[go: up one dir, main page]

Skip to content

Git clone via http(s) fails after gitaly died

Summary

Git clone via http(s) is not more possible after gitaly was restarted after high cpu utilization

Steps to reproduce

Produce a high cpu utilisation on gitaly due to high number of requests and wait for it to die (e.g. Kubernetes node is dead)

Example Project

What is the current bug behavior?

After the mentioned gitaly pod in Kubernetes dies in our environment due to a very high cpu utilization produced by renovate (upgrade all dependencies for all projects as merge reuqest) the gitlab workhorse is not more able to clone via http(s). This affects mainly the gitlab runners as they are cloning via http(s) and see afterwards a 502 exception:

Created fresh repository.
error: RPC failed; HTTP 502 curl 22 The requested URL returned error: 502
fatal: expected flush after ref listing

A "simple" fix is to restart the gitlab webservice (on Kubernetes it contains the workhorse & webservice). Afterwards it works again.

Maybe an important fact is that cloning via ssh is not affected at all. I checked the architecture and saw that it goes via the gitlab-shell that's the reason why I expect that it's somehow a timeout or reconnecting issue in the workhorse.

As an additional note, this started to happen after we upgraded from 14.4.1 to 14.6.3.

What is the expected correct behavior?

git clone via http(s) works after a gitaly server crash

Relevant logs and/or screenshots

At 3 h our gitaly server get's such a high CPU utilisation that it is unreachable for some time.

Screenshot_2022-01-28_at_09.23.56

After this point in time we see the high number of 502 in our proxy infrastructure (600 seconds timeout)

Screenshot_2022-01-28_at_09.27.37

Output of checks

Please see them collapsed inside the corresponding area to not bold up the ticket.

Results of GitLab environment info

Expand for output related to GitLab environment info

Executed from within gitlab-toolbox pod in Kubernetes

```

System information
System:
Current User:	git
Using RVM:	no
Ruby Version:	2.7.5p203
Gem Version:	3.2.19
Bundler Version:2.2.19
Rake Version:	13.0.6
Redis Version:	unknown
Git Version:	unknown
Sidekiq Version:6.3.1
Go Version:	unknown

GitLab information
Version:	14.6.3
Revision:	e085746f077
Directory:	/srv/gitlab
DB Adapter:	PostgreSQL
DB Version:	12.7
URL:		https://git.serious.com
HTTP Clone URL:	https://git.serious.com/some-group/some-project.git
SSH Clone URL:	git@git.serious.com:some-group/some-project.git
Using LDAP:	no
Using Omniauth:	yes
Omniauth Providers: openid_connect

GitLab Shell
Version:	13.22.1
Repository storage paths:
- default: 	/var/opt/gitlab/repo
GitLab Shell path:		/home/git/gitlab-shell
Git:		/usr/bin/git
```

Results of GitLab application Check

Expand for output related to the GitLab application check

Executed from within gitlab-toolbox, it seems there is no git inside of it.

Redis version >= 5.0.0? ... yes
Ruby version >= 2.7.2 ? ... yes (2.7.5)
Git version >= 2.33.0 ? ... no
Your git bin path is "/usr/bin/git"
  Try fixing it:
  Update your git to a version >= 2.33.0 from Unknown
  Please fix the error above and rerun the checks.
Git user has default SSH configuration? ... yes
Active users: ... 508
Is authorized keys file accessible? ... skipped (authorized keys not enabled)
GitLab configured to store new projects in hashed storage? ... yes
All projects are in hashed storage? ... no
  Try fixing it:
  Please migrate all projects to hashed storage
  as legacy storage is deprecated in 13.0 and support will be removed in 14.0.
  For more information see:
  doc/administration/repository_storage_types.md

Checking GitLab App ... Finished


Checking GitLab subtasks ... Finished

Possible fixes

Edited by Björn Wenzel