Git clone via http(s) fails after gitaly died
Summary
Git clone via http(s) is not more possible after gitaly was restarted after high cpu utilization
Steps to reproduce
Produce a high cpu utilisation on gitaly due to high number of requests and wait for it to die (e.g. Kubernetes node is dead)
Example Project
What is the current bug behavior?
After the mentioned gitaly pod in Kubernetes dies in our environment due to a very high cpu utilization produced by renovate (upgrade all dependencies for all projects as merge reuqest) the gitlab workhorse is not more able to clone via http(s). This affects mainly the gitlab runners as they are cloning via http(s) and see afterwards a 502 exception:
Created fresh repository.
error: RPC failed; HTTP 502 curl 22 The requested URL returned error: 502
fatal: expected flush after ref listing
A "simple" fix is to restart the gitlab webservice (on Kubernetes it contains the workhorse & webservice). Afterwards it works again.
Maybe an important fact is that cloning via ssh is not affected at all. I checked the architecture and saw that it goes via the gitlab-shell that's the reason why I expect that it's somehow a timeout or reconnecting issue in the workhorse.
As an additional note, this started to happen after we upgraded from 14.4.1 to 14.6.3.
What is the expected correct behavior?
git clone via http(s) works after a gitaly server crash
Relevant logs and/or screenshots
At 3 h our gitaly server get's such a high CPU utilisation that it is unreachable for some time.
After this point in time we see the high number of 502 in our proxy infrastructure (600 seconds timeout)
Output of checks
Please see them collapsed inside the corresponding area to not bold up the ticket.
Results of GitLab environment info
Expand for output related to GitLab environment info
Executed from within gitlab-toolbox pod in Kubernetes ``` System information System: Current User: git Using RVM: no Ruby Version: 2.7.5p203 Gem Version: 3.2.19 Bundler Version:2.2.19 Rake Version: 13.0.6 Redis Version: unknown Git Version: unknown Sidekiq Version:6.3.1 Go Version: unknown GitLab information Version: 14.6.3 Revision: e085746f077 Directory: /srv/gitlab DB Adapter: PostgreSQL DB Version: 12.7 URL: https://git.serious.com HTTP Clone URL: https://git.serious.com/some-group/some-project.git SSH Clone URL: git@git.serious.com:some-group/some-project.git Using LDAP: no Using Omniauth: yes Omniauth Providers: openid_connect GitLab Shell Version: 13.22.1 Repository storage paths: - default: /var/opt/gitlab/repo GitLab Shell path: /home/git/gitlab-shell Git: /usr/bin/git ```
Results of GitLab application Check
Expand for output related to the GitLab application check
Executed from within gitlab-toolbox, it seems there is no git inside of it.
Redis version >= 5.0.0? ... yes Ruby version >= 2.7.2 ? ... yes (2.7.5) Git version >= 2.33.0 ? ... no Your git bin path is "/usr/bin/git" Try fixing it: Update your git to a version >= 2.33.0 from Unknown Please fix the error above and rerun the checks. Git user has default SSH configuration? ... yes Active users: ... 508 Is authorized keys file accessible? ... skipped (authorized keys not enabled) GitLab configured to store new projects in hashed storage? ... yes All projects are in hashed storage? ... no Try fixing it: Please migrate all projects to hashed storage as legacy storage is deprecated in 13.0 and support will be removed in 14.0. For more information see: doc/administration/repository_storage_types.md Checking GitLab App ... Finished Checking GitLab subtasks ... Finished

