[go: up one dir, main page]

Skip to content

Closing a merge train MR doesn't cancel pipelines that contain child pipelines

Summary

The CI pipeline contains a job for MR results that triggers a child pipeline.

If MR created and the pipeline is triggered, when Merge train MR is closed, the pipeline continues running.

Steps to reproduce

This bug is flakey and dosn't occur eveytime. It is difficult to reproduce.

.gitlab-ci.yml:

stages:
  
  - stage_1
  - stage_2
  - stage_3

job0:
  stage: stage_1
  script: echo "something"

job1:
    stage: stage_2
    trigger:
      include: 'dir1/service1.yml'
    rules:
      - if: $CI_MERGE_REQUEST_ID

job2:
  stage: stage_3
  script: echo "something"

The child pipeline config dir1/service1.yml:

job1-1:
  script:
    - echo "something!"
  rules:
    - when: always

Example Project

https://gitlab.com/gitlab-gold/asalii/ci_commit_ref_name-test

What is the current bug behavior?

Create a new branch, and commit some changes to that branch. Create an MR of that branch into main.

Close the MR. The pipeline will continue running.

What is the expected correct behavior?

All the running pipelines should stop and be canceled once the MR is closed.

Expand for output related to the GitLab application check

(For installations with omnibus-gitlab package run and paste the output of: sudo gitlab-rake gitlab:check SANITIZE=true)

(For installations from source run and paste the output of: sudo -u git -H bundle exec rake gitlab:check RAILS_ENV=production SANITIZE=true)

(we will only investigate if the tests are passing)

Possible fixes

  1. Bring the retry_lock closer to the point of update
  2. add retry_lock when updating auto_canceled_by_pipeline_id
  3. Up the number of retries, currently we retry once
  4. When we do see an ActiveRecord::StaleObjectError in the parents, might we may still want to cancel the children?

Implementation Guide

Implements 1 and 2

diff --git a/app/models/ci/pipeline.rb b/app/models/ci/pipeline.rb
index 1cbe63e596d6..fa7d8e43a257 100644
--- a/app/models/ci/pipeline.rb
+++ b/app/models/ci/pipeline.rb
@@ -668,12 +668,10 @@ def cancel_running(retries: 1, cascade_to_children: true, auto_canceled_by_pipel
       end

       cancel_jobs(cancelable_statuses, retries: retries, auto_canceled_by_pipeline_id: auto_canceled_by_pipeline_id)
+      cancel_child_pipelines(cascade_to_children, retries, auto_canceled_by_pipeline_id, execute_async)

-      if cascade_to_children
-        # cancel any bridges that could spin up new child pipelines
-        cancel_jobs(bridges_in_self_and_project_descendants.cancelable, retries: retries, auto_canceled_by_pipeline_id: auto_canceled_by_pipeline_id)
-        cancel_children(auto_canceled_by_pipeline_id: auto_canceled_by_pipeline_id, execute_async: execute_async)
-      end
+    rescue ActiveRecord::StaleObjectError
+      cancel_child_pipelines(cascade_to_children, retries, auto_canceled_by_pipeline_id, execute_async)
     end

     # rubocop: disable CodeReuse/ServiceClass
@@ -1351,8 +1349,15 @@ def merge_request_diff

     private

+    def cancel_child_pipelines(cascade_to_children, retries, auto_canceled_by_pipeline_id, execute_async)
+      return unless cascade_to_children
+
+      # cancel any bridges that could spin up new child pipelines
+      cancel_jobs(bridges_in_self_and_project_descendants.cancelable, retries: retries, auto_canceled_by_pipeline_id: auto_canceled_by_pipeline_id)
+      cancel_children(auto_canceled_by_pipeline_id: auto_canceled_by_pipeline_id, execute_async: execute_async)
+    end
+
     def cancel_jobs(jobs, retries: 1, auto_canceled_by_pipeline_id: nil)
-      retry_lock(jobs, retries, name: 'ci_pipeline_cancel_running') do |statuses|
         preloaded_relations = [:project, :pipeline, :deployment, :taggings]

         statuses.find_in_batches do |status_batch|
@@ -1360,8 +1365,10 @@ def cancel_jobs(jobs, retries: 1, auto_canceled_by_pipeline_id: nil)
           Preloaders::CommitStatusPreloader.new(relation).execute(preloaded_relations)

           relation.each do |job|
-            job.auto_canceled_by_id = auto_canceled_by_pipeline_id if auto_canceled_by_pipeline_id
-            job.cancel
+            retry_lock(job, retries, name: 'ci_pipeline_cancel_running') do |subject|
+              subject.auto_canceled_by_id = auto_canceled_by_pipeline_id if auto_canceled_by_pipeline_id
+              subject.cancel
+            end
           end
         end
       end
Edited by 🤖 GitLab Bot 🤖