[go: up one dir, main page]

Skip to content

Split UpdateMergeRequestsWorker using event-driven architecture for improved resilience

What does this MR do and why?

This MR implements an event-driven hybrid approach to split the UpdateMergeRequestsWorker into smaller, more resilient workers. This addresses the issues mentioned in #554081 (closed) where MRs can get stuck in broken states due to worker failures.

Problem

The current UpdateMergeRequestsWorker is a monolithic worker that:

  1. Can leave MRs in broken states if interrupted by errors
  2. Causes duplicate work when retried due to Redis connection issues
  3. Is difficult to debug due to lack of observability
  4. Can block queues with long-running operations

Solution

Split the worker into an event-driven architecture with:

  • Phase 1: Parallel independent operations (git analysis, branch cleanup, LFS linking)
  • Phase 2: Sequential heavy operation (diff reload) that depends on git analysis
  • Phase 3: Parallel post-processing (notifications, webhooks) that depends on diff reload

Architecture

UpdateMergeRequestsWorker (publishes MergeRequestRefreshStarted)

Phase 1 (Parallel):
├── MergeRequestGitWorker (publishes CommitsAnalyzed)
├── MergeRequestBranchWorker (publishes BranchesProcessed)  
└── MergeRequestLfsWorker (publishes LfsLinked)

Phase 2 (Sequential, waits for CommitsAnalyzed):
MergeRequestDiffWorker (publishes DiffsReloaded)

Phase 3 (Parallel, waits for DiffsReloaded):
├── MergeRequestSuggestionWorker
├── MergeRequestNotificationWorker
└── MergeRequestWebHookWorker

Benefits

  1. Resilience: Each worker is idempotent and can retry independently
  2. Performance: Independent operations run in parallel
  3. Observability: Clear event flow makes debugging easier
  4. Extensibility: Easy to add new processors without changing existing code
  5. Queue management: Heavy operations don't block lighter ones

Migration Strategy

  • Feature flag controlled rollout
  • Backward compatibility maintained
  • Gradual migration with monitoring

Related to #554081 (closed)

How to set up and validate locally

  1. Enable the feature flag:

    Feature.enable(:split_update_merge_requests_worker)
  2. Push to a branch and observe the new worker execution in Sidekiq logs

  3. Verify MR refresh functionality works correctly

MR acceptance checklist

This MR addresses the customer-reported issue while providing a foundation for improved worker resilience across GitLab.

Merge request reports

Loading