[go: up one dir, main page]

Skip to content

Schedule backfill of system_note_metadata.namespace_id

What does this MR do and why?

Now that the backfill was finalized in the notes table, we can schedule the backfill of system_note_metadata. In the notes table we have records that have both a project_id and a namespace_id, but for some records namespace_id might have the incorrect value. Because of this, we need to fetch projects.project_namespace_id for every record associated with a project, as we know the project_id value is correct. Cleanup of the notes table will be done in the future.

The notes table also has an organization_id column for notes on PersonalSnippet, but because we don't have system_notes for personal snippets, no records exist on the system_note_metadata table that are associated with a personal snippet. That' why we only backfill the namespace_id column. system_note_metadata.organization_id was added before we find out about this, but that column is in the multi release process of getting dropped (already ignored).

DB review

Migration output

UP
main: == [advisory_lock_connection] object_id: 130200, pg_backend_pid: 3483
main: == 20251021184303 QueueBackfillSystemNoteMetadataNamespaceId: migrating =======
main: == 20251021184303 QueueBackfillSystemNoteMetadataNamespaceId: migrated (0.1039s) 

main: == [advisory_lock_connection] object_id: 130200, pg_backend_pid: 3483
ci: == [advisory_lock_connection] object_id: 130200, pg_backend_pid: 3484
ci: == 20251021184303 QueueBackfillSystemNoteMetadataNamespaceId: migrating =======
ci: -- The migration is skipped since it modifies the schemas: [:gitlab_main_org].
ci: -- This database can only apply migrations in one of the following schemas: [:gitlab_ci, :gitlab_ci_cell_local, :gitlab_internal, :gitlab_shared, :gitlab_shared_cell_local, :gitlab_shared_org].
ci: == 20251021184303 QueueBackfillSystemNoteMetadataNamespaceId: migrated (0.0081s) 

ci: == [advisory_lock_connection] object_id: 130200, pg_backend_pid: 3484
DOWN
main: == [advisory_lock_connection] object_id: 129900, pg_backend_pid: 3249
main: == 20251021184303 QueueBackfillSystemNoteMetadataNamespaceId: reverting =======
main: == 20251021184303 QueueBackfillSystemNoteMetadataNamespaceId: reverted (0.0821s) 

main: == [advisory_lock_connection] object_id: 129900, pg_backend_pid: 3249
ci: == [advisory_lock_connection] object_id: 129900, pg_backend_pid: 3274
ci: == 20251021184303 QueueBackfillSystemNoteMetadataNamespaceId: reverting =======
ci: -- The migration is skipped since it modifies the schemas: [:gitlab_main_org].
ci: -- This database can only apply migrations in one of the following schemas: [:gitlab_ci, :gitlab_ci_cell_local, :gitlab_internal, :gitlab_shared, :gitlab_shared_cell_local, :gitlab_shared_org].
ci: == 20251021184303 QueueBackfillSystemNoteMetadataNamespaceId: reverted (0.0278s) 

ci: == [advisory_lock_connection] object_id: 129900, pg_backend_pid: 3274

Query plans

UPDATE namespace_id batch

https://console.postgres.ai/gitlab/gitlab-production-main/sessions/44762/commands/137401

WITH relation AS MATERIALIZED (
  SELECT 
    "system_note_metadata"."id", 
    "system_note_metadata"."note_id" 
  FROM 
    "system_note_metadata" 
  WHERE 
    "system_note_metadata"."id" BETWEEN 1 
    AND 34509 
    AND "system_note_metadata"."id" >= 1 
    AND "system_note_metadata"."id" < 292 
  LIMIT 
    250
), relation_with_namespace_id AS MATERIALIZED (
  SELECT 
    "relation".*, 
    COALESCE(
      "projects"."project_namespace_id", 
      "notes"."namespace_id"
    ) AS namespace_id 
  FROM 
    "relation" 
    INNER JOIN "notes" ON "notes"."id" = "relation"."note_id" 
    LEFT JOIN "projects" ON "projects"."id" = "notes"."project_id" 
  LIMIT 
    250
) 
UPDATE 
  "system_note_metadata" 
SET 
  "namespace_id" = "relation_with_namespace_id"."namespace_id" 
FROM 
  "relation_with_namespace_id" 
WHERE 
  "system_note_metadata"."id" = "relation_with_namespace_id"."id"

Related to #514589

Edited by Mario Celi

Merge request reports

Loading