Introduce FF read_from_new_ci_destinations and update read logic for options/yaml_variables
What does this MR do and why?
Context
As part of our CI data reduction project (Reduce growth rate of ci_builds_metadata partit... (&13886)), we introduced a new table ci_job_definitions to help deduplicate job data. Since some data is moved to new destinations (e.g. intrinsic data moved to ci_builds and immutable processing data to ci_job_definitions) we need to be able to use feature flags to read/write data to specific destination.
In #552057 (closed), we ultimately want to move all necessary getter/setters to Ci::Metadatable so they're all in one place and the behaviour can be switched on/off based on feature flags.
This MR
These changes are based on the POC: !193948 (closed)
Since there are many columns/fields to migrate, I've decided to split this effort into multiple smaller MRs to simplify the review process. This MR is the first iteration of the implementation plan:
- Introduces the Feature flag:
-
read_from_new_ci_destinations(Roll-out: FF: enable `read_from_new_ci_destinations` (#552063 - closed))
- Applies the FF-controlled reading logic to the underlying methods for
optionsandyaml_variables. - Updates all places that preload
:metadataso that it also preloads:job_definition, to avoid N+1 queries.
References
Database queries
NOTES:
- We can't run these queries on Postgres.ai yet because the clone is delayed and the
job_definitionstable doesn't exist. So the query plans below are from local machine. - There isn't any job definition data yet, so the query plans prefer just seq scans for now. Later the queries should perform fine though because we're only filtering on each table's primary key.
Preloading only [:job_definition] association
Raw SQL
SELECT
"p_ci_job_definition_instances"."job_id" AS t0_r0,
"p_ci_job_definition_instances"."job_definition_id" AS t0_r1,
"p_ci_job_definition_instances"."partition_id" AS t0_r2,
"p_ci_job_definition_instances"."project_id" AS t0_r3,
"p_ci_job_definitions"."id" AS t1_r0,
"p_ci_job_definitions"."partition_id" AS t1_r1,
"p_ci_job_definitions"."project_id" AS t1_r2,
"p_ci_job_definitions"."created_at" AS t1_r3,
"p_ci_job_definitions"."updated_at" AS t1_r4,
"p_ci_job_definitions"."interruptible" AS t1_r5,
"p_ci_job_definitions"."checksum" AS t1_r6,
"p_ci_job_definitions"."config" AS t1_r7
FROM
"p_ci_job_definition_instances"
LEFT OUTER JOIN "p_ci_job_definitions" ON "p_ci_job_definitions"."partition_id" IS NOT NULL
AND "p_ci_job_definitions"."id" = "p_ci_job_definition_instances"."job_definition_id"
AND "p_ci_job_definitions"."partition_id" = "p_ci_job_definition_instances"."partition_id"
WHERE
"p_ci_job_definition_instances"."partition_id" = 100
AND "p_ci_job_definitions"."partition_id" = 100
AND "p_ci_job_definition_instances"."job_id" IN (1422, 1421, 1420)
Query plan
Nested Loop (cost=0.00..0.01 rows=1 width=137) (actual time=0.002..0.002 rows=0 loops=1)
Join Filter: (p_ci_job_definition_instances.job_definition_id = p_ci_job_definitions.id)
-> Seq Scan on ci_job_definition_instances_100 p_ci_job_definition_instances (cost=0.00..0.00 rows=1 width=32) (actual time=0.002..0.002 rows=0 loops=1)
Filter: ((partition_id = 100) AND (job_id = ANY ('{1422,1421,1420}'::bigint[])))
-> Seq Scan on ci_job_definitions_100 p_ci_job_definitions (cost=0.00..0.00 rows=1 width=105) (never executed)
Filter: ((partition_id IS NOT NULL) AND (partition_id = 100))
Planning:
Buffers: shared hit=6
Planning Time: 0.150 ms
Execution Time: 0.011 ms
Preloading [:job_definition_instance, :job_definition] separately
SELECT
"p_ci_job_definition_instances".*
FROM
"p_ci_job_definition_instances"
WHERE
"p_ci_job_definition_instances"."partition_id" = 100
AND "p_ci_job_definition_instances"."job_id" IN (1422, 1421, 1420)
Query plan
Seq Scan on ci_job_definition_instances_100 p_ci_job_definition_instances (cost=0.00..0.00 rows=1 width=32) (actual time=0.002..0.002 rows=0 loops=1)
Filter: ((partition_id = 100) AND (job_id = ANY ('{1422,1421,1420}'::bigint[])))
Planning:
Buffers: shared hit=3
Planning Time: 0.071 ms
Execution Time: 0.007 ms
SELECT
"p_ci_job_definitions".*
FROM
"p_ci_job_definitions"
WHERE
"p_ci_job_definitions"."partition_id" = 100
AND "p_ci_job_definitions"."id" IN (1004, 1005, 1006)
Query plan
Seq Scan on ci_job_definitions_100 p_ci_job_definitions (cost=0.00..0.00 rows=1 width=105) (actual time=0.011..0.012 rows=0 loops=1)
Filter: ((partition_id = 100) AND (id = ANY ('{1004,1005,1006}'::bigint[])))
Planning:
Buffers: shared hit=3
Planning Time: 0.610 ms
Execution Time: 0.032 ms
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Related to #552057 (closed)