[go: up one dir, main page]

Skip to content

Add traversal_ids column to vulnerability_statistics

Problem to solve

As part of Sec DB Decomposition Slice 3 - Vulnerability Fe... (&14197 - closed) we're removing cross-database joins b/w sec and the main gitlab schema. However, tables of sec don't provide efficient ways to iterate over projects of a group and its subgroups. In particular, this makes it difficult to efficiently support the following GraphQL field:

Further details

For Group.vulnerabilityScanners, we've considered using vulnerability_reads.traversal_ids and vulnerability_reads.project_id to collect the vulnerability_scanners of group and its subgroups, but that was not efficient enough for large group. See !177229 (comment 2287106459)

Proposal

  • Add traversal_ids and archived columns to the vulnerability_statistics table. Backfill the columns.
  • Leverage the Vulnerabilities::Statistics::ScheduleWorker to update the traversal_ids column regularly, while also updating the project statistics (vulnerability counts and letter grade).
  • Additionally, to reduce the latency add dedicated workers and services to update traversal_ids and archived whenever a project is moved or archived. This would be similar to what's already in place to update the same column of vulnerability_reads. See EventStore subscription and service.

Then we can leverage the new column to solve dependent issues.

Pros & cons

Pros & cons compared to Ensure Sec DB remains in sync with CI mirror ta... (#466288 - closed).

Pros

  • Storage: We only add a column and an index to an existing table.
  • Speed? It's optimized to render Group.vulnerabilityGrades. No joins.

Cons

  • Latency: It's not kept in sync w/ the projects and namespaces table.
    • This can be solved using the EventStore and implementing a worker that updates the traversal IDs though.
    • vulnerability_reads.traversal_ids needs to be updated the same way anyway.
  • We no longer filter out archived and deleted projects when listing vulnerabilityScanners.
  • It's kind of a hack, and it might be detrimental to code consistency and code maintenance.
    • It's inconsistent with how we've solved similar project in the context of the CI database.
    • However, it's consistent with the existing logic that updates traversal IDs in vulnerability tables. See EventStore subscriptions.
  • No dev docs on the adjustment services and workers, compared to https://docs.gitlab.com/ee/development/database/ci_mirrored_tables.html.

Either way we need to...

  • Backfill new columns.
  • Put new queries behind a feature flag, to be enabled on production when the new tables or columns have been backfilled.

It's a two-way door decision.

Planning breakdown

See #512592 (comment 2295645907)

  • Normal migration: Add traversal_ids and archived columns to vulnerability_statistics.
  • Post migration: Create an index on traversal_id where archived is false1.
  • Code change: Change Vulnerabilities::Statistics::ScheduleWorker to update the new columns.
  • Code change: Set archived and traversal_ids upon vulnerability statistics ingestion.
  • Code change: Implement Vulnerabilities::UpdateNamespaceIdsOfVulnerabilitySatisticsWorker, and call it from Vulnerabilities::ProcessTransferEventsWorker#bulk_schedule_worker (registered to EventStore by register_threat_insights_subscribers).
  • Batched background migration: Backfill the new columns.
  • Leverage the new columns to resolve cross-database join in ProjectsGrade.grade_for.
  • Leverage the new columns to resolve cross-database join in EE::Group#vulnerability_scanners.

Features that rely on the new columns are behind a feature flag that's enabled on production when the BBM has completed.

The feature flag is enabled by default after finalizing the BBM.

Implementation plan

Normal migration: Add traversal_ids and archived columns to vulnerability_statistics.

Verification steps

None

  1. The index could be added in the follow-up issues, based on the needs. This would allow use to prove that the index is optimal for the new queries. If a composite index is needed, that index should be reused in simpler queries. since vulnerability statistics are updated/refreshed regularly

Edited by Fabien Catteau