Resolve cross-join in ProjectsGrades.grades_for with groups
Summary
The vulnerabilities/projects_grade.rb model has a scope to include projects which cannot be used when the tables are decomposed.
Further details
def self.grades_for(vulnerables, filter: nil, include_subgroups: false)
projects = vulnerables.map do |v|
collection = include_subgroups ? v.all_projects : v.projects
collection.non_archived
end
relation = ::Vulnerabilities::Statistic.for_project(projects.reduce(&:or))
relation = relation.by_grade(filter) if filter
relation = relation.allow_cross_joins_across_databases(url: 'https://gitlab.com/gitlab-org/gitlab/-/issues/503387')
Proposal
Change ProjectsGrade.grades_for to rely on a new Vulnerabilities::Statistic.by_group scope which is based on Add traversal_ids column to vulnerability_stati... (#512592 - closed).
Vulnerabilities::Statistic.by_group is similar to Vulnerabilities::Read.by_group in the way it compares traversal IDs to collect all the objects that match a group and its subgroup.
Additionally, Vulnerabilities::Statistic provides a way to select objects that are directly in the group, and not in its subgroups. This is necessary to support the include_subgroups paramater. For instance, we could have .by_group and .by_group_not_including_subgroups, or pass a boolean to .by_group to switch b/w including the subgroups or not.
NOTE: The original proposal was to iterate batches of groups, and sub-batches of projects. This has been implemented in Resolve cross-join in ProjectsGrade.grades_for ... (!172720 - merged). However, this implementation wasn't not fast enough for large group such as gitlab-org. See thread: #508631 (comment 2264738914)
Implementation plan
-
Implement Vulnerabilities::Statistic.by_group. It can include subgroups or not. -
Update .grades_forto leverage.by_group. This is behind a feature flag. -
Post migration : Add DB index to make .grades_forefficient when it uses.by_group.
Verification
Use the Group Security Dashboard or the groupVulnerabilityGrades query it triggers.
GraphQL query
query groupVulnerabilityGrades($fullPath: ID!) {
group(fullPath: $fullPath) {
id
vulnerabilityGrades(includeSubgroups: true) {
grade
count
projects {
nodes {
...Project
...ProjectVulnerabilitySeveritiesCount
__typename
}
__typename
}
__typename
}
__typename
}
}
fragment Project on Project {
id
name
nameWithNamespace
securityDashboardPath
fullPath
avatarUrl
path
__typename
}
fragment ProjectVulnerabilitySeveritiesCount on Project {
vulnerabilitySeveritiesCount(state: [DETECTED, CONFIRMED]) {
...VulnerabilitySeveritiesCount
__typename
}
__typename
}
fragment VulnerabilitySeveritiesCount on VulnerabilitySeveritiesCount {
critical
high
info
low
medium
unknown
__typename
}
{
"fullPath": "gitlab-com"
}