Configure Knapsack reports for Unit tests in Charts

Thanks to great work in Add knapsack for parallelizing tests (!2239 - merged) the initial tests parallelisation was added. The current issue is to store Knapsack reports so that it is able to better spread the load using this time execution report.

For example, in https://gitlab.com/gitlab-org/charts/gitlab/-/pipelines/453617212 specs_without_cluster is split to 5 jobs where 4 of them are finished within 10 minutes and the last one ran for 20 minutes. Hopefully with Knapsack report in place, the runtime can be better optimized. If not, we need to explore other solutions to make runtime more even.

Related discussion !2348 (comment 818289394)

Update 2025-12-24

Right now, our slowest jobs takes 37 minutes, while others take on average 25**. I've assessed that by bumping our nodes from 6 to 12 parallel jobs, the average of jobs goes down back to 10 minutes, but we still have this 37 minutes jobs. So increasing the number of parallel jobs from 6 to 12 is not helpful until we calibrate knapsack. I've updated the proposal on how to proceed with the calibration.

Proposal

We investigated how to proceed as part of Spike: Increase tests parallelization (!4081 - closed). Steps:

We have to properly configure knapsack by generating a knapsack report, and committing it to the project repo, as explained in the gem docs: https://github.com/KnapsackPro/knapsack?tab=readme-ov-file#gitlab-ci
The knapsack report can only be generated when we're using only one node (parallel: 1).
When running on 1 node our tests take longer than 3 hours.
Projects pipeline timeout config are set to 2 hours by default. We have to set them to something like 4 hours, which any chart maintainer can do.
The runners have a timeout of 3 hours. We can't easily tweak this, so we need to create a specific runner for this project with a timeout of 4 hours.
- We might need to also use a temporary tag so that the single runner only picks up the specific MR we'll be working on. Let's call it knapsack_experiment tag for the sake of this example.
Configuring the runner, we need to push a temporary change to the chart to reduce nodes to 1 parallel: 1, and to push the knapsack report to the artifacts, so that we can add it to our project.
We have to remove the report file name from the .gitignore.
Finally, bring our nodes back to 6 parallel: 6.
- With the callibration in place, we could actually double the nodes, which is expected to cut the average time by basically half.
Revert the project timeout settings.
Decommission the temporary runner, and remove any temporary tags.

Example change:

specs_without_cluster:
  extends: .specs
  variables:
    RSPEC_TAGS: ~type:feature
    KNAPSACK_GENERATE_REPORT: "true"
    KNAPSACK_REPORT_PATH: knapsack_rspec_report.json
  parallel: 1
  rules:
    - !reference [.specs, rules]
    - if: '$PIPELINE_TYPE == "AUTO_DEPLOY_PIPELINE"'
    - if: '$PIPELINE_TYPE == "RELEASE_PIPELINE"'
  needs: ['lint_package']
  artifacts:
    paths:
      - knapsack_rspec_report.json
  tags:
    - knapsack_experiment

Edited Jan 24, 2025 by João Alexandre Cunha