[Spike] Test the plan for Intelligent Reviewer Assignment leveraging backtesting

Context

@francoisrose commented on the epic for this effort:

IMO it's possible to start prototyping some ideas for that logic, this can be done in any language really and even outside the monolith, as a spike. We could use backtesting and:

Create a small dataset of past MRs, their author, reviewers, and any other data that is included in the algo (like a list of eligible reviewers).

Write an algo as a simple script, that outputs a list of ranked reviewers for each MR in the dataset.

Loop

Run the script against all MRs.

Check in which position the actual reviewers (that were picked by humans) end up in the output.

The higher this is, the better.

Edit the algo and repeat.

Goal

Extract real-world data from MRs and their reviewers and attempt to apply the several concepts we discussed in the document: https://docs.google.com/document/d/1Hb3OXmeUzVFgfEuW3kTISTuiFBQ0k82XIlkDTLK1Tyg/edit?tab=t.0#heading=h.qen8y9ys36x8

Expected outcome

A report on how the variables could be used to achieve a close proximity to the real-world decisions around reviewers.

Along with any conclusions as to whether the variables we're considering were confirmed or dispelled by the real-world scenario confrontation.

Components of research for 18.1

More directed towards a final delivery of an algorithm (thomas, existing issue) w3
Backend spike (with BE) to think of which systems would be built (needs issue) w2 / w2

Edited May 12, 2025 by André Luís