fsck new objects before committing a transaction
Once #5770 (closed) has landed, Gitaly's TransactionManager
guarantees connectivity of new objects introduced within a transaction. We're still not checking the pushed objects are valid. We should fsck
objects to ensure they are valid. Some points:
- Gitaly already walks the new reference tips, and feeds them to
git pack-objects
. This ensures the objects are connected and that they are readable.- Missing objects are treated as transaction's dependencies and are verified to be present in the repository before committing.
- Loose objects are written to the quarantine by Gitaly, and Gitaly shouldn't be writing invalid objects. Hash mismatches shouldn't be possible.
- The untrusted data in Gitaly arrives in packfiles via pushes and fetches.
git fsck
has a limitation where it will only check pack files with the --full
option. This also leads to checking alternate object directories, which we don't want to do. We'd only want to check the quarantine. Further, checking for connectivity is unnecessary as we've already walked the object graph.
receive-pack
uses index-pack --strict
to fsck
objects but --strict
also runs connectivity checks which we don't want to run. index-pack --fsck-objects
on the other hand doesn't seemingly allow configuring which fsck
check to skip as we do here.
Perhaps the best option is to extend index-pack --fsck-objects
to allow configuring the which errors to demote to warnings. That way we only end up checking objects that would ultimately be committed and nothing more.