[go: up one dir, main page]

Skip to content

bundle: fix non-linear performance scaling with refs

The 'git bundle create' command has non-linear performance with the number of refs in the repository. Benchmarking the command shows that a large portion of the time (~75%) is spent in the object_array_remove_duplicates() function.

The object_array_remove_duplicates() function was added in b2a6d1c6 (bundle: allow the same ref to be given more than once, 2009-01-17) to skip duplicate refs provided by the user from being written to the bundle. Since this is an O(N^2) algorithm, in repos with large number of references, this can take up a large amount of time.

Let's instead use a 'strset' to skip duplicates inside write_bundle_refs(). This improves the performance by around 6 times when tested against in a repository with 100000 refs:

Benchmark 1: bundle (refcount = 100000, revision = master) Time (mean ± σ): 14.653 s ± 0.203 s [User: 13.940 s, System: 0.762 s] Range (min … max): 14.237 s … 14.920 s 10 runs

Benchmark 2: bundle (refcount = 100000, revision = HEAD) Time (mean ± σ): 2.394 s ± 0.023 s [User: 1.684 s, System: 0.798 s] Range (min … max): 2.364 s … 2.425 s 10 runs

Summary bundle (refcount = 100000, revision = HEAD) ran 6.12 ± 0.10 times faster than bundle (refcount = 100000, revision = master)

Closes #488 (closed)

Edited by Karthik Nayak

Merge request reports

Loading