diff --git a/Documentation/technical/remembering-renames.adoc b/Documentation/technical/remembering-renames.adoc index 73f41761e2090a467dc0cade4888825d4606a7a7..d1f6c9b6a80cb0239251f1640ba403449f0194a5 100644 --- a/Documentation/technical/remembering-renames.adoc +++ b/Documentation/technical/remembering-renames.adoc @@ -10,32 +10,32 @@ history as an optimization, assuming all merges are automatic and clean Outline: - 0. Assumptions + 1. Assumptions - 1. How rebasing and cherry-picking work + 2. How rebasing and cherry-picking work - 2. Why the renames on MERGE_SIDE1 in any given pick are *always* a + 3. Why the renames on MERGE_SIDE1 in any given pick are *always* a superset of the renames on MERGE_SIDE1 for the next pick. - 3. Why any rename on MERGE_SIDE1 in any given pick is _almost_ always also + 4. Why any rename on MERGE_SIDE1 in any given pick is _almost_ always also a rename on MERGE_SIDE1 for the next pick - 4. A detailed description of the counter-examples to #3. + 5. A detailed description of the counter-examples to #4. - 5. Why the special cases in #4 are still fully reasonable to use to pair + 6. Why the special cases in #5 are still fully reasonable to use to pair up files for three-way content merging in the merge machinery, and why they do not affect the correctness of the merge. - 6. Interaction with skipping of "irrelevant" renames + 7. Interaction with skipping of "irrelevant" renames - 7. Additional items that need to be cached + 8. Additional items that need to be cached - 8. How directory rename detection interacts with the above and why this + 9. How directory rename detection interacts with the above and why this optimization is still safe even if merge.directoryRenames is set to "true". -=== 0. Assumptions === +== 1. Assumptions == There are two assumptions that will hold throughout this document: @@ -44,8 +44,8 @@ There are two assumptions that will hold throughout this document: * All merges are fully automatic -and a third that will hold in sections 2-5 for simplicity, that I'll later -address in section 8: +and a third that will hold in sections 3-6 for simplicity, that I'll later +address in section 9: * No directory renames occur @@ -77,9 +77,9 @@ conflicts that the user needs to resolve), the cache of renames is not stored on disk, and thus is thrown away as soon as the rebase or cherry pick stops for the user to resolve the operation. -The third assumption makes sections 2-5 simpler, and allows people to +The third assumption makes sections 3-6 simpler, and allows people to understand the basics of why this optimization is safe and effective, and -then I can go back and address the specifics in section 8. It is probably +then I can go back and address the specifics in section 9. It is probably also worth noting that if directory renames do occur, then the default of merge.directoryRenames being set to "conflict" means that the operation will stop for users to resolve the conflicts and the cache will be thrown @@ -88,10 +88,10 @@ reason we need to address directory renames specifically, is that some users will have set merge.directoryRenames to "true" to allow the merges to continue to proceed automatically. The optimization is still safe with this config setting, but we have to discuss a few more cases to show why; -this discussion is deferred until section 8. +this discussion is deferred until section 9. -=== 1. How rebasing and cherry-picking work === +== 2. How rebasing and cherry-picking work == Consider the following setup (from the git-rebase manpage): @@ -138,8 +138,8 @@ Conceptually the two statements above are the same as a three-way merge of B, B', and C, at least the parts before you decide to record a commit. -=== 2. Why the renames on MERGE_SIDE1 in any given pick are always a === -=== superset of the renames on MERGE_SIDE1 for the next pick. === +== 3. Why the renames on MERGE_SIDE1 in any given pick are always a == +== superset of the renames on MERGE_SIDE1 for the next pick. == The merge machinery uses the filenames it is fed from MERGE_BASE, MERGE_SIDE1, and MERGE_SIDE2. It will only move content to a different @@ -181,8 +181,8 @@ are a subset of those between E and G. Equivalently, all renames between E and G are a superset of those between A and A'. -=== 3. Why any rename on MERGE_SIDE1 in any given pick is _almost_ === -=== always also a rename on MERGE_SIDE1 for the next pick. === +== 4. Why any rename on MERGE_SIDE1 in any given pick is _almost_ == +== always also a rename on MERGE_SIDE1 for the next pick. == Let's again look at the first two picks: @@ -254,15 +254,15 @@ were detected as renames, A:oldfile and A':newfile should also be detectable as renames almost always. -=== 4. A detailed description of the counter-examples to #3. === +== 5. A detailed description of the counter-examples to #4. == -We already noted in section 3 that rename/rename(1to1) (i.e. both sides +We already noted in section 4 that rename/rename(1to1) (i.e. both sides renaming a file the same way) was one counter-example. The more interesting bit, though, is why did we need to use the "almost" qualifier when stating that A:oldfile and A':newfile are "almost" always detectable as renames? -Let's repeat an earlier point that section 3 made: +Let's repeat an earlier point that section 4 made: A':newfile was created by applying the changes between E:oldfile and G:newfile to A:oldfile. The changes between E:oldfile and G:newfile were @@ -285,9 +285,9 @@ E:oldfile -> G:newfile would be detected as a rename, but A:oldfile and A':newfile would not be. -=== 5. Why the special cases in #4 are still fully reasonable to use to === -=== pair up files for three-way content merging in the merge machinery, === -=== and why they do not affect the correctness of the merge. === +== 6. Why the special cases in #5 are still fully reasonable to use to == +== pair up files for three-way content merging in the merge machinery, == +== and why they do not affect the correctness of the merge. == In the rename/rename(1to1) case, A:newfile and A':newfile are not renames since they use the *same* filename. However, files with the same filename @@ -295,14 +295,14 @@ are obviously fine to pair up for three-way content merging (the merge machinery has never employed break detection). The interesting counter-example case is thus not the rename/rename(1to1) case, but the case where A did not rename oldfile. That was the case that we spent most of -the time discussing in sections 3 and 4. The remainder of this section +the time discussing in sections 4 and 5. The remainder of this section will be devoted to that case as well. So, even if A:oldfile and A':newfile aren't detectable as renames, why is it still reasonable to pair them up for three-way content merging in the merge machinery? There are multiple reasons: - * As noted in sections 3 and 4, the diff between A:oldfile and A':newfile + * As noted in sections 4 and 5, the diff between A:oldfile and A':newfile is *exactly* the same as the diff between E:oldfile and G:newfile. The latter pair were detected as renames, so it seems unlikely to surprise users for us to treat A:oldfile and A':newfile as renames. @@ -394,7 +394,7 @@ cases 1 and 3 seem to provide as good or better behavior with the optimization than without. -=== 6. Interaction with skipping of "irrelevant" renames === +== 7. Interaction with skipping of "irrelevant" renames == Previous optimizations involved skipping rename detection for paths considered to be "irrelevant". See for example the following commits: @@ -421,7 +421,7 @@ detection -- though we can limit it to the paths for which we have not already detected renames. -=== 7. Additional items that need to be cached === +== 8. Additional items that need to be cached == It turns out we have to cache more than just renames; we also cache: @@ -438,7 +438,7 @@ These are all stored in struct rename_info, and respectively appear in * merge_trees The reason for (A) comes from the irrelevant renames skipping -optimization discussed in section 6. The fact that irrelevant renames +optimization discussed in section 7. The fact that irrelevant renames are skipped means we only get a subset of the potential renames detected and subsequent commits may need to run rename detection on the upstream side on a subset of the remaining renames (to get the @@ -482,9 +482,9 @@ we store the trees to compare with what we are asked to merge next time. -=== 8. How directory rename detection interacts with the above and === -=== why this optimization is still safe even if === -=== merge.directoryRenames is set to "true". === +== 9. How directory rename detection interacts with the above and == +== why this optimization is still safe even if == +== merge.directoryRenames is set to "true". == As noted in the assumptions section: @@ -500,7 +500,7 @@ As noted in the assumptions section: Let's remember that we need to look at how any given pick affects the next one. So let's again use the first two picks from the diagram in section -one: +two: First pick does this three-way merge: MERGE_BASE: E diff --git a/Documentation/technical/sparse-checkout.adoc b/Documentation/technical/sparse-checkout.adoc index 0f750ef3e36120be8387a7200bf63dab43e279a3..0701c0f90d686f1045acf088040fa11bc6302511 100644 --- a/Documentation/technical/sparse-checkout.adoc +++ b/Documentation/technical/sparse-checkout.adoc @@ -14,7 +14,7 @@ Table of contents: * Reference Emails -=== Terminology === +== Terminology == cone mode: one of two modes for specifying the desired subset of files in a sparse-checkout. In cone-mode, the user specifies @@ -92,7 +92,7 @@ vivifying: When a command restores a tracked file to the working tree (and file), this is referred to as "vivifying" the file. -=== Purpose of sparse-checkouts === +== Purpose of sparse-checkouts == sparse-checkouts exist to allow users to work with a subset of their files. @@ -255,7 +255,7 @@ will perceive the checkout as dense, and commands should thus behave as if all files are present. -=== Usecases of primary concern === +== Usecases of primary concern == Most of the rest of this document will focus on Behavior A and Behavior B. Some notes about the other two cases and why we are not focusing on @@ -300,7 +300,7 @@ Behavior C do not assume they are part of the Behavior B camp and propose patches that break things for the real Behavior B folks. -=== Oversimplified mental models === +== Oversimplified mental models == An oversimplification of the differences in the above behaviors is: @@ -313,7 +313,7 @@ An oversimplification of the differences in the above behaviors is: they can later lazily be populated instead. -=== Desired behavior === +== Desired behavior == As noted previously, despite the simple idea of just working with a subset of files, there are a range of different behavioral changes that need to be @@ -542,7 +542,7 @@ understanding these differences can be beneficial. * gitk? -=== Behavior classes === +== Behavior classes == From the above there are a few classes of behavior: @@ -609,7 +609,7 @@ From the above there are a few classes of behavior: specification. -=== Subcommand-dependent defaults === +== Subcommand-dependent defaults == Note that we have different defaults depending on the command for the desired behavior : @@ -749,7 +749,7 @@ desired behavior : implemented. -=== Sparse specification vs. sparsity patterns === +== Sparse specification vs. sparsity patterns == In a well-behaved situation, the sparse specification is given directly by the $GIT_DIR/info/sparse-checkout file. However, it can transiently @@ -821,7 +821,7 @@ under behavior B index operations are lumped with history and tend to operate full-tree. -=== Implementation Questions === +== Implementation Questions == * Do the options --scope={sparse,all} sound good to others? Are there better options? @@ -892,7 +892,7 @@ operate full-tree. is seamless for them. -=== Implementation Goals/Plans === +== Implementation Goals/Plans == * Get buy-in on this document in general. @@ -920,15 +920,15 @@ operate full-tree. commands. IMPORTANT: make sure diff machinery changes don't mess with format-patch, fast-export, etc. -=== Known bugs === +== Known bugs == This list used to be a lot longer (see e.g. [1,2,3,4,5,6,7,8,9]), but we've been working on it. -0. Behavior A is not well supported in Git. (Behavior B didn't used to +1. Behavior A is not well supported in Git. (Behavior B didn't used to be either, but was the easier of the two to implement.) -1. am and apply: +2. am and apply: apply, without `--index` or `--cached`, relies on files being present in the working copy, and also writes to them unconditionally. As @@ -948,7 +948,7 @@ been working on it. files and then complain that those vivified files would be overwritten by merge. -2. reset --hard: +3. reset --hard: reset --hard provides confusing error message (works correctly, but misleads the user into believing it didn't): @@ -971,13 +971,13 @@ been working on it. `git reset --hard` DID remove addme from the index and the working tree, contrary to the error message, but in line with how reset --hard should behave. -3. read-tree +4. read-tree `read-tree` doesn't apply the 'SKIP_WORKTREE' bit to *any* of the entries it reads into the index, resulting in all your files suddenly appearing to be "deleted". -4. Checkout, restore: +5. Checkout, restore: These command do not handle path & revision arguments appropriately: @@ -1030,7 +1030,7 @@ been working on it. S tracked H tracked-but-maybe-skipped -5. checkout and restore --staged, continued: +6. checkout and restore --staged, continued: These commands do not correctly scope operations to the sparse specification, and make it worse by not setting important SKIP_WORKTREE @@ -1046,11 +1046,11 @@ been working on it. the sparse specification, but then it will be important to set the SKIP_WORKTREE bits appropriately. -6. Performance issues; see: +7. Performance issues; see: https://lore.kernel.org/git/CABPp-BEkJQoKZsQGCYioyga_uoDQ6iBeW+FKr8JhyuuTMK1RDw@mail.gmail.com/ -=== Reference Emails === +== Reference Emails == Emails that detail various bugs we've had in sparse-checkout: