Squashed stacked PRs workflow
Recall that the “stacked PRs” Git workflow deals with a set of changes (say C
depends on B
, which itself depends on A
), each dependent on the last, and all going into some base branch which I’ll call main
for the sake of this note.
The workflow represents this set of changes as a collection of pull requests: A
into main
, B
into A
, and C
into B
.
Problem statement
The stacked PRs workflow is fine as long as we merge each pull request into its target, because then Git’s standard merge algorithms are “sufficiently associative” that the sequence of merges tends to do the right thing. (Of course, Git’s standard merge algorithms are not associative; see the Pijul manual for concrete examples and discussion of why this is inherently true.)
But if we squash each pull request into its target, then the only way we can merge the entire stack is to merge C
into B
, then B+C
into A
, then A+B+C
into main
.
Any other order, and the rewrite of history in the squash causes the computed merge base of our source and target to be very different from what we actually know it is, and this almost always causes the merge to become wildly conflicted.
For example, if we squash-merge A
into main
(which for the sake of argument should be a fast-forward merge, except that we’ve squashed), then we construct a new commit squash(A)
whose tree is the same as A
and which has the parent main
; then we set main
to point to squash(A)
.
The merge base of B
with squash(A)
would be simply A
if we hadn’t squashed, but A
is no longer in the history of squash(A)
, so the merge base is actually main^
(i.e. main
as it was before the squash-merge); and the merge of squash(A)
and B
with a base of main^
is liable to be gruesome.
So we can’t cleanly merge B
into main = squash(A)
.
The clean problem statement, then, is:
How do I squash-merge the stack in the order
A -> main
,B -> main + A
,C -> main + A + B
, without having to resolve conflicts at each step?
Solution
Since we’re squashing into main
anyway, we should feel free to make a complete mess of history on our branches.
- Squash-merge
A
intomain
. - Merge into
B
theA
commit that’s immediately before the squash intomain
. (This should be clean unless you made changes toA
which genuinely conflicted withB
, so this is work you should really have done in preparation for the review ofB
anyway.) - Fetch
origin/main
locally, and merge intoB
themain
commit that’s immediately before the squash ofA
. (This should be clean if you’ve been hygienically keeping your branches up to date withmain
by mergingmain -> A
,A -> B
,B -> C
. If it’s not clean, again this is work that you would have to do anyway even in a non-squashing world.)
Now B
is up to date both with main
and A
as of immediately before A
was squashed into main
, so it should be the case that merging main + A
into B
would be a no-op: it should not change the tree of B
.
However, we aren’t merging main + A
.
We’re instead merging the squashed main + squash(A)
for some single commit squash(A)
which Git thinks is completely unrelated to A
, but which in fact has the same tree as A
.
So the last step is:
- Merge the squashed
main + squash(A)
commit intoB
with theours
strategy:git merge $COMMIT_HASH --strategy=ours
. That is: since we knowB
’s got the right tree, but its history is woefully incompatible withmain + squash(A)
’s history, we just do a dummy no-op merge to make their histories compatible again.
(Then merge this back up the stack, by merging the new B
into C
.)
The state after performing this procedure
A
has been squashed intomain
.B
’s tree is as ifA
were merged intomain
and then the resultingmain + A
were merged intoB
.B
’s history contains the squashedmain + squash(A)
, so subsequent merges ofmain
intoB
orB
intomain
will be clean.B
’s history looks a bit mad, but we shrug and move on.