How to merge: the overlooked key to a good git strategy
By Marco W. Soijer
Git strategies (or branching strategies) are subject to heated debate. GitFlow, GitHub flow, GitLab flow, and OneFLow all have the objective of creating a simpler, more effective and more sustainable way to handle your commits and merges — at least more than the other strategies out there. The all use same tool of git and have the same objective, yet claim to come to a different solution. In the end, the differences are in how many main branches you work with — for example, whether there are development or release branches — and how often you commit. The key words here a so-called trunk-based development with continuous integration, or feature-based development.
Why they are all the same
Branches, however, are means to an end. When doing development and several steps in coding, integration, and testing will be made for a single feature or set of features, you need a branch to keep them organised and to avoid interfering with those other developers or teams, who are working on something else. It does not matter, how large or mature your organisation is. If you work — or happen to do this feature — sequentially, there is no absolute need for a branch no matter how big your team is. If you are working on a fix, a performance improvement and a new feature, even if doing so by yourself, it is worth keeping them separated on different branches.
You commit your code when you have a status worth preserving. That may be because there are test results linked to it, because it was released, or because you use git to share work-in-progress among team members. That may be often — or not. You should not create a branch, because some dogmatic strategy tells you that you should. You should not commit code, or avoid doing so, because a theory tells you that it's time, or forbids you to do so.
The strategy doesn't matter. Create branches and name them as you need them. Make commits as you need them. And merge branches when you don't need them any longer. The most promising way to a repository that is as simple as possible and therefore more effective and sustainable, is by creating just what you need, and nothing more. As a developer, you can figure that out. If you can code it, you know where it belongs.
Which commits to maintain
The reason for making a commit implies whether it is worthwhile to keep in the long run. A work-in-progress commit, maybe even one with a simple typo that was corrected in the next commit, is of no use three years from now. Commits with code that was released, passed a specific test, or proves a certain concept, are worth keeping — and investing some time to ensure you will actually find them when you need them.
OneFlow specifically critises GitFlow's advice to merge without fast-forward, because it keeps all commits and thus makes the history completely unreadable. That's a fair point. But how do you actually do your merges, in order to filter out those work-in-progress commits? OneFlow itself gives three alternatives, one of which is actually the GitFlow one, as the author notes himself.
So here's what we want to achieve:
- Create a branch whenever you need it, and commit to it whenever it makes sense.
- Merge a code state from a branch into another one, while controlling — hence: reducing — the number of commits on the target branch.
- Resolve merge conflicts timely and in a controlled manner.
- Keep only one long-term branch, which does not get cluttered.
There is some manual work involved here. No tool will be able to decide for you, what is of long-term value and what is not. No tool will properly resolve merge conflicts all the time (yes, we have re-introduced some bugs that were already fixed in our main branch, because of bad merging). And no automatic merging will protect you against unexpected merge conflicts.
Merging for sustainable branches
Having left the illusion of automatic merging behind, this is the branching strategy that works for us. With OneFlowIt it shares the idea of having a single main branch — no matter whether you call it main, master, or anything else — but adds a decisive rule on how to do merges, which is the key to achieving the objectives listed above.
Starting a new branch is done as always. Create a new branch feature as a spin-off from main, and check it out at the same time:
|#||git checkout -b feature main|
You can have as many branches in parallel as needed. But they all originate from the main branch and when they are no longer needed, they merge back into the main branch.
Add commits to the new branch, push it to an upstream repository to share it with other developers, and pull updates however you like. After a while, there will be a series of commits, and probably something will have happened to main too:
When the head of feature contains the mature increment that should be merged with the main branch, the commits on the feature branch are no longer needed. This includes the latest commit f4 itself; in fact, you can use any of the intermediate stages on the branch, or even a set of uncommitted changes. All that matters, is that the working tree reflects the status you want to merge into main.
Convert the current working tree into a single set of changes with respect to the last commit of main by applying a soft reset:
|#||git reset --soft main|
The working tree will remain unchanged, but all differences between the head of main and the current working tree will be squashed into the uncommitted changes.
Switch to the main branch:
|#||git switch main|
This is where the work starts. If you simply commit all changes, you will revert the changes made to the main branch through the commits m1 and m2. So you must filter those out. At Circle Networks', we always have a code review at this point, where the authors of the current feature branch and the ones who have done the last merge — which put m2 onto the main branch — sit together. We go through all differences, discuss them, identify any conflicts arising from the changes, and decide which ones to keep and which ones to discard.
Note that this is something well beyond automatic merging or conflict resolution. We are not just talking about lines or blocks of code that were touched on both sides: this is much more about interface changes and assumptions on how pieces of code behave, which may have changed. It makes developers aware of what was changed, and why. Very often during the review, we find little bugs which an automatic merge would have introduced, and we correct them on-the-fly (remember: it is only the current working tree that counts, so can change whatever you like during the merge process). Seldomly, such a merge review may fail and may need to be repeated.
Selective staging is your friend in this process. Go through all changes (at code block or even line level), see whether they belong to the new increment from feature, and, if everyone feels comfortable, stage them. At the end, you commit the new increment to main in the usual manner.
The uncommitted changes that remain, are the reversal of changes through main m2. They can — and should — all be discarded:
|#||git reset --hard|
The selective-staging merge can be done as often as you want, in order to merge intermediate but mature stages of development on the feature branch into the main branch, and to keep the number of changes controllable.
When you no longer need the branch feature, you can delete it locally and remotely:
|#||git branch -d feature|
|#||git push origin --delete feature|
|#||git remote prune origin|
You now have a single, clean commit on main for the new feature, which properly acknowledges intermediate changes on the main branch.