TIL: git and github diff Differently
My team switched over to the SkullCandy git
workflow
last spring and we did not make a new develop
branch for a long time
as deleting the branch on github automatically deletes the branch of
any open pull requests as well.
So, this week we ripped the band-aid off and remastered develop
.
It’s been painful.
I was hoping pull request from the develop
branch into the master
branch would tell us the commits on develop
that are not in
master
, so we can sort out the differences.
That pull request did not tell us anything. In fact, it revealed a disturbing fact: changes that I thought were in both branches were not there. How is that so??
I ran experiments to see what’s going on. You can see it here.
Replication
This is what I replicated on the repository, which is the workflow used for SkullCandy:
- start a
master
branch. - create a
develop
branch by cloning themaster
branch. - when starting a new feature, clone off the
develop
branch. - when ready to merge change into
develop
, make a pull request in. - after change is in
develop
and validated, cherry-pick the commit from the branch intomaster
and make a new pull request. - done.
After experiments in different merge strategies (merge commit, squash
commit, rebase commit), I started to notice: on github, changes that
were on the master
branch would ONLY be the same if and only if
the commit SHA for the change matched.
When I checked locally the difference between master
and the
corresponding develop
and feature
branch.
Example: develop3
and master
Let’s go through an example from the repository:
The master
branch has all the work and it’s file contents are:
The branch which also has the same work: develop3
has the same file
and its contents are :
Locally
Doing a git diff
on the command line produces
On github
When making a Pull Request on github.com, the result is:
which is pretty much as if the work never existed, but is there!
https://github.com/a-leung/commit_tests/compare/master…develop3?expand=1
Why does this matter?
It’s important because there are differences between git and github. I can’t trust github to be consistent with git, even for a simple change if the SHA do not match.
git can resolve the same code appearing with different SHA, github relies on the SHA to compute differences between branches.
The reason for the difference? git computes the difference between branches using diff, github computes the differences between branches using SHA.
The only difference between the branches master
and develop3
is
the SHA values for the change:
On master
branch:
On develop3
branch:
So, that’s one area git and github differ!
Lesson Learned
We have to adjust our workflow for the ways git and github treats differences in code. It’s a subtle difference, but with greater consequences in that we cannot use the tooling to help us, which adds work (that is not value add!)
For now, I will be remastering the develop
branch with higher
frequency.