Solving semantic conflicts with Git and SemanticMerge

Thursday, July 14, 2016 14 Comments

Here is the scenario: you have a source file with a class and some methods, and then you think it is a good idea to do some cleanup. You know, sort the methods in visibility order (public first) maybe create a subclass to wrap some functionality together, or place methods close to each other depending on how they are called, just to improve readability.

But, someone was doing changes to the same file concurrently (you know, it happens) and then he is less than happy to merge his fixes together with your cleanup...

So, here is the deal: shouldn't we try to keep the code as clean as possible? Yes, of course, sticking to whatever common style rules the team agreed to stick to. But then, in real life, isn't it a merge killer?

This blogpost shows how SemanticMerge helps to solve this case, when integrated with Git. For those of you who didn't know, SemanticMerge parses the code before calculating any merges. Unlike other 3-way merge tools, it is not just based on text. It can parse Java code, C#/VB.net code (Roslyn based) and C. There are also external community written parsers for Delphi and JavaScript.

Merge scenario

Well, I'm using a clone of Kestrel Server source code and more specifically working on file MemoryPool.cs.

Here are the changes the two developers are going to perform:

Merge case

Base is how the code was originally (in fact, the figure is hiding all the methods not involved in the example, the code is slightly more complicated than that).

Source (src) and Destination (dst) are the changes that the two developers are going to make.

Look at the icons: C stands for "changed" and M for "moved". As you can see both methods will be changed concurrently (C on both sides) and also moved by one of the developers.

This figure is actually taken from SemanticMerge.

Diffing the code

The developer doing the cleanup (that's you in this story) can diff the changes (we're using Visual Studio Code for this example) and sees something like this:

Classic diff

Needless to say, "traditional" diff is not very helpful with moved code.

You can always run git difftool from the command line to get a "move aware" diff:

Semantic diff

Merging

To create the actual conflict, I used two branches: task001 where I was doing the refactor/reorganization of the file, master where I just modified the methods.

Then, I checked out master and merged from task001:

Running git merge

And, as expected, Git detects a conflict on MemoryPool.cs.

Solving it with SemanticMerge will be simpler than it seems. Just run git mergetool (provided you configured Git to use SemanticMerge as its merge tool, which is quite simple to achieve (see how to make it).

The key issue you face when trying to merge code that has been moved is that traditional merge tools are not "code aware". They don't parse the code, and as such they just try to match lines that are close. But, that fails when the methods are reordered like in this case.

Since SemanticMerge parses the code first, it "knows" where methods are, and calculates the conflicts on a method by method basis (or function per function, property by property, and so on, depending on the actual language):

Merge tool explained

I added red circles to the screenshot to highlight some interesting points:

  • There are only 2 conflicts to solve. Remember, only 2 methods were changed.
  • The tool starts with the first conflict, Dispose() in this case. Check how the 3 versions involved are aligned on the Dispose() method. I mean, check the line numbers. Remember Dispose() was moved up and changed, and modified in the original location by the other developer. SemanticMerge detects the conflict, but also shows it in a way that is easy to understand. Traditional "line sync" is broken to sync on actual methods.
  • Finally, check line 177 on the left and 72 on the right. These are the actual changes made to the method.

You also see the C and M icons on the method declarations. There are dropdowns there to let you run the merge of the Dispose() method. It will be an automatic merge since the lines are not colliding.

And, the same will happen for the Return() method that was moved down.

Wrapping up

Git does a great job calculating the 3 contributors (base, yours and theirs) involved on each file merge. It implements merge tracking to do that, it calculates the common ancestor and then asks an external tool to handle the job when there are manual conflicts (conflicts its algorithm can't figure out without manual intervention).

By plugin Semantic to Git, you extend the "merge power" you are used to inside the files. And since location dependent conflicts are no longer a conflict, cleaning up code and reordering methods are not the root of all merge evil anymore.

But all of this works just "inside" the same file. What we really need is a merge tool that tracks moved code across files! - I hear you say. Yep, that's correct, but we need to develop our custom merge driver for Git to do that. Something we will definitely do, so stay tuned.

If you want to download the tool and give it a try, just go to www.semanticmerge.com.

Bonus track

If you found this blogpost interesting, you might want to watch the tool in action. You can see the same scenario described above here:

We develop Plastic SCM, a version control that excels in branching and merging, can deal with huge projects and big binary assets natively, and it comes with GUIs and tools to make everything simpler.

If you want to give it a try, download it from here.

We are also the developers of SemanticMerge, and the gmaster Git client.

14 comments:

Customers can expect an efficient service with a team that stays within budget.
design agencies in San Francisco

Excellent Blog! I would like to thank you for the efforts you have made in writing this post. Gained lots of knowledge.
Data Analytics Course

Awesome article. I enjoyed reading your articles. this can be really a good scan for me. wanting forward to reading new articles. maintain the nice work!
Data Science Courses in Bangalore

I bookmarked your website because this site contains valuable information. I am very satisfied with the quality and the presentation of the articles. Thank you so much for saving great things. I am very grateful for this site.

Data Science Training in Bangalore

I have voiced some of the posts on your website now, and I really like your blogging style. I added it to my list of favorite blogging sites and will be back soon ...

Digital Marketing Training in Bangalore

Institute said...

Wow, happy to see this awesome post. I hope this think help any newbie for their awesome work and by the way thanks for share this awesomeness, i thought this was a pretty interesting read when it comes to this topic. Thank you..
Artificial Intelligence Course

AI Courses said...

What an incredible message this is. Truly one of the best posts I have ever seen in my life. Wow, keep it up.
AI Courses in Bangalore

The Extraordinary blog went amazed by the content that they have developed in a very descriptive manner. This type of content surely ensures the participants explore themselves. Hope you deliver the same near the future as well. Gratitude to the blogger for the efforts.

Machine Learning Course in Bangalore

I found Habit to be a transparent site, a social hub that is a conglomerate of buyers and sellers willing to offer digital advice online at a decent cost.

Artificial Intelligence Training in Bangalore

I need to thank you for this very good read and i have bookmarked to check out new things from your post. Thank you very much for sharing such a useful article and will definitely saved and revisit your site.
Data Science Course

I am sure it will help many people. Keep up the good work. It's very compelling and I enjoyed browsing the entire blog.
Business Analytics Course in Bangalore

Education said...

Your site is truly cool and this is an extraordinary moving article and If it's not too much trouble share more like that. Thank You..
Digital Marketing Course in Hyderabad

Knowledge said...

Thank a lot. You have done excellent job. I enjoyed your blog . Nice efforts
Data Science Certification in Hyderabad

Linda said...

The article is very helpful. In principle, working with the code, keeping it readable are difficult things, but here everything is very clearly described. And about the changes that are made by other employees, this is interaction with others and corrections - this suggests that why is personal accountability important.