How to enable Git v2 in Gerrit Code Review

git-2-26

(c) Shutterstock / spainter_vfx

Git protocol v2 landed in Gerrit 3.1 on the 11th of October 2019. This is the last email from David Ostrovsky concluding a thread of discussion about it:

It is done now. Git wire protocol v2 is a part of open source Gerrit and will be
shipped in upcoming Gerrit 3.1 release.

And, it is even enabled per default!

Huge thank to everyone who helped to make it a reality!

A big thanks to David and the whole community for the hard work in getting this done!

This was the 3rd attempt to get the feature in Gerrit after a couple of issues encountered along the path.

Why Git protocol v2?

The Git protocol v2 introduces a big optimization in the way client and server communicate during clones and fetches.

The big change has been the possibility of filtering server-side the refs not required by the client. In the previous version of the protocol, whenever a client was issuing a fetch, all the references were sent from the server to the client, even if the client was fetching a single ref!

In Gerrit this issue was even more evident, since, as you might know, Gerrit leverages a lot the refs for its internal functionality, even more with the introduction of NoteDb.

Whenever you are creating a Change in Gerrit you are updating/creating at least 3 refs:

  • refs/changes/NN/<change-num>/<patch-set>
  • refs/changes/NN/<change-num>/meta
  • refs/sequences/changes

In the Gerrit project itself, there are currently about 104K refs/change and 24K refs/change/*/meta. Imagine you are updating a repo which is behind just a couple of commits, you will get all those references which will take up most of your bandwidth.

Git protocol v2 will avoid this, just sending you back the references that the Git client requested.

Is it really faster?

Let’s see if it really does what is written on the tin. We have enabled Gerrit v2 at the end of 2019 on GerritHub.io, so let’s test it there. You will need a Git client from version 2.18 onwards.

> git clone "ssh://barbasa@review.gerrithub.io:29418/GerritCodeReview/gerrit"
> cd gerrit
> export GIT_TRACE_PACKET=1
> git -c protocol.version=2 fetch --no-tags origin master
19:16:34.583720 pkt-line.c:80           packet:        fetch< version 2
19:16:34.585050 pkt-line.c:80           packet:        fetch< ls-refs
19:16:34.585064 pkt-line.c:80           packet:        fetch< fetch=shallow
19:16:34.585076 pkt-line.c:80           packet:        fetch< server-option
19:16:34.585084 pkt-line.c:80           packet:        fetch< 0000
19:16:34.585094 pkt-line.c:80           packet:        fetch> command=ls-refs
19:16:34.585107 pkt-line.c:80           packet:        fetch> 0001
19:16:34.585116 pkt-line.c:80           packet:        fetch> peel
19:16:34.585124 pkt-line.c:80           packet:        fetch> symrefs
19:16:34.585133 pkt-line.c:80           packet:        fetch> ref-prefix master
19:16:34.585142 pkt-line.c:80           packet:        fetch> ref-prefix refs/master
19:16:34.585151 pkt-line.c:80           packet:        fetch> ref-prefix refs/tags/master
19:16:34.585160 pkt-line.c:80           packet:        fetch> ref-prefix refs/heads/master
19:16:34.585168 pkt-line.c:80           packet:        fetch> ref-prefix refs/remotes/master
19:16:34.585177 pkt-line.c:80           packet:        fetch> ref-prefix refs/remotes/master/HEAD
19:16:34.585186 pkt-line.c:80           packet:        fetch> 0000
19:16:35.052622 pkt-line.c:80           packet:        fetch< d21ee1980f6db7a0845e6f9732471909993a205c refs/heads/master
19:16:35.052687 pkt-line.c:80           packet:        fetch< 0000
From ssh://review.gerrithub.io:29418/GerritCodeReview/gerrit
 * branch                  master     -> FETCH_HEAD
19:16:35.175324 pkt-line.c:80           packet:        fetch> 0000

> git -c protocol.version=1 fetch --no-tags origin master
19:16:57.035135 pkt-line.c:80           packet:        fetch< d21ee1980f6db7a0845e6f9732471909993a205c HEAD\0 include-tag multi_ack_detailed multi_ack ofs-delta side-band side-band-64k thin-pack no-progress shallow agent=JGit/unknown symref=HEAD:refs/heads/master
19:16:57.037456 pkt-line.c:80           packet:        fetch< 07c8a169d6341c586a10163e895973f1bdccff92 refs/changes/00/100000/1
19:16:57.037489 pkt-line.c:80           packet:        fetch< 0014ca6443ac0af338e2677b45e538782bb7a12e refs/changes/00/100000/meta
19:16:57.037502 pkt-line.c:80           packet:        fetch< b4af8cad4d3982a0bba763a5e681d26078da5a0e refs/changes/00/100400/1
19:16:57.037513 pkt-line.c:80           packet:        fetch< 9ec6e507c493f4f1905cd090b47447e66b51b7e1 refs/changes/00/100400/meta
19:16:57.037523 pkt-line.c:80           packet:        fetch< a80359367529288eea3c283e7d542164bced1e2f refs/changes/00/100800/1
19:16:57.037533 pkt-line.c:80           packet:        fetch< 170cced6d81c25d1082d95e50b37883e113efd01 refs/changes/00/100800/meta
19:16:57.037544 pkt-line.c:80           packet:        fetch< 6cb616e0ad4b3274d4b728f8f7b641b6bd22dce4 refs/changes/00/100900/1
19:16:57.037554 pkt-line.c:80           packet:        fetch< 286d1ee1574127b76c4c1a6ef0f918ad4c61953a refs/changes/00/100900/meta
19:16:57.037606 pkt-line.c:80           packet:        fetch< 312ba566d2620b43fb90be3e7c406949edf6b6d9 refs/changes/00/10100/1
19:16:57.037619 pkt-line.c:80           packet:        fetch< dde4b73cb011178584aae4fb29a528018149d20b refs/changes/00/10100/meta

…. This will go on forever …. 

As you can see there is a massive difference in the data sent back on the wire!

How to enable it?

If you want to enable it, you just need to update you git config (etc/jgit.config in 3.1 and $HOME/.gitconfig in previous versions) with the protocol version to enable it and restart your server:

[protocol]
  version = 2

Enjoy your new blazing fast protocol!

If you are interested in more details about the Git v2 protocol you can find the specs here.

Fabio Ponciroli (GerritForge)
Gerrit Code Review Contributor

Crunch Code Review hashtags with Gerrit DevOps Analytics

Screenshot 2020-05-12 at 11.13.06.pngWe have already discussed in previous posts how important it is to speedup the feedback loop in your Software Development Lifecycle. Having early feedbacks gives you the chance of evaluating your hypothesis and eventually change direction if needed. 

The more information you have, the smarter can be your decisions.

We recently added in our Gerrit DevOps Analytics the possibility of extracting data coming from Code Reviews’ metadata to extend the knowledge we can get out of Gerrit.

Furthermore, it is possible to extract meta-data from repositories not necessarily hosted on the Gerrit instance running the analytics processing. This is a big improvement since it allows to fully analyse repositories coming from any Gerrit server.

For example, the Gerrit analytics we are providing on https://analytics.gerrithub.io are coming from the Gerrit repository hosted on the gerrit-review.googlesource.com, the Gerrit server hosted by Google.

Hashtags aggregation

One important type of meta-data contained in the Code Reviews is the hashtag.

Hashtags are freeform strings associated with a change, like on social media platforms. In Gerrit, you explicitly associate hashtags with changes using a dedicated area of the UI; they are not parsed from commit messages or comments.

Similar to topics, hashtags can be used to group related changes together and to search using the hashtag: operator. Unlike topics, a change can have multiple hashtags, and they are only used for informational grouping; changes with the same hashtags are not necessarily submitted together.

You can use them, for example, to mark and easily search all the changes blocking a particular release:

Screenshot 2020-05-12 at 10.43.00.png

Hashtags can also be used to aggregate all the changes people have been working on during a particular event, for example, the Gerrit User Summit 2019 hackathon:

Screenshot 2020-05-12 at 10.44.39.png

The latest version of the Gerrit Analytics plugin exposes the hashtags attached to their respecting Git commit data. Let’s explore together some use cases:

The most popular Gerrit Code Review hashtags over the last 12 months

Screenshot 2020-05-12 at 10.47.49.png

Throughput of changes created during an event

see for example the Palo alto hackathon (#palo-alto-2018). We can see at the end of the week the spike of changes to release Gerrit 2.16.

Screenshot 2020-05-12 at 10.56.44.png

The extend of time for a feature

Removing GWT was an extensive effort which started in 2017 and ended in 2019. It took several hackathons to tackle the removal as shown by the hashtags distribution. Some changes were started in one hackathon and finalised in the next one.

Screenshot 2020-05-12 at 11.05.51.png

Those were some example of useful information on how to leverage the power of GDA.

The above examples are taken from the GDA dashboard provided and hosted by GerritForge on https://analytics.gerrithub.io which mirror commits and reviews on a regular basis from the Gerrit project and its plugin ecosystem.

How to setup GDA on Gerrit

Hashtag extraction is currently available from Gerrit 3.1 onwards. You can download the latest version released from the Gerrit CI.

To enable hashtag extraction you need to enable the feature extraction in the plugin config file as follow:

# analitycs.config
[contributors]
  extract-hashtags = true

For more information on how to configure and run the plugin, look at the analytics plugin documentation.

Conclusion

Data is the goldmine of your company. You need more and more of it for making smarter decision. The latest version of the GDA allows you to leverage even more data produced during the code review process.

You can explore the potential of the information held in Gerrit on the analytics dashboard provided by GerritForge on analytics.gerrithub.io.

If you would like your open source Gerrit hosted project to be added to our dashboard or would need help in setting up and supporting GDA for your organization, get in touch with GerritForge Sales Team and we can help you making smarter decisions today.