Report from a Gerrit hackathon: Repository Optimizer PoC

The Gerrit spring hackathon just ended on Discord, with GerritForge attending from London, SAP, Google from Germany, and WikiMedia from France). One of the PoC we have been working on is a prototype for a scalable and “intelligent” repository optimizer.

Following last year’s release of the git-repo-metrics plugin, presented in the previous users summit, which tracks live information on Git repositories, we thought that having a tool that can “automagically” do something with the collected data would be helpful.
We started working on, what we called, the RepoVet©, a modular tool that can make intelligent and autonomous decisions on what needs to be improved on a repository.

Architecture

The main constraints we aimed for were:

  • Git server implementation agnostic: we want the tool to be usable on any Git repository, not necessarily one managed by Gerrit Code Review
  • Modular: the different components of the tool must be independent and pluggable, giving a chance to integrate into already existing Git server Setups.

After a couple of whiteboard rounds, we developed the following components: Monitor, RuleEngine, and Optimizer.

Each is independent, highly configurable, and communicates with the other components via a message broker (AWS SQS). Following is a list of the responsibility of each of them:

  • Monitor: watch the filesystem and notify for activities happening in the git repository, i.e., increase/decrease of repository size
  • RuleEngine: listens for notifications from Monitor and decides whether any activity is needed on the repository, i.e., a git GC, a git repack, etc. The decision can be based not only on the repository parameters (number of loose objects, number of refs, etc.) but also, for example, on traffic patterns. If RuleEngine decides an optimization is needed, it will notify the Optimizer.
  • Optimizer: listen for instructions coming from the RuleEngine and execute them. This can be a git GC, a git repack, etc. It is not its call to decide which activity to carry on. However, it will determine if it is the right moment. For example, it will only run concurrent GCs or do any operation if there are enough resources.

Following is an example of interaction among the components, where the decision to run a GC is based on some thresholds set in the repository configuration:

In the above example, Monitor reports an increase in the repository size and notifies the RuleEngine via the broker RepoActivity queue.

RuleEngine gets the repository configuration and decides a GC is needed since some thresholds were exceeded. It notifies the operation type and the repository to the Optimizer via the broker RepoIntervention queue.

Optimizer checks if there are other GC currently running and if there are enough resources and then runs the GC and keeps track of its result and timestamp.

As it is possible to see, we met the criteria we initially aimed for since:

  • None of the components needs or use Gerrit, even though the repository was hosted in a Gerrit Code Review setup
  • Components are independent and swappable. For example, if we used Gerrit, the RepoMonitor could be swapped with a plugin acting as a bridge between Gerrit stream events and the broker.

Lessons learned

  • Having low coupling among the different components will allow:
    • The user to pick only the components needed in their installation
    • The user to integrate the tool into a pre-existing infrastructure
    • The developers to potentially work with different technologies and different lifecycles
  • The user to pick only the components needed in their installation
  • SQS proved to be straightforward to work with during the prototyping phase, allowing to spin up the service locally with Docker quickly
  • Modeling the messages among the components is crucial and has to be carefully thought-out at the beginning
  • More planning needs to be spent in choosing the broker system; for example, handling non-processable messages and managing DLQs hasn’t been considered at all

Next steps

We are aiming to start working on an MVP as soon as possible. Maybe starting from one of the components and slowly adding the others.

As soon as we have an MVP, as usual, the code will be available; just waiting for contributions and feedback.

Traditionally, we will use gerrithub.io to dogfood it, and we will report back.

Stay tuned!

New year, new JGit contributions coming

We have shared GerritForge’s goals for improving Gerrit in 2022: most of them will include significant contributions to JGit, the Java-based engine powering Gerrit Code Review’s support for Git data-format and protocol.

GerritForge will contribute many more changes to JGit during 2022, all focused on improving the functionality and performance of large mono-repos. All changes will go through the formal review through Eclipse Foundation’s JGit Gerrit Project.

Lack of knowledge and reviews

The JGit project has suffered from major losses in the past few years, which is clearly shown by the list of top-contributors vs. their recent 12 months activity. I have tried running a “git blame” against all the JGit code-base, which is a heuristic (therefore a rough approximation) of which part of the JGit code has been last written/edited.

  1. (49998 LOC) – Shawn Pearce
  2. (37854 LOC) – Thomas Wolf
  3. (31417 LOC) – Matthias Sohn
  4. (13593 LOC) – David Pursehouse
  5. (13200 LOC) – Christian Halstrick

See below the number of contributions (excluding merges and trivial changes) of the above 5 maintainers in the past 12 months:

  1. Shawn Pearce – 0 changes – He sadly passed away in 2018 
  2. Thomas Wolf – 86 changes
  3. Matthias Sohn – 128 changes
  4. David PurseHouse – 0 changes
  5. Christian Halstrick – 0 changes

The above stats show that the currently active maintainers (Thomas and Matthias) appear in the bit blame of 69k out of the total 390k LOCs.

Thomas and Matthias are doing a fantastic job in keeping up as much as possible with the incoming changes’ pace and reviewing them at their best. At times, though, the incoming change may touch parts of the code they are less familiar with and, therefore, would require more eyes or more time to review.

Breaking the vicious circle

The JGit project is in a dangerous vicious circle.

  1. Incoming changes would take longer to get reviewed and merged.
  2. The lengthy reviews cause detriment to contributors that may lose interest in following up contributions or upload new changes.
  3. The lack of contributions and merged changes would keep the pressure on current maintainers, which fuel point 1. again.

How can GerritForge help break this vicious circle, provide meaningful contributions, and get them merged fast and with proper and thorough reviews?

Keeping the pace of contributions is key to avoid detriment: GerritForge will therefore create a “dev branch” of JGit. All the GerritForge’s contributions to JGit master branch will be part of the dev branch and will go through a rigorous code-review and E2E validation cycle, including the Gatling tests for Gerrit.

Two-steps validation workflow

GerritForge’s workflow for validating JGit changes with Gerrit
  1. A new change is uploaded to the Eclipse Foundation JGit project.
  2. The normal Eclipse Foundation’s CI verification builds the change and, if passes all tests, provides a Verified +1
  3. One of the JGit maintainers, or members of the GerritForge’s contributors, can provide a Code-Review +1 score with the additional description “Approved for dev
  4. The special “Approved for dev” description triggers the cherry-pick of the Change onto the GerritForge’s JGit dev branch
  5. The new change for review on the JGit dev branch triggers the creation of a Change on gerrit-review.googlesource.com/gerrit with the update of the JGit submodule pointing to the open change.
  6. The JGit submodule update Change triggers the current E2E validation using the Gatling tests, developed and hosted by GerritForge. If all tests are passing, the Gerrit change receives a Verified +1.
  7. The cherry-picked Change on JGit dev branch receives a Verified +1
  8. The cherry-picked Change is merged to the JGit dev branch
  9. The merge of the cherry-picked Change is notified on the original Change with a Code-Review +1 score with description “Merged in dev”.
  10. One of the JGit maintainers can finalise the review and, if all is good, provides the final Code-Review +2 and merge the change on JGit master.

NOTE: The above workflow will only apply to the upcoming changes on the master branch, where we do need to innovate and implement new features at a faster pace. We have to plans to apply the workflow to any stable branches.

Plus and minuses

There are good things on the above workflow, however, there are also risks:

  • Complexity: the secondary dev branch will undergo an E2E validation process with Gerrit, which is obviously complex and it may break at times.
  • Danger of forking: if the JGit maintainers would veto the Change at step 10 the changes already merged in dev would make effectively dev and master branches diverge, which isn’t a good thing and it should be avoided as much as possible.

The augmented lifecycle, which also involves Gerrit E2E tests with Gatling, has also many advantages:

  • Additional E2E validation: incoming changes on JGit would involve an E2E validation with Gerrit against the suite of E2E Gatling tests, which is good feedback and gives more confidence in merging code also on less known parts of the JGit codebase.
  • Increased velocity: speedup validation of new incoming changes and getting them merged to the JGit dev branch, without impacting the pace and quality of reviews from the current JGit maintainers.
  • Gerrit edge release: allows to have a downloadable Gerrit change that includes the JGit dev branch, allowing canary deployments and see how Gerrit behaves with the latest and greatest of JGit code.

From a JGit project’s perspective, the flow of incoming changes will have an additional E2E validation, which is always a good thing. Additionally, it will bring more contributions and inspiration for new innovative changes on the project and attracting more and more talent.

Ready to gear-up contributions on JGit?

The workflow proposed is a starting point; however, we are committed to giving it a go and seeing how it would work in practice and if it will be enough to gear up the contributions to the JGit project.

The Virtual Gerrit User Summit is tomorrow!

Join the Gerrit Community tomorrow and Friday from 8 am PST for everything related to the Gerrit Code Review Community.

Gerrit provides web based code review and repository management for the Git version control system. Whether you are experienced or new to Gerrit, you should know that it provides a framework you and your teams can use to review code before it becomes part of the code base. Come and take this chance to join and learn about Gerrit Code Review.

Find here the full schedule of the sessions you will have access to.

Register and join the community event!

Be an active part of the Summit: Last Call for Presentations

There are still a few slots open for you to present on the virtual Gerrit User Summit.

Submit your presentation proposal by creating a change to the Gerrit Summit 2021 repository by following these steps:

  • Login to https://gerrit-review.googlesource.com
  • Go to the Gerrit Summit 2021 repository
  • Click “CREATE CHANGE” button and specify the branch (master) and the headline of your talk
  • Click on “EDIT” button on the top-right to edit your change
  • Click on the “ADD/OPEN/UPLOAD” button and enter the filename for your talk (e.g. sessions/super-duper-repos.md for a talk or lightning-talks/mini-session.md for a lightning talk) upload the text for your talk by dragging the markdown text into the window.
  • Click the “PUBLISH EDIT” button on the top-right of the change screen
  • Click on the “MARK AS ACTIVE” button on the top-right of the change screen

Your talk will then be reviewed by the community and, when accepted, merged into the Gerrit User Summit 2021 site.

Don’t miss out!

SAVE THE DATE: Gerrit User Summit is back on 2 & 3 Dec 2021

REGISTRATION is open to the Virtual Gerrit User Summit. It will take place on 2nd and 3rd December 2021 https://gerrit.googlesource.com/summit/2021/+/refs/heads/master/index.md

The Gerrit Community is happy to announce the Gerrit Virtual User Summit 2021, THE event of the year for everything related to Gerrit Code Review and the trunk-based development pipeline.

A Virtual Summit

The Gerrit User Summit 2021 will be held online only, to allow most of the community around the globe to attend and share their experience and ideas, and avoid the problems with the travelling restrictions due to the COVID-19 pandemic.

The 2-day User Summit is open to all the members of the community as well as those that are willing to learn and adopt Gerrit Code Review in their development process.

Gerrit v3.4.0-rc2: weekly update

Merged Changes

From Milutin Kristofic:

    Fix underlining on hover for subject column in dashboard
    Change-Id: Ie0c55bbe550f1714f1f9f630ac81f6c686184221

    Revert "A11y - show outline when focused on tab title in change view"    
    Change-Id: I6d8f74c4d01004a88b0dbf207d9d9741a6965755

From Mike Frysinger:

    Fix typo in hashtags docs
    Change-Id: Idabd8994651451183472eb3b777064e891e8a765

From Dmitrii Filippov:

    Do not reload page after clicking Cancel in apply-fix dialog    
    Change-Id: I5830caac655df9d3dc7058dc9674ba727685012b

From Paladox:

    Add undefined check for access[repo] in getRepoAccess
    Change-Id: I16f3344c9f11cb230adafa7fdc8457ac37b4ae70

    Add web links to project:<project>
    Change-Id: Iefda4bb5571d917d5fa9f24c02c8e4a3be2c6c52

From Dhruv Srivastava:

    Turn off syntax highlighting in comment context if disabled
    Change-Id: I89625a9b15678c1a63543d36f9f7d835bb39b359

    Allow comment context to take up remaining width 
    Change-Id: I1dab57df26b0fed98efead1a18083d9a504ed40f

    Make the line length indicator color clearer
    Issue: Bug 13648
    Change-Id: I523d971f39db477d67e212d1175ec6f5be8be24d

    Remove Add/No Patchset description label for merged changes
    Change-Id: I18643bcf01e10177a28411089ac1b312222d3e23

From David Ostrovsky:

    Bazel: Disable worker multiplexer to avoid sporadic build failures
    Change-Id: I2265c4ed7128a4b6ed259f44c5594ab717de58b0

From Ben Rohlfs:

    Allow line wrapping in check result messages
    Change-Id: Icedc05bbc28fbe513c60a5eb9853dadde636381b

Issues Fixed

  • Issue 13648: Make the line width guide in commit description to be darker

Issues Raised

Gatling E2E tests results

Git Protocol Simulation:

Gatling full results:
https://gerrit-ci.gerritforge.com/job/gatling-gerrit-test/305/gatling/report/gerritgitsimulation-20210419190719877/source/index.html

Gatling simulation class:
https://github.com/GerritForge/gatling-sbt-gerrit-test/blob/master/src/test/scala/gerritforge/GerritGitSimulation.scala

Gerrit UI REST Simulation

Gatling full results:
https://gerrit-ci.gerritforge.com/job/gatling-gerrit-test/305/gatling/report/gerritrestsimulation-20210419190957647/source/index.html

Gatling simulation class:
https://github.com/GerritForge/gatling-sbt-gerrit-test/blob/master/src/test/scala/gerritforge/GerritRestSimulation.scala

Gatling 10-days trend

Gerrit v3.4.0-rc1: weekly update

Starting from this week, GerritForge provides a weekly status update of how the Gerrit v3.4.0 release plan is progressing:

  1. List of merged changes since the previous RC
  2. List of issues fixed
  3. Result of the Gatling E2E tests on AWS

We hope that having these regular updates would help you focus on what is changing during the release plan execution and do more research, learning or specific investigation on the areas that are more pertinent to your use case.

Also, because the Gerrit release candidates do not come with an associated set of release notes, the list below would help people understand the new functionalities or fixes coming through every week.

Both Gatling-Git and the AWS-Gerrit project with a complete production-ready setup are projects started by GerritForge and contributed as OpenSource to the Gerrit community.

Merged changes

From Matthias Sohn:

    Log memory allocated per command in httpd_log 
    Change-Id: Ie0ce1382a8515e6dfb7d0d3fe10b3e64c0cf9aee

    Log cpu usage per http request   
    Change-Id: I9e78bed5219f9baf57a2b76f0f947efff334ffe5

    Log memory allocated per command in sshd_log
    Change-Id: Ifc1d274bf42eb3cb9b2cf46271b6be0117aa8b18

    Add metrics for monitoring Java memory pools
    Change-Id: I60e5960899c0cff8c05983d299b414d7a646bb07

    Log cpu usage in sshd_log
    Change-Id: I1c53f64caf982c2f85195e6bda4c6d790f79a810

    Encapsulate fields of SshScope.Context
    Change-Id: If989630425ad40922aaf8958c4335aab0bb5c2c9

    Log "-" for missing log fields in sshd_log
    Change-Id: I90adc7618864f702b42029ab596c6014bd4c6cfe

From Ben Rohlfs:

    Remove backend support for HTML UI plugins
    Change-Id: I44cc0d15910937de7e1f9b9780a799d4b85b0673

    Stop producing html version of plugins
    Change-Id: I1036f06e385f2997f7bea849755729df2789acaa

From Milutin Kristofic:

    A11y - show outline when focused on tab title in change view
    Change-Id: Ie9456a5d886e70a77dae8f055a54a3a1a0045daf

    A11y - show outline when focused on change subject in dashboard
    Change-Id: I9c73e49de661a17928c6f96a290c2069503bdfb4

    A11y - fix and improve label when navigating dashboard
    Change-Id: Id01302aea38c783687443401290995bdd0764126

From Hermann Loose:

    Allow setting image viewer max-width and max-weight externally
    Change-Id: Ie151a85d8ede5cb7fa0899d9367ca0dccd887538

From Han-Wen Nienhuys:

    Fix meta_diff documentation
    Change-Id: I9c59a4857724cdfd59b995a6dc255f77d29b017e

From Paladox:

    Update plugins/codemirror-editor and plugins/delete-project
    Change-Id: Ieca7d2e36b9eefffe5c830962109e1fa62134b5c

    gr-confirm-move-dialog: Fix _getProjectBranchesSuggestions
    Change-Id: I8d4ead0bf3a8c30d0ecefdc190e4ba4ea7ede29d

    Remove unused _handleDropdownTap
    Change-Id: Ie4a8effd6d814985cafa6dea9239903f503dae33

    Remove unused _handleDropdownTap
    Change-Id: Ie4a8effd6d814985cafa6dea9239903f503dae33

From Dhruv Srivastava:

    Clean up upload change help dialog
    Change-Id: I0c72450c37326ec2c2922b74928e0b059df0043e

From Saša Živkov:

    Fix binding of DELETE REST calls from plugins
    Change-Id: I9b9632e8f719937e5f7c61466996be79e6f29c14

Issues fixed

  • Issue 14335: CodeMirror plugin broken
  • Issue 14127: REST API DELETE query for delete-project plugin doesn’t work

Gatling E2E tests results

Git Protocol Simulation

Gatling full results:
https://gerrit-ci.gerritforge.com/job/gatling-gerrit-test/287/gatling/report/gerritgitsimulation-20210412213548212/source/index.html

Gatling simulation class:
https://github.com/GerritForge/gatling-sbt-gerrit-test/blob/master/src/test/scala/gerritforge/GerritGitSimulation.scala

Gerrit UI REST Simulation

Gatling full results:
https://gerrit-ci.gerritforge.com/job/gatling-gerrit-test/287/gatling/report/gerritrestsimulation-20210412213912089/source/index.html

Gatling simulation class:
https://github.com/GerritForge/gatling-sbt-gerrit-test/blob/master/src/test/scala/gerritforge/GerritRestSimulation.scala

What’s new in Gerrit Code Review v3.2

gerrit-code-review-3.2-intro

Gerrit Code Review is unstoppable: despite the recent COVID-19 pandemic and the cancellation of the Spring Hackathon 2020, the community has made an extraordinary effort to deliver remotely and on-time the Gerrit v3.2 release on the 1st of June.

GerritForge has already migrated GerritHub.io on the day-1 of the release and is happy to share with you the highlights of this new release. If you need help to assess your current setup and migrating, please get in touch with us at https://gerritforge.com/contact.

Get ready to migrate: get rid of zombie comments

The migration process performs the cleanup of the zombie draft comments in the All-Users.git repository that has been left behind since the introduction of NoteDb back in v2.16.
Every user commenting on any change was creating a series of commits on the All-Users.git repository, where the draft comments are stored. Once the comments were finalised and applied to the change, they were not fully removed from the All-Users.git. That created a backlog of zombie comments on All-Users.git that are now being completely removed during the Gerrit v3.2 migration process.

Since Gerrit v2.16.16, there is a standalone utility to remove the zombie draft comments. You may want to do that operation upfront to make sure that the migration to v3.2 does not have a lot of processing during the init step. Also, make sure that the All-Users.git resides on a fast access local filesystem for minimizing the migration time.

If you do nothing, the cleanup utility will be automatically executed when migrating to Gerrit v3.2, bearing in mind that it may take quite a long time to complete. In our tests, it took around 10 minutes for 10k zombie comments.

WARNING: the execution time is not linear and it may take up to 48h of processing time for a staggering number of 1M zombie comments.

Migrate with zero-downtime

If you have on Gerrit v3.1.x in a high-availability configuration, you can upgrade seamlessly to Gerrit v3.2, without having to suspend or degrading the service in any way. GerritForge has a record number of installations done in high-availability and multi-site: if you are running a single Gerrit master today, you should get in touch with the GerritForge Team to help moving to high-availability.

For the very first time, the whole Gerrit Community can benefit from the ability to perform a rolling upgrade without any downtime.

The zero-downtime upgrade consists of the following steps:

  1. Have Gerrit masters upgraded to v3.1.6 (or later) in a high-availability configuration, healthy and able to handle the incoming traffic properly.
  2. Set gerrit.experimentalRollingUpgrade to true in gerrit.config on both Gerrit masters.
  3. Set the first Gerrit master unhealthy.
  4. Shutdown the first Gerrit master and then upgrade to v3.2.
  5. Startup the first Gerrit master and wait for the on-line reindex to complete.
  6. Please verify that the first Gerrit master is working correctly and then make it healthy again.
  7. Wait for the first Gerrit master to start serving traffic regularly.
  8. Repeat steps 3. to 7. for the second Gerrit master.
  9. Remove gerrit.experimentalRollingUpgrade from gerrit.config on both Gerrit masters.

NOTE: Gerrit v3.1.6 has not been released yet. However, if you want to perform a rolling upgrade today, you can download the latest build on the stable-3.1 branch from the GerritForge’s CI at https://gerrit-ci.gerritforge.com/job/Gerrit-bazel-stable-3.1/

GerritHub.io has been successfully upgraded on the 1st of June without any interruption of any kind using the above procedure.

Java 11 official support

Gerrit is now officially supported on Java 11, in addition to Java 8. Running on Java 11 was already possible from v2.16.13, v3.0.4 and v3.1.0, but not officially supported because of the lack of a CI validation on Java 11 for stable-2.16, stable-3.0 and stable-3.1 branches.

Gerrit v3.2 has been validated with Java 11, with the following known issues:

  • Issue 11567: Java 11 runtime & startTLS LDAP broken: ‘error code 8 – BindSimple: Transport encryption’.
  • Issue 12639: WARNING: An illegal reflective access operation has occurred, when starting Gerrit.

After 24h of adoption of Gerrit v3.2 on GerritHub.io, we have seen two major benefits from the migration to Java 11: overall reduction of the “old generation” build up in the JVM heap and massive reduction of GC cycles times and full-GCs.

screenshot-2020-06-02-at-11.48.30

Before the 29th of May, all GerritHub.io nodes were on Gerrit v3.1 / Java8. The old-generation JVM heap keeps on building up constantly until it reaches the 60GB and triggers a full GC cycle. After the upgrade to Gerrit v3.2 / Java11, memory consumption is very much under control. There are still possibilities of peaks with associated full GCs (see the one on the 30th of May around 12:00 BST) but there isn’t build up of old-generation objects anymore.

screenshot-2020-06-02-at-11.52.43

Java11 brings a lot of benefits also in reducing the latency of the individual GC cycles, showing much better performance with large heaps.
After the migration on the 29th of May, the GC graph is pretty much flat. The only full GC peak that is noticeable on the 30th of May lasted for just 5 msecs while the normal GC cycles are well below 1 msec, barely noticeable.

Performance is a feature

Shawn Pearce, the Gerrit Code Review project founder, used to say “performance is a feature”, which is very true. Any software nowadays can provide some basic out of the box features, thanks to the plethora of open-source components available out of the box. However, designing architecture and making it scale and perform to the levels that an Enterprise Code Review system needs, it is not easy.

Gerrit v3.2 is yet another significant milestone in the continued effort of the Gerrit maintainers and contributors in making Gerrit Code Review faster, more stable and available than ever before.

Performance tuning isn’t a “one-off task” but is a continuous improvement on thousands of little details ranging from the front-end javascript tuning down to the backend of the platform.

New accounts cache

From the data collected on googlesource.com Patrick Hiesel (Google) has identified the accounts loading from NoteDb as a significant cause of the delay of backend calls. That is true for all Gerrit installations, but especially for distributed setups or setups that restart often.

Gerrit v3.2 introduces a brand-new AccountCache decomposed into smaller chunks that can be cached individually:

  1. External IDs + user name (cached in ExternalIdCache)
  2. CachedAccountDetails (newly cached)
  3. Gerrit’s default settings CachedAccountDetails – a new class representing all information stored under the user’s ref (refs/users/<sharded-id>).

The new structure is cleverly designed to require a lot less I/O when an entry needs to be reloaded and lowering the ratio of cache-miss in case of user’s details updates.

The new structure has the following advantages:

  1. CachedAccountDetails contains only details from refs/users/<sharded-id>. By that, we can use the SHA1 of that ref as cache key and start serializing the cache to eliminate cold start penalty as well as router assignment change penalty (for distributed setups). It also means that we don’t have to do invalidation ourselves anymore.
  2. When the server’s default preferences change, we don’t have to invalidate all accounts anymore.
  3. The projected speed improvements that come from persisting the cache makes it so that we can remove the logic to load accounts in parallel.

Migration to Polymer 3

PolyGerrit UX roadmap continues with yet another important milestone: the migration to Polymer 3. The result is visible with an improved polishing of the GUI and significant speedup of rendering and reduction of page loading times.
There are a significant amount of small refinements to the GUI as well, coming from a meticulous work of fixes included in this release.
Not by surprise, the number of issues fixed in v3.2 on the PolyGerrit UX outnumbers by far the overall changes in the release notes.

gerrit-3.2-findings

PolyGerrit is giving special attention to the classification of the feedback coming from robots rather than humans.
Most of the efforts made in the past 12 months target the improvement the support for robot-comments and giving some extra dedicated space for them.
In Gerrit v3.2 there is a special place for them in a brand-new “Findings” tab. It is currently empty on GerritHub.io as people did not start using them much. However, I do see a lot of space of adoption of this new feature, giving the ability for more integration of linters and automatic validation feedback in this tab.

A flooding of fixes and small improvements

The list of fixes and improvements in Gerrit v3.2 is really huge. Please check the release notes on the Gerrit Code Review release page for all the details.

There are a lot of reasons to migrate to Gerrit v3.2, the fastest, more stable and scalable release of Gerrit Code Review ever.


Thanks a lot to the whole Gerrit Code Review Community of maintainers and contributors for making this release happen. Thanks to Patrick Hiesel for the technical description of the account cache improvements and the replication clustering.

Luca Milanesio (GerritForge)
Gerrit Code Review Maintainer, Release Manager, ESC Member

 

How to enable Git v2 in Gerrit Code Review

git-2-26

(c) Shutterstock / spainter_vfx

Git protocol v2 landed in Gerrit 3.1 on the 11th of October 2019. This is the last email from David Ostrovsky concluding a thread of discussion about it:

It is done now. Git wire protocol v2 is a part of open source Gerrit and will be
shipped in upcoming Gerrit 3.1 release.

And, it is even enabled per default!

Huge thank to everyone who helped to make it a reality!

A big thanks to David and the whole community for the hard work in getting this done!

This was the 3rd attempt to get the feature in Gerrit after a couple of issues encountered along the path.

Why Git protocol v2?

The Git protocol v2 introduces a big optimization in the way client and server communicate during clones and fetches.

The big change has been the possibility of filtering server-side the refs not required by the client. In the previous version of the protocol, whenever a client was issuing a fetch, all the references were sent from the server to the client, even if the client was fetching a single ref!

In Gerrit this issue was even more evident, since, as you might know, Gerrit leverages a lot the refs for its internal functionality, even more with the introduction of NoteDb.

Whenever you are creating a Change in Gerrit you are updating/creating at least 3 refs:

  • refs/changes/NN/<change-num>/<patch-set>
  • refs/changes/NN/<change-num>/meta
  • refs/sequences/changes

In the Gerrit project itself, there are currently about 104K refs/change and 24K refs/change/*/meta. Imagine you are updating a repo which is behind just a couple of commits, you will get all those references which will take up most of your bandwidth.

Git protocol v2 will avoid this, just sending you back the references that the Git client requested.

Is it really faster?

Let’s see if it really does what is written on the tin. We have enabled Gerrit v2 at the end of 2019 on GerritHub.io, so let’s test it there. You will need a Git client from version 2.18 onwards.

> git clone "ssh://barbasa@review.gerrithub.io:29418/GerritCodeReview/gerrit"
> cd gerrit
> export GIT_TRACE_PACKET=1
> git -c protocol.version=2 fetch --no-tags origin master
19:16:34.583720 pkt-line.c:80           packet:        fetch< version 2
19:16:34.585050 pkt-line.c:80           packet:        fetch< ls-refs
19:16:34.585064 pkt-line.c:80           packet:        fetch< fetch=shallow
19:16:34.585076 pkt-line.c:80           packet:        fetch< server-option
19:16:34.585084 pkt-line.c:80           packet:        fetch< 0000
19:16:34.585094 pkt-line.c:80           packet:        fetch> command=ls-refs
19:16:34.585107 pkt-line.c:80           packet:        fetch> 0001
19:16:34.585116 pkt-line.c:80           packet:        fetch> peel
19:16:34.585124 pkt-line.c:80           packet:        fetch> symrefs
19:16:34.585133 pkt-line.c:80           packet:        fetch> ref-prefix master
19:16:34.585142 pkt-line.c:80           packet:        fetch> ref-prefix refs/master
19:16:34.585151 pkt-line.c:80           packet:        fetch> ref-prefix refs/tags/master
19:16:34.585160 pkt-line.c:80           packet:        fetch> ref-prefix refs/heads/master
19:16:34.585168 pkt-line.c:80           packet:        fetch> ref-prefix refs/remotes/master
19:16:34.585177 pkt-line.c:80           packet:        fetch> ref-prefix refs/remotes/master/HEAD
19:16:34.585186 pkt-line.c:80           packet:        fetch> 0000
19:16:35.052622 pkt-line.c:80           packet:        fetch< d21ee1980f6db7a0845e6f9732471909993a205c refs/heads/master
19:16:35.052687 pkt-line.c:80           packet:        fetch< 0000
From ssh://review.gerrithub.io:29418/GerritCodeReview/gerrit
 * branch                  master     -> FETCH_HEAD
19:16:35.175324 pkt-line.c:80           packet:        fetch> 0000

> git -c protocol.version=1 fetch --no-tags origin master
19:16:57.035135 pkt-line.c:80           packet:        fetch< d21ee1980f6db7a0845e6f9732471909993a205c HEAD\0 include-tag multi_ack_detailed multi_ack ofs-delta side-band side-band-64k thin-pack no-progress shallow agent=JGit/unknown symref=HEAD:refs/heads/master
19:16:57.037456 pkt-line.c:80           packet:        fetch< 07c8a169d6341c586a10163e895973f1bdccff92 refs/changes/00/100000/1
19:16:57.037489 pkt-line.c:80           packet:        fetch< 0014ca6443ac0af338e2677b45e538782bb7a12e refs/changes/00/100000/meta
19:16:57.037502 pkt-line.c:80           packet:        fetch< b4af8cad4d3982a0bba763a5e681d26078da5a0e refs/changes/00/100400/1
19:16:57.037513 pkt-line.c:80           packet:        fetch< 9ec6e507c493f4f1905cd090b47447e66b51b7e1 refs/changes/00/100400/meta
19:16:57.037523 pkt-line.c:80           packet:        fetch< a80359367529288eea3c283e7d542164bced1e2f refs/changes/00/100800/1
19:16:57.037533 pkt-line.c:80           packet:        fetch< 170cced6d81c25d1082d95e50b37883e113efd01 refs/changes/00/100800/meta
19:16:57.037544 pkt-line.c:80           packet:        fetch< 6cb616e0ad4b3274d4b728f8f7b641b6bd22dce4 refs/changes/00/100900/1
19:16:57.037554 pkt-line.c:80           packet:        fetch< 286d1ee1574127b76c4c1a6ef0f918ad4c61953a refs/changes/00/100900/meta
19:16:57.037606 pkt-line.c:80           packet:        fetch< 312ba566d2620b43fb90be3e7c406949edf6b6d9 refs/changes/00/10100/1
19:16:57.037619 pkt-line.c:80           packet:        fetch< dde4b73cb011178584aae4fb29a528018149d20b refs/changes/00/10100/meta

…. This will go on forever …. 

As you can see there is a massive difference in the data sent back on the wire!

How to enable it?

If you want to enable it, you just need to update you git config (etc/jgit.config in 3.1 and $HOME/.gitconfig in previous versions) with the protocol version to enable it and restart your server:

[protocol]
  version = 2

Enjoy your new blazing fast protocol!

If you are interested in more details about the Git v2 protocol you can find the specs here.

Fabio Ponciroli (GerritForge)
Gerrit Code Review Contributor

Crunch Code Review hashtags with Gerrit DevOps Analytics

Screenshot 2020-05-12 at 11.13.06.pngWe have already discussed in previous posts how important it is to speedup the feedback loop in your Software Development Lifecycle. Having early feedbacks gives you the chance of evaluating your hypothesis and eventually change direction if needed. 

The more information you have, the smarter can be your decisions.

We recently added in our Gerrit DevOps Analytics the possibility of extracting data coming from Code Reviews’ metadata to extend the knowledge we can get out of Gerrit.

Furthermore, it is possible to extract meta-data from repositories not necessarily hosted on the Gerrit instance running the analytics processing. This is a big improvement since it allows to fully analyse repositories coming from any Gerrit server.

For example, the Gerrit analytics we are providing on https://analytics.gerrithub.io are coming from the Gerrit repository hosted on the gerrit-review.googlesource.com, the Gerrit server hosted by Google.

Hashtags aggregation

One important type of meta-data contained in the Code Reviews is the hashtag.

Hashtags are freeform strings associated with a change, like on social media platforms. In Gerrit, you explicitly associate hashtags with changes using a dedicated area of the UI; they are not parsed from commit messages or comments.

Similar to topics, hashtags can be used to group related changes together and to search using the hashtag: operator. Unlike topics, a change can have multiple hashtags, and they are only used for informational grouping; changes with the same hashtags are not necessarily submitted together.

You can use them, for example, to mark and easily search all the changes blocking a particular release:

Screenshot 2020-05-12 at 10.43.00.png

Hashtags can also be used to aggregate all the changes people have been working on during a particular event, for example, the Gerrit User Summit 2019 hackathon:

Screenshot 2020-05-12 at 10.44.39.png

The latest version of the Gerrit Analytics plugin exposes the hashtags attached to their respecting Git commit data. Let’s explore together some use cases:

The most popular Gerrit Code Review hashtags over the last 12 months

Screenshot 2020-05-12 at 10.47.49.png

Throughput of changes created during an event

see for example the Palo alto hackathon (#palo-alto-2018). We can see at the end of the week the spike of changes to release Gerrit 2.16.

Screenshot 2020-05-12 at 10.56.44.png

The extend of time for a feature

Removing GWT was an extensive effort which started in 2017 and ended in 2019. It took several hackathons to tackle the removal as shown by the hashtags distribution. Some changes were started in one hackathon and finalised in the next one.

Screenshot 2020-05-12 at 11.05.51.png

Those were some example of useful information on how to leverage the power of GDA.

The above examples are taken from the GDA dashboard provided and hosted by GerritForge on https://analytics.gerrithub.io which mirror commits and reviews on a regular basis from the Gerrit project and its plugin ecosystem.

How to setup GDA on Gerrit

Hashtag extraction is currently available from Gerrit 3.1 onwards. You can download the latest version released from the Gerrit CI.

To enable hashtag extraction you need to enable the feature extraction in the plugin config file as follow:

# analitycs.config
[contributors]
  extract-hashtags = true

For more information on how to configure and run the plugin, look at the analytics plugin documentation.

Conclusion

Data is the goldmine of your company. You need more and more of it for making smarter decision. The latest version of the GDA allows you to leverage even more data produced during the code review process.

You can explore the potential of the information held in Gerrit on the analytics dashboard provided by GerritForge on analytics.gerrithub.io.

If you would like your open source Gerrit hosted project to be added to our dashboard or would need help in setting up and supporting GDA for your organization, get in touch with GerritForge Sales Team and we can help you making smarter decisions today.