14 years of JGit/EGit Code Reviews migrated to GerritHub

21 November 2023 (Sunnyvale, CA) – GerritForge Inc. the leader in Gerrit Code Review Enterprise Support, has successfully re-hosted the Eclipse JGit/EGit projects on GerritHub.io, preserving 14 years of the repository history, including all changes, reviews and comments. Everything that has been produced and was historically available on the https://git.eclipse.org/r website is now fully available on https://eclipse.gerrithub.io.

From repo.or.cz to Eclipse

Shawn Pearce (RIP) started the JGit project back in 2006 on repo.or.cz and later joined Google in 2008 where he was given the task to adapt the Gerrit Rietveld Code Review tool for the development of the Android Operating System.

Later in 2009 Shawn started the dogfooding practice by also re-hosting the project on Gerrit Code Review instance, kindly offered to the Eclipse Foundation as self-hosting of the Eclipse plugin for Git (i.e. EGit) and its 100% pure Java implementation of the Git protocol and data format (i.e. JGit). The URL of the self-hosted dogfooding Gerrit instance was https://egit.eclipse.org which was later exposed as https://git.eclipse.org/r.

Here is the first Gerrit change https://git.eclipse.org/r/c/egit/egit/+/1 hosted on the first Gerrit Code Review Server Shawn Pearce and Matthias Sohn hosted ourselves on a vserver we got from Eclipse foundation.

Since then, the Gerrit Code Review project has massively evolved, and Google adopted the tool for all its Open-Source projects in a highly available multi-site and multi-domain setup across the globe. Noteworthy examples are https://gerrit-review.googlesource.com, https://android-review.googlesource.com and https://chromium-review.googlesource.com.

Project growth on Eclipse

The Eclipse Foundation started to encourage all of its projects to adopt Gerrit Code Review, which became the main hub where all the other Open-Source components and contributors were uploading their code and collaborating.

Today, the https://git.eclipse.org/r site hosts over 1300 repositories and tens of thousands of contributors and reviewers.

The risks of the announced shutdown 

The Eclipse Foundation started looking at more comprehensive hosting solutions well beyond pure Git hosting and associated Code Review, including GitHub and GitLab and started using them side-by-side with their existing https://git.eclipse.org/r.
In November 2021, the organisation decided to shut down the Gerrit Code Review instance giving as alternatives to migrate the projects to either GitHub or GitLab.

Although both GitHub and GitLab would have offered to keep the code history of all projects, the review information would have been completely lost. Gerrit Code Review has a JSON format (code-named NoteDb) for storing all the review comments together with the repository so that code and review meta-data can be kept safe in the same place. However, GitHub and GitLab have a more traditional relational DBMS approach and would have been unable to render Gerrit’s NoteDb.

If the project would have migrated to GitHub or GitLab, they could have created three main issues:

  1. All the review history would have been formally accessible in the repository but not visible on the GitHub or GitLab UI
  2. All associations between the NoteDb data and the committers’ identity would have been lost.
  3. New reviews of the code developed on GitHub or GitLab UI would have been stored on a server-side relational DBMS.

GerritForge offers to rescue 14 years of review data

GerritForge, the largest contributor to the Gerrit Code Review project outside of Google, leader of the Gerrit Code Review Enterprise Support, launched a new dogfooding project called GerritHub.io back in 2013 with the aim of providing the richer Code Review experience of Gerrit on top of every GitHub repository.

The main goal of GerritHub.io was to enable anyone who has a public or private repository on GitHub to use Gerrit Code Review on top of their existing data. All the authentication, authorisation and publishing of the repository stay on GitHub, whilst GerritHub.io provides the Code Review and collaboration experience.

Because the Eclipse Foundation offered GitHub as one of the alternatives to https://git.eclipse.org/r, GerritHub.io was the most likely candidate to achieve a win-win situation:

  1. The Eclipse Foundation‘s win: they have been able to shut down https://git.eclipse.org/r and save on hosting and maintenance costs.
  2. The projects’ win: all their repositories would have been moved to GitHub, and all existing 14 years of review history and new reviews would be accessible through GerritHub.io

The migration project from git.eclipse.org/r to eclipse.gerrithub.io

The migration journey started six months ago, when Matthias Sohn, the project leader of JGit and EGit, announced on the Eclipse Foundation issue tracker that he was planning to use GerritHub.io as Code-Review frontend for his migrated projects in GitHub.

The project was made possible thanks to the introduction of the “importing feature” in Gerrit v3.7, where projects can be moved between Gerrit instances by keeping their change numbers, accounts identities mapping and all associated review data.

Using existing GitHub projects on GerritHub.io is straightforward, and anyone can get started in a matter of minutes; however, the Eclipse Foundation case was more complex because of multiple additional requirements:

Last but not least, the migration from https://git.eclipse.org/r to https://eclipse.gerrithub.io needed to be completed with zero downtime and minimal disruption for the existing committers and contributors to the project. Therefore, a classic “big-bang” migration with a planned outage was not an option.

Gerrit multi-site and the enablement of smooth migration paths

Gerrit Code Review has been multi-site at Google for many years, but that deployment was limited to the forked version hosted in Google’s data centres.
GerritForge and the rest of the Open-Source community have invested a lot into publicly available multi-site support since 2018, and it is currently able to provide an equivalent solution on a standard infrastructure, leveraging a global-refdb and events-broker off-the-shelf.

Being multi-site means that the “logical domain” (e.g. eclipse.gerrithub.io), instead of being served by a set of hosts in a single data centre, it can point to different locations across the globe, all active at the same time and accept read/write operations, such as Git push, clone, fetch and code-reviews. The full design of the solution is available on the multi-site plugin repository

When two users are pushing code at the same time to two different sites, Gerrit will check the destination refs against the SHA1 stored in the global-refdb and will coordinate the transactions to avoid ending up in a split-brain situation. Synchronisation between sites is achieved using the pull-replication plugin.

Gerrit Code Review is designed to be future-proof, thanks to a clear separation and contract between the front end and the backend REST-API. That allows a smooth blue-green migration between releases because every release of Gerrit is forward and backwards compatible with its next release +1. For example, GerritHub.io is running two different versions of Gerrit Code Review on different sites as we speak: v3.8.2 in the US and Canada (https://review-am.gerrithub.io) and v3.9.0-rc5 in Europe (https://review-eu.gerrithub.io), without anyone noticing any disruption. Each site progresses towards newer releases bi-weekly whilst the overall service remains active.

Project-based migration from git.eclipse.org to eclipse.gerrithub.io

Gerrit projects include all the commits and meta-data in the same repository and, therefore, have the perfect design to allow an easy migration between servers. However, there are some gotchas:

  • Every Gerrit server has a server-id associated with it, which is used to “tag” every change. That prevents Gerrit from parsing and indexing data that does not necessarily belong to the server.
  • Every NoteDb meta-data record is strictly decoupled from any Personal Identifiable Information (aka PII), including the full name and e-mails of the authors, committers, owners and reviewers of the changes under review. The lookup between the anonymised identity (aka account-id) and the PII is contained in a centralised repository called ‘All-Users.git’, which isn’t accessible.
  • Every change has a unique incremental number associated with it, the change number. The numbering sequence is unique per Gerrit server, but when moving projects between different servers, you may have numbering conflicts.

Luca Milanesio and Matthia Sohn, both maintainers of the Gerrit Code Review project, have cooperated to find solutions to all three problems and have included them in Gerrit v3.7 onwards.

GerritForge has configured the server ID of git.eclipse.org as an “external imported server ID” so that every project coming from the Eclipse Foundation can be parsed and indexed. Its review metadata is rendered on the UI.

The identities are mapped using the public REST-API https://git.eclipse.org/r/accounts/NN/detail, which allows the association of GerritHub users with the legacy Eclipse Foundation account IDs matched by e-mail address.

With regards to the change numbers, the legacy sequence numbers coming with https://git.eclipse.org/r are in conflict with the changes on GerritHub.io; see, for example, https://review.gerrithub.io/5819 and https://git.eclipse.org/r/5819, both valid change numbers but pointing to different projects on different servers.
GerritForge has developed a new ad-hoc plugin to allow existing URLs, previously pointing to https://git.eclipse.org/r, to continue to work as expected on the projects migrated to eclipse.gerrithub.io.
The plugin has a full list of the legacy URLs on https://git.eclipse.org/r and performs the correct redirect to the full equivalent project / change on eclipse.gerrithub.io.
For example, https://git.eclipse.org/r/5819 and https://eclipse.gerrithub.io/5819 are both referring to the same Change-Id:Iff84409c of the JGit project.

eclipse.gerrithub.io as a Gerrit Code Review multi-tenant domain

Gerrit Code Review has secretly supported multi-tenant domains for over a decade; however, that was implemented using a private fork implemented at Google and only in their data centres, as Patrick Hiesel presented at the Gerrit User Summit 2017 in London.

The Open-Source version does not have support for multi-tenancy in the Gerrit core. However, I developed a minimalistic solution six years ago that would give the “user experience” of virtual hosting on Gerrit.
The idea behind the solution is quite simple: hide unwanted projects based on the full domain name, pretty much like the virtual hosts work on the HTTP Servers world.

For example, you could define eclipse.gerrithub.io as follows:

 [server "eclipse.gerrithub.io"]
  projects = eclipse-jgit/*
  projects = eclipse-egit/*

Shawn himself was stunned when he saw the source code of the virtual-host libmodule back in 2017, with the comment “how did I end up writing so much code, if you did everything in just 7 Java classes?”

To be fair, the solution Shawn implemented on review-*.googlesource.com was a lot more comprehensive than the virtual-host libmodule, because it also included the ability to have different gerrit.config per tenant, whilst the solution implemented on GerritHub.io is a simple extra permission filter applied based on the domain name.

That means that all the Eclipse repositories are effectively available on any of the GerritHub.io sites and also accessible with the main domain URL https://review.gerrithub.io; the filtering on the virtual-host is a pure visibility setting for avoiding the users coming from the Eclipse Foundation from being overwhelmed by the other 50k projects hosted on GerritHub.io.

The advantage is that all the current GerritHub.io sites replicate the Eclipse Foundations repositories, providing, therefore, additional redundancy to the overall setup. All commits pushed to any of the repositories on eclipse.gerrithub.io will also be replicated to all sites, including the ones NOT starting with eclipse.gerrithub.io. Thanks to this redundancy, all the projects hosted on GerritHub.io can benefit from an astonishing 99.997% availability, well above any other free Git hosting sites for Open-Source available right now.

What’s next for the other 1,300 repositories on git.eclipse.org?

The work done for migrating the JGit and EGit projects to https://eclipse.gerrithub.io is the ground needed for the reuse of the same path for many more repositories and projects that want to keep their review history before the legacy git.eclipse.org site is going to be shut down by the Eclipse Foundation.
The scope definition, the user accounts association, and the provision of the users and projects are going to be exactly the same for any other project that wants to move to keep its history.

Once all the projects are migrated, the Eclipse Foundation can define a redirection rule that serves all the incoming requests to https://git.eclipse.org/r and redirects them to https://eclipse.gerrithub.io.

Lessons learnt and takeaway for other migrations

Migrating projects between Gerrit instances was declared impossible just a few years ago; however, that was the end goal of the whole Gerrit NoteDb project. Shawn Pearce used to say that he “would like to make all his reviews locally on his laptops and just push code and reviews once they landed“, making the Code Review an integral part of the Git data format.

The success of this migration project is the demonstration that Shawn’s vision was really innovative and, thanks to the cooperation of the community, projects can last and persevere well beyond the boundaries and lifetime of the people who initially founded them.

Migrating projects and consolidating Gerrit Servers is not something that is only applicable to this example of the Eclipse Foundation server shutdown, but can be further applied to other domains and use cases.
Companies are constantly changing, splitting and merging; projects need to follow the organisation and also move between Gerrit Servers and domains.

All the innovations introduced in Gerrit v3.7 and beyond can serve as an example of the implementation of a different migration path compared to the traditional big-bang approach.

One important lesson from the Eclipse Foundation’s experience is that every migration comes with many little but important details: all of them need accurate evaluation, implementation and testing. Upfront planning is needed; however, many times, many more details are found along the migration path, making it difficult to estimate correctly all the efforts and costs associated. Migrating is like doing daily exercising, the first round sounds quite lengthy and challenging, however, the following rounds can reuse the tools and experience earned in the previous migrations.

Lastly, this exercise has shown how important it is to keep the project’s history for planning its future. It would have been unthinkable for the JGit/EGit projets to continue developing without being able to leverage the learnings, discussions and experience from the past.

“The Code Review history is our legacy; learning from our past gives us direction for our future.”

Luca Milanesio
GerritForge, Inc. – CEO and CTO
Gerrit Code Review Maintainer
Gerrit Release Manager
Member of the Gerrit Engineering Steering Committee

What’s new in Gerrit Code Review v3.2

gerrit-code-review-3.2-intro

Gerrit Code Review is unstoppable: despite the recent COVID-19 pandemic and the cancellation of the Spring Hackathon 2020, the community has made an extraordinary effort to deliver remotely and on-time the Gerrit v3.2 release on the 1st of June.

GerritForge has already migrated GerritHub.io on the day-1 of the release and is happy to share with you the highlights of this new release. If you need help to assess your current setup and migrating, please get in touch with us at https://gerritforge.com/contact.

Get ready to migrate: get rid of zombie comments

The migration process performs the cleanup of the zombie draft comments in the All-Users.git repository that has been left behind since the introduction of NoteDb back in v2.16.
Every user commenting on any change was creating a series of commits on the All-Users.git repository, where the draft comments are stored. Once the comments were finalised and applied to the change, they were not fully removed from the All-Users.git. That created a backlog of zombie comments on All-Users.git that are now being completely removed during the Gerrit v3.2 migration process.

Since Gerrit v2.16.16, there is a standalone utility to remove the zombie draft comments. You may want to do that operation upfront to make sure that the migration to v3.2 does not have a lot of processing during the init step. Also, make sure that the All-Users.git resides on a fast access local filesystem for minimizing the migration time.

If you do nothing, the cleanup utility will be automatically executed when migrating to Gerrit v3.2, bearing in mind that it may take quite a long time to complete. In our tests, it took around 10 minutes for 10k zombie comments.

WARNING: the execution time is not linear and it may take up to 48h of processing time for a staggering number of 1M zombie comments.

Migrate with zero-downtime

If you have on Gerrit v3.1.x in a high-availability configuration, you can upgrade seamlessly to Gerrit v3.2, without having to suspend or degrading the service in any way. GerritForge has a record number of installations done in high-availability and multi-site: if you are running a single Gerrit master today, you should get in touch with the GerritForge Team to help moving to high-availability.

For the very first time, the whole Gerrit Community can benefit from the ability to perform a rolling upgrade without any downtime.

The zero-downtime upgrade consists of the following steps:

  1. Have Gerrit masters upgraded to v3.1.6 (or later) in a high-availability configuration, healthy and able to handle the incoming traffic properly.
  2. Set gerrit.experimentalRollingUpgrade to true in gerrit.config on both Gerrit masters.
  3. Set the first Gerrit master unhealthy.
  4. Shutdown the first Gerrit master and then upgrade to v3.2.
  5. Startup the first Gerrit master and wait for the on-line reindex to complete.
  6. Please verify that the first Gerrit master is working correctly and then make it healthy again.
  7. Wait for the first Gerrit master to start serving traffic regularly.
  8. Repeat steps 3. to 7. for the second Gerrit master.
  9. Remove gerrit.experimentalRollingUpgrade from gerrit.config on both Gerrit masters.

NOTE: Gerrit v3.1.6 has not been released yet. However, if you want to perform a rolling upgrade today, you can download the latest build on the stable-3.1 branch from the GerritForge’s CI at https://gerrit-ci.gerritforge.com/job/Gerrit-bazel-stable-3.1/

GerritHub.io has been successfully upgraded on the 1st of June without any interruption of any kind using the above procedure.

Java 11 official support

Gerrit is now officially supported on Java 11, in addition to Java 8. Running on Java 11 was already possible from v2.16.13, v3.0.4 and v3.1.0, but not officially supported because of the lack of a CI validation on Java 11 for stable-2.16, stable-3.0 and stable-3.1 branches.

Gerrit v3.2 has been validated with Java 11, with the following known issues:

  • Issue 11567: Java 11 runtime & startTLS LDAP broken: ‘error code 8 – BindSimple: Transport encryption’.
  • Issue 12639: WARNING: An illegal reflective access operation has occurred, when starting Gerrit.

After 24h of adoption of Gerrit v3.2 on GerritHub.io, we have seen two major benefits from the migration to Java 11: overall reduction of the “old generation” build up in the JVM heap and massive reduction of GC cycles times and full-GCs.

screenshot-2020-06-02-at-11.48.30

Before the 29th of May, all GerritHub.io nodes were on Gerrit v3.1 / Java8. The old-generation JVM heap keeps on building up constantly until it reaches the 60GB and triggers a full GC cycle. After the upgrade to Gerrit v3.2 / Java11, memory consumption is very much under control. There are still possibilities of peaks with associated full GCs (see the one on the 30th of May around 12:00 BST) but there isn’t build up of old-generation objects anymore.

screenshot-2020-06-02-at-11.52.43

Java11 brings a lot of benefits also in reducing the latency of the individual GC cycles, showing much better performance with large heaps.
After the migration on the 29th of May, the GC graph is pretty much flat. The only full GC peak that is noticeable on the 30th of May lasted for just 5 msecs while the normal GC cycles are well below 1 msec, barely noticeable.

Performance is a feature

Shawn Pearce, the Gerrit Code Review project founder, used to say “performance is a feature”, which is very true. Any software nowadays can provide some basic out of the box features, thanks to the plethora of open-source components available out of the box. However, designing architecture and making it scale and perform to the levels that an Enterprise Code Review system needs, it is not easy.

Gerrit v3.2 is yet another significant milestone in the continued effort of the Gerrit maintainers and contributors in making Gerrit Code Review faster, more stable and available than ever before.

Performance tuning isn’t a “one-off task” but is a continuous improvement on thousands of little details ranging from the front-end javascript tuning down to the backend of the platform.

New accounts cache

From the data collected on googlesource.com Patrick Hiesel (Google) has identified the accounts loading from NoteDb as a significant cause of the delay of backend calls. That is true for all Gerrit installations, but especially for distributed setups or setups that restart often.

Gerrit v3.2 introduces a brand-new AccountCache decomposed into smaller chunks that can be cached individually:

  1. External IDs + user name (cached in ExternalIdCache)
  2. CachedAccountDetails (newly cached)
  3. Gerrit’s default settings CachedAccountDetails – a new class representing all information stored under the user’s ref (refs/users/<sharded-id>).

The new structure is cleverly designed to require a lot less I/O when an entry needs to be reloaded and lowering the ratio of cache-miss in case of user’s details updates.

The new structure has the following advantages:

  1. CachedAccountDetails contains only details from refs/users/<sharded-id>. By that, we can use the SHA1 of that ref as cache key and start serializing the cache to eliminate cold start penalty as well as router assignment change penalty (for distributed setups). It also means that we don’t have to do invalidation ourselves anymore.
  2. When the server’s default preferences change, we don’t have to invalidate all accounts anymore.
  3. The projected speed improvements that come from persisting the cache makes it so that we can remove the logic to load accounts in parallel.

Migration to Polymer 3

PolyGerrit UX roadmap continues with yet another important milestone: the migration to Polymer 3. The result is visible with an improved polishing of the GUI and significant speedup of rendering and reduction of page loading times.
There are a significant amount of small refinements to the GUI as well, coming from a meticulous work of fixes included in this release.
Not by surprise, the number of issues fixed in v3.2 on the PolyGerrit UX outnumbers by far the overall changes in the release notes.

gerrit-3.2-findings

PolyGerrit is giving special attention to the classification of the feedback coming from robots rather than humans.
Most of the efforts made in the past 12 months target the improvement the support for robot-comments and giving some extra dedicated space for them.
In Gerrit v3.2 there is a special place for them in a brand-new “Findings” tab. It is currently empty on GerritHub.io as people did not start using them much. However, I do see a lot of space of adoption of this new feature, giving the ability for more integration of linters and automatic validation feedback in this tab.

A flooding of fixes and small improvements

The list of fixes and improvements in Gerrit v3.2 is really huge. Please check the release notes on the Gerrit Code Review release page for all the details.

There are a lot of reasons to migrate to Gerrit v3.2, the fastest, more stable and scalable release of Gerrit Code Review ever.


Thanks a lot to the whole Gerrit Code Review Community of maintainers and contributors for making this release happen. Thanks to Patrick Hiesel for the technical description of the account cache improvements and the replication clustering.

Luca Milanesio (GerritForge)
Gerrit Code Review Maintainer, Release Manager, ESC Member

 

GerritHub.io is moving to Gerrit v3.0

It has been a very long journey, from the initial adoption of PolyGerrit at GerritHub to the epic moment where Gerrit historic GWT was dropped with the Gerrit v3.0 last month.

GerritHub.io has always been aligned with the latest and greatest of Gerrit Code Review and thus the moment has come for us to upgrade to v3.0 and drop forever the GWT UI.

PolyGerrit vs. GWT adoption

Screenshot 2019-06-10 at 21.16.48

The PolyGerrit UX was pretty much experimental until the beginning of 2018: the features were incomplete and people needed to go back to the old GWT UI for many of the basic use-cases.

However, things started to change radically in April 2018 when GerritHub.io adopted Gerrit v2.15 which had a 100% functionally complete PolyGerrit UI. The number of users choosing PolyGerrit jumped from 10% to 35% (3.5x times) with a +70% growth in the number of accesses overall. That means that the adoption was mainly driven by users attracted by the new UI.

In the past 12 months, PolyGerrit became the default user-interface and was just renamed as Gerrit UI. Gradually more and more users abandoned the old GWT interface that now represents 30% of the overall accesses.

Timeline of the upgrade

For the 70% of people that are using already using the new Gerrit UI, the upgrade to Gerrit v3.0 would not be noticeable at all:

  • Gerrit v3.0 UI is absolutely identical to the current one in v2.16
  • All existing API and integration points (e.g. Jenkins integration) in Gerrit v3.0 are 100% compatible with v2.16

For the 30% of people that are still using the old GWT UI, things will be very different as their favorite interface will not be available anymore.

The upgrade will happen with zero-downtime across the various GerritHub.io multi-site deployments and will start around mid-June.

Can I still use GWT with GerritHub.io?

The simple answer is NO: Gerrit v3.0 does not contain any GWT code anymore and thus it is impossible for GerritHub.io to bring back the old UI.

The journey to fill the gaps and reach 100% feature and functional equivalence between the old GWT and the new Polymer-based UI took around 6 years, 18k commits and 1M lines of code written by 260+ contributors from 60+ different organizations. It has been tested by hundreds of thousands of developers across the globe and is 100% production-ready and functionally complete.

If you feel that there was “something you could do in the GWT UI and cannot do anymore with the new Polymer-based UI”, please file a bug to the Gerrit Code Review issue tracker and you will get prompt attention and replies from the community.

Can I stay with Gerrit v2.16 on GerritHub.io?

If your organization cannot migrate to Gerrit v3.0, you could still request a dedicated hosting to GerritForge Ltd, which is the company behind GerritHub.io.

Please fill up the GerritForge feedback form and one Sales Representative will come back to you with the possible options and costs associated.

If you fully endorse GerritHub.io with Gerrit v3.0 and start using the new UI, the service will continue to be FREE for public and private repositories, organizations of all types and size. You can optionally purchase Enterprise Support from one of our plans if you require extra help in using and configuring your Gerrit projects with your tools and organization.

Enjoy the future of Gerrit v3.0 with GerritHub.io and GerritForge.

Luca Milanesio, GerritForge Ltd.
Gerrit Code Review Maintainer and Release Manager
Member of the Engineering Steering Committee

Gerrit v3.0 is here

GerritSprintHackathon2019.photo

Gerrit v3.0 has been released during the last Spring Hackathon at Google in Munich involving over 20+ developers for one week.

It can be downloaded from www.gerritcodereview.com/3.0.html and installed on top of any existing Gerrit v2.16/NoteDb installations. Native packages have been distributed through the standard channels and upgrading is as simple as shutting down the service, running the Rpm, Deb or Dnf upgrade command and starting again.

You can also try Gerrit v3.0 using Docker by simply running the following command:

docker run -ti -p 8080:8080 -p 29418:29418 gerritcodereview/gerrit:3.0.0

This article goes through the whole history of the Gerrit v3.0 development and highlights the differences between the previous releases.

Milestone for the Gerrit OpenSource Project

Finally, after 6 years, 18k commits and 1M lines of code written by 260+ contributors from 60+ different organizations, Gerrit v3.0 is finally out.

The event is a fundamental milestone for the project for two reasons:

  • The start of a new journey for Gerrit, without the legacy code of the old GUI based on Google Web Toolkit and without any relational database. Gerrit is now fully based on a Git repository and nothing else.
  • The definition of a clear community organization, with the foundation of a new Engineering Steering Committee and the role of Community Manager.

The new structure will drive the product forward for the years to come and will help to define a clear roadmap to bring back Gerrit at the center of the Software Development Pipeline.

Evolution vs. revolution

When a product release increments the first major number, it typically introduces a series of massive breaking changes and, unfortunately, a period of instability. Gerrit, however, is NOT a typical OpenSource product, because since the beginning it has been based on rigorous Code Review that brought stability and reliability from its initial inception back in 2008. Gerrit v3.0 was developed during the years by following a rigorous backward compatibility rule that has made Gerrit one of the most reliable and scalable Code Review systems on the planet.

For all the existing Gerrit v2.16 installations, the v3.0 will be much more similar to a rather minor upgrade and may not even require any downtime and interruption of the incoming read/write traffic, assuming that you have at least a high-availability setup. How is this possible? Magic? Basically, yes, it’s a “kind of magic” that made this happen, and it is all thanks to the new repository format for storing all the review meta-data: NoteDb.

Last but not least, all the feature that Gerrit v3.0 brings to the table, have been implemented iteratively over the last 6 years and released gradually from v2.13 onwards. Gerrit v3.0 is the “final step” of the implementation that fills the gaps left open in the past v2.16 release.

With regards to statistics of the changes from v2.16 to v3.0, it is clear that the code-base has been basically stabilized and cleaned up, as you can see from the official GerritForge Code Analytics extracted from analytics.gerrithub.io .

  • 1.5k commits from 63 contributors worldwide
  • 62k lines added and 72k lines removed
  • Google, CollabNet, and GerritForge are the top#3 organizations that invested in developing this release

In a nutshell, the Gerrit code-base has shrunk of 10k lines of code, compared to v2.16. So, instead of talking of what’s new in v3.0, we should instead describe what inside the 72k lines removed.

Removal of the GWT UI

The GWT UI, also referred to as “Old UI” has been around since the inception of the project back in 2008.

Gerrit.GWT-UI

Back in 2008, it seemed a good idea to build Gerrit UI on top of GWT, a Web Framework founded by Google two years earlier and aimed at reusing the same Java language for both backend and the Ajax front-end.

However, starting in 2012, things started to change. The interest of the overall community in GWT decreased, as clearly shown by the StackOverflow trends.

Screenshot 2019-05-18 at 23.34.42

In 2015, Andrew Bonventre from the Chromium Project, one of the major users of the Gerrit Code Review platform, apart from the Android Developers, presented the new prototype of the Gerrit Code Review UI, based on the Polymer project, with the code-name of PolyGerrit, and merged as change #72086.

commit ba698359647f565421880b0487d20df086e7f82a
Author: Andrew Bonventre <andybons@google.com>
Date: Wed Nov 4 11:14:54 2015 -0500

Add the skeleton of a new UI based on Polymer, PolyGerrit

This is the beginnings of an experimental new non-GWT web UI developed
using a modern JS web framework, http://www.polymer-project.org/. It
will coexist alongside the GWT UI until it is feature-complete.

The functionality of this change is light years from complete, with
a full laundry list of things that don't work. This change is simply
meant to get the starting work in and continue iteration afterward.

The contents of the polygerrit-ui directory started as the full tree of
https://github.com/andybons/polygerrit at 219f531, plus a few more
local changes since review started. In the future this directory will
be pruned, rearranged, and integrated with the Buck build.

Change-Id: Ifb6f5429e8031ee049225cdafa244ad1c21bf5b5

The PolyGerrit project introduced two major innovations:

  • Gerrit REST-API: for the first time the interaction of the code-review process has been formalized in stable and well-documented REST-API that can be used as “backend contract” for the design of the new GUI
  • The PolyGerrit front-end Team: for the first time, a specific experienced Team focused on user experience and UI workflow was dedicated to rethink and redesign iteratively all the components of the Gerrit Code Review interactions.

The GWT UI and PolyGerrit lived in the same “package” from v2.14 onwards for two years, with the users left with the option to switch between the two. Then in 2018 with v2.16 the PolyGerrit UI became the “default” interface and thus renamed just “Gerrit” UI.

With Gerrit v3.0, the entire GWT code-base in Gerrit has been completely removed with the epic change by David Ostrovsky “Remove GWT UI“, which deleted 33k lines of code in one single commit.

The new Polymer-based UI of Gerrit Code Review is not very different than the one seen in Gerrit v2.16, but includes more bug fixes and is 100% feature complete, including the projects administrations and ACLs configuration.

Screenshot 2019-05-18 at 22.58.13

Removal of ReviewDb

Gerrit v3.0 does not have a DBMS anymore, not even for storing its schema version as it happened in v2.16. This means that almost everything gets stored in the Git repositories.

The journey started back in October 2013, when Shawn Pearce gave to Dave Borowitz the task to convert all the review meta-data managed by Gerrit into a new format inside the Git repository, called NoteDb.

After two years of design and implementation, Dave Borowitz presented NoteDb at the Gerrit User Summit 2015 and called Gerrit v3.0 the release that will be fully working without the need of any other external DBMS (see the full description of the talk at https://storage.googleapis.com/gerrit-talks/summit/2015/NoteDB.pdf).

Google started adopting NoteDb in parallel with ReviewDb on their own internal setup and in June 2017, the old changes table was definitely removed. However, there was more in the todo-list: at the Gerrit User Summit 2017, Dave Borowitz presented the final roadmap to make ReviewDb finally disappear from everyone’s Gerrit server.

Screenshot 2019-05-18 at 23.18.28

In the initial plans, the first version with NoteDb fully working should have been v2.15. However, things went a bit differently and a new minor release was needed in 2018 to make the format really stable and reliable with v2.16.

Gerrit v2.16 is officially the last release that contains both code-bases and allows the migration from ReviewDb to NoteDb.

Dave Borowitz used the hashtag “RemoveReviewDb” to allow anyone to visualize the huge set of commits that removed 35k lines of code complexity from the Gerrit project.

Migrating to Gerrit v3.0, step-by-step

Gerrit v3.0 requires NoteDb as pre-requisite: if you are on v2.16 with NoteDb, the migration to v3.0 is straightforward and can be done with the following simple steps:

  1. Shutdown Gerrit
  2. Upgrade Gerrit war and plugins
  3. Run Gerrit init with the “batch” option
  4. Start Gerrit

If you are running Gerrit in a high-availability configuration, the above process can be executed on the two nodes individually, with a rolling restart and without interrupting the incoming traffic.

If you are running an earlier version of Gerrit and you are still on ReviewDb, then you should upgrade in three steps:

  1. Migrate from your version v2.x (x < v2.16) to v2.16 staying on ReviewDb. Make sure to upgrade through all the intermediate versions. (Example: migrate from v2.13 to v2.14, then from v2.14 to v2.15 and finally from v2.15 to v2.16)
  2. Convert v2.16 from ReviewDb to NoteDb
  3. Migrate v2.16 to v3.0

The leftover of a DBMS stored onto H2 files

Is Gerrit v3.0 completely running without any DBMS at all? Yes and no. There is some leftover that isn’t necessarily associated with the Code Review meta-data and thus did not make sense to be stored in NoteDb.

  • Persistent storage for in-memory caches.
    Some of the Gerrit caches store their status on the filesystem as H2 tables, so that Gerrit can save a lot of CPU time after a restart reusing the previous in-memory cache status.
  • Reviewed flag of changes.
    Represents the flag that enables the “bold” rendering of a change, storing the update status for every user. It is stored by default on the filesystem as H2 table, however, can be alternatively stored on a remote DBMS or potentially managed by a plugin.

New core plugins

Some of the plugins that have been initially distributed only with the Native Packages and Docker versions are now an integral part of the WAR distribution as well:

  • delete-project
    which allows removing a project from Gerrit and the associated changes.
  • gitiles
    a lightweight code-browser created by Dave Borowitz based on JGit
  • plugin-manager
    the interface to discover, download and install Gerrit plugins
  • webhooks
    the HTTP-based remote trigger to schedule remote builds on CI systems or active any other service from a Gerrit event

The above four plugins already existed before Gerrit v3.0, but they were not included in the gerrit.war.

Farewell to Dave Borowitz and the PolyGerrit Team

After having completed the feature parity between GWT and PolyGerrit, the original PolyGerrit Team members left the Gerrit Code Review project.

Their journey came to an end with the release of the new shiny Polymer-based Gerrit UI. The PolyGerrit Team contributed 45k lines of code on 5.3k commits in 4 years.

Then the last event unfolded during the release of Gerrit v3.0: Dave Borowitz announced that he was leaving the Gerrit Code Review project. I defined the event like “Linus Torvalds announcing he was abandoning the Linux Kernel project”.

Dave Borowitz contributed 316k lines of code on 3.6k commits over 36 repositories in 8 years. He helped also the development of the new Gerrit Multi-Site plugin by donating its Zookeeper-based implementation of a global ref-database.

On behalf of GerritForge and the Gerrit Code Review community, I would like to thank all the past contributors and maintainers that made PolyGerrit and NoteDb code-base into Gerrit: Dave, Logan, Kasper, Becky, Viktar, Andrew and Wyatt.

Luca Milanesio – GerritForge
Gerrit Code Review Maintainer, Release Manager
and member of the Engineering Steering Committee

GerritHub is on NoteDb … with a bump

516px-Road-sign-Speed_bump.svg

The 26th of April at 9:10 AM EDT, the 400K changes on GerritHub.io have been successfully migrated to NoteDb.
See below the historical log entry in error_log.2018-04-26

[2018-04-26 09:10:55,429] [OnlineNoteDbMigrator] INFO com.google.gerrit.server.notedb.rebuild.OnlineNoteDbMigrator : Online NoteDb migration completed in 8630s

What is NoteDb?

NoteDb is the next generation of Gerrit storage backend, which replaces the traditional SQL backend for change and account metadata with storing data in the same repository as code changes. In a nutshell, you can access all the reviews from your local Git repository as well by using the “git log -p” command line and even when you are offline, which is really neat.

Whilst all the major competitors of Gerrit Code Review still rely on a traditional DataBase for reviews, NoteDb is innovative and provides many major benefits:

  • Simplicity
    All data is stored in one location in the site directory, rather than being split between the site directory and a possibly external database server.
  • Consistency
    Replication and backups can use a snapshot of the Git repository refs, which will include both the branch and patch set refs, and the change metadata that points to them.
  • Auditability
    Rather than storing mutable rows in a database, modifications to changes are stored as a sequence of Git commits, automatically preserving history of the metadata.
  • Extensibility
    Plugin developers can add new fields to metadata without the core database schema having to know about them.
  • New features
    Enables simple federation between Gerrit servers, as well as offline code review and interoperation with other tools.

Large-scale, world’s first.

GerritHub.io is the first large-scale Gerrit Code Review installation, apart from Google’s of course, that has hit essential records targets:

  1. The world’s most advanced and up-to-date Gerrit release in production: v2.15.1-143
  2. The world’s first NoteDb on-line migration in production

Being the “first” has a lot of advantages because allow people and companies to work faster and more efficiently than the competitors, which is paramount of the modern global economy. However, there are disadvantages as well: being the “first” means that at times you are going into unexplored space, and the road could be bumpy.

See below a summary of what happened yesterday on GerritHub.io during the NoteDb migration.

Timeline of events

06:47 AM – Starting online NoteDb migration

The online migration process starts. All incoming changes and reviews are still happening on ReviewDb, however, Gerrit start creating the /meta refs on the existing changes to translate all the existing DBMS records into Review Notes.

This migration state is called: WRITE (changes are written to both NoteDb and ReviewDb)

07:58 AM – Setting primary storage to NoteDb

The primary storage for new changes is moved to NoteDb. New changes will be stored to NoteDb while existing changes that have been modified between 6:47 AM and 7:58 AM will be delta-migrated and then flagged as “NoteDb only” one by one.

When a new change is created, it will be assigned from a sequence number coming from NoteDb and not anymore from ReviewDb.

08:01 AM – Errors when trying to push new changes to GerritHub.io

One developer of the Python zVM SDK OpenSource project tries to create a new change to GerritHub.io but receives the following error:

$ git push origin HEAD:refs/for/master
Counting objects: 10, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (5/5), done.
Writing objects: 100% (10/10), 657 bytes | 0 bytes/s, done.
Total 10 (delta 4), reused 0 (delta 0)
remote: Resolving deltas: 100% (4/4)
remote: Processing changes: new: 1, refs: 1, done
remote:
remote: New Changes:
remote: https://review.gerrithub.io/#/c/mfcloud/python-zvm-sdk/+/407671
remote:
To ssh://balaskoa@review.gerrithub.io:29418/mfcloud/python-zvm-sdk
! [remote rejected] HEAD -> refs/for/master (internal server error: Error inserting change/patchset)

Other errors are appearing with the identical symptoms on other projects. It isn’t, however, a general failure because other new changes are getting through and existing changes are reviewed correctly as expected.

08:36 AM – Problem notified to the Gerrit mailing list

The troubleshooting starts, and it seems that some of the new changes created on NoteDb have sequences in conflict with existing changes on ReviewDb but on other projects.

Not all changes are impacted though, so migration continues.

09:02 AM – Migrated primary storage

All the changes have been migrated and flagged as “NoteDb only”, there will be no more read access to ReviewDb for those.

09:06 AM – Cause identified

A bug has been identified in the code that manages the generation of sequencing numbers for the new changes on NoteDb: the switch to the primary storage to NoteDb has not updated the sequencing number on the All-Projects/refs/sequences/changes and thus new changes created may be conflicting with existing ones on ReviewDb.

09:10 AM – Migration completed

09:49 AM – Acknowledge by Google

Dave Borowitz, the leader of the Gerrit Code Review project, analyzes the discussion topic on the mailing list and agrees on the diagnosis of the issue.

Dave Borowitz words were: “Nice catch, thank you Luca.”

10:02 AM – GerritHub.io production patched, problem resolved.

9:05 PM – A software fix to the Gerrit v2.15 stable branch uploaded

A definite fix for the software glitch is uploaded to Gerrit-Review and is reviewed by the Gerrit Code Review contributors.

The “bump” on the road

Migration is always a pain, and you need to plan it, test and fix all the issues you can potentially verify in a “like-for-like” pre-production environment. However, this time at least, testing had produced a situation that was unprecedented.

When Gerrit was migrated from v2.14 to v2.15.1, traffic has been moved between Data-Centers (DCs), from Canada to Germany and then back to Canada, using a “ping-pong” technique with zero-downtime.
That means that the testing of the on-line migration to NoteDb has been tried *already* on the Canada-DC a few days ago and it actually succeeded and Gerrit stored the “last known sequence number” in ReviewDb into NoteDb.

The second NoteDb migration yesterday followed exactly the same trace of the previous test made on Canada-DC but, this time, the “last known sequence number” was not updated.
That is an “edge-case” that was not foreseen when writing the code and has produced the failures experienced by new changes.

Gerrit NoteDb code is very resilient and immediately detected the situation and avoided to insert and index the changes with conflicting IDs.

Statistics of migration

  • Total migration time: 2h 23m
  • Reaction time to investigate failures: 36m
  • Resolution time: 2h
  • Software fix: 13h
  • Number of changes impacted: 33 over 400k – 0.008%
  • Number of projects impacted: 14 over 14k – 0.1%
  • Data loss: 0%
  • Incidents created and closed: 3

Current situation

No more errors or problems reported, production is stable

GerritHub adopts 100% PolyGerrit

p-logo

GerritHub.io has been successfully migrated Gerrit Code Review v2.15. Thanks to the 5-days Gerrit Hackathon hosted by Axis in Lund, all the remaining issues we had on v2.15 have been resolved and all the 15k active users of GerritHub from today can use the 100% feature complete PolyGerrit UX.

What is PolyGerrit?

PolyGerrit is the code-name of the new UX wholly redesigned using web components, a set of web platform APIs that allow you to create new custom, reusable, encapsulated HTML tags to use in web pages and web apps. Custom components and widgets build on the Web Component standards, will work across modern browsers, and can be used with any JavaScript library or framework that works with HTML.

To access the PolyGerrit, just go to the GerritHub.io footer and click on the link “Switch to New UI” or add “?polygerrit=1” as an extra query string (e.g. https://review.gerrithub.io/?polygerrit=1)

To have a more comprehensive description of PolyGerrit, you can read the previous blog post about the Google talk at the past Gerrit User Summit in London.

Zero-downtime “ping-pong” migration to Gerrit v2.15

As usual, we migrated with near-zero-downtime, with only a “ten minutes *read-only* window” where we were waiting to drain the final replications were moved between Data-Centers.

There are two Data-Centers (DC) active for GerritHub.io; the main one is hosted in Canada and the second in Germany. Both DC have a high-availability configuration and run the same version of Gerrit with the same data. However, during major upgrades, we use a “ping-pong” technique to give a seamless experience to the users.

Ping phase

DC-Canada is with Gerrit v2.14 and DC-Germany gets upgraded to v2.15. Traffic gets forwarded smoothly from DC-Canada to DC-Germany thanks to the HAProxy that is serving traffic for https://review.gerrithub.io.

Pong phase

DC-Canada gets upgraded to v2.15 and DC-Germany sync back to Canada in near-real time. Once the upgrade is complete, the HAProxy forwards back the traffic to DC-Canada.

Benefits of the ping-pong upgrade

There two significant benefits of using the combined zero-downtime rollout with the ping-pong technique:

  1. No general service disruption, minimal read-only time.
    Nobody would notice any significant service disruption on GerritHub.io: the read-only window is a minor service degradation that lasts for only a few minutes. Given that 90% of the traffic is represented by “git fetch/clone” and web browsing, the degradation is hardly noticed by anyone.
  2. Validation of the disaster recovery procedure.
    Because the DC-Germany is used as disaster recovery site, it is essential to make sure that is always working fine and you can actually failover to it at any given time when needed.
    You do not want to find out that the disaster recovery isn’t working when is too late.

What’s new in Gerrit v2.15?

There are many changes in Gerrit v2.15, for the details you can have a look at the Google presentation at the Gerrit User Summit 2017 in London.

In a nutshell, here are the headlines of the most visible changes you will notice:

  1. Support for draft changes and draft patch sets has been completely removed.
    You have now two possible states for a change: WIP (work-in-progress) and Private. All the changes that were in “draft” status at the migration have been moved to WIP state.
  2. New URL Scheme
    Gerrit URLs generated and used by the UI include not just the change number but the project name as well.
    For instance, the Change 123 on project ‘myproject’ would now be accessible on the URL: https://review.gerrithub.io/#/c/myproject/+/123.
    Existing URLs containing only the change number (e.g. https://review.gerrithub.io/123) are redirected to the new scheme.
  3. New workflows on the PolyGerrit UX
    The PolyGerrit UX is now 100% feature complete. It is not only an engineering rewrite but also a whole redesign of the user-flows and experience.
    See at https://www.gerritcodereview.com/releases/2.15.md#new-workflows the details of all the new flows.

What’s next?

Gerrit v2.15 includes a brand-new storage for reviews, code-named “NoteDb”, that is actually your Git repository itself. That means that all the meta-data, comments, scores, history, audit, will all be stored in the same GitHub repository with your code.

Our next step is to perform an online migration from ReviewDb to NoteDb, which will be again with zero-downtime. Thanks to the fantastic work made by Dave Borowitz (Google), there will be no need for a read-only window: you will not even notice.

What do know all the details of Gerrit v2.15?

For an overview of what’s new in GerritHub with v2.15, you can look at the Gerrit User Summit 2017 presentation.

 

Git vs. Subversion – An Executive Decision Guide

svn-vs-git

Subversion or Git ? Making the right choice for your Business.

Git is the fastest growing distributed SCM (source code management) tool. Yet, with 50%+ market share, Apache Subversion continues to be the undisputed leader for SCM in the enterprise. The success of both open source SCM tools provides a dilemma for some organizations, in particular those migrating off legacy systems such as ClearCase, Visual SourceSafe (VSS) or CVS. Should you move to Subversion, to Git – or to both?

Discover all you need to know to make the right choice in theWebinar, organized and sponsored by CollabNet Inc, on Tuesday, August 13, 2013, 9:00 AM – 10:00 AM PDT.

What we will cover in the Webinar ?

  • Pro’s and Con’s of Git vs. Subversion in the enterprise
  • Proven integration and migration strategies
  • Industry trends and market predictions
  • Security and compliance implications

Do not miss this opportunity to watch the Webinar live and ask questions to GerritForge for making your decisions.

REGISTER NOW at http://visit.collab.net/Webinar_Gitvs.SubversionAnExecutiveDecisionGuide.html

ClearCase Migration Git vs. ClearCase – a Developer’s View

Git is the fastest-growing DVCS (distributed version control system) in the enterprise. It provides a viable alternative to ClearCase – in particular when deployed side-by-side with Subversion or binary repository systems.

We will cover in a technical session provides a developers’s point of view, about pro’s and con’s for Git versus ClearCase. It also provides practical guidelines and reference charts, to enable a smooth transition.

What we cover:

  • Detailed comparison: features and SCM processes
  • How to translate SCM commands, side-by-side
  • Security and scalability – tips and tricks
  • Migration topics – pitfalls and solutions
  • Multi-site replication – Git versus ClearCase

Do not miss the next forthcoming Webinar on how to migrate ClearCase to Git, organised and sponsored by CollabNet Inc.

When ?
Thursday, August 1, 2013, 9:00 AM – 10:00 AM PST

Where ?
Register now http://visit.collab.net/2013Q3WebinarClearCaseMigrationGitvsClearCaseaDevelopersView.html