How to migrate Gerrit from v2.15 to v2.16

Time has come to migrate gerrithub.io to the latest Gerrit v2.16, from the outdated v2.15 we had so far. The big change between the two is the full adoption of NoteDB: the internal Gerrit groups were still kept in ReviewDb on v2.15, which forced us to keep a PostgreSQL instance active in production. This means we can finally say goodbye to the ReviewDb ūüĎč and eliminated yet another SPoF (Single-Point-of-Failure) from the GerritHub high-availability infrastructure.

Migrating to Gerrit v2.16 implies:

  1. Gerrit WAR upgrade
  2. GIT repos upgrade because of a change in the NoteDb format
  3. Change in the database used, from PostgreSQL to H2 (for the schema_version)
  4. Introduction of the new Projects index

The above is a quite complex process and, here at GerritForge, we executed the migration on a running GerritHub.io with 15k of active users avoiding any downtime during the migration.

Architecture

This is the initial architecture we are starting the GerritHub.io v2.15 migration from:

Initial status - 15_01.png

In this setup, we have 2 sites, one in Canada (active) and one in Germany (active for analytics and disaster recovery). The latter is aligned with the active master via replication plugin.

The HA Plugin used between the 2 Canadian nodes is a GerritForge fork enhanced with the ability to align the Lucene Indexes, Caches and Events when sharing repositories via NFS with caching enabled.

NOTE: The original High-Availability plugin is certified and tested on Gerrit v2.14 / ReviewDb only and requires the use of NFS without caching, which requires a direct fiber-channel connection between the Gerrit nodes the disks.

The traffic is routed with HAProxy to the active node. This allows us easy code migrations with no downtimes, using what we call the ‚Äúping-pong‚ÄĚ technique between the Canadian and the German site, which is inspired by the classical Blue/Green deployment with some adjustments for the peculiarities of the Gerrit multi-site setup.

The migration pattern, in a nutshell, is composed of the following phases:

  1. Upgrade code in Germany
    The Gerrit site in Germany is used for Analytics and thus can be upgraded first with low risk associated.
    German site -> passive, Canadian site -> active
     
  2. Redirect traffic in Germany
    Once the site in Germany is ready and warmed up, the GerritHub users are redirected to it. GerritHub is technically serving the v2.16 user-experience to all users.
    German site -> active, Canadian site -> passive
     
  3. Upgrade code in Canada
    The site in Canada is put offline and upgraded as well.
    German site -> active, Canadian site -> passive
     
  4. Redirect traffic back to Canada
    Once the site in Canada is fully ready and warmed up, the entire user-base is redirected back.
    German site -> passive, Canadian site -> active

Each HAProxy has the same configuration with a primary and 2 backups as follow:

HAProxy CA Primary.png

Timeline of events – 2nd of Jan 2019

2/1/2019 – 8:00 GMT: Starting point of the GerritHub configuration

  • Review-1 – Gerrit 2.15 – active node
  • Review-2 – Gerrit 2.15 – ¬†failover node
  • Review-DE – Gerrit 2.15 – analytics node, used for disaster recovery

2/1/2019 Р10:10 GMT: Upgrade disaster recovery server

  • Stopped all services using Gerrit on review-de (we use the disaster recovery to crunch and serve the analytics dashboard)
  • Disabled replication plugin
  • Stopped Gerrit 2.15 and upgraded to Gerrit 2.16
  • Restarted Gerrit

2/1/2019 Р10:44 GMT: Re-enabled disaster recovery server 

  • Re-Enabled replication from review 1…boom!
    • First issue: mirror option of the replication plugin was set to true, hence all the branches containing the groups on the All-Users repo been dropped from the recovery server. All the Groups were suddenly gone from the disaster recovery server
  • Remove mirror option in replication plugin
  • Re-Enabled replication from review-1…this time everything was ok!
  • Migration re-executed and everything was fine

2/1/2019 –¬†11:00 GMT:¬†Removed ReviewDB

  • Once we were happy with the replication of the Groups we could remove PostgreSQL

The only information left outside NoteDB is the schema_version table, which contains only one row and it is static. We moved it into H2 by copying the DB from a vanilla 2.16 installation and changing Gerrit Config to use it.

DE 2.16 - 15_01.png

Before the next step, we had to wait for the online reindexing on review-de to finish (~2 hours).

Note:¬†we didn’t consider offline¬†reindexing since it is basically sequential, and it would have been way slower compared to the concurrent¬†online one. Additionally, it does not compute all the Prolog rules in full.

2/1/2019 Р15:15 GMT: Reduce delta between masters

  • Reducing the delta of data between the 2 sites (Canada and Germany) will allow having a shorter read-only window when upgrading the currently active master
  • Manually replicate and reindex misaligned repositories on review-de (see below the effect on the system load)

Screenshot 2019-01-14 at 20.33.10.png

Screenshot 2019-01-14 at 20.33.23.png

  • Pro tip: if you want to check queue status to see, for example, if the replication is still ongoing this command can be used:

    ssh -p 29419 <gerrit_admin_user>@localhost \
                 gerrit show-queue --by-queue --wide

2/1/2019 Р15:50 GMT: Final master catchup

  • Switched on read-only plugin on the active master
  • Service degraded for few minutes (i.e.: Gerrit was read-only), but most of the operations were available, i.e.: Gerrit index/query/plugin/version, git-upload-pack, replication
  • Waited for review-de to catch up with the latest changes that come in review-1 (we monitored it using the above ‚Äúgerrit show-queue‚ÄĚ command)

CA Readonly - 15_01.png

2/1/2019 Р15:54 GMT: Made disaster recovery active

  • Changed HAProxy configuration, and reloaded, to re-direct all the traffic to review-de, which become the active node in the cluster

HAProxy-DE-primary-transition.png

  • See the transition of the traffic to review-de

Screenshot 2019-01-14 at 20.39.22.png

  • Left review-de the whole night as the primary site. This way we also tested the disaster recovery site stability

DE Active - 15_01.png

2/1/2019 Р19:47 GMT: Upgrade review-1 and review-2 to Gerrit 2.16

  • Stopped Gerrit 2.15 and upgraded to Gerrit 2.16
  • Wait for offline reindexing of Projects, Accounts and Groups
  • Started with Gerrit 2.16 with online reindexing of the changesCA 2.16 - 15_01.png

It was possible to see an expected increase in the system load due to the reindexing, lasted for about 2 hours:

System load.png

Furthermore, despite review-1 not being the active node, the HTTP workload grew disproportionately:

HTTP requests.png

This was due to a well-known issue of the high-availability plugin, where the reindexing are forwarded to the passive nodes, creating an excessive workload on them.

3/1/2019 Р10:14 GMT: Made review 1 active

  • We used the same pattern used when upgrading¬†review-de to align the data between masters
  • Changed HAProxy configuration, and reloaded, to re-direct back all the traffic to review-1

 

Final - 15_01.png

Conclusions

Migration was completed and production is back to stable again with the latest and greatest Gerrit v2.16.2 and the full PolyGerrit UI. With the migration of the Groups in NoteDB, ReviewDB leaves the stage completely to NoteDB. PostgreSQL is no more needed, simplifying the overall architecture.

The migration itself was quite smooth, the only issue was due to a plugin misconfiguration, nothing to have with Gerrit core. With the good monitoring we have in place, we managed to spot the issues straight away. Still, we will further automate our release process to avoid these issues from happening again.

Fabio Ponciroli (aka Ponch) – Gerrit Code Review Contributor – GerritForge

New year, free GerritHub: unlimited private reviews with anyone, forever

Today GitHub has announced the extension of its free plan to include unlimited private repositories. This is great because allows a lot more people to start experimenting their side projects and keep them confidential until they ready to be shared publicly.

GerritHub.io allows extending this amazing offer by having a fully-featured code review process on top of their GitHub private repositories and still keep the confidentiality needed for early-stage projects. Differently from GitHub, however, GerritHub allows you to have an unlimited number of reviewers and collaborators, for free, forever.

A wonderful new 2019 is starting with two amazing free offers to allow everyone to experiment and unleash their potential:

  • Free unlimited repos from GitHub, limited to 3 collaborators
  • Free unlimited repos from GerritHub, with unlimited collaborators for reviews

That’s super-cool, how do I start?

Getting started with your private GitHub repositories on GerritHub is easy:

  1. Go to https://review.gerrithub.io
  2. Click the top-right “Sign-in” link
  3. Select “Private” option and click the top-right “Login” button
  4. Enter your GitHub credentials
  5. Allow GerritHub to access in reading/writing your private repositories
  6. Select the GitHub SSH keys and profile into Gerrit, and click the top-right “Next” button
  7. Select the organization and repositories to import into GerritHub, and click the top-right “Import” button
  8. Select the GitHub PRs you want to import into GerritHub for review, and click the top-right “Import Selected” button

Once you’re done with the above steps, you’re up-and-running with GerritHub and you are free to invite collaborators and accept reviews.

You can follow the GerritHub video on YouTube which describes the above process.

I am new to Gerrit Code Review, where do I start?

There is plenty of information on the web about Gerrit Code Review. The best place to start is the project’s tutorial in the documentation.

Alternatively, you can watch the presentation by Shawn Pearce, the Gerrit Code Review project’s founder.

 

Have questions? Get in touch with the Community.

In case of issues, questions, you can get in touch with the Gerrit Code Review Community, and they will be happy to guide you through and provide support.

Want to use Gerrit into your Enterprise?

If you decide to use Gerrit Code Review in your Enterprise and you need the service level compliant with your company standards, you can get in touch with GerritForge which offers the full coverage of the Enterprise Support you will need:

  • Silver: 8×5 Support, with 24h turnaround for P1 issues
  • Gold: 24×7 Support, with 8h turnaround for P1 issues
  • Platinum: 24x7x365 Support, with 4h turnaround for P1 issues

What’s next?

With GitHub and GerritHub you have no excuses anymore to start innovating right now, with free unlimited repositories and free unlimited Gerrit reviewers and contributors.

Go and innovate, the future is now. 

Security: do not use Git v2 in Gerrit 2.16

Git protocol v2 was released as an experimental feature in Gerrit v2.16. However, the introduction came with a quite serious unnoticed bug: all refs are visible to all users,
regardless of the ACL configuration, giving to any registered users the complete access to
of all branches, tags and meta-data refs, their associated commit SHA1s and
the ability to fetch them locally … ouch!

What is the Security impact?

If you were using Gerrit v2.16 for OpenSource projects, not much: everything is visible to everyone anyway, so what’s the point?

For not-so-OpenSource projects, well, it cannot be used as-is in production for sure. It was flagged as experimental after all because it was intended more for an early adopter to “have a go with it” rather than using it at full scale in production for sensitive projects.

See the full details of the security advisory at:
https://groups.google.com/forum/#!topic/repo-discuss/z_QCw2QHbbc

How did we get there?

The problem is mainly located in the JGit implementation of the refs filtering for the Git protocol v2.¬†Gerrit ACLs are enforced using the JGit’s AdvertiseRefsHook which calls
RefFilter, where a Gerrit-specific implementation of it processes the access permissions associated with a user.

The AdvertiseRefsHook is usually set by UploadPack.setAdvertiseRefsHook
but, if Gerrit has the protocol v2 enabled in the gerrit.config and the client
is leveraging the git protocol v2 feature, the hook is not invoked. The bug has been already fixed in JGit but, of course, before enabling the support again in v2.16 we need to be sure that no other vulnerabilities are exposed.

What if I have Gerrit v2.16 in production?

On Gerrit v2.16 and v2.16.1, the Git protocol v2 was disabled anyway by default.

If you had it enabled in production by mistake (or bravery?), then just set it as disabled explicitly.

[receive]
enableProtocolV2 = false

If you can, just upgrade to v2.16.2 whenever possible, where the Git protocol v2 is always
disabled.

What’s next?

Git protocol v2 is coming back to Gerrit v2.16 very soon, possibly as early as next week or the one after, ready for Christmas ūüôā

The fix is ready, but the Gerrit and JGit teams are working hard to put more specific security testing in place so that the new reintroduction can be safer to be rolled out.

Merry Christmas to everyone … and hope that Git Protocol v2 will be back with us very soon, just in time for the end of year celebrations.

Luca Milanesio – Gerrit Code Review Maintainer.

Gerrit User Summit 2018

The Gerrit User Summit 2018 has ended. It has been a truly memorable event and with strong emotions, gratitude and celebration of the success of the Gerrit OpenSource project.

During the hackathon, some major events happened:

  • The release of Gerrit v2.16
  • Support for Kubernetes
  • New plugins contributed by Qualcomm
  • The announcement of a new maintainer, Marco Miller from Ericsson, congrats!

Hackathon and Gerrit v2.16

The hackathon took place at SAP (2 days) and Cloudera (1 day) in Palo Alto with the main focus of completing the migration to PolyGerrit and cut the v2.16 release. Three major milestones of the Gerrit project:

  1. PolyGerrit as default UI for Gerrit Code Review.
  2. Migration of the groups to NoteDb
  3. Support of the Git protocol v2

With regards to the achievement #1, during the hackathon, the commit for removing all the GWT related code from Gerrit master has been finally merged.

PolyGerrit as default Gerrit Code Review UI

The PolyGerrit Team has reached the summit of the long and troubled journey of getting rid of the outdated and Gerrit GWT UI. Kasper and Logan (Google) burned every single bit of their development energy to fix, implement and fill all the remaining gaps in the user journeys.

On Gerrit v2.16, PolyGerrit UI is the default and standard way of interacting with Gerrit Code Review. GWT is still there for allowing people to smoothly adapt to the new interface. However, the next forthcoming version of Gerrit v3.0 will not contain anymore any reference to the old UI. People will still have a few more months to adapt and, from the feedback received on GerritHub.io, they seem to not miss the GWT interface at all.

Well done to the entire PolyGerrit Team, starting from Andrew who kicked off the project back in 2015 to Wyatt, Becky, Victor, Kasper and Logan who brought it up to the usability and portability to multiple devices as it is today. Thanks to their amazing effort, we can all interact with Gerrit Code Review wherever and whenever we want, without losing speed and effectiveness.

Migration of Gerrit groups to NoteDb

One more step towards the complete elimination of ReviewDb has been achieved in Gerrit v2.16: there isn’t anymore any mutable data in the leftover of the (in)famous Gerrit DB. The new version includes a new git repository (All-Groups) which contains all the information related to the groups create through the Gerrit UI.

What’s left in ReviewDb? Not much: only the schema version. When you install Gerrit it is populated with a single row and a single column with the value ‘170’ and will never change.¬†There is no need to share it across master or slaves: just set database.type = h2 and forget about it.

Git protocol v2

It is a major achievement of Gerrit v2.16 as the Git wire protocol v2 solves the performance problems of large repositories: the refs filtering during the advertisement phase.

In a nutshell, if you were fetching a single from a repository with 500k refs, the server would have sent a huge payload of all refs with the associated SHA1s even if at the end of the day you wanted only to fetch a single one. The delay could have been of several seconds if not minutes, which is quite relevant if you are fetching all the times, like in a CI/CD pipeline.

Git protocol v2 reduced the refs advertisement phase by over 70%, which makes Gerrit v2.16 so appealing that you may want to place in the top 5 tasks of your backlog.

Farewell to GWT UI code

Even though Gerrit v2.16 keeps the ability to display the old-fashioned GWT Gerrit UI, the code has been definitely removed on the Gerrit master branch, which will the base for Gerrit v3.0 that will be released in Spring 2019.

This is a memorable moment because represents the final act of making PolyGerrit UI the unified user-experience of Gerrit moving forward.

Data and Insights of the Hackathon

This year the event has been truly focused as well on the analytics side of the Code Review and the entire CI/CD pipeline: was it possibly the location, Cloudera, the world leader in OpenSource BigData and Analytics, giving the right vibrations? Possibly, yes.

Numbers

Over 20 people have attended the Hackathon on-site and, for the first time, remotely via video-conference, from companies and countries all around the world:

  1. Google (USA and Germany)
  2. CollabNet (Japan and Germany)
  3. GerritForge (UK, Ireland, Italy, and Germany)
  4. SAP (Germany)
  5. Qualcomm (USA)
  6. Ericsson (Canada)
  7. Wikimedia Foundation (UK)

Over 300+ changes have been uploaded and 244 of them have been already merged.

They have been working on 44 projects, showing how diverse is today the universe of Gerrit Code Review and its associated plugins and integrations.

Gerrit DevOps Analytics

GerritForge has been hacking on improving the platform that powers the Gerrit Code Review Analytics Platform which has been serving the community for many years.

For the first time, thanks to the introduction of branch-level analytics, the community had data automatically crunched in near-real time of what was happening on the master branch and on the on-going stable 2.16 release.

As a matter of fact, the numbers mentioned before are directly coming from Gerrit DevOps Analytics. Thanks again GerritForge for innovating and improving the Gerrit Code Review platform.

Gerrit Analytics Wizard

For the first time, two new contributors from GerritForge, Tony and Ponch, finalized and demoed a new plugin called ‘analytics-wizard’ that allows anyone with a Gerrit server to set up a mini-version of the Gerrit DevOps Analytics, using Docker, ElasticSearch, and Kibana.

Two new plugins: batch and tasks

Qualcomm has released two new plugins based on their experience of running complex pipelines on a multi-master Gerrit setup. The majority of the code was developed in-house before the hackathon. However, the final release and announced happened this week and I am sure it will allow many other companies to benefit from their experience of validating complex multi-repo changes on a large scale CI system.

Batch

This plugin provides a mechanism for building and previewing sets of proposed updates to multiple projects/refs that should be applied in a batch. These updates are built with Gerrit changes.

Task

The plugin provides a mechanism to manage tasks which need to be performed on changes along with a way to expose and query this information. Tasks are organized hierarchically, and task definitions use Gerrit queries to define which changes each task applies to, and how to define the status criteria for each task. An important use case of the task plugin is to have a common place for CI systems to define which changes they will operate on, and when they will do so.

Gerrit on Kubernetes from SAP and GerritForge

Luke (GerritForge), Matthias (SAP) and Thomas (SAP remotely from Germany) worked on a PoC for supporting Kubernetes deployments.

Last year SAP started using containerized slaves and decided to go cloud native and base our future Gerrit deployments on Kubernetes and leverage project Gardener to support multiple cloud providers. Matthias worked with Thomas to prepare a PoC and a demo the current work in progress and discuss plans moving forward.

The code has been published to the new k8s-gerrit repository and new commits will be subject to the standard Gerrit Code Review process.

Luke prepared an example of Gerrit v2.15 deployment in Kubernetes (single master at the moment) with shared storage and showed how to upgrade easily by leveraging Helm Charts deployments through Tiler.

Even though the two projects started in parallel, they agreed to cooperate and work together with the community to have a unified Kubernetes deployment code-base for installing and upgrading multi-master Gerrit setups. This is definitely an area where we will see very interesting developments very soon.

What about the Summit?

The talks of the summit have been very interesting this year and have covered mostly three main topics: Gerrit roadmap and scalability, sharing of real-life migrations to Gerrit v2.14 and v2.15 and, as previously mentioned, Data Analytics & Insights including research work on using machine-learning for supporting the Code Review process.

All the talks are getting published as we speak on each talk of the Gerrit User Summit Schedule.

All the talks have been recorded and will be published within the next few weeks on the GerritForge TV YouTube channel.

More blog posts will follow with more background on each talk, stay tuned and watch this space to know more about what has been presented and announced at the Summit.

Proposals for the next Gerrit User Summit 2019

The overall feedback on the summit has been very positive, including the Birthday Party organized and sponsored by GerritForge for the 10th anniversary of the Gerrit Code Review project.

During the final closing keynote, many interesting proposals have been made:

  • Gerrit users’ wish-list session.
    Any Gerrit user can feed the backlog of new feature proposing what they will like to see implemented
  • Gerrit plugin hackathon.
    Learn how to implement Gerrit plugin and trying to put together in one or two days something useful using any of the programming languages they wish, including Java, Groovy, Scala or anything else.

More details will come as soon as the closing keynote session recording will be published.

Remembering Shawn Pearce and his legacy

Dave Borowitz, the leader of the Gerrit Code Review project, made a thoughtful and touching speech about Shawn Pearce, the father of the Gerrit 2, who started the project exactly 10 years ago. The cake, the hackathon, the summit, and the v2.16 release have been fully dedicated to his memory as appreciation for his humanity and passion for OpenSource.

Thanks, Shawn. We will continue on your legacy and improve the Gerrit Code Review platform as you would have wanted us to do.

 

 

 

GerritHub.io and the new Reviewer role

Screen Shot 2018-06-21 at 08.57.58

Good news for all the people that wanted to use Gerrit Code Review on top of their GitHub repositories, but so far have been concerned about sharing their profile information: you can now keep your details private and still review other people’s changes on Gerrit.
If you are an EU citizen, you have rights guaranteed by the GDPR and the default role for accessing your GitHub OAuth scope can now be limited to what is the bare minimum.

The default scope explained

When signing up with GerritHub.io, you have the ability to define the scope of access to your personal information held on GitHub:

  • Your e-mail
  • Your membership to organizations and teams
  • Your repositories

The default access requested for doing some activity on Gerrit is: user:email + public:repo + read:org, that allows Gerrit to see your e-mail, clone and push on your behalf to your public repositories and see the list of your teams and organizations. Those permissions are needed when you want to push some code to GerritHub.io, and thus you need to allow Gerrit to resolve your groups (Organizations/Teams) and grant you permissions to push your code to Gerrit and eventually GitHub as well on your behalf.

The problem with GDPR

The default role may not fit with some of the conditions of GDPR for EU where you are not willing to contribute any code but participate in the review of existing changes. That’s why GerritHub.io introduced a new GitHub scope page that allows external reviewers to sign-up with the bare minimum information that Gerrit needs to let you in: your e-mail address.

Why can’t I stay anonymous on Gerrit?

Even though you could create a GitHub account without exposing any private information or exposing your e-mail, you cannot sign-up with Gerrit Code Review if you are not willing to share who you are.
The e-mail address in Gerrit is a crucial property, and without it, some parts of the code-review workflow would stop working. That condition of Gerrit is so pervasive in the architecture because of the requirements of the Android OpenSource Project (code-named AOSP) which is the project historically hosted on the platform: the . See the details of the Android contribution license and process at https://source.android.com/setup/start/licenses.

What if I still want to stay completely anonymous on Gerrit Code Review?

Gerrit has the concept of the “Anonymous Coward” (not an offense, just the default name assigned to it), which is an account that has only an ID without either a full name or an e-mail address. With the anonymous user, however, you cannot review code, for obvious legal reasons and for preventing spamming and abuse.

I am in doubt on what to chose, where to start from?

If you are concerned about sharing your Personal Information and you need to decide which level to use, then just chose “Reviewer” and you can amend your choice and extend your scope at any time later.

GitHub acquired by Microsoft: what’s next?

The world woke up this morning with shocking and exciting news at the same time: GitHub is going to be a Microsoft Business.
There are mixed feelings and GitLab already reported a tremendous increase in its rate of imported projects from GitHub and a record of registration of new accounts all tagged with the #MoveToGitLab Twitter hashtag.

Do not press the panic button

Microsoft had, unfortunately, a historical record of acquisitions that did not go very well. However, that doesn’t mean that GitHub is going to follow the same path.

The question is: what is going to change in the next few weeks? Possibly nothing at all. It is not the time to panic and looking frantically for quick alternatives without really thinking about it. GitHub is there, works and is not going to change in the near term.

Looking for more independence and Openness

One thing that people should do right now, is to say with GitHub and keep their presence as it is today. At the same time, it is clear that economics of the staggering $7.5Bn price tag will start to impact the future decisions and the bias of their services, but nobody knows when and how.

If you are looking for something better, more open and more powerful, you should look at what the best of the OpenSource community proposes on Git and Gerrit Code Review.

OpenSource Code Review, 10 years of independence

Gerrit Code Review was founded the 1st of October 2008 by Google and, since then, has been paramount of Openness and vendor neutrality. There is NO “Community” vs. “Enterprise” editions, no “vendor-locking”, no pull-request filtering for enterprise-class features.
According to the Official Gerrit Analytics page (http://gerrit-analytics.gerritforge.com), over 160+ organizations contributed to Gerrit a stunning 36k commits and the project keeps growing.

Gerrit Code Review project contributions since its inception over 10 years

Screen Shot 2018-06-04 at 14.29.23

Try Gerrit Code Review workflow and stay on GitHub

Since 2013, a new service called GerritHub allows OpenSource projects and private companies to leverage Gerrit Code Review workflow and keep their public presence on GitHub.
In addition to a much more powerful and functional workflow, they get for free the ability to be discoverable on GitHub and accept contributions as Pull Requests.

What if I want to leave GitHub anyway?

Should you decide to stay on Gerrit Code Review and leave GitHub in the future, you will always have your repos and reviews on Gerrit and decide to cancel your GitHub subscription at any time, without any consequence to your Community.

So, why not giving Gerrit Code Review a try?
https://review.gerrithub.io/static/intro.html

 

 

 

GerritHub.io and GDPR

Screen Shot 2018-05-25 at 16.17.49

GerritHub.io has changed some of its key components to be compliant with the new regulations related to the European General Data Protection Rules (aka GDPR). You can find the updated Privacy Policy on the GerritForge Web Site.

Impact on Gerrit Code Review

Gerrit Code Review had a fundamental problem with some of the concepts in the GDPR: accounts could not be removed from the system once created the first time. Yes, you could have disabled them, but their associated personal data would have remained in the Gerrit DB or All-Users git repository.

We have developed a brand-new plugin for Gerrit that enable the following features:

  1. General permission to allow the removal of accounts
    Group of users can be deleted to remove their own accounts data. Gerrit administrators can be delegated to remove other accounts.
  2. New SSH command for account removal
    Users that have been granted permissions, can access a new ‘account’ command to remove their own personal information.
  3. Simple and easy self-service online form to review its Personal Information and self-remove its account.

The new plugin has been shared with the wider Gerrit Code Review community and will be soon public domain for every other publicly available website that wants to host Gerrit and being publicly accessible by EU citizens.

Access and control of your Personal Information

  • Right to Access
    Every personal profile is directly accessible to its owner through the User’s settings in Gerrit Code Review.
  • Data Portability
    Gerrit stores the user’s profile in a specific set of branches of the All-Users repository. GerritHub.io can make the personal branches available to everyone that would like to have direct access to its data and reuse or import them into its system.
  • Privacy by Design
    Gerrit already stores only hashed passwords. Additionally, the GitHub credentials are never passed to Gerrit Code Review and the authentication and profile access delegation process is completely controlled by the user thanks to the OAuth 2.0 and Scope selection process.
  • User Data and “right to be forgotten”
    Thanks to the new Gerrit ‘account’ plugin, every user on GerritHub.io is always in control of its data and can decide to remove its account permanently at any time.
    Note that Gerrit will have to “remember” the account UUID for consistency of all the previous review work done by the user. Every account removed from the platform will become an “Anonymous UUID” and will not be able to be reused anymore in the future.

Questions?

If you still have doubts or questions about Gerrit Code Review and the EU GDPR regulations, you can get in touch with GerritForge Ltd.

GerritHub is on NoteDb … with a bump

516px-Road-sign-Speed_bump.svg

The 26th of April at 9:10 AM EDT, the 400K changes on GerritHub.io have been successfully migrated to NoteDb.
See below the historical log entry in error_log.2018-04-26

[2018-04-26 09:10:55,429] [OnlineNoteDbMigrator] INFO com.google.gerrit.server.notedb.rebuild.OnlineNoteDbMigrator : Online NoteDb migration completed in 8630s

What is NoteDb?

NoteDb is the next generation of Gerrit storage backend, which replaces the traditional SQL backend for change and account metadata with storing data in the same repository as code changes. In a nutshell, you can access all the reviews from your local Git repository as well by using the “git log -p” command line and even when you are offline, which is really neat.

Whilst all the major competitors of Gerrit Code Review still rely on a traditional DataBase for reviews, NoteDb is innovative and provides many major benefits:

  • Simplicity
    All data is stored in one location in the site directory, rather than being split between the site directory and a possibly external database server.
  • Consistency
    Replication and backups can use a snapshot of the Git repository refs, which will include both the branch and patch set refs, and the change metadata that points to them.
  • Auditability
    Rather than storing mutable rows in a database, modifications to changes are stored as a sequence of Git commits, automatically preserving history of the metadata.
  • Extensibility
    Plugin developers can add new fields to metadata without the core database schema having to know about them.
  • New features
    Enables simple federation between Gerrit servers, as well as offline code review and interoperation with other tools.

Large-scale, world’s first.

GerritHub.io is the first large-scale Gerrit Code Review installation, apart from Google’s of course, that has hit essential records targets:

  1. The world’s most advanced and up-to-date Gerrit release in production: v2.15.1-143
  2. The world’s first NoteDb on-line migration in production

Being the “first” has a lot of advantages because allow people and companies to work faster and more efficiently than the competitors, which is paramount of the modern global economy. However, there are disadvantages as well: being the “first” means that at times you are going into unexplored space, and the road could be bumpy.

See below a summary of what happened yesterday on GerritHub.io during the NoteDb migration.

Timeline of events

06:47 AM – Starting online NoteDb migration

The online migration process starts. All incoming changes and reviews are still happening on ReviewDb, however, Gerrit start creating the /meta refs on the existing changes to translate all the existing DBMS records into Review Notes.

This migration state is called: WRITE (changes are written to both NoteDb and ReviewDb)

07:58 AM – Setting primary storage to NoteDb

The primary storage for new changes is moved to NoteDb. New changes will be stored to NoteDb while existing changes that have been modified between 6:47 AM and 7:58 AM will be delta-migrated and then flagged as “NoteDb only” one by one.

When a new change is created, it will be assigned from a sequence number coming from NoteDb and not anymore from ReviewDb.

08:01 AM – Errors when trying to push new changes to GerritHub.io

One developer of the Python zVM SDK OpenSource project tries to create a new change to GerritHub.io but receives the following error:

$ git push origin HEAD:refs/for/master
Counting objects: 10, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (5/5), done.
Writing objects: 100% (10/10), 657 bytes | 0 bytes/s, done.
Total 10 (delta 4), reused 0 (delta 0)
remote: Resolving deltas: 100% (4/4)
remote: Processing changes: new: 1, refs: 1, done
remote:
remote: New Changes:
remote: https://review.gerrithub.io/#/c/mfcloud/python-zvm-sdk/+/407671
remote:
To ssh://balaskoa@review.gerrithub.io:29418/mfcloud/python-zvm-sdk
! [remote rejected] HEAD -> refs/for/master (internal server error: Error inserting change/patchset)

Other errors are appearing with the identical symptoms on other projects. It isn’t, however, a general failure because other new changes are getting through and existing changes are reviewed correctly as expected.

08:36 AM – Problem notified to the Gerrit mailing list

The troubleshooting starts, and it seems that some of the new changes created on NoteDb have sequences in conflict with existing changes on ReviewDb but on other projects.

Not all changes are impacted though, so migration continues.

09:02 AM – Migrated primary storage

All the changes have been migrated and flagged as “NoteDb only”, there will be no more read access to ReviewDb for those.

09:06 AM – Cause identified

A bug has been identified in the code that manages the generation of sequencing numbers for the new changes on NoteDb: the switch to the primary storage to NoteDb has not updated the sequencing number on the All-Projects/refs/sequences/changes and thus new changes created may be conflicting with existing ones on ReviewDb.

09:10 AM – Migration completed

09:49 AM – Acknowledge by Google

Dave Borowitz, the leader of the Gerrit Code Review project, analyzes the discussion topic on the mailing list and agrees on the diagnosis of the issue.

Dave Borowitz words were: “Nice catch, thank you Luca.”

10:02 AM – GerritHub.io production patched, problem resolved.

9:05 PM – A software fix to the Gerrit v2.15 stable branch uploaded

A definite fix for the software glitch is uploaded to Gerrit-Review and is reviewed by the Gerrit Code Review contributors.

The “bump” on the road

Migration is always a pain, and you need to plan it, test and fix all the issues you can potentially verify in a “like-for-like” pre-production environment. However, this time at least, testing had produced a situation that was unprecedented.

When Gerrit was migrated from v2.14 to v2.15.1, traffic has been moved between Data-Centers (DCs), from Canada to Germany and then back to Canada, using a “ping-pong” technique with zero-downtime.
That means that the testing of the on-line migration to NoteDb has been tried *already* on the Canada-DC a few days ago and it actually succeeded and Gerrit stored the “last known sequence number” in ReviewDb into NoteDb.

The second NoteDb migration yesterday followed exactly the same trace of the previous test made on Canada-DC but, this time, the “last known sequence number” was not updated.
That is an “edge-case” that was not foreseen when writing the code and has produced the failures experienced by new changes.

Gerrit NoteDb code is very resilient and immediately detected the situation and avoided to insert and index the changes with conflicting IDs.

Statistics of migration

  • Total migration time: 2h 23m
  • Reaction time to investigate failures: 36m
  • Resolution time: 2h
  • Software fix: 13h
  • Number of changes impacted: 33 over 400k – 0.008%
  • Number of projects impacted: 14 over 14k – 0.1%
  • Data loss: 0%
  • Incidents created and closed: 3

Current situation

No more errors or problems reported, production is stable

GerritHub adopts 100% PolyGerrit

p-logo

GerritHub.io has been successfully migrated Gerrit Code Review v2.15. Thanks to the 5-days Gerrit Hackathon hosted by Axis in Lund, all the remaining issues we had on v2.15 have been resolved and all the 15k active users of GerritHub from today can use the 100% feature complete PolyGerrit UX.

What is PolyGerrit?

PolyGerrit is the code-name of the new UX wholly redesigned using web components, a set of web platform APIs that allow you to create new custom, reusable, encapsulated HTML tags to use in web pages and web apps. Custom components and widgets build on the Web Component standards, will work across modern browsers, and can be used with any JavaScript library or framework that works with HTML.

To access the PolyGerrit, just go to the GerritHub.io footer and click on the link “Switch to New UI” or add “?polygerrit=1” as an extra query string (e.g. https://review.gerrithub.io/?polygerrit=1)

To have a more comprehensive description of PolyGerrit, you can read the previous blog post about the Google talk at the past Gerrit User Summit in London.

Zero-downtime “ping-pong” migration to Gerrit v2.15

As usual, we migrated with near-zero-downtime, with only a “ten minutes *read-only* window” where we were waiting to drain the final replications were moved between Data-Centers.

There are two Data-Centers (DC) active for GerritHub.io; the main one is hosted in Canada and the second in Germany. Both DC have a high-availability configuration and run the same version of Gerrit with the same data. However, during major upgrades, we use a “ping-pong” technique to give a seamless experience to the users.

Ping phase

DC-Canada is with Gerrit v2.14 and DC-Germany gets upgraded to v2.15. Traffic gets forwarded smoothly from DC-Canada to DC-Germany thanks to the HAProxy that is serving traffic for https://review.gerrithub.io.

Pong phase

DC-Canada gets upgraded to v2.15 and DC-Germany sync back to Canada in near-real time. Once the upgrade is complete, the HAProxy forwards back the traffic to DC-Canada.

Benefits of the ping-pong upgrade

There two significant benefits of using the combined zero-downtime rollout with the ping-pong technique:

  1. No general service disruption, minimal read-only time.
    Nobody would notice any significant service disruption on GerritHub.io: the read-only window is a minor service degradation that lasts for only a few minutes. Given that 90% of the traffic is represented by “git fetch/clone” and web browsing, the degradation is hardly noticed by anyone.
  2. Validation of the disaster recovery procedure.
    Because the DC-Germany is used as disaster recovery site, it is essential to make sure that is always working fine and you can actually failover to it at any given time when needed.
    You do not want to find out that the disaster recovery isn’t working when is too late.

What’s new in Gerrit v2.15?

There are many changes in Gerrit v2.15, for the details you can have a look at the Google presentation at the Gerrit User Summit 2017 in London.

In a nutshell, here are the headlines of the most visible changes you will notice:

  1. Support for draft changes and draft patch sets has been completely removed.
    You have now two possible states for a change: WIP (work-in-progress) and Private. All the changes that were in “draft” status at the migration have been moved to WIP state.
  2. New URL Scheme
    Gerrit URLs generated and used by the UI include not just the change number but the project name as well.
    For instance, the Change 123 on project ‘myproject’ would now be accessible on the URL: https://review.gerrithub.io/#/c/myproject/+/123.
    Existing URLs containing only the change number (e.g. https://review.gerrithub.io/123) are redirected to the new scheme.
  3. New workflows on the PolyGerrit UX
    The PolyGerrit UX is now 100% feature complete. It is not only an engineering rewrite but also a whole redesign of the user-flows and experience.
    See at https://www.gerritcodereview.com/releases/2.15.md#new-workflows the details of all the new flows.

What’s next?

Gerrit v2.15 includes a brand-new storage for reviews, code-named “NoteDb”, that is actually your Git repository itself. That means that all the meta-data, comments, scores, history, audit, will all be stored in the same GitHub repository with your code.

Our next step is to perform an online migration from ReviewDb to NoteDb, which will be again with zero-downtime. Thanks to the fantastic work made by Dave Borowitz (Google), there will be no need for a read-only window: you will not even notice.

What do know all the details of Gerrit v2.15?

For an overview of what’s new in GerritHub with v2.15, you can look at the Gerrit User Summit 2017 presentation.

 

Gerrit Analytics

Screen Shot 2018-01-02 at 12.01.41.png

I am pleased to announce the availability of Gerrit Code Review Analytics, an Apache 2.0 Open Source solution for extracting, processing and visualizing statistics about your code and developers community.

Why extracting analytics from the source code?

Actually, this is what most of the Git servers have available already out of the box as basic code commit metrics.

GitHub shows for every repository, a basic set of graphs including the overall daily commits and the breakdown on a per-contributor basis.

GitLab displays an overview of what happened on his platform, including push, merge and issue-related events with their correlation to the other GitLab CI components.

… and what about Gerrit? Well, there was basically nothing and we¬†needed to fill the functionality gap and provide even more insights on what happens on the teams and projects that are managed with the Gerrit workflow.

What gets extracted and analyzed from Gerrit Code Review?

Gerrit Code Review has a goldmine of information related to the overall software development lifecycle:

  • Contributors and their team and company ownership
  • Repositories and their associated metadata
  • Git commits with their associated review notes
  • Feedback on code and review scores
  • Events on what happens in terms of commits, reviews, refs and much more

Additionally, Gerrit Code Review allows having a repository-agnostic view by allowing to query and search for information across multiple repositories.

All that “gold” can be extracted, organized and refined to be leveraged on a global scale and extract useful KPI for the Teams and entire Company.

Which components are needed?

To build an extraction pipeline to dig the “Gerrit Code Review goldmine of information” you need the following components:

  1. Gerrit Code Review Ver. 2.13 or later
  2. Gerrit Analytics extractor plugin
  3. Gerrit Analytics ETL Spark Job
  4. ELK stack

We have built a working pipeline that is active 24×7 and is displaying the overall Gerrit Analytics of … the Gerrit Code Review project itself. Best way to iterate on the Gerrit Analytics features is dogfooding as we always did on the Gerrit Code Review project itself.

When can you start using it?

You can start building your Gerrit Analytics pipeline right now by installing the Gerrit Analytics plugin and running the ETL transformation on a regular basis.

If you need any help GerritForge can build and manage the pipeline for you. GerritForge provides the SaaS (Software as a Service) solution for setting up the extraction flows and give you the “out-of-the-box” solution for your company.

To learn more about how GerritForge can support and help you, go to https://gerritforge.com/contact.