Gerrit Code Review has been integrated with Jenkins for over nine years. It was back when Kohsuke was still a Senior Engineer at Sun Microsystem, which was just announced to be acquired by Oracle and his OpenSource CI project was still called Hudson.
Jenkins and Gerrit are the most critical components of the DevOps Pipeline because of their focus on people (the developers), their code and collaboration (the code review) their builds and tests (the Jenkinsfile pipeline) that produce the value stream to the end user.
The integration between code and build is so important that other solutions like GitLab have made it a unique integrated tool and even GitHub has started covering the “last mile” a few months ago by offering powerful actions APIs and workflow to automate build actions around the code collaboration.
Accelerate the CI/CD pipeline
DevOps is all about iteration and fast feedback. That can be achieved by automating the build and verification of the code changes into a target environment, by allowing all the stakeholder to have early access to what the feature will look like and validating the results with speed and quality at the same time.
Every development team wants to make the cycle time smaller and spend less time in tedious work by automating it as much as possible. That trend has created a new explosion of fully automated processes called “Bots” that are more and more responsible for performing those tasks that developers are not interested in doing manually over and over again.
As a result, developers are doing more creative and design work, are more motivated and productive, can address technical debt a lot sooner and allow the business to go faster in more innovative directions.
As more and more companies are adopting DevOps, it becomes more important to be better and faster than your competitors. The most effective way to accelerate is to extract your data, understand where your bottlenecks are, experiment changes and measure progress.
Humans vs. Bots
The Gerrit Code Review project is fully based on an automated DevOps pipeline using Jenkins. We collect the data produced during the development and testing of the platform and extract metrics and graphs around it constantly https://analytics.gerrithub.io thanks to the OpenSource solution Gerrit DevOps Analytics (aka GDA).
By looking at the protocol and code statistics, we founded out that bots are much more hard worker than humans on GerritHub.io, which hosts, apart from the Gerrit Code Review mirrored projects, also many other popular OpenSource.
That should not come as a surprise if you think of how many activities could potentially happen whenever a PatchSet is submitted in Gerrit: style checking, static code analysis, unit and integration testing, etc.
We also noticed that most of the activities of the bots are over SSH. We started to analyze what the Bots are doing and see what the impact is on our service and possibly see if there are any improvements we can do.
Build integration, the wrong way
GerritHub has an active site with multiple nodes serving read/write traffic and a disaster recovery site ready to take over whenever the active one has any problem.
Whenever we roll out a new version of Gerrit, using the so-called ping-pong technique, we swap the roles of the two sites (see here for more details). Within the same site, also, the traffic can jump from one to the other in the same cluster using active failover, based on health, load and availability. The issue is that we end up in a situation like the following:
The “old” instance still served SSH traffic after the switch. We noticed we had loads of long-lived SSH connections. These are mostly integration tools keeping SSH connections open listening to Gerrit events.
Long-lived SSH connections have several issues:
- SSH traffic doesn’t allow smart routing. Hence we end up with HTTP traffic going on the currently active node and most of the SSH one still on the old one
- There is no resource pooling since the connections are not released
- There is the potential loss of events when restarting the connections
That impacts the overall stability of the software delivery lifecycle, extending the feedback loop and slowing your DevOps pipeline down.
Then we started to drill down into the stateful connections to understand why they exist, where are they coming from and, most importantly, which part of the pipeline they belong to.
Jenkins Integration use-case
The Gerrit Trigger plugin for Jenkins is one of the integration tools that has historically been suffering from those problems, and unfortunately, the initial tight integration has become over the years less effective, slow and complex to use.
There are mainly two options to integrate Jenkins with Gerrit:
We use both of them with the Gerrit Code Review project, and we have put together a summary of how they compare to each other:
|Gerrit Trigger Plugin||Gerrit Code review Plugin||Notes|
Jenkins listens for Gerrit events stream
Gerrit webhooks notify events to Jenkins
|Stateful stream events are consuming resources on both Jenkins and Gerrit|
|Transport Protocol||SSH session on a long-lived stream events connection||HTTP calls for each individual stream event||– SSH cannot be load-balanced
– SSH connections cannot be pooled or reused
|Setup Complexity||Hard: requires a node-level and project-level configuration.
No native Jenkinsfile pipeline integration
|Easy: no special knowledge required.
Integrates natively with Jenkinsfile and multi-branch pipeline
|Configuring the Gerrit Trigger Plugin is more error-prone because requires a lot of parameters and settings.|
|Systems dependencies||Tightly Coupled with Gerrit versions and plugins.||Uses Gerrit as a generic Git server, loosely coupled.||Upgrade of Gerrit might break the Gerrit Trigger Plugin integration.|
|Gerrit knowledge||Admin: You need to know a lot of Gerrit-specific settings to integrate with Jenkins.||User. You only need to know Gerrit clone URL and credentials.||The Gerrit Trigger plugin requires special user and permissions to listen to Gerrit stream events.|
|Fault tolerance to Jenkins restart||Missed events: unless you install a server-side DB to capture and replay the events.||Transparent: all events are sent as soon as Jenkins is back.||Gerrit webhook automatically tracks and retries events transparently.|
|Tolerance to Gerrit rolling restart||Events stuck: Gerrit events are stuck until the connection is reset.||Transparent: any of the Gerrit nodes active continue to send events.||Gerrit trigger plugin is forced to terminate stream with a watchdog, but will still miss events.|
|Differentiate actions per stage||No flexibility to tailor the Gerrit labels to each stage of the Jenkinsfile pipeline.||Full availability to Gerrit labels and comments in the Jenkinsfile pipeline|
|Multi-branch support||Custom: you need to use the Gerrit Trigger Plugin environment variables to checkout the right branch.||Native: integrates with the multi-branch projects and Jenkinsfile pipelines, without having to setup anything special.|
Gerrit and Jenkins friends again
After so many years of adoption, evolution and also struggles of using them together, finally Gerrit Code Review has the first-class integration with Jenkins, liberating the Development Team from the tedious configuration and BAU management of triggering a build from a change under review.
Jenkins users truly love using Gerrit and the other way around, friends and productive together, again.
Thanks to Gerrit DevOps Analytics (GDA) we managed to find one of the bottlenecks of the Gerrit DevOps Pipeline and making changes to make it faster, more effective and reliable than ever before.
In this case, by just picking the right Jenkins integration plugin, your Gerrit Code Review Master Server would run faster, with less resource utilization. Your Jenkins pipeline is going to be simpler and more reliable with the validation of each change under review, without delays or hiccups.
The Gerrit Code Review plugin for Jenkins is definitively the first-class integration to Gerrit. Give it a try yourself, you won’t believe how easy it is to set up.
Gerrit Code Review Contributor, GerritForge.
Really nice Ponch ! Great use case for GDA.
Pingback: Accelerate with Gerrit DevOps Analytics - Fabio Ponciroli Senior Software Engineer at GerritForge LTD - OpenExpo Europe 2019