Jenkins ❤︎ Gerrit Code Review, again

Gerrit Code Review has been integrated with Jenkins for over nine years. It was back when Kohsuke was still a Senior Engineer at Sun Microsystem, which was just announced to be acquired by Oracle and his OpenSource CI project was still called Hudson.

Jenkins and Gerrit are the most critical components of the DevOps Pipeline because of their focus on people (the developers), their code and collaboration (the code review) their builds and tests (the Jenkinsfile pipeline) that produce the value stream to the end user.

The integration between code and build is so important that other solutions like GitLab have made it a unique integrated tool and even GitHub has started covering the “last mile” a few months ago by offering powerful actions APIs and workflow to automate build actions around the code collaboration.

Accelerate the CI/CD pipeline

DevOps is all about iteration and fast feedback. That can be achieved by automating the build and verification of the code changes into a target environment, by allowing all the stakeholder to have early access to what the feature will look like and validating the results with speed and quality at the same time.

Every development team wants to make the cycle time smaller and spend less time in tedious work by automating it as much as possible. That trend has created a new explosion of fully automated processes called “Bots” that are more and more responsible for performing those tasks that developers are not interested in doing manually over and over again.

As a result, developers are doing more creative and design work, are more motivated and productive, can address technical debt a lot sooner and allow the business to go faster in more innovative directions.

As more and more companies are adopting DevOps, it becomes more important to be better and faster than your competitors. The most effective way to accelerate is to extract your data, understand where your bottlenecks are, experiment changes and measure progress.

Humans vs. Bots

The Gerrit Code Review project is fully based on an automated DevOps pipeline using Jenkins. We collect the data produced during the development and testing of the platform and extract metrics and graphs around it constantly https://analytics.gerrithub.io thanks to the OpenSource solution Gerrit DevOps Analytics (aka GDA).

By looking at the protocol and code statistics, we founded out that bots are much more hard worker than humans on GerritHub.io, which hosts, apart from the Gerrit Code Review mirrored projects, also many other popular OpenSource.

That should not come as a surprise if you think of how many activities could potentially happen whenever a PatchSet is submitted in Gerrit: style checking, static code analysis, unit and integration testing, etc.

human-vs-bot

We also noticed that most of the activities of the bots are over SSH. We started to analyze what the Bots are doing and see what the impact is on our service and possibly see if there are any improvements we can do.

Build integration, the wrong way

GerritHub has an active site with multiple nodes serving read/write traffic and a disaster recovery site ready to take over whenever the active one has any problem.

Whenever we roll out a new version of Gerrit, using the so-called ping-pong technique, we swap the roles of the two sites (see here for more details).  Within the same site, also, the traffic can jump from one to the other in the same cluster using active failover, based on health, load and availability. The issue is that we end up in a situation like the following:

Basic Use Case Diagram

The “old” instance still served SSH traffic after the switch. We noticed we had loads of long-lived SSH connections. These are mostly integration tools keeping SSH connections open listening to Gerrit events.

Long-lived SSH connections have several issues:

  • SSH traffic doesn’t allow smart routing. Hence we end up with HTTP traffic going on the currently active node and most of the SSH one still on the old one
  • There is no resource pooling since the connections are not released
  • There is the potential loss of events when restarting the connections

That impacts the overall stability of the software delivery lifecycle, extending the feedback loop and slowing your DevOps pipeline down.

Then we started to drill down into the stateful connections to understand why they exist, where are they coming from and, most importantly, which part of the pipeline they belong to.

Jenkins Integration use-case

The Gerrit Trigger plugin for Jenkins is one of the integration tools that has historically been suffering from those problems, and unfortunately, the initial tight integration has become over the years less effective, slow and complex to use.

There are mainly two options to integrate Jenkins with Gerrit:

We use both of them with the Gerrit Code Review project, and we have put together a summary of how they compare to each other:

Gerrit Trigger Plugin Gerrit Code review Plugin Notes
Trigger mechanism Stateful

Jenkins listens for Gerrit events stream

Stateless

Gerrit webhooks notify events to Jenkins

Stateful stream events are consuming resources on both Jenkins and Gerrit
Transport Protocol SSH session on a long-lived stream events connection HTTP calls for each individual stream event – SSH cannot be load-balanced
– SSH connections cannot be pooled or reused
Setup Complexity Hard: requires a node-level and project-level configuration.

No native Jenkinsfile pipeline integration

Easy: no special knowledge required.

Integrates natively with Jenkinsfile and multi-branch pipeline

Configuring the Gerrit Trigger Plugin is more error-prone because requires a lot of parameters and settings.
Systems dependencies Tightly Coupled with Gerrit versions and plugins. Uses Gerrit as a generic Git server, loosely coupled. Upgrade of Gerrit might break the Gerrit Trigger Plugin integration.
Gerrit knowledge Admin: You need to know a lot of Gerrit-specific settings to integrate with Jenkins. User. You only need to know Gerrit clone URL and credentials. The Gerrit Trigger plugin requires special user and permissions to listen to Gerrit stream events.
Fault tolerance to Jenkins restart Missed events: unless you install a server-side DB to capture and replay the events. Transparent: all events are sent as soon as Jenkins is back. Gerrit webhook automatically tracks and retries events transparently.
Tolerance to Gerrit rolling restart Events stuck: Gerrit events are stuck until the connection is reset. Transparent: any of the Gerrit nodes active continue to send events. Gerrit trigger plugin is forced to terminate stream with a watchdog, but will still miss events.
Differentiate actions per stage No flexibility to tailor the Gerrit labels to each stage of the Jenkinsfile pipeline. Full availability to Gerrit labels and comments in the Jenkinsfile pipeline
Multi-branch support Custom: you need to use the Gerrit Trigger Plugin environment variables to checkout the right branch. Native: integrates with the multi-branch projects and Jenkinsfile pipelines, without having to setup anything special.

Gerrit and Jenkins friends again

After so many years of adoption, evolution and also struggles of using them together, finally Gerrit Code Review has the first-class integration with Jenkins, liberating the Development Team from the tedious configuration and BAU management of triggering a build from a change under review.

Jenkins users truly love using Gerrit and the other way around, friends and productive together, again.

Conclusion

Thanks to Gerrit DevOps Analytics (GDA) we managed to find one of the bottlenecks of the Gerrit DevOps Pipeline and making changes to make it faster, more effective and reliable than ever before.

In this case, by just picking the right Jenkins integration plugin, your Gerrit Code Review Master Server would run faster, with less resource utilization. Your Jenkins pipeline is going to be simpler and more reliable with the validation of each change under review, without delays or hiccups.

The Gerrit Code Review plugin for Jenkins is definitively the first-class integration to Gerrit. Give it a try yourself, you won’t believe how easy it is to set up.

Fabio Ponciroli
Gerrit Code Review Contributor, GerritForge.

Accelerate with Gerrit DevOps Analytics, in one click!

 

Accelerating your time to market while delivering high-quality products is vital for any company of any size. This fast pacing and always evolving world relies on getting quicker and better in the production pipeline of the products. The whole DevOps and Lean methodologies help to achieve the speed and quality needed by continuously improving the process in a so-called feedback loop. The faster the cycle, the quicker is the ability to achieve the competitive advantage to outperform and beat the competition.

It is fundamental to have a scientific approach and put metrics in place to measure and monitor the progress of the different actors in the whole software lifecycle and delivery pipeline.

Gerrit DevOps Analytics (GDA) to the rescue

We need data to build metrics to design our continuous improvement lifecycle around it. We need to juice information from all the components we use, directly or indirectly, on a daily basis:

  • SCM/VCS (Source and Configuration Management, Version Control System)
    how many commits are going through the pipeline?
  • Code Review
    what’s the lead time for a piece of code to get validated?
    How are people interacting and cooperating around the code?
  • Issue tracker (e.g. Jira)
    how long does it take the end-to-end lifecycle outside the development, from idea to production?

Getting logs from these sources and understanding what they are telling us is fundamental to anticipate delays in deliveries, evaluate the risk of a product release and make changes in the organization to accelerate the teams’ productivity. That is not an easy task.

Gerrit DevOps Analytics (aka GDA) is an OpenSource solution for collecting data, aggregating them based on different dimensions and expose meaningful metrics in a timely fashion.

GDA is part of the Gerrit Code Review ecosystem and has been presented during the last Gerrit User Summit 2018 at Cloudera HQ in Palo Alto. However, GDA is not limited to Gerrit and is aiming at integrating and processing any information coming from other version control and code-review systems, including GitLab, GitHub and BitBucket.

Case study: GDA applied to the Gerrit Code Review project

One of the golden rules of Lean and DevOps is continuous improvement: “eating your dog food” is the perfect way to measure the progress of the solution by using its outcome in our daily life of developing GDA.

As part of the Gerrit project, I have been working with GerritForge to create Open Source tools to develop the GDA dashboards. These are based on events coming from Gerrit and Git, but we also extract data coming from the CI system, the Issue tracker. These tools include the ETL, for the data extraction and the presentation of the data.

As you will see in the examples Gerrit is not just the code review tool itself, but also its plugins ecosystem, hence you might want to include them as well into any collection and processing of analytics data.

Wanna try GDA? You are just one click away.

We made the GDA more accessible to everybody, so more people can play with it and understand its potentials. We create the Gerrit Analytics Wizard plugin so you can have some insights in your data with just one click.

What you can do

With the Gerrit Analytics Wizard you can get started quickly and with only one click you can get:

  • Initial setup with an Analytics playground with some defaults charts
  • Populate the Dashboard with data coming from one or more projects of your choice

The full GDA experience

When using the full GDA experience, you have the full control of your data:

  • Schedule recurring data imports. It is just meant to run a one-off import of the data
  • Create a production ready environment. It is meant to build a playground to explore the potentials of GDA

What components are needed?

To run the Gerrit Analytics Wizard you need:

You can find here more detailed information about the installation.

One click to crunch loads of data

Once you have Gerrit and the GDA Analytics and Wizard plugins installed, chose the top menu item Analytics Wizard > Configure Dashboard.

You land on the Analytics Wizard and can configure the following parameters:

  • Dashboard name (mandatory): name of the dashboard to create
  • Projects prefix (optional): prefix of the projects to import, i.e.: “gerrit” will match all the projects that are starting with the prefix “gerrit”. NOTE: The prefix does not support wildcards or regular expressions.
  • Date time-frame (optional): date and time interval of the data to import. If not specified the whole history will be imported without restrictions of date or time.
  • Username/Password (optional): credentials for Gerrit API, if basic auth is needed to access the project’s data.

Sample dashboard analytics wizard page:

wizard.pngOnce you are done with the configuration, press the “Create Dashboard” button and wait for the Dashboard, tailored to your data, to be created (beware this operation will take a while since it requires to download several Docker images and run an ETL job to collect and aggregate the data).

At the end of the data crunching you will be presented with a Dashboard with some initial Analytics graphs like the one below:

dashboard-e1549490575330.png

You can now navigate among the different charts from different dimensions, through time, projects, people and Teams, uncovering the potentials of your data thanks to GDA!

What has just happened behind the scenes?

When you press the “Create Dashboard” button, loads of magic happens behind the scenes. Several Docker images will be downloaded to run an ElasticSearch and Kibana instance locally, to set up the Dashboard and run the ETL job to import the data. Here a sequence workflow to illustrate the chain of events is happening:

components.png

Conclusion

Getting insights into your data is so important and has never been so simple. GDA is an OpenSource and SaaS (Software as a Service) solution designed, implemented and operated by GerritForge. GDA allows setting up the extraction flows and gives you the “out-of-the-box” solution for accelerating your company’s business right now.

Contact us if you need any help with setting up a Data Analytics pipeline or if you have any feedback about Gerrit DevOps Analytics.

Fabio Ponciroli – Gerrit Code Review Contributor – GerritForge Ltd.