2023: New Year and opportunities for GerritForge and Gerrit Code Review

TL;DR: GerritForge has been dedicating its efforts to organising and managing the Gerrit User Summit in London back in November 2022, in conjunction with the release of Gerrit v3.7. The event has been a great success, with a significant presence on-site and record-breaking attendees on the GerritForge TV youtube channel. It has also committed to its promises to research and improve the JGit and Gerrit scalability to large mono-repos, with tens of millions of objects and refs. 2023 will see the finalisation of these efforts with an increase in development efforts and a new JGit Committer for pushing the platform to a new level of performance and scalability and a new innovating system for collecting and optimising the repository metrics automatically. Stay tuned.

Read the full story here below (9 mins read).


2022 has been a critical year for turning the Gerrit Code Review community and development back on track after the COVID-19 pandemic. At GerritForge, we’ve been working hard to make sure that the development, support, and innovation of Gerrit Code Review continue on its main objectives.

Gerrit Code Review v3.6 and v3.7

We have continued to deliver on the development and release of Gerrit Code Review and its plugins, helping the testing and releasing of versions v3.6.0 (May) and v3.7.0 (November).

Some numbers of the past 12 months’ development contributions by individual committers and companies:

  • 3,627 Changes have been merged on 76 projects related to the Gerrit Code Review platform, including JGit
  • 113 committers from 42 different organisations

A special mention to the top #10 contributors: Google (Ben Rohlfs, Edwin Kempin, Chris Pouchet, Dhruv Srivastava, Frank Borden, Milutin Kristofic), GerritForge (Luca Milanesio), Wikimedia (Paladox) and SAP (Matthias Sohn and Thomas Dräbing).

In comparison with 2021, we had 25% fewer changes merged but with more contributors coming from more companies, which is a symptom to a very healthy and thriving ecosystem of maintainers.

GerritForge has committed to resuming the face-to-face user summits, which were suspended since 2020.

The Gerrit User Summit 2022 took place in London, UK the 10-11 of November in a hybrid format, with people having the opportunity to participate either on-site or remotely on GerritForge’s YouTube TV channel.

It was a glorious success, with record-breaking attendance from all around the globe:

  • 50 people registered to attend on-site, 26 of them managed to arrive despite the London tube strike, whilst the others attended remotely
  • 235 people viewed the summit on YouTube with an average view time of 40 mins (one talk)

The summit survey had an outstanding report showing a huge acceptance and appreciation of the event:

  • 82% rated the remote video streaming as “good” or “outstanding”
  • 96% rated the quality of the summit as “good” or “outstanding.”
  • 100% would recommend the summit to a colleague, with 83% strongly recommending it

GerritHub.io SLA gets closer to five-nines.

We have been working hard to make Gerrit more stable and resilient throughout 2022, discovering and fixing many issues in the code base and on the multi-site software architecture.
In 2022, GerritHub.io had only six small hiccups for a total of 19 mins of downtime (SLA = 99.997%) over a 12-month period, a 75% reliability improvement compared to 2021.

We have run extensive RCAs on the causes of the downtime and identified two leading issues, which are explained in the details below.

The “anonymous unlimited query” hole in Gerrit
GerritHub.io has been subject to a 15 mins outage because of anonymous users being able to bring offline all the sites before the system could auto-recover.
Gerrit allows bypassing of all limits set in the ACLs for running queries by simply adding the “no-limit” parameter.
Returning an arbitrary payload without limits could allow a single user to generate a server-side workload for collecting and building a GBytes-sized JSON payload; unfortunately, that option was available to everyone, including anonymous users making any publicly faced Gerrit Code Review installation subject to deny-of-service attacks.
We have identified the issue, reported and fixed it in Gerrit with Change 333304, which has been included in Gerrit v3.3.10, v3.4.4, v3.5.1, and all v3.6.0 or later releases.

More granular monitoring and alerting
We have lowered the threshold of uptime checks on GerritHub.io to 1 minute, giving us the ability to detect and react immediately to 4 smaller hiccups. We have detected a lack of scalability for some specific higher-load projects. Those hiccups have been responsible for 2 mins of downtime over the 2nd part of 2022. Many more projects are also planning to be onboarded on GerritHub.io; hence we do need to address this project-specific capacity needs.

Scaling Gerrit Code Review and JGit beyond its limits

We have been investing a massive effort in building a test environment designed to stress Gerrit and JGit to its limits and identify all the limitations and bottlenecks that prevented us from scaling further.

Scaling the test repository
We have created over the months some test repositories that increased in every dimension:

  • Tens of millions of refs as both refs/changes and refs/heads
  • Millions of delta-chains
  • Tens of millions of Git objects
  • Packfiles of tens of Giga-bytes and packed refs of hundreds of megabytes

For generating a significant load on both client and server side, we have invested more into the aws-gerrit cloud setups and gatling-git performance loading tool.

There were some “well-known” issues and additional surprising ones.

SHA1 complexity and CPU utilization for large entities
JGit has been used SHA1 for identifying uniqueness not just for Git objects but also for other large entities. However, computing SHA1 has become increasingly CPU intensive because of the relatively recent findings about collisions on shattered.io.
We have highlighted two major potential improvements in cooperation with Matthias Sohn (SAP) on the raw SHA1 performance and its application for detecting packed-refs changes on the filesystem.

Commit priority queues
JGit has a custom implementation of priority queues which are intensively used in RevWalk, which has almost quadratic complexity. That isn’t a problem for small to medium chains of commits; however, when the number of commits reaches millions, the performance degradation becomes unbearable.
We have replaced the JGit’s custom implementation with the one provided by the Java JVM library, which has a logarithmic complexity that massively improves its performance with large commit chains.

Unwanted reachability checks
JGit needs to perform a full reachability check whenever a remote unknown client is advertising refs, which makes sense when serving a remote client. However, the cost of full reachability of millions of advertised refs can be a daunting task that may be alleviated if the remote end can be considered trusted.

Fixing JGit bitmaps
Since the introduction of Git bitmap, the whole community has learned how key they are in speeding up the counting and selection during the clone phase.
However, large and unoptimized bitmaps could be so unhelpful for Git that instead of speeding up, they could represent a massive overhead for the system, causing CPU spikes and, eventually, lowering the throughput of the server.
Git bitmaps are compressed using the JavaEWAH library, which is good for memory consumption but evil for CPU utilization: that is the reason why the smaller is best for performance.
We have discovered and fixed a critical issue with the JGit bitmap generation that was causing the inclusion of all commits and BLOBs pointed by annotated tags. Also, we have introduced the ability to inform JGit about the heads that can be excluded from the bitmap, allowing to shorten the creation tens of thousands times (5h generation time for a 2k refs to as little as 60s) and increase its effectiveness by 200%.

Millions of unneeded ref logs
When performing a clone of a repository with millions of heads, JGit created one local reflog file for every remote ref, including the ones there were not actually cloned but just fetched as remote references. This was creating a significant performance gap between JGit and Git, which would instead lazily create the reflog files once they are effectively checked out the first time. Cloning a single branch of a repository with millions of remote refs took around 1h, compared to a few minutes of Git.

All of the findings were included in multiple updates on the following components:

  • JGit changes: all fixes were also provided to stable-5.13, the last supported branch for Java 8, which allows benefiting from these improvements for older versions of Gerrit from v2.16 onwards.
  • pull-replication went through major performance improvements, achieving a 1000x times faster execution time compared to the traditional replication plugin
  • aws-gerrit is going through upgrades for making use of pull-replication plugin, including the support for the bearer token which allows to replicate virtually any repository, including All-Users.git
  • gatling-git: we have upgraded the Gatling version and JGit to the latest stable-5.13 to include the latest performance improvements.
  • git-repo-metrics: we have introduced a brand-new plugin that allows us to keep under control the major dimensions of a repository and therefore graph their increase over time.

GerritForge goals for 2023

We are definitely not done yet with the performance improvements on Gerrit and JGit: there are still significant improvements to be made, and JGit changes to get merged into the mainstream branches.
We believe we are on track to finalize the job and allow a stable and scalable platform for large Git repositories in 2023.

Finalise what we cooked in 2022 for JGit
JGit has a new maintainer, David Ostrovsky, awarded in 2022 as Git committer of the project. GerritForge’s devs are focused to get more reviews and attention to the JGit performance improvements. We are committed to finalising all the open changes related to large repositories.

JGit multi-pack indexes support
There is still a major gap between JGit and Git when dealing with very active repositories: multi-pack indexes. The proliferation of packfiles would eventually lead to a long and painful search-for-reuse phase for BLOBs which could be cut down 100s of times with a multi-pack index.

Git repository optimiser for Gerrit
We have been working on tracking the live information on the Git repository, thanks to the git-repo-metrics plugin. Wouldn’t it be nice to have a tool that can do something with it and automatically?
We would be doing R&D on how to correlate the repository metrics, the Git audit trail, and the performance data for making AI-based decisions on what needs to be improved on the repository.
This work stream is going to be useful for any Git repository, not just the ones powered by Gerrit Code Review. The ‘git-repo-metrics’ and the repository optimiser would also apply to other products, including GitHub and GitLab.

Gerrit v3.8 and projects-specific change numbers
We will finalise the design document for the transition to project-specific change numbers in Gerrit v3.8. That would allow the seamless migration of projects across Gerrit setups without having to worry about changes renumbering anymore.

Gerrit Code Review testing and GerritForge-certified binaries
GerritForge is spending a tremendous amount of time developing test environments and tools for serving the Gerrit community with more stable releases and improving the quality of its code. We want to intensify the effort and also offer our platinum support customers a unique service that includes the GerritForge digital signature and rubber stamp on the binaries of Gerrit Code Review and its plugins that have been successfully tested and validated for being production-ready.
Stay tuned; more details are coming soon …

GerritForge company forecast in 2023

GerritForge Inc. will finalise its roll-out to the USA, and all contracts and services will be run from Sunnyvale, CA and Europe. Over 2022, 60% of the customers and businesses have already been moved, and the operation will be completed over the course of 2023.

We are looking forward to doubling our revenue figures in 2023 and also our contributions to the open-source community, with a main focus on JGit as the driver of performance growth for Gerrit Code Review.


2023 is going to be an incredible year for GerritForge, Gerrit Code Review, and the JGit community altogether.

Happy New start of the Year 2023!

Luca Milanesio (GerritForge)
Gerrit Code Review Maintainer and Release Manager
Member of the Gerrit Engineering Steering Committee

2022 GOALS for Gerrit

The year 2021 has been a challenging one because of the COVID-19 global emergency; nevertheless, the Gerrit Code Review project has continued to deliver what the community expected:

GerritForge delivered on the promise of making Gerrit more cloud-native, with a particular focus on AWS, the platform that most users have adopted for running in the cloud. GoogleCloud has also been our focus, assuring a cloud-neutral approach to Gerrit and providing support for events over PubSub.

Focusing on Gerrit unique values

A successful product focuses on what makes it unique and innovative, compared to anything else in the world.

We believe that the key aspects that make Gerrit Code Review THE platform of choice for developing software based on Git repositories are:

  • Large-scale
    Gerrit is THE best platform for developing large-scale projects, huge monorepos, and a large number of changes and refs.
  • Maximum availability
    Large organizations and communities of developers need a platform that is always available, anywhere, anytime, 24×7, and 365 days a year.
  • Performance
    The need to work remotely poses multiple issues, one of them being the increase of network latency. Gerrit multi-site distribution of the repositories and reviews allows anyone, anywhere in the world, to clone, push and review at optimal latency and performance.
  • Quality of tracing of reviews
    Gerrit is based on single-commit code reviews, a winning approach in terms of review accuracy and supporting changes chains, and full traceability of the entire review history and workflow.

Many popular Git code-review tools exist in the Open-Source community; Gerrit is the winning choice when scale, availability, performance, and quality do matter.

GerritForge goals for improving Gerrit in 2022

Scale Gerrit beyond limits

GerritForge and the rest of the community have worked hard to identify the bottlenecks of large mono-repos with Gerrit. Some of them can be mitigated by keeping the Git repository lean and organized, despite the massive amount of push traffic and reviews coming from large teams.

We want to focus on improving at least ten times the following KPIs, without having a significant impact on the overall system performance:

  • Number of changes and refs in a repository: millions of changes and tens of millions of refs
  • Size of the repository: hundreds of GBs

GerritForge will step up its involvement in the JGit project in 2022 and introduce many innovations, some of them already implemented in the C-Git implementation:

  • JGit support for multi-pack index
  • Revamp of JGit cache, allowing the pluggability of high-performance implementations
  • Improvement of JGit bitmaps for large number of refs
  • Support for high-performance large storage systems
  • Introduction of new performance metrics
  • Replace Prolog with native submit rules in the owners plugin

99.999% up-time

GerritForge maintains a free service known as GerritHub.io to demonstrate what Gerrit can do and achieve. GerritHub.io is the most advanced and reliable Open-Source vanilla Gerrit deployment, apart from Google’s.

GerritHub.io uptime in 2021 – checked and reported by PIngdom.com

We achieved an astonishing 99.99% SLA in 2021; we want to push the GerritHub.io uptime further to 99.999%, reducing the annual downtime to just 315s.

In order to reach a five-nines uptime, we will work on:

  • Granular probing and health-checks
  • Advanced repositories performance monitoring and alerting
  • Gerrit limits and deadlines
  • RCAs
  • Multi-site improvements

Goal #3: Increase 1000x times the Gerrit replication performance

GerritForge has presented the innovating pull-replication plugin at the Gerrit Virtual User Summit 2021, showing that it is possible to replicate Git commits and changes meta-data across the globe with msec latency. The pull-replication plugin technology and speed is going to be improved and made available and Open-Sourced to anyone and match and outperform the traditional replication plugin features.

Join the 2022 endeavor


We need YOU and the Gerrit community’s help and support in this 2022 endeavor.

GerritForge has already increased his Team of contributors working on the project, including three Gerrit maintainers and two Gerrit release managers. However, Gerrit’s success is in the cooperation, contribution, and ideas of the whole community of contributors, Gerrit admins, and users.

Let us know what you think about our goals. We are happy to cooperate and work with anyone sharing the same values and goals.

2022 is the year where Gerrit Code Review is pushed beyond its limits even further, making it the MOST innovative tool for large-scale repositories and teams worldwide.

GerritForge 2020 year in Review

Dear Gerrit Code Review fellow,

It has been a challenging year, a strange year, for everyone. Most of us confined in our homes and working remotely, finding new ways of dealing with old problems.

Still, we believe the Gerrit Community as a whole has demonstrated an outstanding level of resilience in the face of exceptional difficulties. As far as we are concerned, we have seen no lesser activity, interest, new projects, compared to the previous years. For this, we are thankful to the community we are so strongly proud of being a part of.

In sharing our most sincere vows for happy festivities and a fruitful new year, we wanted to take the opportunity to share with you some of the achievements that made this 2020 worth getting through.

Top-Ten achievements in 2020

1. GerritForge has confirmed to be the world’s largest non-Google contributor to the Gerrit Code Review project.

GerritForge contributed almost a thousand changes merged, over 47 different components in the past 12 months. That is an outstanding achievement confirming the commitment and dedication of the company to the Gerrit Code Review platform and community.

2. Improved Gerrit DevOps Analytics with the ability to process Changes hashtags

The Gerrit DevOps Analytics platform continued to expand its possibilities, with the full support of the parsing of the NoteDb changes and the extractions of precious meta-data, such as the changes hashtags.

3. Driven two major releases of Gerrit Code Review

GerritForge helped driving the release of two major Gerrit Code Review version: Gerrit v3.2 – 1st June and with Ericsson making the Gerrit v3.3 – 1st December. GerritForge is also providing the CI/CD pipeline for the build and validation of the releases and helped the migration to Java 11.

4. Released major security fixes for the whole Gerrit Community, of all Gerrit versions dating back to v2.14

GerritForge coordinated together with Ericsson the release of critical security fixes across 6 different versions of the Gerrit Code Review platform, showing a continued commitment to a secure and scalable adoption of Gerrit in the enterprise.

5. Brand-new Gerrit module cache-chroniclemap, a new high-performance persistent cache in Gerrit v3.3

GerritForge continued its efforts in improving Gerrit’s scalability and performance with the development of a brand-new persistent cache backend powered by ChronicleMap, showing unprecedented low latency and performance on Gerrit v3.3.

6. Introduced the world-first official Open-Source AWS-based Gerrit Code Review Deployment

In a year of remote working and flexible environments, GerritForge provided its contribution to the whole community to help adopting Gerrit in the cloud. The aws-gerrit project is a brand-new production-ready AWS deployment now available for everyone and fully Open-Source. The new project is based on the GitOps best-practices and has been successfully adopted for the testing and validation of Gerrit v3.3.

7. New read-write scalability for Gerrit in High-Availability

GerritForge’s mission to scalability and high-availability (HA) plugin continued with the ability to scale horizontally with multiple R/W Gerrit Servers behind the same load balancer.

8. Improved reliability and flexibility of Gerrit Multi-Site

The Gerrit Multi-Site (MS) plugin has received exciting improvements with the support for geographically distributed R/W Gerrit servers across the globe.

9. Brand-new pull-replication plugin for improving latency and performance of large mono-repos replication.

The need to have Gerrit multi-site brought the request to have faster replication, especially for large mono-repos. GerritForge introduced a new pull-replication plugin project that, used in conjunction with Git Protocol v2, assure top-notch performance in replicating repositories with large number of refs.

10. Helped large-scale OpenSource Community migrating to Gerrit v3

The Eclipse Foundation and the OpenDev platform (by OpenStack), upgraded to the latest version of Gerrit Code Review, thanks to the continuous help and support from GerritForge and the rest of the Gerrit Code Review community.


All the best, Stay Safe, and we will shake hands again soon!

The GerritForge Team

Crunch Code Review hashtags with Gerrit DevOps Analytics

Screenshot 2020-05-12 at 11.13.06.pngWe have already discussed in previous posts how important it is to speedup the feedback loop in your Software Development Lifecycle. Having early feedbacks gives you the chance of evaluating your hypothesis and eventually change direction if needed. 

The more information you have, the smarter can be your decisions.

We recently added in our Gerrit DevOps Analytics the possibility of extracting data coming from Code Reviews’ metadata to extend the knowledge we can get out of Gerrit.

Furthermore, it is possible to extract meta-data from repositories not necessarily hosted on the Gerrit instance running the analytics processing. This is a big improvement since it allows to fully analyse repositories coming from any Gerrit server.

For example, the Gerrit analytics we are providing on https://analytics.gerrithub.io are coming from the Gerrit repository hosted on the gerrit-review.googlesource.com, the Gerrit server hosted by Google.

Hashtags aggregation

One important type of meta-data contained in the Code Reviews is the hashtag.

Hashtags are freeform strings associated with a change, like on social media platforms. In Gerrit, you explicitly associate hashtags with changes using a dedicated area of the UI; they are not parsed from commit messages or comments.

Similar to topics, hashtags can be used to group related changes together and to search using the hashtag: operator. Unlike topics, a change can have multiple hashtags, and they are only used for informational grouping; changes with the same hashtags are not necessarily submitted together.

You can use them, for example, to mark and easily search all the changes blocking a particular release:

Screenshot 2020-05-12 at 10.43.00.png

Hashtags can also be used to aggregate all the changes people have been working on during a particular event, for example, the Gerrit User Summit 2019 hackathon:

Screenshot 2020-05-12 at 10.44.39.png

The latest version of the Gerrit Analytics plugin exposes the hashtags attached to their respecting Git commit data. Let’s explore together some use cases:

The most popular Gerrit Code Review hashtags over the last 12 months

Screenshot 2020-05-12 at 10.47.49.png

Throughput of changes created during an event

see for example the Palo alto hackathon (#palo-alto-2018). We can see at the end of the week the spike of changes to release Gerrit 2.16.

Screenshot 2020-05-12 at 10.56.44.png

The extend of time for a feature

Removing GWT was an extensive effort which started in 2017 and ended in 2019. It took several hackathons to tackle the removal as shown by the hashtags distribution. Some changes were started in one hackathon and finalised in the next one.

Screenshot 2020-05-12 at 11.05.51.png

Those were some example of useful information on how to leverage the power of GDA.

The above examples are taken from the GDA dashboard provided and hosted by GerritForge on https://analytics.gerrithub.io which mirror commits and reviews on a regular basis from the Gerrit project and its plugin ecosystem.

How to setup GDA on Gerrit

Hashtag extraction is currently available from Gerrit 3.1 onwards. You can download the latest version released from the Gerrit CI.

To enable hashtag extraction you need to enable the feature extraction in the plugin config file as follow:

# analitycs.config
[contributors]
  extract-hashtags = true

For more information on how to configure and run the plugin, look at the analytics plugin documentation.

Conclusion

Data is the goldmine of your company. You need more and more of it for making smarter decision. The latest version of the GDA allows you to leverage even more data produced during the code review process.

You can explore the potential of the information held in Gerrit on the analytics dashboard provided by GerritForge on analytics.gerrithub.io.

If you would like your open source Gerrit hosted project to be added to our dashboard or would need help in setting up and supporting GDA for your organization, get in touch with GerritForge Sales Team and we can help you making smarter decisions today.

Accelerate with Gerrit DevOps Analytics, in one click!

 

Accelerating your time to market while delivering high-quality products is vital for any company of any size. This fast pacing and always evolving world relies on getting quicker and better in the production pipeline of the products. The whole DevOps and Lean methodologies help to achieve the speed and quality needed by continuously improving the process in a so-called feedback loop. The faster the cycle, the quicker is the ability to achieve the competitive advantage to outperform and beat the competition.

It is fundamental to have a scientific approach and put metrics in place to measure and monitor the progress of the different actors in the whole software lifecycle and delivery pipeline.

Gerrit DevOps Analytics (GDA) to the rescue

We need data to build metrics to design our continuous improvement lifecycle around it. We need to juice information from all the components we use, directly or indirectly, on a daily basis:

  • SCM/VCS (Source and Configuration Management, Version Control System)
    how many commits are going through the pipeline?
  • Code Review
    what’s the lead time for a piece of code to get validated?
    How are people interacting and cooperating around the code?
  • Issue tracker (e.g. Jira)
    how long does it take the end-to-end lifecycle outside the development, from idea to production?

Getting logs from these sources and understanding what they are telling us is fundamental to anticipate delays in deliveries, evaluate the risk of a product release and make changes in the organization to accelerate the teams’ productivity. That is not an easy task.

Gerrit DevOps Analytics (aka GDA) is an OpenSource solution for collecting data, aggregating them based on different dimensions and expose meaningful metrics in a timely fashion.

GDA is part of the Gerrit Code Review ecosystem and has been presented during the last Gerrit User Summit 2018 at Cloudera HQ in Palo Alto. However, GDA is not limited to Gerrit and is aiming at integrating and processing any information coming from other version control and code-review systems, including GitLab, GitHub and BitBucket.

Case study: GDA applied to the Gerrit Code Review project

One of the golden rules of Lean and DevOps is continuous improvement: “eating your dog food” is the perfect way to measure the progress of the solution by using its outcome in our daily life of developing GDA.

As part of the Gerrit project, I have been working with GerritForge to create Open Source tools to develop the GDA dashboards. These are based on events coming from Gerrit and Git, but we also extract data coming from the CI system, the Issue tracker. These tools include the ETL, for the data extraction and the presentation of the data.

As you will see in the examples Gerrit is not just the code review tool itself, but also its plugins ecosystem, hence you might want to include them as well into any collection and processing of analytics data.

Wanna try GDA? You are just one click away.

We made the GDA more accessible to everybody, so more people can play with it and understand its potentials. We create the Gerrit Analytics Wizard plugin so you can have some insights in your data with just one click.

What you can do

With the Gerrit Analytics Wizard you can get started quickly and with only one click you can get:

  • Initial setup with an Analytics playground with some defaults charts
  • Populate the Dashboard with data coming from one or more projects of your choice

The full GDA experience

When using the full GDA experience, you have the full control of your data:

  • Schedule recurring data imports. It is just meant to run a one-off import of the data
  • Create a production ready environment. It is meant to build a playground to explore the potentials of GDA

What components are needed?

To run the Gerrit Analytics Wizard you need:

You can find here more detailed information about the installation.

One click to crunch loads of data

Once you have Gerrit and the GDA Analytics and Wizard plugins installed, chose the top menu item Analytics Wizard > Configure Dashboard.

You land on the Analytics Wizard and can configure the following parameters:

  • Dashboard name (mandatory): name of the dashboard to create
  • Projects prefix (optional): prefix of the projects to import, i.e.: “gerrit” will match all the projects that are starting with the prefix “gerrit”. NOTE: The prefix does not support wildcards or regular expressions.
  • Date time-frame (optional): date and time interval of the data to import. If not specified the whole history will be imported without restrictions of date or time.
  • Username/Password (optional): credentials for Gerrit API, if basic auth is needed to access the project’s data.

Sample dashboard analytics wizard page:

wizard.pngOnce you are done with the configuration, press the “Create Dashboard” button and wait for the Dashboard, tailored to your data, to be created (beware this operation will take a while since it requires to download several Docker images and run an ETL job to collect and aggregate the data).

At the end of the data crunching you will be presented with a Dashboard with some initial Analytics graphs like the one below:

dashboard-e1549490575330.png

You can now navigate among the different charts from different dimensions, through time, projects, people and Teams, uncovering the potentials of your data thanks to GDA!

What has just happened behind the scenes?

When you press the “Create Dashboard” button, loads of magic happens behind the scenes. Several Docker images will be downloaded to run an ElasticSearch and Kibana instance locally, to set up the Dashboard and run the ETL job to import the data. Here a sequence workflow to illustrate the chain of events is happening:

components.png

Conclusion

Getting insights into your data is so important and has never been so simple. GDA is an OpenSource and SaaS (Software as a Service) solution designed, implemented and operated by GerritForge. GDA allows setting up the extraction flows and gives you the “out-of-the-box” solution for accelerating your company’s business right now.

Contact us if you need any help with setting up a Data Analytics pipeline or if you have any feedback about Gerrit DevOps Analytics.

Fabio Ponciroli – Gerrit Code Review Contributor – GerritForge Ltd.

Gerrit User Summit 2018

The Gerrit User Summit 2018 has ended. It has been a truly memorable event and with strong emotions, gratitude and celebration of the success of the Gerrit OpenSource project.

During the hackathon, some major events happened:

  • The release of Gerrit v2.16
  • Support for Kubernetes
  • New plugins contributed by Qualcomm
  • The announcement of a new maintainer, Marco Miller from Ericsson, congrats!

Hackathon and Gerrit v2.16

The hackathon took place at SAP (2 days) and Cloudera (1 day) in Palo Alto with the main focus of completing the migration to PolyGerrit and cut the v2.16 release. Three major milestones of the Gerrit project:

  1. PolyGerrit as default UI for Gerrit Code Review.
  2. Migration of the groups to NoteDb
  3. Support of the Git protocol v2

With regards to the achievement #1, during the hackathon, the commit for removing all the GWT related code from Gerrit master has been finally merged.

PolyGerrit as default Gerrit Code Review UI

The PolyGerrit Team has reached the summit of the long and troubled journey of getting rid of the outdated and Gerrit GWT UI. Kasper and Logan (Google) burned every single bit of their development energy to fix, implement and fill all the remaining gaps in the user journeys.

On Gerrit v2.16, PolyGerrit UI is the default and standard way of interacting with Gerrit Code Review. GWT is still there for allowing people to smoothly adapt to the new interface. However, the next forthcoming version of Gerrit v3.0 will not contain anymore any reference to the old UI. People will still have a few more months to adapt and, from the feedback received on GerritHub.io, they seem to not miss the GWT interface at all.

Well done to the entire PolyGerrit Team, starting from Andrew who kicked off the project back in 2015 to Wyatt, Becky, Victor, Kasper and Logan who brought it up to the usability and portability to multiple devices as it is today. Thanks to their amazing effort, we can all interact with Gerrit Code Review wherever and whenever we want, without losing speed and effectiveness.

Migration of Gerrit groups to NoteDb

One more step towards the complete elimination of ReviewDb has been achieved in Gerrit v2.16: there isn’t anymore any mutable data in the leftover of the (in)famous Gerrit DB. The new version includes a new git repository (All-Groups) which contains all the information related to the groups create through the Gerrit UI.

What’s left in ReviewDb? Not much: only the schema version. When you install Gerrit it is populated with a single row and a single column with the value ‘170’ and will never change. There is no need to share it across master or slaves: just set database.type = h2 and forget about it.

Git protocol v2

It is a major achievement of Gerrit v2.16 as the Git wire protocol v2 solves the performance problems of large repositories: the refs filtering during the advertisement phase.

In a nutshell, if you were fetching a single from a repository with 500k refs, the server would have sent a huge payload of all refs with the associated SHA1s even if at the end of the day you wanted only to fetch a single one. The delay could have been of several seconds if not minutes, which is quite relevant if you are fetching all the times, like in a CI/CD pipeline.

Git protocol v2 reduced the refs advertisement phase by over 70%, which makes Gerrit v2.16 so appealing that you may want to place in the top 5 tasks of your backlog.

Farewell to GWT UI code

Even though Gerrit v2.16 keeps the ability to display the old-fashioned GWT Gerrit UI, the code has been definitely removed on the Gerrit master branch, which will the base for Gerrit v3.0 that will be released in Spring 2019.

This is a memorable moment because represents the final act of making PolyGerrit UI the unified user-experience of Gerrit moving forward.

Data and Insights of the Hackathon

This year the event has been truly focused as well on the analytics side of the Code Review and the entire CI/CD pipeline: was it possibly the location, Cloudera, the world leader in OpenSource BigData and Analytics, giving the right vibrations? Possibly, yes.

Numbers

Over 20 people have attended the Hackathon on-site and, for the first time, remotely via video-conference, from companies and countries all around the world:

  1. Google (USA and Germany)
  2. CollabNet (Japan and Germany)
  3. GerritForge (UK, Ireland, Italy, and Germany)
  4. SAP (Germany)
  5. Qualcomm (USA)
  6. Ericsson (Canada)
  7. Wikimedia Foundation (UK)

Over 300+ changes have been uploaded and 244 of them have been already merged.

They have been working on 44 projects, showing how diverse is today the universe of Gerrit Code Review and its associated plugins and integrations.

Gerrit DevOps Analytics

GerritForge has been hacking on improving the platform that powers the Gerrit Code Review Analytics Platform which has been serving the community for many years.

For the first time, thanks to the introduction of branch-level analytics, the community had data automatically crunched in near-real time of what was happening on the master branch and on the on-going stable 2.16 release.

As a matter of fact, the numbers mentioned before are directly coming from Gerrit DevOps Analytics. Thanks again GerritForge for innovating and improving the Gerrit Code Review platform.

Gerrit Analytics Wizard

For the first time, two new contributors from GerritForge, Tony and Ponch, finalized and demoed a new plugin called ‘analytics-wizard’ that allows anyone with a Gerrit server to set up a mini-version of the Gerrit DevOps Analytics, using Docker, ElasticSearch, and Kibana.

Two new plugins: batch and tasks

Qualcomm has released two new plugins based on their experience of running complex pipelines on a multi-master Gerrit setup. The majority of the code was developed in-house before the hackathon. However, the final release and announced happened this week and I am sure it will allow many other companies to benefit from their experience of validating complex multi-repo changes on a large scale CI system.

Batch

This plugin provides a mechanism for building and previewing sets of proposed updates to multiple projects/refs that should be applied in a batch. These updates are built with Gerrit changes.

Task

The plugin provides a mechanism to manage tasks which need to be performed on changes along with a way to expose and query this information. Tasks are organized hierarchically, and task definitions use Gerrit queries to define which changes each task applies to, and how to define the status criteria for each task. An important use case of the task plugin is to have a common place for CI systems to define which changes they will operate on, and when they will do so.

Gerrit on Kubernetes from SAP and GerritForge

Luke (GerritForge), Matthias (SAP) and Thomas (SAP remotely from Germany) worked on a PoC for supporting Kubernetes deployments.

Last year SAP started using containerized slaves and decided to go cloud native and base our future Gerrit deployments on Kubernetes and leverage project Gardener to support multiple cloud providers. Matthias worked with Thomas to prepare a PoC and a demo the current work in progress and discuss plans moving forward.

The code has been published to the new k8s-gerrit repository and new commits will be subject to the standard Gerrit Code Review process.

Luke prepared an example of Gerrit v2.15 deployment in Kubernetes (single master at the moment) with shared storage and showed how to upgrade easily by leveraging Helm Charts deployments through Tiler.

Even though the two projects started in parallel, they agreed to cooperate and work together with the community to have a unified Kubernetes deployment code-base for installing and upgrading multi-master Gerrit setups. This is definitely an area where we will see very interesting developments very soon.

What about the Summit?

The talks of the summit have been very interesting this year and have covered mostly three main topics: Gerrit roadmap and scalability, sharing of real-life migrations to Gerrit v2.14 and v2.15 and, as previously mentioned, Data Analytics & Insights including research work on using machine-learning for supporting the Code Review process.

All the talks are getting published as we speak on each talk of the Gerrit User Summit Schedule.

All the talks have been recorded and will be published within the next few weeks on the GerritForge TV YouTube channel.

More blog posts will follow with more background on each talk, stay tuned and watch this space to know more about what has been presented and announced at the Summit.

Proposals for the next Gerrit User Summit 2019

The overall feedback on the summit has been very positive, including the Birthday Party organized and sponsored by GerritForge for the 10th anniversary of the Gerrit Code Review project.

During the final closing keynote, many interesting proposals have been made:

  • Gerrit users’ wish-list session.
    Any Gerrit user can feed the backlog of new feature proposing what they will like to see implemented
  • Gerrit plugin hackathon.
    Learn how to implement Gerrit plugin and trying to put together in one or two days something useful using any of the programming languages they wish, including Java, Groovy, Scala or anything else.

More details will come as soon as the closing keynote session recording will be published.

Remembering Shawn Pearce and his legacy

Dave Borowitz, the leader of the Gerrit Code Review project, made a thoughtful and touching speech about Shawn Pearce, the father of the Gerrit 2, who started the project exactly 10 years ago. The cake, the hackathon, the summit, and the v2.16 release have been fully dedicated to his memory as appreciation for his humanity and passion for OpenSource.

Thanks, Shawn. We will continue on your legacy and improve the Gerrit Code Review platform as you would have wanted us to do.