Zero-downtime Git and Gerrit Code Review @GoogleSource.com

Posted on March 22, 2015 by Git and Gerrit Code Review for the Enterprise

Where is this coming from?

Zero downtime image from http://www.couchbase.com

Yesterday GitHub was down for a DB upgrade, an outage that overall lasted for 23 minutes. This may not sound a problematic downtime at all, but when you think that nowadays GitHub is used not only for Software development worldwide as a Git server but as also as a source and binary packaging repository and distribution centre, a Markdown pages server and possibly much more … and you multiply by the number of users / repos hosted, then 23 minutes may translate in a significant disruption and, for some mission-critical business use-cases, even financial loss.

We never needed planned outages for DB upgrades on Gerrit Code Review used for a lot of other OpenSource projects (ranging from Android to Chromium): how the Gerrit team is managing to outperform GitHub? I asked Shawn Pearce to spend some time to describe how his team at Google managed to implement its roll-out strategy in the delivery pipeline going through tons of major DB upgrades with zero downtime worldwide.

He kindly responded on the Gerrit Code Review mailing list with this post, and we are very thankful for having shared his experience with us, hoping that GitHub guys will read this post and may learn from it for future GitHub DB upgrades.

I am reporting here Shawn’s post AS-IS, in order to maximise the audience and enable more people to access its content.

How googlesource.com manages database upgrades with no downtime (by Shawn Pearce)

In light of the recent GitHub database outage, Luca Milanesio asked me to describe how googlesource.com has managed nearly 3 years of database upgrades with zero downtime. So… here is an attempt. 🙂

tl;dr: protobuf, Bigtable, and multi-master.

Long version…

Bigtable … not SQL

Years ago we settled on using Google Bigtable as the backing database for googlesource.com instead of MySQL or PostgreSQL. This decision actually came about because of virtual hosting (see below), not because Google is any better at running Bigtable than MySQL or PostgreSQL (we run those well too).

Briefly, Bigtable is a NoSQL database that organizes data into tables of column families; read the Bigtable paper for an overview. Rows can contain irregular shapes of columns, and two rows in the same table do not need to have the same layout (columns).

To support Gerrit Code Review I hand-wrote a complete implementation of the ReviewDB interface and all of its sub interfaces to transport data between the application and Bigtable.

Data is stored in ~3 Bigtables:

Accounts: Accounts, AccountDiffPreferences, AccountExternalIds, …
Changes: Changes, PatchSets, PatchLineComments, …
SiteData: AccountGroups, AccountGroupByIds, …

We mash data for multiple ReviewDb tables into the same Bigtable by assigning the tables to different column families. Data for an Accounts row goes into the “Accounts.data” column family, while data for an AccountDiffPreferences row goes into the “AccountDiffPreferences.data” column family. E.g.:

row: 100151 # account_id
Accounts.data:
... data for account object ...

AccountsDiffPreferences.data:
... data for diff pref object ...

Our guiding principal for what goes where is based (mostly) on the primary key declaration. If Account.Id was first in the primary key, the row(s) go into the Accounts Bigtable. If Change.Id was first in the primary key, the row(s) go into the Changes Bigtable. This means the StarredChanges data is stored in the Accounts Bigtable, and PatchLineComments is in the Changes Bigtable.

Everything else that didn’t quite fit the Accounts or Changes pattern went into SiteData. AccountGroups for example are in SiteData.

To be honest, this is all arbitrary. I could have randomly assigned ReviewDb tables to Bigtables. Or put them all in a single Bigtable.

Creating new tables

New table creation is handled by pushing a new column family to Bigtable. This is an online operation that does not require changing any existing data. Internally column families are just unique tags written before the stored data. Adding a column family just assigns a new tag that has not been used yet.

Protobuf

The really important part of our online schema upgrade process is actually Google protobuf.

Bigtable doesn’t store structured data. Bigtable stores sequences of bytes in column families. Googlers get structure by storing encoded protobuf messages in column families. Protobuf encodes messages by writing a unique integer tag before each field. The tag allows readers to match data back up to the runtime object during decoding.

Protobuf gives us very critical features:

– Unknown fields are skipped (and ignored). If a field has been deleted from the model, but still exists in data records, the application code can safely skip over that data by reading the tag, recognizing its an unknown field, skipping its encoded bytes, and continuing onto the next field.

– Unknown fields are preserved. If a field is not recognized its encoded bytes are kept in memory. When the application makes changes to a message and writes the message back to the database table, the unknown fields are preserved and written back as-is.

– Fields can be missing. If a field is not present in the data, it simply has no tag present in the encoded message. The field is assumed to be its default value by the application.

Each database table in ReviewDb is described by its own protobuf message. The @Column() annotations in ReviewDb include the unique field numbers used by protobuf to tag data in encoded messages. You can see this schema by printing the protobuf schema out:

java -jar gerrit.war ProtoGen -o reviewdb.proto ; cat reviewdb.proto

In our Bigtable mapping the Gerrit application server encodes an Account object into a protobuf message, then writes the encoded protobuf to the Accounts.data column family. Reading from the database is merely the reverse process.

Column deletion

Columns can be removed from a table by removing its @Column annotation from the Java object. The field definitions will be omitted from the protobuf description. New application code that reads from the database table will skip over the (now unknown) field. During updates of a row the deleted/unknown field will be preserved and written back to the database table.

It is very important that the field number is never reused.

Nothing prunes the old fields from the Bigtable. Disk storage is cheap, disk IOs are not. Leaving the deleted data on disk is cheaper than scanning through every row and clipping out the deleted fields.

This is why we leave deleted fields commented out in source code, so future developers know not to reuse a field number.

Column addition

Columns can be trivially added to an existing table by assigning a new field number. When newer application code reads an old record it won’t find the new tags and will simply assume the default that is supplied in the protobuf description.

Unfortunately the defaults used in @Column annotations don’t always match with the real intended defaults. We have had to hack this at Google by applying a patch to every version of Gerrit for 2 fields:

- optional bool size_bar_in_change_table = 16;
+ optional bool size_bar_in_change_table = 16 [default = true];
optional bool legacycid_in_change_table = 17;
optional string review_category_strategy = 18;
- optional bool mute_common_path_prefixes = 19;
+ optional bool mute_common_path_prefixes = 19 [default = true];

The open source project chose to apply these defaults using Schema_NNN upgrade files that rewrite all existing accounts to set the fields true during init. We do not have that luxury and instead patch every release we make to assume the “correct” default if the field is not present in the stored data. This is why I lobby so hard against boolean columns being true by default via Schema_NNN upgrades. 🙂

Because of the unknown field properties described earlier, it is (usually) safe to run newer binaries alongside old binaries against the same database. A newer binary may store new fields to a row. The older binary will ignore these, but preserves the unknown field data during updates.

Of course cross-field semantics could be confused if we attempted this. We limit our risk by staying close to HEAD and try really, really hard to avoid cross-field semantic issues (e.g. anything like status and open in changes).

Column rename

We really don’t care about column renames. The column names themselves are not stored in Bigtable or in the encoded protobuf messages. Column names are only in the application software. A column name change is just a recompile, similar to a method name change.

What we cannot do is change field IDs. Once used in an @Column annotation, we are stuck with that ID number forever. 🙂

Virtual Hosting

googlesource.com implements virtual hosting for hundreds of Gerrit sites. All sites are combined together into the same 3 Bigtables by prefixing each row with the site name, for example:

row: gerrit:100151 # $site:$account_id
Accounts.data:
... data for account object ...

AccountsDiffPreferences.data:
... data for diff pref object ...

The application server itself is virtual hosted by running a javax.servlet.Filter in front of Gerrit. The filter extracts the host name from the HTTP Host header and stores it somewhere accessible by the hand-coded ReviewDb implementation. All database operations include the host name as part of the row keys being accessed.

It is this virtual hosting strategy that forced our hand and required such smooth online schema migrations.

When we update the binary, we update the binary for hundreds of “servers” at once. We can’t shutdown everyone for 200 * 5 minutes to upgrade 200 sites at 5 minutes each while we run a Schema_NNN process serially. We also don’t want to use 200 CPUs to update 200 sites in parallel during a global 5 minute downtime window, too much can go wrong, and there will always be straggling sites. Neither option appealed to us.

So smooth online migrations it was. 🙂

Multi-master hosting

We don’t run one Gerrit server. We run many Gerrit servers against the same Bigtables. Requests load-balance across this pool of servers, based on a number of factors that are out of scope for this particular article.

We use this multi-master hosting to help do online binary upgrades of Gerrit.

Given N servers where N >= 3:

1) we take one out of the load balancing rotation
2) wait for in-flight requests to finish
3) stop the process
4) install the new version
5) restart it
6) add it back to the rotation
7) goto 1

We size N such that N is larger than the number we actually need to handle traffic; this allows us to lose a server without impact to traffic to do the upgrade dance.

Linux operating system upgrades can be coordinated the same way, as the servers are on different machines.

Multi-data center hosting

Given multi-master hosting, we don’t put all of our servers in the same data center. We run them in multiple data centers and allow the load balancers to route across all of them.

This strategy allows us to perform data center level maintenance without service interruption by taking some servers out of the load balancing rotation before maintenance starts.

Sometimes data center level maintenance is power related; e.g. servers may need to be shutdown to repair a failed UPS. Other times its database related. I recently corrupted a database replica in one data center by accident. I “shutdown” our servers in that data center while I manually restored a known good database. Nobody except my team at Google knew about my mistake, or the impact.

Once you are multi-data center, cross-site database consistency becomes an issue. Frankly we just reuse Google Megastore to get cross data center consistency based on a high quality Paxos implementation. Each of our data centers has a full copy of the database local to it and Paxos is used to ensure the application has a consistent view.

And by this point, you are probably wishing you had stopped at the tl;dr … 🙂

GitHub fully operational again

Posted on March 21, 2015 by Git and Gerrit Code Review for the Enterprise

GitHub outage latest for around 23 minutes and now the site has resumed normal stable operations.

GerritHub.io and his users have not been impacted by the GitHub outage, everything went smoothly and the cache TTL extension avoided any negative effects on our systems. Replication to GitHub resumed smoothly without any misalignment caused by the the outage.

Will this be the last GitHub outage? Have they learned how to implement effectively DB roll-outs with Continuous Delivery practices?

It would be very interesting if Shawn Pearce could put together a presentation on how Continuous Delivery is achieved for Gerrit Code Review at Google, avoiding downtime even during DB upgrades and roll-outs. Possibly GitHub could be inspired by us 🙂

GitHub outage started … hopefully won’t be long :-)

Posted on March 21, 2015 by Git and Gerrit Code Review for the Enterprise

As previously announced, GitHub service outage has officially started.

GerritHub.io is available as usual and sign-in is working, thanks to the an extended cache TTL set to 2 days. If you have signed in over the past two days, your cookie will still be valid and your group ownership / permissions are cached on our systems.

Please remember that some of the other non-cacheable services won’t be available:

Sign-Up for a new GerritHub.io account
Import of a GitHub profile
Import of a GitHub repository or pull request
Replication to GitHub

You can still use the Gerrit Code Review functionalities as normal, including review Web GUI and git push/pull over SSH or HTTPS.

Once GitHub will be back on-line, we will reschedule an extra maintenance replication to make sure that all Gerrit changes are replicated back to GitHub.

Thank you for your patience and in case of any issue please report to https://gerritforge.com/support.

GitHub Scheduled Maintenance – Saturday 3/21/2015 @ 12:00 UTC

Posted on March 19, 2015 by Git and Gerrit Code Review for the Enterprise

GitHub planned outage

GitHub announced a scheduled downtime of its API starting from this forthcoming Saturday, 21st of March 2015 from 12PM UTC … I have to say that this is really the first time and I am quite surprised. I have always considered GitHub as one of the best examples of continuous deployment and feedback, allowing the transparent roll-out of dozen of changes every week; however sometimes even “The Rich Also Cry”.

What are the implications of this outage for GerritHub.io?

GerritHub.io uses the GitHub API for the following operations:

Sign-Up and Sign-In to Gerrit Code Review GUI
Import user profile, repositories and pull requests
Gerrit groups lookup
Replication using GitHub OAuth

As all the GitHub API would return 503 (Service Unavailable) the basic Gerrit Code Review functionalities could be eventually impacted.

How can we minimise the impact?

We will be rolling out longer cache TTL and cookie expiry times on Friday 20th of March on Gerrit Code Review, allowing to keep existing sessions for a much longer time up to 2 days validity. Similarly the Group and Accounts caches TTL will be extended in order to fill the GitHub API blackout.

And what about replication?

Whilst we can minimise the impact on Gerrit Code Review which is under our control, we can do little about GitHub availability: the commits pushed to GerritHub.io will be “parked” until GitHub services will be resumed again.

They will still be accessible to your Team but only through the GerritHub.io clone URLs.

What should I do when GitHub services will be resumed?

GitHub has not notified yet the length of his maintenance window but you will be able to receive notifications on its status on https://status.github.com and we will notify the progress and the impact on our services on https://gitenterprise.me, Twitter @gitenterprise and Facebook on https://facebook.com/gitenterprise.

Once the GitHub services will be back and fully operational, we do suggest to sign-in and verify the replication status of your repositories to GitHub, checking the SHA-1 of your branches on GerritHub.io against the corresponding ones on GitHub.

Example on how check the replication status of myorg/myrepo:

$ git ls-remote https://review.gerrithub.io/myorg/myrepo | \
  egrep -e "(heads|tags)" | awk '{print $2"\t"$1}' | \
  sort > /tmp/myrepo.gerrit
$ git ls-remote https://github.com/myorg/myrepo | \
  egrep -e "(heads|tags)" | awk '{print $2"\t"$1}' | \
  sort > /tmp/myrepo.github
$ diff /tmp/myrepo.gerrit /tmp/myrepo.github

What should I do to resync the repositories?

First of all you need to establish which one is the “source of truth”. If you have been using GerritHub.io as main code review, then the answer is always review.gerrithub.io.
In order to resync your GitHub repository, you just need to manually pull from review.gerrithub.io and push to github.com.

Example on how to resync myorg/myrepo:

$ git clone --mirror https://review.gerrithub.io/myorg/myrepo 
$ cd myrepo.git
$ git push --all --tags https://github.com/myorg/myrepo

What should I do if the push to GitHub fails?

There is not a unique answer to this question: if the push fails it means that your GerritHub.io and GitHub.com repositories started diverging. This happens when people pushes directly to GitHub without going any Code Review, which is potentially possible if you have left the permissions doors wide opened on GitHub.

My suggestion is always to check what is in GitHub that has not gone through Gerrit Code Review and, if possible and does not create conflicts, pull that set of commits into your GerritHub.io repository.

Example of pulling changes from a GitHub branch (e.g. mybranch) that are not contained in GerritHub.io:

$ git clone https://review.gerrithub.io/myorg/myrepo 
$ cd myrepo && git checkout mybranch
$ git pull https://github.com/myorg/myrepo mybranch
$ git push origin mybranch

Questions? Doubts? Problems?

If you have any questions or you need any assistance during the outage because you are experiencing problems, feel free to contact our customer support at https://gerritforge.com/support or tweet us at @gitenterprise.

Alternatively for any Gerrit-related problems, the best free source of information is always the Gerrit mailing list at https://groups.google.com/forum/#!forum/repo-discuss.

Gerrit moves into Cloud IDE space with in-line edit

Posted on March 10, 2015 by Git and Gerrit Code Review for the Enterprise

Gerrit Code Review has begun the Ver. 2.11 release cycle and the first release candidate been released this morning on Gerrithub.io.

Entering into the battlefield

Gerrit is entering for the first time into the field of Cloud based IDE integrating a Browser-based editing functionality into the code review lifecycle. For the very first time you are just a couple of clicks away from a review-edit-submit turnaround: see below the additional icon to access the functionality from the Gerrit change screen.

What is in-line edit and how can I use it?

As this is a brand-new functionality with a complete new UX, a new dedicated page has been published to guide through the new functionality.

You can experiment today the in-line edit by creating a new project on GitHub and sign-in to GerritHub.io. The new turnaround is quick and the flow is splendid ! This has been a masterpiece in collective code ownership and review of the Gerrit Team; this feature has been lead by David Ostrovsky after a series of early betas shared and discussed collectively.

What else is included in Gerrit 2.11?

A lot of new enhancements are coming, mainly related to the improvements of Gerrit REST-API to support this new feature.

The full list of changes can be accessed at https://gerrit.googlesource.com/gerrit/+/refs/heads/master/ReleaseNotes/ReleaseNotes-2.11.txt.

Where is Gerrit heading to?

We foresee a near future where Gerrit becomes the central hub of the code-review and integration workflow, together with a CI engine such as Jenkins. It has recently proposed a new build of Gerrit without a GUI and exposing its review capabilities in headless mode: the presentation logic will then be implemented by the various UX plugins integrated with other IDEs.

Should this scenario materialise as future of Gerrit, we will soon see other UX that will expose the power, flexibility and scalability of Code Review system in a brand-new HTML5 or native experience.

The IntelliJ and Eclipse plugins are already a reality of this, but more will come and I bet they will be more focused on the Cloud IDE use-case.

Gerrit 2.10 rpm and debian packages available

Posted on February 27, 2015 by Git and Gerrit Code Review for the Enterprise

We are pleased to announce the general availability of RPM and Debian native distributions for Gerrit Ver. 2.10. They have been packaged from the original Gerrit WAR using the new native installers tools developed and contributed back to the Gerrit community.

Why yet another packaging format?

Gerrit Code Review has always been released as pure Java executable WAR that has then to be invoked with a target installation directory. This style of packaging has been working fine right now, however many companies use the Linux native software packaging tool to standardise the way they install and update software.

The reasons behind that choice are:

automatic compatibility matrix and dependencies tracing and resolution: Gerrit dependencies are automatically checked and download upon installation
automatic update of the latest patch-set released on Gerrit using “yum update” or “apt-get install –only-upgrade”
out-of-the-box support for Puppet / Chef for unattended installations
compliance with company standards: system administrators do not have to manage a “special case” for Gerrit

What type of packages are provided?

The following two packaging formats are currently supported:
– RPM packages for RedHat / CentOS Linux distributions
– Deb packages for Ubuntu or other Debian distributions

How to enable Gerrit native packages on my Linux system?

Gerrit packages are distributed using the native repository distribution system provided by the Linux distribution:

YUM repository for RedHat / CentOS
Debian source for Ubuntu

To facilitate the set-up, GerritForge provided an RPM for RedHat / CentOS that will do the job for you:

rpm -i https://gerritforge.com/gerritforge-repo-1-5.noarch.rpm

On Ubuntu 14.x up to 16.x / Debian you need to define the extra source pointing to the GerritForge distribution site:

echo "deb mirror://mirrorlist.gerritforge.com/deb gerrit contrib" > /etc/apt/sources.list.d/gerritforge.list

apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 1871F775
apt-get update

On Ubuntu 18.x or alter you need to define a specific source pointing to the GerritForge distribution site with higher security keys and hashing requirements:

echo "deb mirror://mirrorlist.gerritforge.com/bionic gerrit contrib" > /etc/apt/sources.list.d/gerritforge.list

apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 55787ed781304950
apt-get update

Once the additional distribution repository has been configured, installing Gerrit 2.10 and all future versions will be just one command away and can be completely automated.

Simpler or more complex than Gerrit init wizard?

The complexity of a native packaging system is all concentrated in the first repository set-up, however it is aligned with any other type of repository additions that System Administrators are very familiar with.

Once the repository set-up phase is finalised, then installing a working version of gerrit is just one command away:

On RedHat / CentOS you simply need to execute:

yum install -y gerrit

On Ubuntu / Debian the command is:

apt-get install gerrit

Additional once configured for production, the configuration files are kept across updates and you can get constantly aligned with the most updated patch-level for your installation.

Gerrit then appears in the list of the installed software alongside with its version number and patch-level.

What is the Gerrit default configuration for native packaging?

Gerrit comes out-of-the-box with a “playground-style” configuration, to allow to have a working instance that can be verified without the need of any extra configuration step:

H2 Database
DEVELOPMENT_BECOME_ANY_ACCOUNT authentication
HTTP (no SSL) and SSH protocols

Once installed, the System Administrators will need to go through the Gerrit documentation and configure the authentication, security protocols and DB of choice. All configuration settings will remain local and will be kept across package upgrades.

Who is hosting Gerrit native packages?

Gerrit native packaging hosting follows the typical strategy of the native distribution sites: a list of mirrors that will be automatically chosen by the Linux installer based on speed and location.

We currently have two mirrors:

dl.bintray.com – kindly offered by JFrog
gerritforge.com – kindly offered by GerritForge

The list of sources can be extended as the native packaging systems allow to define a set of mirrors that are automatically selected based on your location and speed.

If anyone is willing to host a mirror of those native packages, please contact GerritForge or post a message on the Gerrit mailing list and we’ll update the official list of mirrors available.

How to tailor Gerrit native packages for your Company?

Gerrit RPM and Deb packages are built using 100% OpenSource tools and scripts published. Just checkout the gerrit-installer project and make the packages again defining your desired version, organisation and distribution URLs.

What’s next?

We are going to publish soon more native packaging formats:

Docker (coming very soon)
MacOS Pkg
Windows MSI

Stay tuned and let us know what do you think.

Pingdom status for GerritHub.io

Image

You can check the status of GerritHub.io services in real-time thanks to the public page offered by PingDom.com

The GerritHub.io status page is http://status.gerrithub.io and displays:

Current status with response time
History of the past 7 days with uptime
Details of the last 24 hours

Tonight for instance reports a temporary service outage (3 times, around 15′ each) caused by an intermittent unavailability of the GitHub API. As GitHub was not able to provide the validity of its OAuth code credentials, GerritHub was not able to allow the completion of its login handshake and thus resulting in a partial outage.

We will use the PingDom.com reports to reinforce our production infrastructure and make GerritHub.io more resilient in the future. For instance in this case (GitHub API unavailable) we will look at reusing cached credentials for allowing known people with non-expired Gerrit cookies to complete their operations. For unknown users, we will display next time a courtesy message explaining that the sign-up is unavailable for GitHub API temporary outage, avoiding the allocations and time-outs of HTTP connections.

GitMinutes #30: Luca Milanesio on Gerrit Code Review

Posted on July 7, 2014 by Git and Gerrit Code Review for the Enterprise

Many thanks to Thomas Ferris Nicolaisen for inviting me to talk about Gerrit Code Review at GitMinutes.

It has been a very interesting discussion on the benefits of Code Review and how Gerrit can help out small and large companies embracing it.

The interview is available on-line at http://episodes.gitminutes.com/2014/07/gitminutes-30-luca-milanesio-on-gerrit.html, alternatively you can download and listen the 1h 27′ conversation on PodCast at https://itunes.apple.com/de/podcast/gitminutes-podcasts/id637843725?l=en.

Use the force Luca!

We started (of course!) talking about the [in]famous force push of 186 Jenkins repositories to GitHub, I was on the Top-10 HackersNews over 7h … so I was expecting the question to pop-up during the interview 🙂

My friend Alex Blewitt took the opportunity as well to forge a Star-Wars like headline for his InfoQ article on what happened.

Git adoption in the Enterprise, where all began

We moved the discussion to the foundation of my business on Git and Code Review and the reasons and challenges that an Enterprise company is facing when moving to Git. We went through the history on how LMIT started GitEnterprise.com and then focused on Gerrit Code Review based product and services for large Enterprises World-Wide: a niche and successful business nowadays.

GitHub or Gerrit? or both with GerritHub?

As I expected, we ended up comparing GitHub and Gerrit analysing the similarities and differences between the two. This topic has been presented as well in two conferences at Gerrit User Summit @GooglePlex – Mountain View CA and 33rd Degree.org Java Developers Conference in Krakow; slides are available at http://www.slideshare.net/lucamilanesio/gerrit-codereviewgit-hubplugin.

Gerrit has historically been considered as “more difficult” than GitHub: true in the past but not anymore today apart from the Web User-Experience CSS styling, much nicer and pleasent on GitHub. The availability of http://gerrithub.io allowed over 1,800 developers since October 2013 to get started with Gerrit in less than 5 minutes by watching an Gerrit Introductionary YouTube video: using it was then just 3 clicks away, no installation or configuration needed! The availability of an easy and accessible Public Cloud instance represents a big improvement in accessibility and usability of Gerrit.

For which teams is Gerrit the right choice?

We talked about the “typical learning curve” of people coming from previous version control systems, such as Subversion. Does it make sense to get started with Git and Gerrit at the same time? When is Gerrit needed and when is it going to provide most of its value?

I’ve covered the topic in the past webinars and talks: hands-on Webinars recordings are freely available on-line at:

The size of the project (in terms of number of people x number of repositories) is typically one of the key factors in Code Review adoption. Gerrit however can be used as well as a standalone OpenSource Git Server , even without leveraging its Code Review capabilities: this makes the choice of Gerrit a good first step towards a smoother Git adoption.

What are Gerrit Topics about?

We went through a very interesting discussion about “Gerrit Topic”, a feature that is not new to Gerrit but is sometimes forgotten besides its important and relevance for medium-large teams.

With the forthcoming support of multi-repositories atomic commits in Gerrit, it will be possible to merge multiple changes on multiple repositories at the same time for a single topic. This feature is not ready yet but coming hopefully in the near future and Google Gerrit Team developers and contributors are working on it.

The ability to make an atomic commit across multiple repositories will allow to have a more consistent Jenkins build process as well, with less broken builds because of interdependent changes on multiple components.

Who is using Gerrit today?

We talked about the adoption of Gerrit in the community, which is growing year after year. A lot of medium companies adopted Gerrit in the past, including Spotify side-by-side with GitHub.

The ability to “submit a change” to any project without the risk to break the build is definitely an incentive to encourage even more people to contribute to share the knowledge and improve the code base, without the risk of breaking anything or forking the code. This is one of the reason that drove large OpenSource organisations such as the Eclipse Foundation and OpenStack to the adoption Gerrit Code Review in their tools platform.

How to embrace Code Review in a Team or Company?

We went through an interesting comparison / discussion of Agile Methodology vs. Code Review. Often Teams misunderstand and confuse the concept of “review” with “pair-programming”: the problem was well analysed in my book “Learning Gerrit Code Review” (available on Amazon.com at http://www.amazon.com/Learning-Gerrit-Code-Review-Milanesio/dp/1783289473). I defined the pair-programming as a dot in a time/people space: two developers writing a piece of code at the same time. This however does not exclude all the other points in the time/people space where multiple people at different times will read the code and provide their feedback: pair-programming is then a “specific example” of the “code review space”.

Because of the different perspectives (pair-programming is a dot whilst code-review is a “cloud of dots” in time/people space) they are not one exclusive of the other: they are equally important and both enable effective collective code ownership and knowledge sharing.

References and greetings.

It has been a very long but interesting discussion with Thomas and hope you’ll enjoy it.

See below the links of the resources we mentioned during the interview:

Thanks again to Thomas for his fantastic initiative: GitMinutes PodCast!

Luca Milanesio

Heartbleed: GitEnterprise and GerritHub are safe

Posted on April 11, 2014 by Git and Gerrit Code Review for the Enterprise

A few days ago a large part of the Word Wide Web has been found vulnerable to the heartbleed bug in OpenSSL.

What is the vulnerability about?

The vulnerability is effectively a bug in all the versions of OpenSSL from Ver. 1.0.1 to 1.0.2. In reality a lot of web-sites are either using the older and still popular OpenSSL 0.9.8 or they have already upgraded to the latest patched version of OpenSSL and thus are NOT vulnerable to heartbleed.

Are you passwords safe ?

In a nutshell yes when they are posted or exchanged with a server that is not vulnerable to this attach:

GitEnterprise (gitent-scm.com) has never used any OpenSSL 1.0.1-1.0.2 (see: https://www.ssllabs.com/ssltest/analyze.html?d=gitent%2dscm.com) and thus is not vulnerable: you can keep your existing password as they are safe.
GerritHub (gerrithub.io) has been vulnerable for only 5 days and then has been upgraded (see https://www.ssllabs.com/ssltest/analyze.html?d=gerrithub.io). However GerritHub DOES NOT exchange passwords over the Internet but rely on your existing GitHub session through OAuth Token authentication. This means that during the 5 days of vulnerability your account has NOT been at risk on GerritHub.

What about GitHub ?

Unfortunately GitHub has been vulnerable (see https://github.com/blog/1818-security-heartbleed-vulnerability) but the problem has been resolved or is under resolution right now as the nodes get upgraded.

We do recommend then to change your GitHub password in order to be sure that any previous credentials potentially stolen would not impact the security of your account and repositories.

GerritHub relies on GitHub OAuth, so is GerritHub at risk as well ?

In real terms the answer is “potentially yes”: if a potential attacker had been stolen your GitHub password, he could have initiated a login on your behalf and then accessed GerritHub as well.

How can I strengthen my GitHub security ?

GitHub already support today the two-factor authentication (see https://help.github.com/articles/about-two-factor-authentication): if you have this extra security enabled, nobody other than you can ever access your account, even if they could have potentially stolen your password.

Can I have a GerritHub account secured independetly from GitHub ?

Not yet, however we are working on an advanced security option for the private GerritHub accounts. We will offer for a monthly extra fee:

Access to your GitHub private Teams and Repositories
Extra scripting functionality to hook Gerrit events on the server side
(commit validation, issue tracking association, …)
Integration with Atlassian Jira or BugZilla
Integration with BuildHive from CloudBees for Continuous Integration
Extra enterprise account protection for GerritHub.io accounts (additional password / X.509 Certificates)

Wow, that is amazing ! When can I get GerritHub private edition ?

We are currently in public beta stage, you can start using the implemented features for FREE during the trial by logging in to GerritHub using the URL:

https://review.gerrithub.io/login?scope=scopesPrivate

Can I provide suggestions and give feedback during the public beta trial ?

Yes, you are very welcome to provide your feedback and we are very opened to adjust the development of GerritHub private features to your needs !

For problems and getting support:
http://gerritforge.com/support

For suggestions and feedback, please use the Gerrit Code Review forum:
https://groups.google.com/forum/#!forum/repo-discuss

Is GerritHub OpenSource ?

Absolutely YES: GerritHub is based on Gerrit Code Review 2.10-SNAPSHOT master with a selected set of enterprise plugins:

GitHub plugin
Codenvy plugin
ITS-Jira plugin
Scripting provider plugin
SingleUserGroup plugin
Download commands plugin
Replication plugin
Gravatar plugin
Review notes plugin

If you want to directly review and contribute to Gerrit, you are welcome to the developers and contributors community !

-2 days to the Gerrit User Summit 2014

Posted on March 19, 2014 by Git and Gerrit Code Review for the Enterprise

The Gerrit User Summit 2014 is about to start in only 2 days: it is going to be a two days of exciting news and innovations on the world of Code Review. There are names from the largest industries in the world that have adopted the Code Review workflow in large enterprise environments: Google, SAP, SonyMobile, Ericsson, IBM, Garmin, HP, CollabNet, GerritForge, Codenvy, Eclipse Foundation and LibreOffice.

During all this week there is a special promotional discount on the Learning Gerrit Code Review book. Additionally, for the attendees of the conference, there will be a limited number of signed paperback copies available at the session “Gerrit or GitHub? Take both !”

In order to redeem the book promotion, scan the QR code and enter one of the following PROMO-CODEs:

Book PROMO-CODE: LGCRB20
eBook PROMO-CODE: LGCReB20

The Gerrit User Summit Agenda has been published yesterday and has a lot of very interesting talks and announcements:

Day 1 – Friday 21st of March

What’s new in Gerrit 2.8 (David Pursehouse – Gerrit maintainer – SonyMobile)
Scaling Gerrit at Ericsson (Patrick Renaud, Vladimir Cantiru, Hugo Ares – Ericsson)
Monitoring Gerrit (Doug Kelly – Garmin)
Browsing Repository Content with Gerrit’s REST API (Simon Kaegi – IBM)
Gerrit@LibreOffice (David Ostrovsky – LibreOffice)
Gerrit plugins made easy with Scripting (Luca Milanesio – GerritForge)
The Angular revolution in Gerrit! (Dariusz Luksza – CollabNet)

The day 1 would end with a very interesting Q&A with the Gerrit User Community about the features they would like to see coming up in the next forthcoming releases!

Day 2 – Saturday 22nd of March

2014 Roadmap (Shawn Pearce – Gerrit project founder, Google)
Gerrit@SAP (Edwin Kempin – Gerrit Code Review maintainer – SAP)
Integrating CLA and Origin checks with Gerrit (Denis Roy – Eclipse Foundation)
Guiding Diffy to the Enterprise land (Dariusz Luksza, Eryk Szymanski – CollabNet)
Collaboration at Scale: The Openstack CI toolbox (Khai Do – HP)
Gerrit or GitHub? Take Both! (Luca Milanesio – GerritForge)
Diffy gets Enterprise grade (Dariusz Luksza, Eryk Szymanski – CollabNet)
Continuous Development with Gerrit (Tyler Jewell & Luca Milanesio – Codenvy & GerritForge)

The day 2 will end with a meet-up with food and drinks sponsored and organised by Codenvy where the Gerrit Community can discuss and exchange their post-Summit impressions and ideas on the future of Code Review.

It is going to be again a huge leap forward for the Code Review community and the Git and Gerrit projects improvement !