Why Facebook* doesn't use Git

First of all, why does this matter to me?

I'm working on building Graphite, inspired by Facebook's internal toolkit*. When I decided to create a startup with friends, I had never heard of Mercurial before, although I had always been passionate about developer tools. My previous development experience included personal projects, college homework, iOS development at Google, and infrastructure development at Airbnb. Throughout my career, using git has been as natural as air. It is so popular that I personally considered it the only suitable tool for creating and managing code changes.

It's funny that Mercurial specialist Gregory Sortz worked next to me at Airbnb, although I only knew him as a pleasant colleague, but had no idea that he was a contributor.

In 2021, my teammates Thomas and Nick opened my eyes. They came from Facebook* and, to my surprise, they barely knew Git. But they had a deep understanding of Mercurial patterns and the Facebook workflow* based on stacked diffs. Over time, they convinced me of the usefulness of this pattern and we pivoted the direction of the company to implement multi-layer diffs for GitHub developers.

But the post is not dedicated to our startup. It's about an important issue that has been bothering me for the last three years. Why don't Facebookers use Git? Why did they choose Mercurial and create their own workflows based on it? I know Google doesn't use Git, but it makes sense: Google's development culture predates Git by five years. Facebook* was founded around the same time as Git, around 2004, and by the time Facebook* started getting serious about source management tools, Git was older and more popular than Mercurial. So why doesn't Facebook* use Git?

This question is specific and more interesting to me than to the average developer, but I think it will be interesting to think about it. If Facebook* had chosen to become a Git contributor in the early 2010s, the development world might look different today. Git could be more user friendly and support layered changes natively. GitHub could develop better support for closed source software development. Companies founded by Facebookers who left early, like Uber and Pinterest, could also use Git and GitHub for source management rather than Phabricator and Mercurial, resulting in a less fragmented ecosystem over the past decade.

But Facebook* did not choose Git (for its main monorepositories). Instead, it uses Mercurial for version control and incrementally adds its own tooling on top. Why? First, I decided to Google the answer and found the following detailed post:

https://engineering.fb.com/2014/01/07/core-infra/scaling-mercurial-at-facebook/

This article, written ten years ago, as well as more recent technical discussions on YouTube gave me my initial answer: “due to performance”.

But I wanted something deeper, I wanted to hear the opinion of the engineers who made this decision. Thanks to the help of a colleague, I posted a question in a group of former Facebook bookers. I also sent two cold emails to two engineers working on the Mercurial migration project. They kindly agreed to answer confidentially and expressed their personal opinion about the project. Here's what I've learned about why Facebook isn't using Git*. Hopefully this post will further document the history of why instruments look the way they do in 2024.

It seems to me that the article does not clearly enough say that the Mercurial developers were positive about the idea of ​​​​Facebook* engineers creating patches for Mercurial in order to better scale it for huge repositories, unlike the Git developers.

Former Facebooker, 2024

Why and how Facebook* migrated from Git

According to a 2014 Facebook* post, the company started with Git. As you might expect, this was the company's standard source control system. But around 2012, it began to run into problems with limited scalability. The post claims that the codebase was “many times larger than even the Linux kernel, which consisted of 17 million lines and 44 thousand files.” In particular, engineers began to feel that Git operations were becoming too slow. Not terribly slow, but slow enough to start investigating.

The main bottleneck was the process “stats» all files. “Git examines each file and naturally becomes slower as the number of files increases.” Engineers tried to run the simulation by creating a repository layout that matched the size of Facebook*'s codebase several years later. The result was horrific – simple git commands took over 45 minutes to run. As an engineer working on the project said, “This is not something you can turn a blind eye to until all the engineers start complaining. By this point, everything will be out of control. It would take a Herculean feat to come up with a cleaner solution.”

So a motley group of software developers began exploring possible solutions. They first contacted Git maintainers to understand what it would take to extend Git to better support large monorepositories:

Here are selected quotes from communication with Git maintainers by mail — 12 years have passed, but I still feel some irritation reading these letters:

It looks like you have everything in a single .git. Split your huge repository into separate smaller .git repositories.

While this /can/ be done, it's a bad idea, you should split the repositories

I support. I work for a company with a long history of developing many huge CVS repositories, and we are slowly but surely migrating the codebase from CVS to Git. Break up projects. This will help you reorganize them and, in my opinion, has no downsides.

Although Git may perform better [sic] With large repositories (in particular, the application of commits in interactive rebase seems to slow down with large repositories), when statizing 1.3 million files, you can't expect much more.

This answer does not show any desire to collaborate, and in today's future, where there are a bunch of large monorepositories, it does not seem particularly insightful. Git maintainers refused to improve performance, instead recommending that Facebook* break up its monorepository. However, sharding was not an option for the Facebook* team; its members recall being surprised by the reluctance to expand Git's capabilities. Typically, an offer of free open source labor to a large technology company is perceived as a gift that can ensure the long life of projects.

FB*: Hey Git maintainers, we want Git to scale better for large repos! Shall we work together?

Git: Nope. You're doing everything wrong. You need a bunch of small repos. There's no reason to make Git work with large repositories because they shouldn't exist.

FB*: …

FB*: Hello Mercurial maintainers, we want Hg to scale better for large repositories. Shall we work together?

Hg: Great! Let's.

Former Facebooker, 2024

As far as I remember, the Git community was not interested in FB*'s scaling proposals. It didn't want to support such an insane scale. And Hg turned out to be more open.

Former Facebooker, 2024

I don't mean to say that the Git project had to obey Facebook's requests; I in no way thought to portray the maintainers as the “bad guys.” Don't do something just because Facebook asked you to do it*. It's interesting that Git maintainers seem to have changed their minds a couple of years laterafter seeing that Facebook* is making useful improvements to Mercurial:

They emailed a list of git performance issues. Judging by that. What I saw there was practically no reviews.

My impression was that the git community had little understanding of performance issues in large repositories.

The question is whether the git community is interested in being competitive with such large-scale projects; meanwhile, Mercurial has this functionality natively

Ten years later, Git has made significant changes. to improve support for monorepositories.

Today the situation has changed quite a lot, Git already works well even with very large repositories (if you know how to do it correctly).

Former Facebooker, 2024

Alternatives Considered

In 2012, there were few alternatives to Git. The FB* team considered using closed source Perforce (formerly Google's source control system). In one of the first calls with Perforce sales engineers, the Facebook* team pointed out an architectural flaw in the local consistency between read and write nodes. Perforce's response did not inspire confidence – the engineers were not aware of the fundamental problem and they had no plans to fix it.

Other solutions were also considered, for example, Bitkeeper, but they were all quickly rejected. The last option was Mercurial. Its performance was similar to Git, but it had a cleaner architecture. Git consisted of a complex interweaving of Bash and C code, while Mercurial was designed in Python using object-oriented coding patterns and with extensibility in mind.

One of the engineers considering the options had extensive experience with Mercurial, so the team decided to attend a Mercurial hackathon in Amsterdam to explore the issue further.

And since I've worked a lot with Mercurial before, I was keen to consider all other alternatives before suggesting it.

Former Facebooker, 2024

They found a system that could be easily extended, and a community of maintainers who welcomed the aggressive changes proposed by the Facebook team*.

I think it’s specifically about this hackathon, although I’m not sure. The video is very much in the spirit of the early 2010s: https://www.youtube.com/watch?v=fml4s6MEjW8

Migration of an entire organization

After a hackathon in Amsterdam, the Facebook* team made a decision. All that remains is to convince the rest of the company of the need to migrate. The task was daunting – engineers can be very sensitive to changing tools (remember the battles of vim and emacs), and changing source control systems is a serious matter.

The team began preparations as smoothly as possible so as not to scare off other engineers. What followed was a master class in migrating internal development tools. The team spent several months preparing the team for a possible migration to Mercurial and compiled a list of analogue commands and processes between Git and Mercurial. She even looked at the frequency of Git commands executed throughout the company and specifically documented how the most common operations would work in Mercurial.

She then provided an opportunity for developers to voice their concerns and discuss edge cases that might be challenging in the new system. The team assumed it would get bogged down in arguments about hypothetical problems. But, to my surprise, I found out that my fellow engineers were adaptable and friendly. “No one started complaining about their particular situation.”

Personally, I've always liked Mercurial's high-level commands better than Git's, but I don't think that was the reason. By the way, we hired the author of Mercurial.

Former Facebooker, 2024

In the end they were all for the migration and moved the company to Mercurial. Facebook* has contributed performance improvements to Mercurial, making it the best option for large monorepositories. Evan Priestley expanded Phabricator, adding support for Mercurial. Facebook* used Mercurial's diff concept to create a “layering” pattern that allowed for a new parallelization of code reviews. Former Facebookers left for new companies and took their work processes with them, creating a small but significant cult of stacked diff fans. Later, I met some of these fans and decided to dedicate myself to implementing Mercurial-style stacked diff in Git and GitHub.

Finally

What conclusion can be drawn from this story? As I reflected on the quotes and interviews, I was reminded of the old adage that many important technical decisions in history were determined not by technology, but by people.

As is often the case, the solution was partly social, partly technical. I don't regret it one bit.

Former Facebooker, 2024

Facebook* didn't choose Mercurial because it was more performant than Git. She chose it because the maintainers and codebase were more open to collaboration. Facebook* engineers met personally with Mercurial maintainers and they liked the idea of ​​partnership. When an entire organization had to be convinced, the decision was made through smart communication, not because one technology was clearly better than another.

I think FB*'s adoption of Mercurial was largely due to Brian's genius; they should be considered as an example when introducing new technologies into a company.

Former Facebooker, 2024

Meta Platforms*, as well as its social networks Facebook** and Instagram*** – are recognized as an extremist organization, its activities are prohibited in Russia** – are prohibited in Russia

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *