How GitHub Replaced SourceForge as the Dominant Code Hosting Platform

Translator's note: I didn't live in the pre-GitHub era and always took it for granted. So I was curious to learn what tools engineers in the 2000s used and what features allowed GitHub to conquer the market. I hope you will also be interested in delving into the history of the project, which was collected by Greg Foster. The author's text will follow.

I've been coding since high school. I have vague memories of making an Android game with a friend using TortoiseSVM. In college, I learned how to clone GitHub repositories to access my computer science homework. Later, during my internships, I joined the ranks of those who use GitHub for PR reviews and merges. Most developers who started their careers in the last decade have probably had similar experiences. GitHub has become synonymous with source code and code changes, be it open source or closed source projects.

It's easy to take GitHub's ubiquity for granted, but how did we get here?

I asked my colleague David how he learned about GitHub. David has been coding for about ten years longer than I have and has grown a lot as a professional. He told me that in the 2000s, programmers used SVN. He downloaded the software from SourceForgebut found the site's interface utilitarian and “lousy.” Over time, David found himself increasingly visiting GitHub to find documentation and download open source tools like Rails. This led David to read about the service's underlying version control system, Git, and eventually to use a git-to-svn converter for his work. But many companies hosted code on SVN until 2010, and it wasn't until years later that most private organizations fully migrated to Git.

David's story only heightened my interest. How did GitHub enter the market? What existed before it, and what niche did the new platform fill?

The World Before GitHub

Four years before GitHub was founded, Linus Torvalds created Git in 2004. While many people still used SVN, Git was rapidly gaining popularity as a distributed version control system. It offered undeniable advantages. Unlike previous version control tools such as CVS and SVN, Git users could store complete copies of the source code on their computers without having to go to a centralized server. They could even update the code offline and share copies with each other — there was no need to store it centrally (and spend money on hosting a single source). And while branching code in SVN required duplicating the entire repository, creating branches in Git was fast and cheap.

As you might guess, these improvements led to an explosion of creative development in the Open Source community. Git was created specifically for distributed democratic development, and so it quickly went viral. However, Git was extremely slow to penetrate corporate code bases, which were quite happy with their private, centrally managed SVN servers with legacy processes.

Fast forward four years to 2008. Open source projects like Rails followed Linux and began to adopt Git. Private organizations continued to use SVN servers and Perforce for centralized source code management. Open source software was distributed primarily through SourceForge, then through Google Code or alternatives such as personal servers (but these were rare).

Despite its dominant position, SourceForge left much to be desired. It did not support Git. until 2009although by that time a whole year had passed since GitHub's creation, and the number of its users had exceeded 100 thousand. But the differences were not only in technology.

Today, when we think of online code hosting, we imagine a site where we can easily view the code itself, browse issues, and follow contributors. SourceForge didn't have that capability. Together with Google Code, they were more focused on distribution of software to end users, rather than collaboration on code. Yes, they solved part of the problem associated with the distribution of Open Source projects, but they offered little in terms of comments, source code review, change review, and other basic modern features.

It gets worse. Creating and managing repositories on SourceForge was a real pain. For example, to create a new repository you had to fill out an application and get human approval. Moreover, private repositories were not supported at allmaking the site useless for hosting closed source projects.

Leaving comments and opening issues on projects was difficult — the process was extremely unintuitive, and forking was extremely rare. In order to contribute to a project, in most cases you had to create a patch and submit it via mail serverrather than just forking and opening a pull request. When I first learned about SourceForge, I was struck by how similar it was to today's Apple App Store. It's clear now what a niche market GitHub has opened up.

Time for SourceForge and Google Code

Check it out SourceForge landing 2008. What do you see? Proud claims to be the largest Open Source site. Download stats and featured projects. Latest software releases, even ads in the right column. But no mention of Git, no emphasis on user profiles, and no private repositories. The focus is on distribution, not developer collaboration.

An example repository on SourceForge. Note that the primary call to action is the download button, not the clone button.

An example repository on SourceForge. Note that the primary call to action is the download button, not the clone button.

Google Code was a step up from SourceForge. The site aimed to make it easy to host code and documentation for free. It relied on Google's powerful search engine and offered thoughtful documentation for developers looking to learn. However, it was critically lacking in the “social” and collaboration features that would later prove so important to open source developers. Finally, while it launched with SVN support, Google Code only added Git support in 2011, accompanying article in Wired with a sarcastic title.

Creating GitHub

One evening in 2008, Tom Preston-Werner and Chris Wanstrath stopped by a sports bar in San Francisco after a Ruby on Rails meetup. The Rails community was increasingly using Git, but there was no single site for hosting Git repositories like SourceForge. Moreover, social networking was clearly gaining traction, and there was no such network for open source developers. Git had made it much easier to collaborate on software, and sites like SourceForge helped distribute new releases. But there was no “home” platform for full-fledged collaboration. What if anyone could host source code, discuss issues, and ask maintainers to pull in changes to forks—all with a Facebook-like profile and comment feed?

The first version of GitHub was developed as a pet project. The authors worked on the basic functionality of the platform on weekends (using Rails, of course). Eventually, it acquired the features of a finished product — Chris could use it at his main job. Constantly testing GitHub in practice, the authors eliminated bugs and closed critical gaps in functionality.

“GitHub as a company actually grew out of this pet project, so we didn't have any big strategy or dream or ambition. We just wanted to work on something cool.”Chris Wanstrath).

The creator of Ruby on Rails was an early adopter of GitHub, helping the site grow rapidly in popularity.

David Heinemeier Hansson, creator of the Ruby on Rails framework, profile on GitHub

David Heinemeier Hansson, creator of the Ruby on Rails framework, profile on GitHub

GitHub in 2008

GitHub’s original logo included the phrase “Social Code Hosting,” and its core brand promise was “Git repository hosting is no longer a pain in the ass.” GitHub clearly defined its core selling points at launch: social networking features and the ability to host Git repositories online. No other site offered this.

GitHub's home page from 2008

GitHub's home page from 2008

News feed of projects and developers. Follow your favorite projects and the people working on them:

View source code. Easily view code in any version, branch or tag:

Public profiles of developers. Keep track of what other developers are working on and how many commits they've made:

“What's amazing about GitHub is how it adds a social aspect to the process. Chris and Tom show us how Git development should work. Personally, I've been repeatedly amazed by something as simple as pulling commits from external Git repositories.” (Rick Olson)

“You've probably heard it a hundred times in the last week: GitHub is powerful. I've never had a reason to host my code on a hosting service like this, but now I do.” (Josh Sasser)

GitHub's Rapid Growth

Having tested MVPthe co-founders of GitHub launched a free beta for friends, giving them the ability to host public repositories.

GitHub's growth in the open source community was meteoric — the product was exactly what the market expected. Rails immediately switched to the platform, so anyone who wanted to use Ruby on Rails had to interact with GitHub. This is how people like my colleague David first learned about GitHub and Git.

  • In its first year, GitHub grew to 46,000 public repositories.

  • The next year, their number increased to 90,000; 100,000 people used the platform.

  • By the third year, the number of repositories had reached 1 million, and by 2011, GitHub had overtaken SourceForge and Google Code.

But it was too early to rejoice in the rapid growth: the founders continued to count every dollar and develop the business on their own. The main task that stood before them was to start generating income. Instead of focusing on advertising, like SourceForge, or on sales to corporate clients, like Perforce, the co-founders of GitHub began selling individual subscriptions for hosting closed repositories. The model was clear, self-service, and somewhat original compared to other hosting products and social networks. Google Code and SourceForge not only did not host Git repositories, but also did not offer options for hosting closed code. Besides GitHub, the only option for hosting closed repositories was your own server.

Beyond the quick revenue from subscriptions, GitHub’s co-founders were looking for other ways to make money and save money. They experimented with alternative revenue streams, such as one-time ad placements, a merch store, Git training services, and a job board. To keep business expenses as low as possible, GitHub partnered with Engine Yard and then Rackspace, offering them footer ads in exchange for free hosting.

Example of GitHub footer ad

Example of GitHub footer ad

In 2009, GitHub launched a self-hosted version that allowed larger enterprises to use the platform. Instead of using SVN servers or products like Perforce, engineers could use the new Git tools for both open source and proprietary development. GitHub’s Enterprise and Teams plan was launched in 2010, further cementing the platform’s presence in the closed-source and collaborative development market. And in 2011, GitHub:Fi became the official server product for enterprise customers.

«GitHub Enterprise was created to help bring GitHub to the masses. Whether you're stuck behind a firewall or have full access to the web, we want to make GitHub work for you.”Chris Wanstrath).

By changing the approach to the code hosting business model, GitHub made a lot of money. It could now scale the team and gradually improve the user experience. But focusing on revenue and independent growth was not the founders’ whim — venture funding was not available. Until 2010, “companies building developer tools could not attract significant investment” (Forbes).

“Even when tool companies have sold businesses, like when Facebook acquired mobile app tool maker Parse in 2013, the reported $80 million was considered a modest result. For some, developer tools will always remain a risky venture capital area.”Forbes).

It took investment from Heroku, Atlassian, Stripe, Twilio, and SendGrid to expand the market. It wasn’t until six years later that GitHub received investment from Andreessen Horowitz, a $100 million Series A round that was positioned as the largest in history.

Who hasn't used GitHub

Google never adopted GitHub. They used Perforce throughout the 2000s, and later developed their own version control system called Piper. As far as I know, in addition to their unique, cutting-edge version control tools, Google engineers also invented their own web interface for code reviews. Their early code review dashboard, created in 2004was inspired by the Gmail interface and became the gold standard for enterprise software development processes. They had no immediate need for Git and GitHub.

Facebook also did without GitHub in its internal development. Around 2010, Facebook’s Evan Priestley developed Phabricator, a year before GitHub officially launched self-hosting for companies. Even if GitHub had offered this solution earlier, Facebook probably would have preferred its own tool, which would have better integrated into the company’s internal systems and handled monorepo scaling (something even Git has struggled with). Moreover, Facebook switched from Git to Mercurialfor which GitHub never developed full support.

Some unicorns, like Airbnb, have used GitHub from the start. Other big names, like Uber and Pinterest, have forked Phabricator and hosted it on their servers. I’m not sure, but I suspect they did this because:

  • Phabricator was best suited for self-hosting and open source control;

  • Their ranks included many former Facebook employees who missed the old tools.

This is what the Phabricator interface looked like. Since June 2021, active support for the project has ceased.

This is what the interface looked like Phabricator. Since June 2021, active support for the project has been discontinued.

Launched in 2011, GitLab took a different path, focusing on a full-fledged DevOps platform rather than “social coding.” Its creators took advantage of the rapidly growing DevOps trend and bet on CI/CD to gain market share among large tech companies like NVIDIA.

Working on Graphite in 2024, I talk to hundreds, if not thousands, of high-tech engineering teams. And I rarely hear about companies that don't use GitHub. According to Stack Overflow Survey 2022GitHub's market share is only twice that of GitLab. However, in practice, my observation is that about 95% of modern tech companies use GitHub, and only a few of them host GitHub Enterprise themselves. The rest either use GitLab, Phabricator, or Gerrit, or have developed their own code hosting platforms.

The Future of GitHub and Code Hosting

Linus Torvalds, the creator of Git, praised GitHub, saying:

“GitHub hosting is great. They've done a great job with this. I think GitHub deserves a lot of credit for making hosting open source projects so easy.”

However, he also sharply criticized GitHub's implementation of the merge interface, noting:

“Git has a great pull request module, but GitHub decided to replace it with their own, completely crappy version. As a result, GitHub is useless for this kind of thing. It's great for hosting, but pull requests and online commit editing are terrible.”Wired).

What is the future of code hosting? In his famous book “The Innovator's Dilemma” Clayton Christensen argues that the first innovative products often start out as integrated solutions. I would argue that GitHub and GitLab are examples of such integrated offerings, providing a “single pane of glass” for teams looking to manage source code.

Christensen argues that as the market matures, solutions become specialized and modular. We’re already seeing this in some areas of “social programming.” Jira and Linear offer modular solutions for issue tracking, and Jenkins and Buildkite offer modular solutions for CI. GitHub was the first host for Git repositories, but over time BitBucket and AWS CodeCommit have offered similar solutions. GitHub now offers an integrated merge queue, but there are more focused solutions in the market, such as Mergify, Aviator, Trunk, and Graphite.

GitHub maintains its monopoly on open source code thanks to its strong networking effects and features like forking, forum-style comments, and moderation that are ideal for open source development. For closed source repositories, GitHub was initially chosen for its specialization in hosting Git, but this feature has now become commonplace. GitHub’s social features are of little use to commercial companies, where discussions are conducted via Slack, Notion, Linear, and Zoom.

I think there will be a separation in the future between open source and closed source development tools. For open source collaboration, discussions, profiles, moderation, forks, and project discovery are important. For closed projects, it is critical that code changes are reviewed in hours rather than days. This requires trunk-based processes, merge queues, emergency procedures, and CI/CD coordination. There are overlaps in both areas, but I think the world will become even more specialized over time in different solutions for these specific cases.

There are already specialized solutions from Facebook and Google. By developing their version control systems independently of GitHub's restrictions for Open Source, they have created powerful patterns such as Google's PR inbox and Facebook's stacked diffs.

I’d like to see modularity evolve to the point where every engineer can choose how to host their source code, completely independent of the tools they use to change that code. This trend is already happening in other aspects of development, such as allowing engineers to freely use their preferred IDE or cloud hosting provider. GitHub has earned its monopoly by specializing in code hosting and social programming, but that’s not the end of the story. Hopefully, someday in the future, developers will have not one, but five viable options for hosting their code.

Disclaimer: I'm fairly young and hadn't done any professional programming before GitHub. Most of the information in this article is gathered from online sources, interviews, and engineering work. I've sifted through a bunch of sources trying to understand how things were and where we're going.

Complete chronology of events

  • 1991: SourceForge begins operations, becoming the first free hosting provider for Open Source projects.

  • Before 2004: used for version control CVS And SVN; SourceForge is the leader in hosting open source projects.

  • 2004: Linus Torvalds creates Gitrevolutionizing version control with a distributed system.

  • 2006: Launched Google Codeoriginally supporting SVN.

  • 2008: Founded GitHuboffering Git repository hosting with a focus on social programming.

  • 2009: SourceForge adds support for Git; GitHub introduces a self-hosted version, laying the groundwork for GitHub Enterprise.

  • 2010: Facebook develops Phabricatora set of web tools for code review and software development.

  • 2011: Founded GitLabspecializing in creating a full-fledged DevOps platform; Google Code has added Git support.

  • 2012: GitHub launches GitHub Enterprise, which addresses the private hosting needs of large organizations.

  • 2016: Google Code shuts down, highlighting GitHub's growing dominance in code hosting.

  • 2018: GitHub introduces the platform GitHub Actionsautomating work processes in the field of software development.

  • 2021: Phabricator is no longer actively maintained, resulting in an increase in GitHub users.

P.S.

The original article has a nice note about Greg devoting a lot of time to writing to spread the word about Graphite. As a thank you for the interesting content, we will leave a link to the tool website Here.

Read also in our blog:

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *