Can a single machine host more than 100,000 projects? Curious, come and learn about the hardware platform of this Git repository

Can a single machine host more than 100,000 projects? Curious, come and learn about the hardware platform of this Git repository

If you want to host your project, consider GitLab.com, where we run a single instance of GitLab. There are currently nearly 20,000 users using this service. There are more than 100,000 projects hosted on a single machine.

Single Server

Previously, GitLab.com was running on Amazon's AWS platform, using the highest configuration instance on AWS. However, due to the continuous growth of users, the single AWS instance could no longer meet our needs, especially the CPU and storage limitations. We had to find an alternative solution.

100,000 repositories require multiple terabytes of storage, so storage capacity is critical. Because we use Git, the storage must be a single file system, not an S3 object storage service like Amazon. We want to be able to easily scale storage. In addition, a large number of people submitting and downloading code also places high CPU requirements on the system, so having more CPU cores helps improve responsiveness under high load.

It seems that the most cost-effective solution is to use your own server. Fortunately, GitLab can be easily run on it.

Therefore, we currently have two independent servers for running GitLab.com, one of which is the active primary server and the other is a backup. The server configuration is as follows:

  • Server model: HP DL180 G6 (manufactured in 2009)

  • Processor: 2x X5690 (24 cores in total)

  • 32GB RAM

  • 12x 2TB HDDs, (two for the root volume using RAID 1, and the other 10 disks using RAID 10 with ext4 filesystem)

We actually only use 16 of the cores.

Failure and failover

Migrating from Amazon meant that we could no longer take advantage of some of the features of the AWS platform, so we needed some failover measures in case of server failure.

We need to use DRBD to create a master-slave server architecture, where only one application server is active at a time, and if there is a problem, DRBD will switch to another server.

Our DRBD tools are available via subscribers.

Future scalability

GitLab.com runs well on its current hardware platform, but it is growing rapidly. Scaling existing hardware is expensive and difficult in some parts.

In the future, GitLab.com will be hosted on Amazon's AWS platform again, which will allow us to easily achieve horizontal expansion. In addition, Amazon has just announced ESB volumes of more than 10TB, which will make our migration easier.

Original English text: The hardware that powers 100,000 git repositories

The hardware that powers 100k git repos

<<:  Summary of AndroidStudio shortcut keys

>>:  Android studio code formatting issues

Recommend

Sending 14 cabinets to the sky? What were they thinking?

The Shenzhou-15 crew, including Fei Junlong, Zhan...

10 tips for selling products in live streaming!

1. Short video-live broadcast room resonance When...

Self-taught! How sophisticated are today’s AI deception methods?

Many studies have shown that today's AI can a...

The relationship between programmer growth and the number of lines of code

In 2011, John D. Cook wrote a blog post in which ...

How do new domestic brands conduct offline marketing?

"If you are good at online business, then go...

The pain of being excluded: animal hierarchy and school bullying

Biologists have long recognized that all animals,...

3 models + 5 methods of data operations!

Many friends who are just getting started with da...

6 steps of data operation, from methodology to cases to get you started!

Analyzing operational data helps us further perfo...