How does Google manage 2 billion lines of code?

How does Google manage 2 billion lines of code?

When faced with the question "How big is Google?", you might answer this question with revenue, stock price, number of customers, or metaphysical influence. However, this is not all. As the world's largest Internet company, we can certainly think about this question with "Internet thinking", such as measuring it with the number of codes. Rachel Potvin from Google gave us a reference answer at the Silicon Valley Engineering Conference held on Monday.

[[149754]]

She said the software that runs all of Google's Internet services, including Google Search, Gmail, Google Maps and others, has about 2 billion lines of code. By comparison, the Windows operating system, which has been in development since the 1980s and is one of the most complex software tools ever developed for a single computer, has only 50 million lines of code.

So to put it simply, building Google is equivalent to building 40 Windows systems.

Of course, the 50 million lines of code only drive the Windows system itself, while the 2 billion lines of code are all of Google. Google's business coverage is extremely wide, including search, maps, documents, social, calendar, mail, video, and other Internet services. All 2 billion lines of code are stored in the code resource library and provided to all 25,000 Google engineers. Internally, Google treats its code like a huge operating system. Potvin said: "Although I can't prove it, I think this is the largest single information library in the world."

Google is an extreme example, but it shows how complex software is today in the Internet age, and how we have to change the coding tools and philosophies we use to adapt to this complexity. Google's huge resource library is only available to internal programmers, but in some ways it is similar to Github - a source code repository open to all the public, where engineers can share code over the Internet. We are moving towards a world where we need to collaborate on code frequently and at scale to keep up with the development of modern Internet services.

GitHub said: "Google has 25,000 engineers who can share code with people with all kinds of different skills internally. But small companies can use GitHub and open source to get the same advantages."

On the other hand, building and running a massive system with 2 billion lines of code is not easy. Lambert said: "This is a technical challenge and a huge feat. The numbers are quite staggering."

GitHub, which allows programmers to easily share code and collaborate, covers millions of projects, but does not directly house software projects. Google has taken it a step further by combining many projects into one. Given the many engineers involved and the difficulty of dealing with so much code at the same time, this is crazy.

Piper

In order to cope with all the codes at the same time, Google has built its own "version control system": Piper. It runs on the entire huge network infrastructure, covering 10 different Google data centers.

This system not only stores all 2 billion lines of code in a single system and provides it to internal engineers, but also gives engineers more freedom to use and merge code across countless projects. Potvin said: "When you start a new project, Google has already provided a library with rich resources, and almost everything has been done for you. More importantly, engineers can make code changes and deploy them immediately in all Google services. Update one thing and you can update everything."

Of course, there are limits to using this system. Potvin said that some highly confidential code, such as the PageRank search algorithm, is stored in a separate repository and is only available to certain employees. And because the Android and Chrome operating systems are very different from those online services, Google stores their code in separate version control systems. But in most cases, Google code is a whole.

Machine Programmer

Lambert pointed out that building and running such a system requires not only knowing how to do this, but also huge computing power. Piper needs to process about 85TB of data (85,000GB) every day, and Google's 25,000 engineers make 45,000 commits (modifications) to the resource library every day.

At the same time, Piper must also be able to remove a lot of redundancy generated by human programmers. It must ensure that the code is accurate, that programmers do not interfere with each other, and that errors and unused code can be removed from the resource library. And it is precisely because of all these difficulties that Piper has to take over some of the work of humans. Now, Google has switched from a previous version control system, Perforce, to Piper, allowing machines to do part of the work.

This doesn't mean that Google is letting robots write code, but they can generate a lot of the data and configuration files needed to run the software. Programmers and robots need to work together to maintain the health of the code. It's not just humans who maintain the code anymore.

Piper benefits everyone

Could other companies benefit from the same kind of system? Of course they could, and some have. Facebook's main app is over 20 million lines of code, and the company ran the whole thing as a separate project. There are other companies doing the same thing on a smaller scale, and as they get closer to the size of Google or Facebook, they'll do the same thing. But Google and Facebook are exploring ways to change everyone.

The two giants are developing an open source version control system that anyone can use to work with large-scale code. It is based on an existing system called Mercurial, and Google is trying to scale Mercurial repositories to Google's scale.

<<:  A preliminary study on the method of “design that touches people’s hearts”

>>:  In just one year, H5 has become like this

Recommend

Analysis of competitive products of Douyin app

With the rapid development of mobile Internet, pe...

Zhang Fan: Strange Creatures in the Black Water

"This is the kingdom of plankton. Anemones, ...

The most comprehensive analysis of enterprise WeChat operations!

Enterprise WeChat 3.0 version was released, openi...

Offline advertising channels and outdoor advertising formats

Outdoor advertising is the earliest advertising m...

Dan Nystedt: Nvidia will contribute 11% of TSMC's revenue in 2023

According to financial expert Dan Nystedt's f...

K12 Online Education: Yuanfudao Product Analysis

This article aims to help you understand the curr...

Zhihu operation and promotion strategy

I am currently researching how to create Zhihu. I...

A set of logic that may be suitable for all new media to increase followers!

Output makes people grow, output makes people gro...

The new Liberty X will be released this year.

Jeep has always specialized in hard-core off-road...

Apple to launch dedicated app for Apple Watch

[[126112]] ***Beta version iOS 8.2 ***Beta versio...