Make Code Review a Habit

Make Code Review a Habit

1. Introduction

In May, I suddenly received a notification from summer from code.oa.com [a code management platform within Tencent], saying that the participation rate of Guangdiantong's code review was outstanding among all departments in the company, and the overall performance of our group (Guangdiantong advertising targeting group) in code review was the best among all groups in the company. This surprised me a little, and it felt like an accident. Summer hoped that I could write about some of our group's experience in code review. Recalling my experience of learning to use code review, the sentence that first came to my mind was a comment about code review that I saw a few years ago:

The biggest thing that makes Google's code so good is simple: code review. That's not specific to Google – it's widely recognized as a good idea, and a lot of people do it. But I've never seen another large company where it was such a universal. At Google, no code, for any product, for any project, gets checked in until it gets a positive review.

http://scientopia.org/blogs/goodmath/2011/07/06/things-everyone-should-do-code-review/

This is a passage that left a deep impression on me. It was written by a Google engineer. I quoted it as the beginning of the article to express my heartfelt recognition and respect for the CodeReview culture after learning and practicing CodeReview in collaboration with some outstanding engineers in Tencent over the years.

2. Rietveld

I joined Tencent in 2008, and mainly stayed in R2 to participate in the development of soso. From 2008 to 2010, I was very disdainful of R2's R&D quality management, and felt that it was really bad: there was no unified coding standard, no unittest, and even experienced that some people did not know how to use svn and did not understand code version control. Of course, I myself did not know much about quality management in the development process, and I had never heard of codereview at that time. In early 2010, the company began to increase investment in search and recruited some great people, especially some engineers from Google, and these people also brought some of Google's development culture into R2. Although the final fate of soso was not as expected, from 2010 to 2013, many soso engineers learned and experienced what is a rigorous and powerful development process, and also saw the Google technology and development process that we once copied. So far, when we recall it with many colleagues, we often joke that we have seen pigs run and eaten pork.

In April 2010, a talented person who came to our department from Google was Yiwang, a PhD in machine learning. He gave a report on LDA topic modeling, which was very impressive. After several contacts, I took the initiative to find Yiwang to work together, learn LDA and participate in the development of parallel topic modeling. The first thing Yiwang asked me to do was to study Google's open source Rietveld code review system and find a machine to build Rietveld. So after a lot of trouble, we built the company's first strict code review system, of course, at that time it was only for our own use within the group. After building Rietveld, the second thing we did was to unify the code style to the C++ coding style released by Google. The first four engineers who conducted code review on this system were Yiwang, Leostarzhou, Charlieyan, and me.

All those who have worked with Yiwang have suffered a lot during code review. He strictly enforces Google C++ coding style. If there is an extra space or a blank line in the code we submit, or a missing space after the comment symbol "//", all of them will be sent back for revision; not to mention the big taboos such as variable and function names not conforming to the specification and arbitrary use of variable abbreviations. For our initial code, Yiwang pointed out the places where we violated the specification one by one, and pasted the URL anchor of the specific requirements of the Google C++ specification in the comment to explain why we violated the specification. At first, I was a little skeptical, but after several rounds of back and forth, I felt embarrassed. I read the Google coding style carefully from beginning to end three times, carefully memorized some details, and then I started to get started soon.

Naturally, we worked with more and more engineers later, and many of them questioned and challenged the necessity of code review. So during the years of working with Yiwang, I repeatedly listened to his three stories.

The first one is about himself. He started coding when he was young, got the engineer certification in high school, did a lot of coding before entering Google, and the first 100-line program he wrote after entering Google had more than 100 lines of comments. So don't complain about our strict standards. Strict standards can produce beautiful code. The code review we do at Tencent today is really nothing.

The second one is about the controversy of code review within Google. It is said that when the code review system was established within Google, many engineers were not happy about it, especially the older engineers. They had been writing code for many years, and they could not stand being criticized by a group of young brats. *** said that a very important person in Google spoke out: If we want to control the quality of the project, no matter how experienced we are, we must abide by the common code standards, so the code review system was established.

The third one is to explain why Google C++ coding style is used. Google's coding standards are formulated by a group of top coders. Variable names are lowercase and function names are camel case. They are based on strict visual aesthetics and are pleasing to the eye. They have even been studied for visual comparison. Although I have my doubts, I still agree that Google C++ coding style is the best coding style I have ever seen. An HR from Tencent also expressed the same view when chatting with me:-).

Many rules of coding style are relative, and the great thing about Google style is that the purpose of most of the rules he set is not to prove that he is correct, but to list the pros and cons of each rule in detail, so that you can understand these rules more deeply. The famous statistician George EP Box said a widely circulated saying in data modeling: All models are wrong, but some are useful. Applying this sentence to the coding style of code review, I would say that All coding rules are wrong, but some are useful. Of course, an additional reason why Google style is good is that Google provides the cpplint tool to automatically check whether there are any violations of the standards in your code, and most of the low-level errors will be checked out.

3. Promote CodeReview

When our team first built Google's Rietveld internally and promoted its use within the department, we discovered some problems, mainly including:

1. How to send the issue to the company's outlook account via email after submission

2. The issue submission tool upload.py provided by Rietveld does not support Chinese well

3. The underlying text files are used for storage. When the traffic is large and when there are many issues submitted, the speed is not good.

Charlieyan and I made some changes to the Python source code of Rietveld for the first two problems, and they were quickly solved. We have not been able to solve the third problem. However, it was OK to support the development of several of our groups at that time. So Yiwang continued to promote the use of Rietveld in R2. By 2011, this tool had been used by multiple departments in R2. The following picture shows the usage of each department at that time that we had counted.

When we were promoting codereview in 2010, Yiwang also contacted colleagues in the R&D management department, who were the maintainers of code.oa.com. At that time, code.oa.com did not have a strict codereview function. It provided a function for codereview after code checkin, so there was no upload.py script for sending codereview; this function is certainly useful, but the review after the fact is often flawed and it is difficult to replace the review before the fact. In fact, the more important problem is that the culture of codereview had not been formed within the company at that time. When Yiwang communicated with people in the R&D management department, he strongly recommended the Rietveld system. Soon after, code.oa.com also developed a formal codereview function based on Rietveld, and the tcr.py used to submit issues was also modified based on Rietveld's upload.py.

The situation was somewhat different within Soso. Zhu Huican, who came from Google, led the search cloud platform (search infrastructure department) to strongly promote the construction of codereview culture. After several experts from the infrastructure department began to use this system, Wujie and Phong took over the CodeReview system. At that time, the third problem mentioned above had become a serious problem. Wujie and Phong began to improve the performance of Rietveld, replaced the underlying text file storage with MySQL database, and used Apache server to replace Rietvled's Django-based HttpServer, so the third performance-related problem of Rietveld was completely solved. Later, Huanyu applied for the domain name codereview.soso.oa.com to provide codereview services. So until Soso and Sogou merged, brothers from several departments within Soso had been doing codereview based on the Rietveld system.

4. C++ Readability

The contextual advertising center of the search advertising department should be the place where the code review culture was established most thoroughly and perfectly. At that time, Yiwang was the director of this center, and Huan and I were in charge of a group respectively. Our center was responsible for developing a set of contextual advertising products of AFC (ads for content), which had many similarities with the current Guangdiantong. Huan, Yiwang, and our GM Paulyan were all engineers from Google, and the AFC product was developed almost from scratch, so naturally, many development processes were copied from Google. Although the AFC product did not succeed in the company for various reasons, many engineers who participated in the development should have been impressed by the development process we established at that time. Of course, code review is a very important part of it. Huan borrowed from Google's code review process and promoted the establishment of a C++ readability system in the center. The readability of a language indicates the engineer's familiarity with the language and the corresponding coding standards. This C++ readability was initially granted to only a limited number of senior engineers. Each codereview issue needs to be approved by at least one engineer with C++ readability before it can be submitted to the code repository.

If an engineer who does not have C++ readability wants to obtain it, he needs to submit a code review issue to apply for it. The title of this issue must contain "Apply for C++ Readability". The issue must contain at least three files, xxx.cc, xxx.h, xxx_test.cc. If this issue is approved through strict code review, the engineer can be granted C++ readability. Usually, everyone will be very careful and meticulous in the review process of this application.

5. Code Review in Guangdiantong

[[131760]]

In October 2013, the search advertising platform department was merged into Guangdiantong to form the performance advertising platform department of SNG. The original AFC system was basically abandoned due to this merger, but many modules and components were gradually migrated to the Guangdiantong system during the development process. When we first entered Guangdiantong, Guangdiantong used the code.oa.com platform for code review and established some code review culture, but there were many places where it was not implemented properly. Although the coding style also advocated Google coding style, it was not strictly enforced, so the code style of the entire code base was very inconsistent. This also brought great challenges to our code review execution:

The old code style is not unified. How can we unify the code style? Does the old code need to be rewritten based on the new style?

The entire product is undergoing rapid development and iteration. The boss has clear expectations for the development and evolution of product features. The product manager urges the developers to make progress every day. Should we perform strict code reviews?

Guangdiantong.com has clearly stipulated that the coding style should be unified with Google C++ coding style. Our team's CodeReview has also been moved to code.oa.com. We no longer use Rietveld (but this system has not been abandoned. Wujie built this system to serve WeChat in January 2014 and applied for the domain name cr.oa.com). After several years of improvement, the CodeReview function of code.oa.com has indeed become very useful. Considering the rhythm of product iteration, our advertising quality center is advancing gradually:

All new submissions must use Google Style

All new modified code must use Google style

Within the specified time frame, the code of important modules must be refactored to Google style

In order to establish and implement the code review culture more thoroughly in Guangdiantong, several directors and architects also quickly promoted the implementation of the code owner mechanism in code.oa.com. The code owner is clearly set under each directory. Each change issue must be approved by the code owner before it can be submitted. When it was first implemented, in order to ensure the implementation of code review, the Ad Quality Center was very centralized in code review. The owner of all relevant directories was only written as the director. Therefore, all code review issues issued by the quality center can only be submitted after the director approves it. This does affect the development efficiency for a certain period of time, but it is effective in educating all developers about code review. After nearly three weeks of strict control, the owner of the directory was added to the team leader, and then gradually released to some key members of the team.

For our team, half of them are former AFC developers who have undergone strict code review training; the other half are former Guangdiantong developers who have not undergone strict code review training. In order to educate everyone on code review awareness, I repeatedly emphasize the importance of code review in the team's weekly meeting; at the same time, I clearly point out that everyone's evaluation is related to code review performance.

Engineering outputs for the advertising business mainly include design documents, code implementation, online A/B test experiments and follow-up analysis of experimental results, as well as daily sharing within the group (usually active email discussions, wiki construction, techtalk sharing within the group, etc.). For all front-line engineers, regardless of their rank, the most important engineering output principle is to show me the code, and code review is the best way to reflect this objective output. Within the group, we try to make everyone's participation in code review open and transparent, so we require

Each code review issue must be sent to your project collaborator, and submission is allowed only after the project collaborator reviews it.

Every issue must be forwarded to all members of the group. Anyone in the group can review other people's code if they want.

Since all codereview issues will be sent to all members of the team, I can see everyone's code output on code.oa.com. I judge the engineering output of each engineer in the team by doing codereview for everyone. On code.oa.com, I can roughly see the degree of participation of each person in the team in codereview, including the issues he sent out, the codereview comments he made to the members in the team, etc. code.oa.com provides a function that can dump everyone's codereview issues, so when evaluating, I will check the codereview output of each member within six months and check the quality of the issues submitted by the members in the team.

However, the function of checking everyone's codereview participation status on code.oa.com is still too weak. If code.oa.com can provide a statistical analysis function, everyone can see some of their own statistical indicators, including

The total number of files I modified, the total number of lines of code modified and committed

How many times I reviewed code for collaborators and how many comments I provided

How many times did my collaborators comment on me?

If the team leader can see these statistical indicators of the team members, it will undoubtedly provide a lot of objective and fair information for evaluating the engineering output of each team member.

6. Code Review Insights

After 4 years of doing Code Review, this process has basically been deeply rooted in our team and has become a behavioral habit in the group's engineering work. If one day I leave Tencent, I believe that the relatively strict Code Review training I have experienced in Tencent will help me do a good job in engineering quality management for the next project. There are many arguments on the Internet about why to do Code Review. When I worked with Yiwang before, the following points were summarized within the team (mainly summarized by Yiwang):

[[131761]]

Code Review ensures the quality of development. The whole process is well-regulated and there is no need to worry about "face". It can correct many minor problems that are not noticed or committed knowingly.

Code Review allows efficient communication and exchange, without the need for phone calls and video conferences, allowing colleagues in Beijing and Shenzhen to work together

Code Review helps transfer knowledge and share knowledge in a fine-grained manner within the team. Commenting on the code line by line is more detailed than technical exchange meetings and KM submissions.

Code Review ensures the sustainable use of code. Inconsistent code style causes a system to be rewritten every time it is handed over to the person in charge.

Code Review and Agile Development are compatible, not mutually exclusive. Agile is not about pursuing the number of releases, but about ensuring the quality of releases.

CodeReview is a social approach that encourages more technical exchanges between team members.

Code review is extremely beneficial to the growth of newcomers. Many newcomers feel terrible and overwhelmed when doing code review for the first time. The worst issue I have seen was required to be modified more than 20 times before being allowed to be submitted. However, this method is very efficient for the growth of newcomers. An engineer with good coding skills can quickly write qualified code after a few code review issues.

CodeReview increases the maintainability of code within the team, and at least two people will be familiar with the same code - the author and the reviewer of the code.

How to build a team's code review culture, following Google's footsteps, there are also some specific suggestions:

We need to train qualified reviewers. We need to establish a readability certification mechanism for various programming languages, starting with a group of “seed” reviewers, and allow more colleagues to obtain certification and become qualified reviewers.

The code style of various languages ​​needs to be refined and made public. The URL of the recommended source can be given in the comments to ensure that the code style and maintainability are implemented.

A Code Lab mechanism needs to be established to help engineers get started with development processes and specifications

A change approval mechanism needs to be established. More engineers in the team are encouraged to improve the code (agile), but before the modification is submitted, it must not only pass the review, but also require the approval of the relevant responsible person; while accelerating the development process, it ensures clear division of responsibilities.

Guide lightweight review. We have all experienced that when some engineers send a change issue, they send dozens of files at the same time. One issue may be a week's worth of modifications. Reviewers are almost panicked when they see such issues, and it is impossible to ensure the quality of the review. Therefore, good issues should be lightweight, and large issues should be split into small issues. The figure below is extracted from Cisco's codereview report. If the number of modified code lines in an issue exceeds 400, it is basically impossible for the reviewer to review it carefully and help you find bugs.

It is the engineer's responsibility to make it clear whether an issue can be submitted. Developers sometimes complain that others are slow to review and delay the time. Our team usually clearly states that whether an issue can be submitted is the developer's responsibility. Developers should actively encourage collaborators to help review the issue, and promote the issue to be approved and submitted in time.

7. Conclusion

All high-quality papers in the academic world are published only after peer review. All masters and doctoral students who have written papers have an experience that the various review opinions received are sometimes very painful. However, whether you reject the reviewer's opinions or accept the opinions and modify the article, you will go through a serious thinking process to improve the quality of your article. High-quality papers are strictly reviewed by scientific researchers, and I believe that high-quality code can also be produced through the review of engineers. Make code review a habit of the team, and we will gradually discover the positive energy released by this habit. I hope that one day, we can also do this:

At Tencent, no code, for any product, for any project, gets checked in until it gets a positive review.

<<:  Programming classes in prison: He became a programmer after he was released

>>:  The first killer app for Apple Watch is born

Recommend

Why can’t you find coconut water and coconut flesh in the coconuts you eat?

I still remember the first time I ate coconut, I ...

How to become a top coder in three years

In a typical intellectual industry like programmi...

Why is it not recommended to buy a mobile phone with 128G memory now?

Recently, Apple phones have once again sparked he...

How do operators build a data analysis framework?

Data analysis, as a core skill that operations pe...

Briefly talk about 20 laws of brand communication

Once the defense is broken, it will kill instantl...

Bilibili product analysis: Station B’s path to change

The multiple historical highs created by Bilibili...

"Qunar" operation strategy: 10 times user growth skills

Based on his own professional experience, the aut...

Tips for building a private domain traffic pool!

Private domain traffic is now a "battlefield...

[Inspirational story] Don’t be afraid of falling and never give up

[Inspirational Story] Don't be afraid of falli...

How to monetize short videos? Here are 7 ways

In recent years, short videos have been very popu...

Online activity plan operation process template

This template is a relatively general activity te...