I saw a hot post on Zhihu that I found very interesting. It was called "Why are some big companies so weak in technology?" When I first saw the title, I had a preconceived and stereotyped opinion. Just like the first answer, I frowned and felt contempt for the author of the question. But soon after reading the post, I stood on the side of the author of the question. Just like one of the answers in the post: “After reading the title, I thought the poster was an idiot, but after reading the article, I found out that it was really the company that was an idiot.” The probability of the above situation happening is actually quite low, but I think it really happened this time. But what makes me regretful is that among all the answers, most of them are trying to "educate" the author of the question, saying that the world is not perfect, that we should compromise and accept such facts, and that we should helplessly swallow the bitter fruit of this reality. Isn't this widespread view too abnormal? For example, if: "Writing good code is the moral integrity of a programmer. Sorry, how much is moral integrity per kilogram, and how much is a square meter of commercial housing in the Third Ring Road of Beijing?" Some questions cannot be answered in a few words, especially when facing the reality of the entire industry. However, it is clear that when a project code is poorly written, the project will not last long. Well, let's stick to the facts, I will post it below and analyze the contents one by one. Let me explain why the questioner is fine in general, but the problem lies with this company and this project team. ================================= At the beginning of this year, I interned at an Internet company, which is a leader in the domestic industry. However, in terms of technology and management, it is weak. The programmers there check emails and problem tickets every day. Most of these problems are caused by their improper design. >>This is the so-called operations work, which is boring in most cases. This usually also means that the system is complex, the load is high, and the problems are not easy to locate and solve. They are often large systems left over from history. This is not surprising, just like outsiders see the glorious AWS, and then the internal engineers know how much maintenance work is involved and how great the pressure is. I remember watching a principal talk not long ago, talking about how the torture of oncall made him grow. I think what he said is true, but it’s a pity that I haven’t reached this level yet. The author of the question can see that most of the problems are caused by improper design, which is good. And the problems are mostly caused by improper design. The initial architects of this project, as well as the backbone engineers who later took care of the architecture, should be criticized. For a "domestic industry leader", such a thing should not happen. The code is a mess, all copy and paste, even the author hasn't made any changes. People generally don't write comments or format the code, so the code is crooked. >> This is another thing that shouldn't have happened. Everyone has their own opinions on the pursuit of code quality, but I don't believe that the general "leading" companies can accept words like "copy and paste", "the author didn't change it", "no comments", "no formatting", etc. These things are like eating a meal one bite at a time. No matter how ambitious you are, these most basic details can never be ignored. I think when companies are recruiting, since it is a two-way choice, they can show each other their codes. After such codes are displayed and seen, everyone doesn't have to waste time. In one project, there are four types of httpclient. One is written by the company's R&D department, one is an old version of the open source project, one is a new version of the open source project, and the other is a wheel created by the developers. >> The most ideal situation is of course to unify into one. When encountering so many implementations and some self-inventing the wheel, many of them are due to "historical reasons". Of course, you can see that the code is simply "copy and paste", and it is not surprising to write according to your own understanding without looking at the previous code implementation. Of course, I can accept that there are more than one implementation of an httpclient due to some special reasons (I have also written about the benefits of reinventing the wheel here ), but there are actually four, I think it is not a good thing. I tried to log the interface request and response, but I didn’t know how to use an interceptor. The error log does not include any context information. Everyone has their own logging style, and they are all weird. Many important intermediate processes do not log. >> Interceptors are good things. They simplify the code and prevent verbose logs from affecting the reading of business logic. Of course, there are also disadvantages, such as being unintuitive, difficult to debug, and sometimes performance issues. Therefore, this is not mandatory, and it depends on the actual situation of the project. The strange log styles are mostly caused by the lack of internal project management, with everyone doing their own thing and lacking communication. Not to mention the developers are not very good, the project manager is even weaker. Not logging important processes can only help prove that this group of developers lacks engineering awareness. All the configuration files of idea, eclipse and myeclipse were transferred to the project. >> It's OK if the IDEs are different, but are the configuration files of these IDEs also uploaded? Such low-level problems have occurred... Don't we need to review the code? A programmer who had worked in the company for two years was working on a query interface with the express company, but he didn't even know how to encrypt the waybill number. >> Whether such information needs to be encrypted usually depends on the protocols of the two calling parties, but any important information such as privacy needs to be encrypted to reduce the risk of information leakage. There is no requestId for all inter-service communications, which makes it difficult to track sessions. >> If it only involves communication between services, and the communication is just a simple query, I think it is acceptable to have no requestId. As for session tracking, if it means that the entire system can track all or important behaviors during the arrival and processing of a request, such as which interfaces are called, which results are obtained, which operations are performed, etc., this function is indeed necessary, but this tracking can be done without requestId in inter-service communication. For example, use a thread variable in the main system, put the thread variable in front every time you print this information, and the subsequent log analysis tool can capture all the logs of this session interaction. An edge interface with little qps actually uses the asynchronous mode of consumer producer + blocking queue. It seems like you have little skills, right? Don’t you know that asynchrony will increase maintenance costs and make testing more difficult? Moreover, the task team did not consider persistence, and many tasks were lost in time for release. >> For edge interfaces with little QPS, it seems like an overkill to make them into such an asynchronous mode. However, this matter needs to be analyzed in the context. For example, it may be for future expansion needs, and some interfaces can foresee a significant increase in the number of requests. Of course, considering the entire context, I am more inclined to believe that the project team neglected management, and then a self-righteous "big shot" came and made a set of high-end tricks to scare everyone; or a guy who really wanted to try new things in the project took this thing to practice. As for the lack of persistence in the task team, it still depends on the specific problem, but usually, if you want to design a general task queue, persistence is a must. Of course, you can't say it for sure. You want to make it completely indifferent to task loss, or the task scheduler can continuously retry those failed tasks, so you don't care about their loss - but think about it, it seems that such cases are quite rare. Reading a small xml and exe configuration file, actually using streaming parsing, I have never seen such a stupid thing, it's really drunk. >> This is the same problem as the one mentioned above about using a butcher knife to kill a chicken, which has been explained. Optimization is all done by snapping your fingers. Don’t you know how to use Excel to analyze logs and jprofile to scan projects? To traverse a set of constants within 100, he also has to write an optimization algorithm, and the algorithm and business are mixed together, making it a mess. Everyone is talking about performance, algorithms, distributed computing... >> It seems that what is meant here is performance optimization. Then optimization includes at least two parts. One part is to analyze the data at the demand level in the design stage to determine the degree of "optimization". The other part is to "reverse" according to the log and the data of the existing project operation as mentioned by the questioner (I wrote something about this a few years ago ). Of course, no matter which one, it is much more reliable than patting your head or thigh. As for the traversal of constant sets within 100, you also need to write an optimization algorithm. This may not be a bad thing sometimes, for example, everyone is following the best practices. However, considering the context (including "algorithms and business are mixed together"), I am more inclined to think that it belongs to the problems already explained. In this situation, everyone is still shouting "performance, algorithms, distributed computing", which seems to have missed the main contradiction. The main contradiction should be to solve the most basic problems of these codes. I remember when I was practicing calligraphy as a child, my teacher criticized me: Don't try those techniques first, practice the most basic horizontal and vertical lines first. There is almost no documentation, and the logic is entirely deduced from the code. >> Code and documentation are often opposites, and this is not uncommon. I think it is more feasible to have summary documentation, but detailed documentation is often unrealistic, and even if it is written, it will inevitably become outdated. Too much documentation within the company is a mess. He doesn't use the enumeration, but insists on hard-coding the enumeration values one by one on every page. Do you know how difficult it is to modify the code later? >> Lack of basic programmer qualities. The deceptive variable name stores AES encrypted data, but the variable name suffix is DES; the variable name stores lowercase letters, but it is written as upperStr. A method has more than a dozen parameters, one-third of which are extremely abbreviated abbreviations and certainly no comments. It is common for a class to have three or four thousand lines. >> Seeing this, I feel powerless to complain. When developing and testing themselves, they actually have to put all the codes on a public machine, and use svn, which they use as ftp. There are a lot of meaningless commits in svn, and more than half of the commits cannot even be compiled. I saw a fresh graduate who changed two sentences and submitted it immediately, saying that he was afraid that the code would be lost. >> It is not unusual to use SVN for self-testing. Just like when you develop a new function, you can cut a new branch under git, develop it, submit it, and deploy it to various machines. However, it is obviously wrong to use it as FTP. "More than half of the submissions cannot be compiled", does this project have no project management? Isn't there unit testing on the development machine? As for "change two sentences and submit immediately", I don't see anything wrong with it, as long as the pre-submission test, review, etc. pass. In a project that had been running for two years, Spring's package scanning was obviously misconfigured, and some beans could not be scanned in at all, and no one noticed it. Half of the beans are managed by Spring, and the other half are instantiated by writing their own singleton pattern. >> The first one is still evidence of poor management of the project team. Even the main programmers lack a sense of responsibility for the project as a whole, or the code is so bad that people can't get interested. The second one is a typical bad practice. Sometimes such a compromise may be necessary, but if half of the beans are out of Spring's management, why introduce Spring in the first place? They use MySQL to build an audit system and generate reports. One report takes 8 minutes to run. It turned out that someone used a string to store multiple values (comma separated), and wrote like in the SQL, which resulted in the index not being used. Why not use pg? pg has richer functions in SQL programming and is more suitable for statistics. It itself supports arrays. >> It is normal for the report to run in 8 minutes. Sql uses strings to store multiple values, and does not use indexes. We still need to analyze the specific problem. In principle, I don’t think there is any problem. To perfectly solve this problem, you still have to split the multiple values into multiple rows in MySQL and put them in a new table. In addition, there are some NoSQL systems that naturally support multiple values, such as DynamoDB, but this is off topic. As for why pg is not used, this involves the initial technology selection. It is easy for future generations to just say "If you use xxx, then yyy", but it is not clear whether there were technical considerations at the beginning. Of course, this project is so bad, maybe it was just for convenience at the beginning to make a prototype of MySQL. In any case, questioning ideas are always worth encouraging. Programmers all have a laissez-faire attitude, and they consider their job done as long as they can get the code in and run the tests. Why do large Internet companies have such poor technology and management? How did they come into being? |
<<: Test combination: Can Baidu MTC relieve your pain?
>>: Facebook launches new video embed API and Apple Watch/TV SDK
It has been several months since TikTok announced...
Although routines are necessary, you must also be...
Sylvester Stallone's Rambo 5 movies collectio...
Speaking of Haima Automobile, many people probabl...
On February 9, Tencent made a major upgrade to QQ...
WeChat Mini Program is an application that users ...
[[126615]] Recently, Marco Arment, a well-known i...
Recently, NIO announced that it has signed a conv...
Introduction: Have we ever thought that a grain o...
It is said that mobile games have brought the sta...
Let’s talk about some down-to-earth social media ...
My definition of user reach is: sending specific ...
Promotion has replaced sales, and we are forced t...
There have been some updates about WeChat Video A...
Father's Day is coming. They are good at many...