Printing logs is an art, but it has long been neglected by developers. Logs are like car insurance, no one wants to pay for insurance, but once something goes wrong, everyone wants to have insurance available.
Image via Pexels We are very casual when printing logs, but when we use them, we will complain about all kinds of SB, including ourselves! Write every log well, and encourage each other! What is a log? Log, as defined by Wikipedia, is a record of the operation of computer equipment such as servers or software. Log files provide accurate system records, and the error details and root causes can be finally located based on the logs. The characteristic of logs is that they describe some discrete (discontinuous) events. For example, an application outputs INFO or ERROR information through a rolling file, and stores it in some storage engine (Elasticsearch) through a log collection system for easy query. What is the use of logs? In the above article, we explained that the role of logs is to provide accurate system records to facilitate root cause analysis. So in what specific aspects can it play a role? ① Print debugging : You can use logs to record variables or a certain logic. Record the program running process, that is, which codes the program runs, to facilitate troubleshooting logic problems. ② Problem location : When the program is abnormal or fails, quickly locate the problem to facilitate later problem solving. Because the online production environment cannot debug, it is time-consuming and laborious to simulate a production environment in the test environment. Therefore, it is very important to locate the problem based on the information recorded in the log. You can also record the traffic, and later use ELK (including EFK) to perform traffic statistics. ③User behavior log : records user operation behavior for big data analysis, such as monitoring, risk control, recommendation, etc. This type of log is generally used for analysis by other teams, and may be multiple teams. Therefore, there are generally certain format requirements. Developers should record according to this format to facilitate use by other teams. Of course, which behaviors and operations to record are usually agreed upon, so the developer mainly plays an execution role. ④ Root cause analysis (essential for shirking responsibility) : that is, record logs in key places. This is convenient for locating problems with various terminals. If someone says it is your program problem, you can confidently take out your log and say, "Look, it runs here and the status is correct." In this way, the other party will obediently locate his code instead of shirking responsibility. When to log? The importance of logs has been mentioned above, so when do we need to record logs? ① System initialization : system or service startup parameters. The core module or component initialization process often relies on some key configurations, and different services will be provided according to different parameters. Be sure to record the INFO log here, print out the parameters and the service description of the startup completion state. ② Programming language prompts exceptions : Nowadays, all kinds of mainstream programming languages include exception mechanisms, and popular business-related frameworks have complete exception modules. This type of captured exception is the system telling developers that they need to pay attention to it. It is a very high-quality error report. Appropriate logs should be recorded, and the WARN or ERROR level should be used according to the actual business situation. ③Business process does not meet expectations : In addition to platform and programming language exceptions, when the results in the project code do not meet expectations, it is also one of the log scenarios. In short, all process branches can be taken into consideration. It is up to the developer to judge whether the situation can be tolerated. Common suitable scenarios include incorrect external parameters, data processing problems that cause the return code to be out of the reasonable range, etc. ④ System core roles and key component actions : Business actions triggered by core roles in the system require more attention and are important indicators for measuring the normal operation of the system. It is recommended to record INFO level logs. For example, the entire process from user login to order placement in an e-commerce system; the interaction between microservice nodes; the addition, deletion and modification of core data tables; the operation of core components, etc. If the log frequency is high or the printing volume is particularly large, the key points can be extracted as INFO records, and the rest can be considered at the DEBUG level as appropriate. ⑤ Remote call of third-party services : An important point in the microservice architecture system is that third parties are never trustworthy. For remote calls to third-party services, it is recommended to print the request and response parameters to facilitate locating problems on each terminal and avoid being at a loss due to the lack of third-party service logs. Log Printing Slf4j&Logback The full name of Slf4j in English is “Simple Logging Facade for Java”, which is a simple logging facade provided for Java. Facade is an interface at a lower level. It allows users to access different logging systems through Slf4j in their projects according to their preferences. Logback is the native implementation framework of Slf4j. It is also created by Log4j alone, but it has more advantages, features and stronger performance than Log4j. Logback has a faster execution speed than Log4j. Based on our previous work on Log4j, Logback has rewritten its internal implementation, which can be up to 10 times faster than before in certain scenarios. This ensures that Logback components are faster while requiring less memory. Log files Log files are placed in a fixed directory and named according to a certain template. The recommended log file names are:
Log variable definition It is recommended to use lombok (code generator) annotation @lombok.extern.slf4j.Slf4j to generate log variable instances:
Code example:
Log Configuration Log records are recorded in a hierarchical manner. The levels correspond to the log file names, and log information of different levels is recorded in different log files. If you have special format logs, such as access logs, use a separate file. Please be careful to avoid repeated printing (you can use additivity="false" to avoid this). Parameter placeholder format Use parameterized form {} placeholders and [] to isolate parameters. This has the advantage of higher readability and the parameters are processed only when they are actually ready to be printed.
Basic log format ①Log time As the date and time when the log is generated, this data is very important and is generally accurate to milliseconds. yyyy-MM-dd HH:mm:ss.SSS ②Log level The output of logs is divided into different levels. Different settings and different occasions print different logs. The following four levels are mainly used: DEBUG: The DEBUG level mainly outputs debugging content. This level of log is mainly used for output during the development and testing stages. This level of log should be as detailed as possible. Developers can record various detailed information in DEBUG for debugging purposes, including parameter information, debugging details, return value information, etc., to facilitate analysis when problems or exceptions occur during the development and testing phases. INFO: The INFO level mainly records key system information, aiming to retain key operating indicators during normal system operation. Developers can record initial system configuration, business status change information, or core processing in user business processes into INFO logs to facilitate daily operation and maintenance work and context scenario reproduction during error backtracking. It is recommended that after the project is completed, the log level should be set to INFO in the test environment, and then the INFO level information can be used to see whether the application usage can be understood, and whether these logs can provide useful troubleshooting information if problems occur. WARN: The WARN level mainly outputs warning content, which is predictable and planned, for example, when a method input parameter is empty or the value of the parameter does not meet the conditions for running the method. At the WARN level, more detailed information should be output to facilitate subsequent log analysis. ERROR: The ERROR level is mainly for some unpredictable information, such as errors and exceptions. For example, network communication and database connection exceptions captured in the catch block. If the exception has little impact on the entire process of the system, you can use the WARN level log output. When outputting ERROR-level logs, try to output as much data as possible, such as method input parameters and objects generated during method execution. When there is data with errors or exception objects, the objects need to be output together. ③DEBUG/INFO selection The DEBUG level is lower than INFO, and contains more detailed information about the system operation status during debugging, such as the value of variables, etc., which can all be output to the DEBUG log. INFO is the default output level of online logs, which provides feedback on the current status of the system to the end user. The output information should be meaningful to the end user. From a functional perspective, the information output by INFO can be regarded as part of the software product, so it needs to be treated with caution and should not be output casually. If this log is printed frequently or is not useful for debugging most of the time, you should consider lowering the level to DEBUG:
④ WARN/ERROR selection It can be used when the method or function processing produces unexpected results or there is a framework error. Solutions to common problems include:
Generally speaking, there will be no SMS alarm at the WARN level, but there will be SMS alarm or even phone alarm at the ERROR level. ERROR-level logs mean that a very serious problem has occurred in the system and someone must deal with it immediately, such as database unavailability, the system's key business processes cannot proceed, etc. Incorrect usage can lead to serious consequences. No matter how important the problem is, it will be recorded as an ERROR whenever there is one. In fact, this is very irresponsible, because for a mature system, there will be a complete error reporting mechanism. Then when the error message needs to be sent out is often determined by the number of ERROR logs per unit time. ⑤ Emphasize ERROR alarm ERROR-level log printing is usually accompanied by an alarm notification. ERROR-level log printing should be accompanied by business function impairment, that is, a very serious problem has occurred in the system mentioned above, and someone must deal with it immediately. The goal of the ERROR log is to provide the processor with direct and accurate information. The ERROR information forms a closed loop. Problem location:
⑥Thread name Output the thread name of the log. Generally, a synchronous request in an application is completed by the same thread. Outputting the thread name can classify the logs generated by each request, making it easier to distinguish the logs of the current request context. ⑦Opentracing logo In distributed applications, a user's request will call several services to complete, and these services may be nested. Therefore, the log of completing a request is not in the log file of one application, but is scattered in the log files of different application nodes on different servers. This flag is used to concatenate the call logs of a request in the entire system:
By searching for the trace id, you can find all the logs generated by the request identified by the trace id during its flow (processing) in the entire system. ⑧biz logo In business development, our logs are all related to the business. Sometimes we need to cluster them according to users or businesses. Therefore, if a request can be clustered by a certain identifier, the cluster identifier can be printed in the log:
⑨Logger name The name of the logger is usually the class name. The simple class name can be output in the log file. It depends on the actual situation whether the package name and line number are needed. It is mainly used to find the class in which the log output is found after seeing the log, so as to locate the problem. ⑩Log content Disable System.out.println and System.err.println. Variable parameter replacement log concatenation, the object that outputs the log should implement a fast toString method in its class so that only the object class name and hashCode are output when the log is output. Prevent null pointers: Do not call object methods in logs to obtain values unless you are sure that the object is definitely not null. Otherwise, it is very likely that the application will generate a null pointer exception due to log problems. ⑪Exception stack The exception stack usually appears in the ERROR or WARN level log. The exception stack contains the system of the method call chain and the root cause of the exception. The log of the exception stack belongs to the previous line of log and needs to be moved to the previous line when collecting logs. Best Practices ①Log format 2019-12-01 00:00:00.000|pid|log-level|[svc-name,trace-id,span-id,user-id,biz-id]|thread-name|package-name.class-name : log message The log format is as follows:
②Log module expansion The log module is extended based on the following technical points:
In each tracing link, the context information in the Opentracing Scope is placed in the MDC, according to the value logic of logging.pattern.level extended by the Spring Boot Logging extension interface. Related source code reference:
Standardization can be achieved by modifying the pattern of each appender in the logback configuration file to the following default value.
Excerpt from logback.xml:
Example of code usage:
Logging:
Log Service SLS Alibaba Cloud Log Service Alibaba Cloud Log Service (SLS) is a one-stop service for log data, which was developed through a large number of big data scenarios at Alibaba Group. You can quickly complete log data collection, consumption, delivery, query and analysis functions without development, improve operation and maintenance efficiency, and establish massive log processing capabilities in the DT era. Project: Project is the basic unit for managing logs. For service logs, it is recommended that one environment be built as one Project. In this way, log records are a closed loop as a whole, and log records are generated along with service calls in the entire environment. logstore: log store. It is recommended that the log store be divided into different types according to log types, such as access logs in a specific format, and info/warn/error logs. Specific formats can be configured with more convenient indexes and alarm settings. Note: Do not divide logstores into different types according to application services. In the microservice architecture, a request crosses multiple application services, and logs are scattered in each application service. To divide logstores into different types according to services, developers need to have a good understanding of the application operation status and call topology, which they often do not have. ① Real-time collection and consumption Function:
Applications: data cleaning (ETL), stream computing, monitoring and alarming, machine learning and iterative computing. ②Query analysis Real-time indexing, query and analysis of data:
Applications: DevOps/online operation and maintenance, real-time log data analysis, security diagnosis and analysis, operation and customer service systems.
③Consumer delivery Stable and reliable log delivery. Deliver log core data to storage services for storage. Supports various storage methods such as compression, custom partitions, and rows and columns. Purpose: data warehouse + data analysis, auditing, recommendation system and user profiling. ④Alarm The alarm function of the log service is implemented based on the query chart in the dashboard. Set the alarm rules on the query page or dashboard page of the log service console, and specify the configuration, check conditions, and notification method of the alarm rules. After setting the alarm, the log service will periodically check the query results of the dashboard and send an alarm notification when the check results meet the preset conditions, thus realizing real-time service status monitoring. ⑤ Best Practices Alibaba Cloud's log service is quite powerful. If you want to make good use of the log service, you can refer to: https://help.aliyun.com/document_detail/29090.html?spm=a2c4g.11186623.6.1079.4edd3aabvs50OW ELK Universal Log Solution ELK is the abbreviation of the first letters of the three open source frameworks Elasticsearch, Logstash, and Kibana. It is also known as the Elastic Stack in the market. Elasticsearch is a near real-time search platform framework based on Lucene, distributed, and interacting in a RESTful way. In scenarios like big data full-text search engines such as Baidu and Google, Elasticsearch can be used as the underlying support framework. It can be seen that the search capabilities provided by Elasticsearch are indeed powerful. In the market, we often refer to Elasticsearch as ES. Logstash is the central data flow engine of ELK, which is used to collect data in different formats from different targets (files/data storage/MQ), and supports output to different destinations (files/MQ/Redis/Elasticsearch/Kafka, etc.) after filtering. Kibana can display Elasticsearch data through a friendly page and provide real-time analysis capabilities. Practical Notes Common format log:
The prefix of ordinary logs is fixed, and the word segmentation index can be fixed to facilitate faster query and analysis. Logs in a specific format, taking access logs as an example:
Specific format logs can be indexed by format to facilitate focused query analysis and alerts, such as based on take-time, http-status, biz-code and other values. References:
|
<<: QQ is really tough! Updated 4 new features to target primary school students
>>: It’s 2020 now. Are foldable phones still “expensive vases”?
The past year has been extremely magical, and I b...
Recently, Tesla officially announced that users w...
"Make the logo bigger! Bigger!" - Proba...
This article is reproduced from Xinhua News Agenc...
[[420558]] This article is reprinted from the WeC...
Excerpted from: Inside and Outside the Classroom ...
1. Concept of App operation and promotion Quoting...
Author: Liu Sen Recently, there is a very interes...
This is a fairly common problem I see with beginn...
[[395030]] With the popularity of digital RMB har...
: : : : : : : : : : : : : : : : : : : : : : : : : ...
Last year, I had an interesting work experience. ...
I am a movie fan. I have loved watching movies si...
How much does it cost to be an agent of a toy min...
Recently, a member of the National Committee of t...