Troubleshooting starts with error code 406

Troubleshooting starts with error code 406

background

A while ago, I was suddenly informed by my colleagues in operations that a teacher from Hujiang could not log in to his Hujiang account while abroad. This was a very common fault, but the troubleshooting process was not simple. We unexpectedly gained a lot of insights, which we would like to share with you here.

We first determined that the fault was not related to the backend, but to the frontend, so we quickly checked the frontend logs. From the logs, we found that the geographical location interface used to determine the client continued to have errors, and a large number of HTTP Status Code 406 appeared (more than 10,000 within 24 hours). According to the HTTP Status Code specification, error codes starting with 4 are related to the client. Considering that this fault only occurred in one teacher, we initially determined that 406 was the root cause of the problem.

As we gained more information and conducted deeper analysis, we quickly resolved the problem that the foreign teacher had. Unfortunately, we confirmed that it had nothing to do with 406.

However, we cannot stop here. After all, the HTTP Status Code of the response should be 200 under normal circumstances, so what are the large number of 406? Why can't we reproduce it? How are they caused? Such a large number of outbreaks should have caused user feedback? Why is the online feedback so calm?

The following figure shows the 406 error in the log platform

Troubleshooting process

To ensure performance, our Node client does not record every request in detail, so simply looking at the 406 log does not reveal the specific cause. To troubleshoot this issue, we urgently released an online patch that records detailed information for each request, and then saw the following request in the log platform:

For the sake of comparison, we intercepted a normal request on the browser. As shown below

Comparing these two requests carefully, combined with the definition of error code 406, our attention is focused on the Accept header

In the log

The normal browser behavior

So, we simulated the wrong request in Postman, and sure enough, we reproduced the 406 error, so we can confirm that the problem is caused by the Accept field.

The 406 Not Acceptable status code indicates a client error, indicating that the content characteristics of the requested resource cannot meet the conditions in the request header, and thus the response entity cannot be generated. Translated from the HTTP protocol specification RFC document

We searched online and discussed the 406 error code with our backend colleagues. We learned that if the Accept header in the request does not meet the pre-agreed contract, a 406 error will be returned. The API service reported the error and returned data in application/json format. However, the Accept header in the request indicated that it did not support this format, so a 406 error was reported.

We carefully checked the requests sent by common browsers and found that all of them contain Accept: */* ;. It seems that these requests that cause 406 are not sent by ordinary users. So, who sent these requests?

Is it CDN?

The full name of CDN is Content Delivery Network. Its purpose is to enable users to obtain the required content nearby, solve the problem of Internet network congestion, and improve the response speed of users accessing websites. CDN network can cache the content of the server to CDN nodes distributed around the world, and connect to the CDN nearby according to the user's access IP address to improve the response speed of the website. (Quoted from google.com)

Nowadays, CDN is a common configuration for various companies, and Hujiang is no exception. We carefully studied the source IP of the request that caused 406 and found that they all came from a few nodes of Beijing Unicom. From this point of view, CDN is highly suspected. There are probably two possibilities: 1. The Accept field in the original request header is wrong; 2. The Accept field in the original request header is correct, but it was tampered by CDN when passing through the CDN node. Since we have encountered the problem of CDN tampering with the header before, we initially judged that it was a CDN problem.

Next, we temporarily returned the Beijing Unicom node to the source to verify whether the CDN tampered with the header, and also obtained the final user IP. When searching for detailed information about this IP on the Internet, it clearly said that it was a crawler of a certain search engine. It turned out that the 406 did not come from ordinary users, but from the crawler of the search engine.

Highlights

In the past few days of writing this article, I found that the error logs have dropped a lot, and there are no 406 errors. I thought that a certain search engine had repented, so I used the IP address that caused the error to search the log platform and found that the search engine just changed its strategy. Its Accept field was modified, and the search engine's unique logo was added to the UA header, and it suddenly became a regular search engine again.

summary

For developers, when a site encounters a large number of 406 errors, don’t worry too much. Check the log carefully. It is very likely caused by the search engine crawler.

To summarize this 406 error code incident, when a search engine crawled the Hujiang page, the Accept field set in the request header was different from the Accept field accepted by the backend service, resulting in a large number of 406 errors.

***Detailed explanation of the relevant knowledge of Accept in Header

Accept

It is used in the header to inform the client of the content type that can be processed. This content type is represented by the MIME type (quoted from MDN)

Content Type

text/html, application/xhtml+xml, and application/xml are all MIME types, which can also be called media types and content types.

In the example, application is the type and json is the subtype. This means that the client can only receive responses of the application/json type. If the server cannot return a response of this type, it should return a 406 error.

The wildcard * represents any type

For example: Accept: / means the browser can handle all types

Accept can support multiple types separated by

With the help of content negotiation, the server can choose one of many options to apply and use the Content-Type response header to inform the client of its choice.

It shows that there are only three types of responses that the client can receive: text/html, application/xhtml+xml, and application/xml.

Factor weight (q)

q is a value between 0 and 1. The default value of q is 1. q=0 means unacceptable. The larger the q value, the more likely the request will get the content represented by the type before the ";"

It shows that the client prefers responses in text/html format, followed by application/xhtml+xml, and finally application/xml, */*.

<<:  New gameplay! iPhone X runs Windows 95: smooth minesweeper, stand-alone games

>>:  Live video streaming, online claw machine and the Internet of Things, where will the next trend be?

Recommend

Do you know all the new features of iOS 10?

[[172403]] What did Apple release at the just-con...

Long March 6A: my country's first solid-liquid carrier rocket

Friends who are familiar with foreign launch vehi...

Global new discovery No. 1 in 2022!

The Henan Provincial Bureau of Geology and Minera...

Do you understand App promotion thinking?

A friend asked me yesterday: How long do you thin...

9 things you need to know about eating watermelon healthily

Watermelon is the favorite fruit of most people i...

LG G6 disassembly: Heat pipe cooling suppresses Snapdragon 821

LG officially released its new phone LG G6. LG G6 ...