Scent of Romance Comics: What is a website log? How does website log help SEO?

A website log is a file ending with "·log" that records various raw information such as the web server receiving and processing requests and runtime errors. To be precise, it should be a server log. The greatest significance of website logs is to record website operations, such as the operation of the space and the records of access requests. Through the website log, you can clearly know which page of your website the user visited, under what IP, time, operating system, browser, and resolution display, and whether the visit was successful.

The so-called website log is a record of the processing status of the server where the website is located when it accepts various requests from users. Whether it is normal processing or various errors, they will be recorded in the website log, and the file ends with . log is the extension.

By analyzing the website log files, we can see the behavioral data of users and search engine spiders visiting the website. These data can enable us to analyze the user and spider preferences for the website and the health of the website. In website log analysis, what we mainly need to analyze is spider behavior.

During the spider crawling and indexing process, the search engine will allocate corresponding resources to websites with specific weights. A search engine-friendly website should make full use of these resources so that spiders can quickly, accurately and comprehensively crawl valuable content that users like, without wasting resources on useless content that has abnormal access.

Website log data analysis and interpretation:

1. Number of visits, dwell time, and crawl volume

From these three data we can learn: the average number of pages crawled each time, the single page crawling dwell time and the average dwell time each time.

Average number of pages crawled each time = total crawl volume / number of visits

Single page crawling and staying = each stay/each crawling

Average dwell time per visit = total dwell time / number of visits

From these data we can see the spider's activity, affinity, crawling depth, etc. The higher the total number of visits, dwell time, crawling volume, average crawled pages, and average dwell time, the more popular the website is with search engines. The single-page crawling residence time indicates the website page access speed. The longer the time, the slower the website access speed, which is not conducive to search engine crawling and inclusion. We should try our best to increase the web page loading speed, reduce the single-page residence time, and allow crawler resources to crawl and include more.

In addition, based on these data, we can also calculate the overall trend performance of the website over a period of time, such as: spider visit trend, residence time trend, and crawling trend.

2. Directory crawling statistics

Through log analysis, we can see which directories on the website are favored by spiders, the depth of the crawled directories, the crawling status of important page directories, the crawling status of invalid page directories, etc. By comparing the crawling and inclusion of pages in the directory, we can find more problems. For important directories, we need to increase weight and crawling through internal and external adjustments; for invalid pages, we block them in robots.txt.

In addition, through multi-day log statistics, we can see the effects of on-site and off-site behaviors on the directory, whether the optimization is reasonable, and whether it has achieved the expected results. For the same directory, over a long period of time, we can see the performance of pages under that directory and infer the reasons for the performance based on the behavior.

3. Page crawling

In the website log analysis, we can see the specific pages crawled by the spider. Among these pages, we can analyze which pages the spider has crawled that need to be prohibited from crawling, which pages that have no value to be included, which duplicate page URLs have been crawled, etc. In order to make full use of spider resources, we need to prohibit crawling these addresses in robots.txt.

In addition, we can also analyze the reasons why the pages are not included. For new articles, it is because they were not crawled and not included, or they were crawled but not released. For some pages that are not very meaningful to read, we may need it as a crawling channel. For these pages, should we add Noindex tags, etc. But on the other hand, would spiders be so stupid as to rely on these meaningless channel pages to crawl web pages? Spiders don’t understand sitemaps? I still have doubts about this.

4. Spider access IP

Someone once suggested using the spider's IP segment to determine the website's downgrade. I feel this is not very meaningful because it is too hindsight. Moreover, the demotion should be judged more based on the first three data items. It is not very meaningful to judge based on a single IP segment. IP analysis is more useful for determining whether there are collection spiders, fake spiders, malicious click spiders, etc.

5. Access status code

Spiders often use status codes such as 301, 404, etc. These status codes must be processed in a timely manner to avoid causing bad effects on the website.

6. Crawl time period

By analyzing and comparing the crawling volume of multiple spiders per day, we can understand the active periods of specific spiders for this website at specific times. By comparing weekly data, we can see the activity cycles of specific spiders during the week. Knowing this can provide certain guidance for the update time of website content, and the previous so-called small three, small four, etc. are all unscientific statements.

<<: How about Minsheng Life Insurance: What challenges do you often encounter as an SEO optimizer?

>>: The essence of WeChat public account operation: from original to unique, from low price to priceless

5 copywriting skills, do you master them if you are in operation?

Scent of Romance Comics: What is a website log? How does website log help SEO?

5 copywriting skills, do you master them if you are in operation?

How to increase online user growth during the epidemic

How to solve the high user churn rate? Here are 10 strategies

How to monetize short video live streaming? What are the ways to monetize short videos?

Dissecting the cases of brands that became popular on Xiaohongshu!

Why should we advertise on CCTV?

15 short video planning and promotion

Changlu Teacher Chip Conversion Ultimate Edition

Why is there such a big difference in the price of building a website? What is the difference?

2021 Maternal and Infant Industry Brand Marketing Insights Report

Recommend

Five key points of the new media operation matrix!

Xinji SEO training: Optimize each internal page as the homepage

Comparison and forms of advertising on short video platforms such as TikTok

How should self-media content marketing accept advertising?

All 300+ hot spots in 2017 are here. Don’t tell me you can’t keep up with the hot spots!丨

Five years of practical operation experience, revealing seven golden rules of user operation!

How to recall old users and improve retention rate?

How to promote 360 advertising information flow?

WeChat Mini Program merchant entry process, how to activate the mini program?

Take stock of online marketing and promotion channels! One picture is enough, I collected it~~

Bilibili Product Analysis | "Not just to become the 'Chinese YouTube'!"

How much does a character entry on Baidu Encyclopedia cost? How much does Baidu People cost?

Two major channels for Tik Tok to get free traffic!

The logic behind internet celebrity brand marketing success!

What percentage is the commission for gifts given to Tik Tok anchors? How to get gifts?