Scent of Romance Comics: What is a website log? How does website log help SEO?

A website log is a file ending with "·log" that records various raw information such as the web server receiving and processing requests and runtime errors. To be precise, it should be a server log. The greatest significance of website logs is to record website operations, such as the operation of the space and the records of access requests. Through the website log, you can clearly know which page of your website the user visited, under what IP, time, operating system, browser, and resolution display, and whether the visit was successful.

The so-called website log is a record of the processing status of the server where the website is located when it accepts various requests from users. Whether it is normal processing or various errors, they will be recorded in the website log, and the file ends with . log is the extension.

By analyzing the website log files, we can see the behavioral data of users and search engine spiders visiting the website. These data can enable us to analyze the user and spider preferences for the website and the health of the website. In website log analysis, what we mainly need to analyze is spider behavior.

During the spider crawling and indexing process, the search engine will allocate corresponding resources to websites with specific weights. A search engine-friendly website should make full use of these resources so that spiders can quickly, accurately and comprehensively crawl valuable content that users like, without wasting resources on useless content that has abnormal access.

Website log data analysis and interpretation:

1. Number of visits, dwell time, and crawl volume

From these three data we can learn: the average number of pages crawled each time, the single page crawling dwell time and the average dwell time each time.

Average number of pages crawled each time = total crawl volume / number of visits

Single page crawling and staying = each stay/each crawling

Average dwell time per visit = total dwell time / number of visits

From these data we can see the spider's activity, affinity, crawling depth, etc. The higher the total number of visits, dwell time, crawling volume, average crawled pages, and average dwell time, the more popular the website is with search engines. The single-page crawling residence time indicates the website page access speed. The longer the time, the slower the website access speed, which is not conducive to search engine crawling and inclusion. We should try our best to increase the web page loading speed, reduce the single-page residence time, and allow crawler resources to crawl and include more.

In addition, based on these data, we can also calculate the overall trend performance of the website over a period of time, such as: spider visit trend, residence time trend, and crawling trend.

2. Directory crawling statistics

Through log analysis, we can see which directories on the website are favored by spiders, the depth of the crawled directories, the crawling status of important page directories, the crawling status of invalid page directories, etc. By comparing the crawling and inclusion of pages in the directory, we can find more problems. For important directories, we need to increase weight and crawling through internal and external adjustments; for invalid pages, we block them in robots.txt.

In addition, through multi-day log statistics, we can see the effects of on-site and off-site behaviors on the directory, whether the optimization is reasonable, and whether it has achieved the expected results. For the same directory, over a long period of time, we can see the performance of pages under that directory and infer the reasons for the performance based on the behavior.

3. Page crawling

In the website log analysis, we can see the specific pages crawled by the spider. Among these pages, we can analyze which pages the spider has crawled that need to be prohibited from crawling, which pages that have no value to be included, which duplicate page URLs have been crawled, etc. In order to make full use of spider resources, we need to prohibit crawling these addresses in robots.txt.

In addition, we can also analyze the reasons why the pages are not included. For new articles, it is because they were not crawled and not included, or they were crawled but not released. For some pages that are not very meaningful to read, we may need it as a crawling channel. For these pages, should we add Noindex tags, etc. But on the other hand, would spiders be so stupid as to rely on these meaningless channel pages to crawl web pages? Spiders don’t understand sitemaps? I still have doubts about this.

4. Spider access IP

Someone once suggested using the spider's IP segment to determine the website's downgrade. I feel this is not very meaningful because it is too hindsight. Moreover, the demotion should be judged more based on the first three data items. It is not very meaningful to judge based on a single IP segment. IP analysis is more useful for determining whether there are collection spiders, fake spiders, malicious click spiders, etc.

5. Access status code

Spiders often use status codes such as 301, 404, etc. These status codes must be processed in a timely manner to avoid causing bad effects on the website.

6. Crawl time period

By analyzing and comparing the crawling volume of multiple spiders per day, we can understand the active periods of specific spiders for this website at specific times. By comparing weekly data, we can see the activity cycles of specific spiders during the week. Knowing this can provide certain guidance for the update time of website content, and the previous so-called small three, small four, etc. are all unscientific statements.

<<: How about Minsheng Life Insurance: What challenges do you often encounter as an SEO optimizer?

>>: The essence of WeChat public account operation: from original to unique, from low price to priceless

Kuaishou live broadcast analysis!

Blog

Experience Introduction: Glow App Develops Apple Watch App

Blog

5G mobile phone sales in the third quarter: Huawei accounts for 30.7% of the market, Apple accounts for 12.5%, less than half of Huawei

Blog

Ningbo has a high-end men's health club, you will definitely want to come again

The United States is leading in autonomous driving, but China has no choice but to give up its illusions and rely on itself to catch up.

Blog

A project that can earn you 100 yuan a day just by copying and pasting, check out Baidu Experience to learn about making money!

Today I will share with you a small project that ...

Brilliance is panicking. BMW plans to increase its stake in the joint venture to 75% and transform it into a global production base.

According to German Manager Magazine, BMW's s...

Don't buy food with these words on the package! How many do you have at home?

This article was reviewed by Pa Li Ze, chief phys...

"Coming from South to North" What kind of eye disease is the "uveitis" that Old Wu suffers from?

Recently, the TV series "South to North"...

Scent of Romance Comics: What is a website log? How does website log help SEO?

Kuaishou live broadcast analysis!

Experience Introduction: Glow App Develops Apple Watch App

5G mobile phone sales in the third quarter: Huawei accounts for 30.7% of the market, Apple accounts for 12.5%, less than half of Huawei

Ningbo has a high-end men's health club, you will definitely want to come again

Jules Verne's novel predicted air warfare half a century in advance

Common iOS debugging methods: LLDB commands

Unboxing the Xiaomi Router Youth Edition: A Must-Have for College Dormitories

How to choose the first release channel? 8 hidden rules of APP operation

3 stages of APP user growth!

The United States is leading in autonomous driving, but China has no choice but to give up its illusions and rely on itself to catch up.

Recommend

Private Domain Traffic Quick Start Guide

Xpeng Motors' new progress in autonomous driving, obtains California autonomous driving license

Why is Apple Watch failing?

Methods and strategies for operating sweepstakes activities!

“Grumpy when waking up”, what are you angry about?

A project that can earn you 100 yuan a day just by copying and pasting, check out Baidu Experience to learn about making money!

150 years ago, she traveled around the world and drew plant paintings that even Darwin admired

New highlights of Android 12 revealed: double-click the back of the phone to quickly perform various operations

Apple is said to have launched generative AI features in iOS 18, including cloud and device

Honor XSport headphones: shark fin design, strong waterproof and long battery life to fuel sports

An excellent Android application starts with building a project

It’s not enough to just capture the carbon, you also have to “lock it up”!

Brilliance is panicking. BMW plans to increase its stake in the joint venture to 75% and transform it into a global production base.

Don't buy food with these words on the package! How many do you have at home?

"Coming from South to North" What kind of eye disease is the "uveitis" that Old Wu suffers from?