Why do Toutiao always know what you like to watch?

Why do Toutiao always know what you like to watch?

There are many apps like Toutiao and Qingmang Reading that provide you with personalized information recommendations. Why do they have different styles despite the same personalized recommendations? The article shared today will briefly introduce to you from three aspects how the app recommends personalized information for you.

I don’t understand how Toutiao works. However, since I was responsible for the personalized recommendations and sorting of News Feed when I worked at Facebook, I can talk about how Facebook measures the quality of its recommendations and sorting.

At the specific implementation level, there are three main ways to evaluate the effectiveness of the recommendation engine based on machine learning models , product data , and user surveys .

1. Machine Learning Model

A major core of the recommendation engine is machine learning (although now everyone calls it artificial intelligence , it is still essentially supervised learning). If you want to examine the quality of a machine learning model, there is already a set of mature practical methods in academia.

Whether it is model selection (such as replacing a decision tree with a neural network) or iterative improvement (such as using twice as much data when training the model), supervised learning-based measurement methods can be used. The most common one is AUC.

On the other hand, there are more detailed indicators for a certain type of problem. For example, you can use the importance of model features to know whether the newly added features are useful.

2. Product Data

No matter how powerful a machine learning model is, it must be tested on actual product data. Everyone is familiar with this aspect, KPI. But at Facebook, especially in a place like Newsfeed where one thing affects the entire system, we track a range of data to describe the product rather than relying on a single metric .

These data include but are not limited to:

  • Daily/monthly active users (DAU, MAU)
  • User interaction (likes, comments, reposts, etc.)
  • User post count
  • User dwell time and amount of content consumed
  • income
  • User interaction rate (e.g. the percentage of likes/comments/long reads/collections of content viewed)
  • Number of user reports and blocks

Moreover, in daily rapid iterations and A/B testing, these general data are not enough. We also need more detailed data to truly understand some of our changes. For example:

  • How does the distribution of content types change : the proportion of user-generated and forwarded content, the proportion of web links and pictures and videos, the proportion of long videos and short videos , etc.
  • How it affects public accounts : What kind of public accounts will benefit from this change
  • Which third-party giants were affected and whether the impact was reasonable : For example, my earliest project when I was an intern at FB was to rectify SPAM accounts. That change dealt a heavy blow to Zynga (because Zynga relied heavily on users harassing their friends to attract traffic), but everyone thought it was reasonable, so they asked the public relations department to communicate and then released it.

In addition, in order to prevent short-term eye-catching effects, we maintain a long-term backtest for every important product decision to evaluate the long-term impact of the decision . For example:

  • For the decision to put ads in the feed, we will select a small number of users and not show ads to them for a long time, and then compare their user activity with users who can normally see ads to measure the long-term impact of advertising.
  • Similarly, for whether Newsfeed is sorted, we also have a holdout group whose feeds are completely sorted by time.

In this way, for every decision that may be controversial, at every point in the future, we will clearly know what trade-offs we are facing. With this level of protection, we will dare to take more risks and move faster when making decisions.

3. User Survey

Most product data have limitations because they are explicit and passive . For example, if you push a piece of vulgar content to a user to attract attention, the user may click on it to read it at the moment, so the data is good.

However, users may have a low opinion of the content in their minds, and may also look down on the product as a content platform. This will cause great harm to the product in the long run.

There is a consensus in the Silicon Valley Internet circle that KPI cannot fully describe product quality , but each company has a different answer as to how to solve this problem.

Twitter CEOs, whether it is Jack Dorsey or Evan Williams, tend to underestimate KPIs and rely on their own subjective ideas to make decisions.

Google and Facebook took another approach and decided to incorporate user reviews into their KPIs.

Google started working in this area relatively early, so there is more public information. In general, they hire a large number of ordinary people to give subjective scores to the quality of Google search rankings and advertising recommendations from a user's perspective .

When the amount of scoring reaches a certain level, these data will be sufficient to become a stable, effective KPI that can be continuously tracked and improved. Facebook has taken a similar approach to personalized recommendations, albeit in a different product area.

At the end of the answer, I still want to reiterate two methodologies:

  • Never rely on a single KPI to evaluate work on a product. No KPI, no product, can do that.
  • As long as the limitations of KPIs are clearly understood, numbers can put an end to most meaningless wrangling, whether technical or political.

 

Mobile application product promotion service: APP promotion service Qinggua Media advertising

The author of this article @宋一松 is compiled and published by (Qinggua Media). Please indicate the author information and source when reprinting! Site Map

<<:  How much bandwidth should I choose to rent a server when my website has a lot of traffic?

>>:  6 major information flow advertising platforms, which channel is the most powerful

Recommend

This is the entire process of a product from establishment to acquiring users!

In the early stages of entrepreneurship , from bu...

7-Day Primer on Portfolio Strategy

Introduction to the 7-day introductory resource f...

New Media Operations: 5 Tips for 100,000+ Hot Articles

A 10W+ viral article can be produced accidentally...

How to tell whether keywords are optimized well in website optimization?

For SEO website optimization workers, if we want ...

Online education app buying trends and delivery insights!

According to Adinsight product monitoring by Reyu...

Introduction to Dong Zhongshu: How to increase the amount of external links?

How to post external links to increase the entry ...

30 information flow cases to teach you how to improve conversion

January's creative sharing is finally here! A...

Short video operation and script writing!

In an era where content is king, the account posi...