Cold start is an important beginning in the entire recommendation system. Recommendation systems generally require a large amount of data to make more accurate recommendations. The cold start of an app may directly determine whether a new user will continue to use it. The cold start of a new item also affects the enthusiasm of the producer, so the cold start is very important. Cold start problems are divided into 3 categories:
User Cold Start Idea User cold start, the most common scenario is the cold start of new users. The path for a new user to be converted into an old user is: new user interest acquisition (building an initial portrait of cold start users) -> content consumption and interest convergence -> sedimentation of interest to become an old user. In general, the first step is to "do everything possible" to obtain user portraits or let users actively generate portraits. There are several methods to consider. Utilize users' social attributes , such as gender, age, region, etc. When a user opens an APP for the first time, many APPs will prompt or leave an entry for the user to fill in relevant information. Even if the user does not actively input, you can try to introduce portrait information from external channels (channel portraits, matrix portraits, applist, etc.) (but you need to pay attention to user overlap and relevance). With this information, coarse-grained personalized recommendations can be made based on social attributes. By utilizing the user’s relationship chain , we can collect information through operational activities (such as Alipay activities to collect friend relationships and parent-child relationships) or introduce it from the outside (third-party login or open API), and recommend content that friends like to users based on the principle of “birds of a feather flock together”. By utilizing popular content and knowing “nothing” about the user, based on herd mentality and the 80/20 rule, you can try to recommend popular content to users. This method focuses on the scope and algorithm of popularity, and the effect will be better than random recommendations. The same goes for leveraging high-quality content. (Left: Weibo, Right: Toutiao) The indicators of user cold start can focus on the portrait indicators of new users (average number of interests, portrait coverage, portrait accuracy, etc.) and the active performance of new users (click-through rate, retention, etc.). Suppose an international app can have a better recommendation effect based on nationality and gender at the beginning. How can it obtain this information?
This type of hidden exploration requires skill in selecting items:
Item Cold Start Idea Recommended by using item content:
(Pictures from the Internet) Introduction to related algorithms What are the commonly used algorithms involved during this period? Assume that user A is a new user with only a few portraits.
UserCF and ItemCF use the same user behavior data, but with different statistical dimensions. As shown in the simple example below (1 means the user clicked on the item), UserCF calculates the similarity of users horizontally, and ItemCF calculates the similarity of items vertically. UserCF and ItemCF both have the problem of "first mover" in cold start. UserCF, new items must appear in the user's display list first, so that more people can give feedback on the item and the item can spread. Therefore, there is a problem of the first driving force, that is, where the first user discovers the new item. ItemCF calculates user behavior at intervals (the log is huge and time-consuming) to calculate item similarity (if a large number of users have viewed item a and also item b, the two items are considered similar) and outputs an item relevance matrix. When a new item is added, it is not automatically added to the matrix table and a user must first discover the new item. ContentItemKNN uses the content features of items to calculate item-related tables, and can update the related tables frequently without the problem of the first mover. However, it ignores user behavior and thus ignores the rules contained in user behavior. The results are low in accuracy and high in novelty, and the effect is generally not as good as collaborative filtering. However, if user behavior is strongly influenced by a certain content feature, the content filtering algorithm has its highlights. The first driving force problem mentioned above, that is, the item cold start problem, also known as "new item trial", is there any way to solve it? The trial launch of new items is “items looking for users”. If it is "users looking for items", the Matthew effect is likely to occur: popular categories have a lot of exposure and the long tail phenomenon is serious. There are two ways for items to find users:
Assume that we define items with less than 500 exposures as new items (information products generally have time limits, such as within 6 hours), represent new items and users as multi-dimensional vectors, calculate the distance between vectors, distribute them to more active users, and weight the sorting and re-ranking restrictions during the cold start phase. During the cold start phase, an item will have a stable click-through rate (or other comprehensive indicators), which will be the basis for its subsequent traffic distribution. Based on the click-through rate performance of small traffic, items with good performance will enter the next larger traffic pool, and items with poor performance will be eliminated or downgraded. The gradient traffic distribution strategy is a relatively common personalized recommendation "horse racing mechanism". What indicators are used to evaluate the cold start effect of items?
Another point is that we also need to pay attention to the user's contextual information, including time information and spatial information, and follow some strong rules. For example, for an e-commerce app, if a new user logs in during the summer, down jackets should not be offered; if a new user logs in during the Mid-Autumn Festival, information about the Dragon Boat Festival should not be offered. But this is not just in the cold start phase, the user's contextual information should be taken into account in the entire recommendation scenario. Author: Zhang Xiaomiao Miu Source: Zhang Xiaomiao Miu |
Recently I have seen many of my colleagues compil...
Recently I found that some friends wanted to attr...
This article will provide a systematic explanatio...
The following is a summary of the resources of Xi...
"Practice makes perfect", Jieshen Allia...
In the Internet age, short videos are king. 2018 ...
For operators of paid content products, the data ...
Please don’t think that Ai Qijun is a clickbait t...
Whether you are doing user operations, new media ...
What is the investment cost of Baise Preschool Ed...
Training course content: This course will lead st...
This is not the first time I’ve shared this exper...
This article shares with you [Baidu Marketing Dou...
In the past two years, Xu Huaizhe and Liu Xiong h...
How much does it cost to join the Jiaozuo educati...