Collaborative filteringCollaborative filtering (CF) and its variants are one of the most commonly used recommendation algorithms. Even a beginner in data science can use it to build their own personalized movie recommendation system, for example, for a resume project. When we want to recommend something to a user, the most logical thing to do is to find users with the same hobbies as him/her, analyze their behavior, and recommend the same things to them. Or we can focus on things similar to what the user has bought before and recommend similar products. There are two basic methods of collaborative filtering (CF): user-based collaborative filtering technology and item-based collaborative filtering technology. The recommendation algorithm consists of two steps in each of the above scenarios: 1. Find how many users/items in the database are similar to the target user/item. 2. Evaluate other users/items to predict the rating you would give to a user of related products, given the total weight of users/items that are more similar to a product user/item. What does "most similar" mean in this algorithm?What we have is a preference vector for each user (columns of the matrix R), and a vector of user ratings for each product (rows of the matrix R). First, keep only the elements whose values are known in both vectors. For example, if we want to compare Bill and Jane, and we know that Bill has not seen Titanic and Jane has not seen Batman, then we can only measure their similarity by Star Wars. How can someone not watch Star Wars, right? (Smile) The best way to measure similarity is to measure the cosine similarity or correlations of the user/item vectors. The next step is to fill the empty cells in the table using a weighted arithmetic average based on the similarity. Matrix Factorization for RecommendationsAnother interesting approach is to use matrix decomposition. This is an elegant recommendation algorithm because usually when we decompose a matrix, we don’t think too much about which items in the rows and columns of the resulting matrix will be retained. But when using this recommendation tool, we can clearly see that u is a vector about the interests of the i-th user, and v is a vector about the parameters of the j-th movie. We can then estimate x (the rating given to the jth movie by the ith user) by taking the dot product of u and v. We build these vectors using known ratings and use them to predict unknown ratings. For example, after matrix decomposition we obtain Ted’s vector (1.4; .9) and movie A’s vector (1.4; .8). Now we can restore the rating of movie A-Ted simply by calculating the dot product of (1.4; .9) and (1.4; .8), and the rating result is 2.68. ClusteringPrevious recommendation algorithms were simple and applicable to small systems. And until now, we have still conceived the recommendation problem as a supervised machine learning task. Now is the time to use unsupervised methods to solve such problems. Imagine that we are building a large-scale recommendation system, in which collaborative filtering and matrix decomposition should take longer. The first assumption is clustering. In the early stages of a business, there is often a lack of prior user classification, and clustering is the best method. But if used alone, clustering is a bit weak, because in fact what we are doing is actually identifying user groups and recommending the same things to every user in this group. When we have enough data, it is a better choice to use clustering as the first step, which can reduce the selection of relevant neighbors in the collaborative filtering algorithm. It can also improve the performance of complex recommendation systems. Each cluster is assigned a representative preference based on the preferences of the users belonging to that cluster. Each group of users in a cluster receives recommendations calculated at the cluster level. Deep Learning Methods for Recommender SystemsIn the past decade, the development of neural networks has made great leaps. Now they are being used in a variety of applications and are gradually replacing traditional machine learning methods. Below I will show how deep learning methods are used in Youtube. Needless to say, building a recommender system for such a service is a very challenging task due to its large scale, the ever-changing corpus, and various unobservable external factors. According to the research on "Deep Neural Networks for YouTube Recommendation System", the YouTube recommendation system algorithm consists of two neural networks: one for candidate generation and the other for ranking. If you don't have enough time, I will give you a brief summary here. Using the user's history as input, the candidate generation network significantly reduces the number of videos and can select a set of the most relevant videos from a large corpus. The generated candidate set is the most relevant to the user, and the purpose of this neural network is only to provide a broad personalization service through collaborative filtering. At this step, we have a smaller number of candidates that are closer to the user's needs. Our goal now is to carefully analyze all the candidates so that we can make the best decision. This task is accomplished by the ranking network, which assigns a score to each video according to a desired objective function that uses data to describe the video and information about user behavior. Using a two-stage approach, we are able to make video recommendations from a large video corpus, but we can be sure that only a small number of these recommendations are personalized and actually applied by users. This design also allows us to mix results generated by other resources with these candidate results. The recommendation task is like an extreme multi-class classification problem, where the prediction problem becomes a problem of accurately classifying a specific video (wt) in a class (i) among millions of videos in a corpus (V) based on the user (U) and context (C) at a given time t. Important points to note before creating your own recommendation system:
|
<<: State determines view - thinking about front-end development based on state
>>: From Shallow Models to Deep Models: An Overview of Machine Learning Optimization Algorithms
With the arrival of summer, the wedding photograp...
In response to the declining sales of Apple iPhon...
According to foreign media AppleInsider, Apple so...
The latest Godot 3.5 Beta 3 development version i...
Socializing with strangers is something that many...
For every seller doing overseas marketing, the to...
Oh my god, I was really blown away. It turns out ...
[[131286]] On the afternoon of April 6, today is ...
WeChat mini programs are applications that can be...
[[375494]] As the elderly population in my countr...
Recently, the topic of "Apple will prevent u...
The 28th session of Wang Yan's Four-dimension...
In addition to promotions, how else can you write...
We often come across many excellent operation cas...
In the past six months, I have transformed from a...