- Published on
Movie Recommendation
- Authors
- Name
- Ajay Karthick Senthil Kumar
- @i_ajaykarthick
View Project here: Project Repo
Objective
This project aims to train a Machine learning model to learn the user's preferences from the Movies rating dataset and then recommend a movie for any user based on its learning. As this project deals with the huge dataset, big data tools like Spark framework has been leveraged.
Dataset
The dataset used in this project is fetched from MovieLens 20M Dataset
The datasets describe ratings and free-text tagging activities from MovieLens, a movie recommendation service. It contains 20000263 ratings and 465564 tag applications across 27278 movies. These data were created by 138493 users between January 09, 1995 and March 31, 2015. This dataset was generated on October 17, 2016.Users were selected at random for inclusion. All selected users had rated at least 20 movies.
User Based Collaborative Filtering
User to User Collaborative Filtering is a kind of machine learning technique used to predict the items that a user might like on the basis of ratings given to that item by the other users who have similar taste with that of the target user.
In this project, the users with similar taste of the user is referred as neighbors of user . Each neighbor of user is referred as user . We find the similarity score or the weight of the similarity between user and neighbor user is calculated as follows:
where,
- set of movies that user has rated
- : set of movies that user has rated
- set of movies that both users and have rated
- is the deviation score of the user's rating on the movie j from his/her average rating. This is because each user's interpretation of rating can be different. Hence, we focus on the deviation score for each movie rating to see how much it deviates from his average rating.
The predicted rating score for a user on a movie is calculated as follows:
This score give an estimate of how much the user would have rated the movie based on the ratings of his/her weighted neighbors on movie .
The weighted relationship across each user's based on their similarity is depicted in a graph as follows:
Item-Based Collaborative Filtering
Item-item collaborative filtering is one kind of recommendation method which is used to predict the items that a user might like on the basis of ratings given to the similar items the target user.
In this project, the movies that are similar to the movie is referred as movie . The similarity score between movies is calculated as follows:
where,
- - users who rated movie
- - users who rated movie and movie
- - average rating for movie
The predicted rating score for a user on a movie is calculated as follows:
where,
- - movies that user i has rated
This score give an estimate of how much the user would have rated the movie based on his/her ratings on movies which are similar to movie .
The weighted relationship across each movie based on their similarity is depicted in a graph as follows:
Matrix Factorization