Currently users are inundated with information and data coming from products and services. Recommending systems are an emerging research area from the last years but with a huge importance in any commercial application. A simple classification of these techniques lies in pushing, user-user or item-user based recommendations neighborhood models, simple matrix factorization model or latent models. The main of challenge of improving these techniques is addressed to get more accurate models in which information with regards to resources biases, user biases and user preferences biases are taken into account.
Collaborative filtering is a prime component to recommend products and services. Basically, the neighborhood approach and latent factor mode models (such as Singular Value Decomposition-SVD) are the main approaches to easy comparisons. First ones are focused on computing relationships between items or users while the second ones translate all items to the same latent factor space thus them directly comparable.
After this short review of main approaches of collaborative filtering, I am going to focus on the subject of this post “The BellKor Solution to the Netflix Prize” [1], it is a contest to improve the accuracy of the Cinematch algorithm using the quality metric “RMSE” with a prize up to 1M $. The authors (Bob Bell and Chris Volinsky, from the Statistics Research group in AT&T Labs, and Yehuda Koren), of this algorithm have won the prize with the first approach that merges both models (neighborhood and SVD) getting a more accurate model. Some of the main features of this approach lies in:
- a new model for neighborhood recommenders based on optimizing a global cost function keeping advantages such as explainability and handling new users while improving accuracy
- a set of extensions to existing SVD models to integrate implicit feedback and time features
Thus a new approach for recommending systemswas presented in 2008-2009 (a complete description of the algorithm is available in the article “Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model“) to win the Netflix Contest but some open issues are still open:
- Scalability (millions of users and items) and Real time (map/reduce techniques to continuouslyprocess new data)
- Explainability
- Implicit and explicit feedback
- Factorization techniques (please read this article from the same authors)
- Quality including more data with regards to dates, attributes of users, etc.
- …in general recommender systems are a young area in which a lot of improvements can be implemented
Finally, it is relevant to check last publications of Yehuda Koren:
- Care to Comment? Recommendations for Commenting on News Stories Erez Shmueli; Amit Kagian; Yehuda Koren; Ronny Lempel, WWW’12, ACM, 2012 [view abstract]
- Build Your Own Music Recommender by Modeling Internet Radio Streams Natalie Aizenberg; Yehuda Koren; Oren Somekh, WWW’2012, ACM, 2012 [view abstract]
- OrdRec: An Ordinal Model for Predicting Personalized Item Rating Distributions Yehuda Koren; Joe Sill, ACM Recommender Systems 2011 (RecSys’11), ACM, 2011 [view abstract]
- Automatically Tagging Email by Leveraging Other Users’ Folders Yehuda Koren; Edo Liberty; Yoelle Maarek; Roman Sandler, KDD 2011: 17th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, ACM, 2011 [view abstract]
- Advances in Collaborative Filtering Yehuda Koren; Robert Bell, Recommender Systems Handbook, 2011 [view abstract]
- Adaptive Bootstrapping of Recommender Systems Using Decision Trees Nadav Golbandi; Yehuda Koren; Ronny Lempel, WSDM’11, 2011 [view abstract]
- On Bootstrapping Recommender Systems Nadav Golbandi; Yehuda Koren; Ronny Lempel, CIKM’10, 2010 [view abstract]
- Performance of Recommender Algorithms on Top-N Recommendation Tasks Paolo Cremonesi; Yehuda Koren; Roberto Turrin, ACM RecSys’10, 2010 [view abstract]