in Research

BellKor Solution to the Netflix Prize

Currently users are inundated with information and data coming from products and services. Recommending systems are an emerging research area from the last years but with a huge importance in any commercial application. A simple classification of these techniques lies in pushing, user-user or item-user based recommendations neighborhood models, simple matrix factorization model or latent models. The main of challenge of improving these techniques is addressed to get more accurate models in which information with regards to resources biases, user biases and user preferences biases are taken into account.

Collaborative filtering is a prime component to recommend products and services. Basically, the neighborhood approach and latent factor mode models (such as Singular Value Decomposition-SVD)  are the main approaches to easy comparisons. First ones are focused on computing relationships between items or users while the second ones translate all items to the same latent factor space thus them directly comparable.

After this short review of main approaches of collaborative filtering, I am going to focus on the subject of this post “The BellKor Solution to the Netflix Prize” [1], it is a contest to improve the accuracy of the Cinematch algorithm using the quality metric “RMSE” with a prize up to 1M $. The authors (Bob Bell and Chris Volinsky, from the Statistics Research group in AT&T Labs, and Yehuda Koren), of this algorithm have won the prize with the first approach that merges both models (neighborhood  and SVD) getting a more accurate model. Some of the main features of this approach lies in:

  •  a new model for neighborhood recommenders based on optimizing a global cost function keeping advantages such as explainability and handling new users while  improving accuracy
  • a set of extensions to existing SVD models to integrate implicit feedback and time features

Thus a new approach for recommending systemswas presented in 2008-2009 (a complete description of the algorithm is available in the article “Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model“) to win the Netflix Contest but some open issues are still open:

  • Scalability (millions of users and items) and Real time (map/reduce techniques to continuouslyprocess new data)
  • Explainability
  • Implicit and explicit feedback
  • Factorization techniques (please read this article from the same authors)
  • Quality including more data with regards to dates, attributes of users, etc.
  • …in general recommender systems are a young area in which a lot of improvements can be implemented

Finally, it is relevant to check last publications of Yehuda Koren:

Write a Comment

Comment