Following with the review of existing recommending systems in multimedia sites I have found through Marcos Merino the recomendation engine provide by HULU (it is an online video service that offers a selection of hit shows, clips, movies and more).
It brings together a large selection of videos from over 350 content companies, including FOX, NBCUniversal, ABC, The CW, Univision, Criterion, A&E Networks, Lionsgate, Endemol, MGM, MTV Networks, Comedy Central, National Geographic, Digital Rights Group, Paramount, Sony Pictures, Warner Bros., TED and more. (Hulu, About)
But, which is the underlying technology in Hulu?
Checking the technological blog they have spent a lot of effort to provide a great recommending engine in which they have decided to recommend shows to users instead of individual videos. Thus, contents can be organized due to same shows videos are usually closely related. As well as Netflix one of the drivers of the recommendation is the user behavior data (implicit and explicit feedback). The algorithm implemented in Hulu is based on a collaborative filtering approach (user or item based) but the most important part lies in Hulu’s architecture which is comprised of the next components:
- User profile builder
- Recommendation core
- Filtering
- Ranking
- Explanation
Just because a recommendation system can accurately predict user behavior does not mean it produces a show that you want to recommend to an active user. (Hulu, Tech Blog)
Other key points of their approach lies in explanation-based diversity and temporal diversity. This situation demonstrates that existing problems of recommending information resources in different domains are always similar. Nevertheless, depending on the domain (user behavior, type of content, etc.) new metrics can emerge such as novelty. On the other hand, real time capabilities, off-line processing and performance are again key-enablers of a “good” recommendation engine apart from accuracy. Following some interesting lessons from Hulu’s experience are highlighted:
- Explicit Feedback data is more important than implicit feedback data
- Recent behaviors are much more important than old behaviors
- Novelty, Diversity, and offline Accuracy are all important factors
- Most researchers focus on improving offline accuracy, such as RMSE, precision/recall. However, recommendation systems that can accurately predict user behavior alone may not be a good enough for practical use. A good recommendation system should consider multiple factors together. In our system, after considering novelty and diversity, the CTR has improved by more than 10%. Please check this document out: “Automatic Generation of Recommendations from Data: A Multifaceted Survey” (a technical report from the School of Information Technology at Deakin | University Australia)
- Classification and prediction in heterogeneous networks
- Pattern-analysis methods
- Link mining and link prediction
- Semantic search over heterogeneous networks
- Mining with user interactions
- Semantic mining with light-weight reasoning
- Extending LOD and Quality of LOD disambiguation, identity, provenance, integration
- Personalized mining in heterogeneous networks
- Domain specific mining (e.g., Life Science and Health Care)
- Collective intelligence mining
- …