
Presentation by Todd Holloway
Ensemble is the process of using multiple supervised learning models to make a prediction. This talk is arguing that using multiple types of predictor models turns out more statistically correct results.
Relating movies based on user recommendations does not work will for relatedness because we don't have enough data on all movies.
Netflix prize: 17000 sample movies, millions of sample ratings. One million dollar prize for a ten percent improvement on current Netflix model.
Using multiple models decreases error as long as they are independent decision makers.
To get independence and diversity, we use different relatedness measures for each model.
This adds complexity but gives better results, which is a violation of Ockham's Razor.
AdaBoost is the process of trying a classifier, testing it, take the incorrect results, and using them to train a new classifier. Unfortunately, this emphasizes noise.
www.abeautifulwww.com for slides and more info.
No comments:
Post a Comment