Machine Learning Meetup Notes:2011-4-13

From Noisebridge
(Difference between revisions)
Jump to: navigation, search
m
m
Line 1: Line 1:
 
Anthony Goldbloom from Kaggle Visits
 
Anthony Goldbloom from Kaggle Visits
  
*Heritage Health Competition
 
 
**Guy used random forests to win HIV competition. Word "random forests" is trademarked. Dude taught himself machine learning from watching youtube videos. Random forests are pretty robust to new data.
 
**Guy used random forests to win HIV competition. Word "random forests" is trademarked. Dude taught himself machine learning from watching youtube videos. Random forests are pretty robust to new data.
 
***Used [http://cran.r-project.org/web/packages/caret/ caret] package in R to deal with random forests.
 
***Used [http://cran.r-project.org/web/packages/caret/ caret] package in R to deal with random forests.
 
**Kaggle splits test dataset into two, uses half for leaderboard.
 
**Kaggle splits test dataset into two, uses half for leaderboard.
 
**Often score difference between winning model and second place is not statistically significant. So they award prizes to top few. Might impose restrictions on execution time of model.
 
**Often score difference between winning model and second place is not statistically significant. So they award prizes to top few. Might impose restrictions on execution time of model.

Revision as of 20:09, 13 April 2011

Anthony Goldbloom from Kaggle Visits

    • Guy used random forests to win HIV competition. Word "random forests" is trademarked. Dude taught himself machine learning from watching youtube videos. Random forests are pretty robust to new data.
      • Used caret package in R to deal with random forests.
    • Kaggle splits test dataset into two, uses half for leaderboard.
    • Often score difference between winning model and second place is not statistically significant. So they award prizes to top few. Might impose restrictions on execution time of model.
Personal tools