Machine Learning/Datasets: Difference between revisions

From Noisebridge
Jump to navigation Jump to search
mNo edit summary
mNo edit summary
Line 24: Line 24:
*[http://robjhyndman.com/tsdldata/data/immig.dat Immigration Rates]
*[http://robjhyndman.com/tsdldata/data/immig.dat Immigration Rates]
**"Annual immigration into the United States: thousands. 1820 – 1962. From Kendall & Ord (1990), p.13."
**"Annual immigration into the United States: thousands. 1820 – 1962. From Kendall & Ord (1990), p.13."
*[http://robjhyndman.com/tsdldata/roberts/beards.dat Percent of Men with Beards 1866-1911]
**"Percent of Men with full beards, 1866 – 1911. Source: Hipel and Mcleod (1994)."


'''Clustering'''
'''Clustering'''

Revision as of 23:36, 14 March 2011

This page describes in detail the datasets used for the NBML Course.

Classification

  • MNIST Handwritten Digits
    • Classify handwritten digits using this dataset, a very popular one with lots of training examples.
  • Heart Disease
    • Predict whether a person will have heart disease based on a subset of 76 factors.
  • Census Income
    • Try to predict whether a person has an income greater than or less than 50k

Regression

Time Series

  • Gun-related Deaths in Australia
    • "Deaths from gun-related homicides and suicides and non-gun-related homicides and suicides. Australia: 1915-2004. Source: Neill and Leigh (2007)."
  • Immigration Rates
    • "Annual immigration into the United States: thousands. 1820 – 1962. From Kendall & Ord (1990), p.13."
  • Percent of Men with Beards 1866-1911
    • "Percent of Men with full beards, 1866 – 1911. Source: Hipel and Mcleod (1994)."

Clustering