From Noisebridge
Jump to navigation Jump to search

Introduction[edit | edit source]

Decision trees are the most approachable and most fundamental sort of machine learned labelling algorithm.

Subject Matter Expert[edit | edit source]


Requirements[edit | edit source]

  1. Explain the idea behind a decision tree, including converting a set of decision criteria into a graphical representation
  2. Describe at least three applications of decision trees
  3. Discuss the strengths and weaknesses of decision trees
  4. Discuss the appropriate inputs and outputs for a decision tree
  5. Explain fundamental machine learning concepts relevant to decision trees
    1. Explain the process of discretization of data
    2. Explain the causes of, and problems resulting from, an overfit model
  6. Describe the relationship between decision trees and entropy
    1. Demonstrate an understanding of information-theoretic entropy, including at least 3 computations by hand
    2. Explain information gain and how it relates to entropy
    3. Explain how entropy guides the learning of a decision tree
  7. Demonstrate basic decision tree creation (all nominal values, no missing values)
    1. Demonstrate the creation of a decision tree by hand on a small dataset
    2. Demonstrate the creation of a decision tree on a larger dataset, using computer tools (off-the-shelf or custom)
    3. Explain the idea of pruning and its motivations
  8. Demonstrate converting a set of criteria into executable code in any programming language, and validate with a test set

Resources[edit | edit source]