MeritBadges/DecisionTrees
Jump to navigation
Jump to search
Introduction
Decision trees are the most approachable and most fundamental sort of machine learned labelling algorithm.
Subject Matter Expert
Requirements
- Explain the idea behind a decision tree, including converting a set of decision criteria into a graphical representation
- Describe at least three applications of decision trees
- Discuss the strengths and weaknesses of decision trees
- Discuss the appropriate inputs and outputs for a decision tree
- Explain fundamental machine learning concepts relevant to decision trees
- Explain the process of discretization of data
- Explain the causes of, and problems resulting from, an overfit model
- Describe the relationship between decision trees and entropy
- Demonstrate an understanding of information-theoretic entropy, including at least 3 computations by hand
- Explain information gain and how it relates to entropy
- Explain how entropy guides the learning of a decision tree
- Demonstrate basic decision tree creation (all nominal values, no missing values)
- Demonstrate the creation of a decision tree by hand on a small dataset
- Demonstrate the creation of a decision tree on a larger dataset, using computer tools (off-the-shelf or custom)
- Explain the idea of pruning and its motivations
- Demonstrate converting a set of criteria into executable code in any programming language, and validate with a test set
Resources
- Decision Tree Learning Another overview of decision trees, good for the entropy, gain, and manual creation steps
- Overview of Decision Trees An overview of decision trees and their construction, at a fair bit of detail.
- Andrew Moore's Decision Tree Slides, which offer a great review of the motivations and ideas of decision trees, but are a little terse.