Lecture 12: Tree-Based Algorithms

Applied Machine Learning

Volodymyr Kuleshov
Cornell Tech

Part 1: Decision Trees

We are now going to see a different way of defining machine models called decision trees.

Review: Components of A Supervised Machine Learning Problem

At a high level, a supervised machine learning problem has the following structure:

$$ \underbrace{\text{Training Dataset}}_\text{Attributes + Features} + \underbrace{\text{Learning Algorithm}}_\text{Model Class + Objective + Optimizer } \to \text{Predictive Model} $$

The UCI Diabetes Dataset

To explain what is a decision tree, we are going to use the UCI diabetes dataset that we have been working with earlier.

Let's start by loading this dataset.

We can also look at the data directly.

Decision Trees: Intuition

Decision tress are machine learning models that mimic how a human would approach this problem.

  1. We start by picking a feature (e.g., age)
  2. Then we branch on the feature based on its value (e.g, age > 65?)
  3. We select and branch on one or more features (e.g., is it a man?)
  4. Then we return an output that depends on all the features we've seen (e.g., a man over 65)

Decision Trees: Example

Let's first see an example on the diabetes dataset.

We will train a decision tree using it's implementation in sklearn.

Decision Rules

Let's now define a decision tree a bit more formally. The first important concept is that of a rule.

Decision Regions

The next important concept is that of a decision region.

Decision Trees: Definition

A decision tree is a model $f : \mathcal{X} \to \mathcal{Y}$ of the form $$ f(x) = \sum_{R \in \mathcal{R}} y_R \mathbb{I}\{x \in R\}. $$

We can also illustrate decision trees via this figure from Hastie et al.