Optional Lab Decision Trees
Optional Lab: Decision Trees
Section titled “Optional Lab: Decision Trees”Lab Overview
Section titled “Lab Overview”This notebook visualizes how decision trees split using information gain, implementing the core concepts from the lectures with hands-on code examples.
Dataset and Encoding
Section titled “Dataset and Encoding”Cat Classification Dataset
Section titled “Cat Classification Dataset”10 Training Examples with three categorical features:
Animal | Ear Shape | Face Shape | Whiskers | Cat |
---|---|---|---|---|
0 | Pointy | Round | Present | 1 |
1 | Floppy | Not Round | Present | 1 |
2 | Floppy | Round | Absent | 0 |
… | … | … | … | … |
One-Hot Encoding Applied
Section titled “One-Hot Encoding Applied”Binary representation of categorical features:
- Ear Shape: Pointy = 1, Floppy = 0
- Face Shape: Round = 1, Not Round = 0
- Whiskers: Present = 1, Absent = 0
Result: X_train contains 3 features, y_train contains labels
X_train = np.array([[1, 1, 1], # Pointy, Round, Present [0, 0, 1], # Floppy, Not Round, Present [0, 1, 0], # Floppy, Round, Absent # ... more examples ])
y_train = np.array([1, 1, 0, 0, 1, 1, 0, 1, 0, 0])
Entropy Implementation
Section titled “Entropy Implementation”Entropy Function
Section titled “Entropy Function”Key implementation from lecture material:
def entropy(p): """Calculate entropy for probability p""" if p == 0 or p == 1: return 0 else: return -p * np.log2(p) - (1-p) * np.log2(1-p)
# Example: Root node entropyprint(entropy(0.5)) # Output: 1.0 (maximum impurity)
Entropy Behavior Visualization
Section titled “Entropy Behavior Visualization”Interactive plot shows entropy curve:
- Maximum at p=0.5: Entropy = 1.0 (highest uncertainty)
- Minimum at extremes: Entropy = 0.0 (complete certainty)
- Shape: Bell curve peaking at equal class distribution
Information Gain Calculation
Section titled “Information Gain Calculation”Helper Functions
Section titled “Helper Functions”Split Indices Function:
def split_indices(X, index_feature): """Split dataset based on feature value""" left_indices = [] # feature = 1 right_indices = [] # feature = 0
for i, x in enumerate(X): if x[index_feature] == 1: left_indices.append(i) else: right_indices.append(i)
return left_indices, right_indices
Weighted Entropy Function:
def weighted_entropy(X, y, left_indices, right_indices): """Calculate weighted entropy after split""" w_left = len(left_indices) / len(X) w_right = len(right_indices) / len(X)
p_left = sum(y[left_indices]) / len(left_indices) p_right = sum(y[right_indices]) / len(right_indices)
return w_left * entropy(p_left) + w_right * entropy(p_right)
Information Gain Implementation
Section titled “Information Gain Implementation”def information_gain(X, y, left_indices, right_indices): """Calculate information gain from split""" p_node = sum(y) / len(y) h_node = entropy(p_node) w_entropy = weighted_entropy(X, y, left_indices, right_indices)
return h_node - w_entropy
Feature Comparison Results
Section titled “Feature Comparison Results”Root Node Analysis
Section titled “Root Node Analysis”Starting conditions:
- 10 examples total: 5 cats, 5 dogs
- Root entropy: H(0.5) = 1.0
- Goal: Find feature with highest information gain
Information Gain Calculation
Section titled “Information Gain Calculation”for i, feature_name in enumerate(['Ear Shape', 'Face Shape', 'Whiskers']): left_indices, right_indices = split_indices(X_train, i) i_gain = information_gain(X_train, y_train, left_indices, right_indices) print(f"Feature: {feature_name}, Information Gain: {i_gain:.2f}")
# Expected Output:# Feature: Ear Shape, Information Gain: 0.28# Feature: Face Shape, Information Gain: 0.03# Feature: Whiskers, Information Gain: 0.12
Best Split Selection
Section titled “Best Split Selection”Result: Ear Shape provides highest information gain (0.28) Decision: Split root node on ear shape feature
Tree Visualization
Section titled “Tree Visualization”Single-Level Tree
Section titled “Single-Level Tree”After first split on ear shape:
- Left branch: Examples with pointy ears
- Right branch: Examples with floppy ears
- Visualization: Interactive tree diagram showing split
Complete Tree Construction
Section titled “Complete Tree Construction”Recursive process continues:
- Maximum depth: Set to 2 for demonstration
- Stopping criteria: Pure nodes or depth limit
- Final tree: Multiple levels with leaf predictions
Key Lab Insights
Section titled “Key Lab Insights”Algorithmic Process
Section titled “Algorithmic Process”- Start at root: All examples together
- Calculate gains: For each possible feature
- Select best: Feature with highest information gain
- Split data: According to selected feature
- Recurse: Repeat process on subsets until stopping criteria met
Mathematical Implementation
Section titled “Mathematical Implementation”- Entropy: Measures impurity using log₂ formula
- Information gain: Reduction in entropy from splitting
- Weighted averages: Account for different branch sizes
- Optimization: Choose splits that maximize purity
Practical Understanding
Section titled “Practical Understanding”- Feature comparison: Systematic evaluation of all options
- Pure nodes: Natural stopping points (entropy = 0)
- Tree depth: Controls model complexity
- Recursive structure: Build large trees from smaller subtrees
The lab provides hands-on experience with the mathematical foundations of decision trees, making the algorithmic concepts concrete through implementation and visualization.