Single Tree
High Variance
- Sensitive to data changes
- Prone to overfitting
- Unreliable predictions
Single decision trees are highly sensitive to small changes in training data, leading to poor robustness.
Original Dataset: 10 examples with specific features
Modified Dataset: Change just ONE example
Tree Ensemble: Collection of multiple decision trees that vote on final predictions
Benefits:
Example Prediction: Animal with pointy ears, not round face, whiskers present
Individual Tree Problems:
Ensemble Solutions:
Single Tree
High Variance
Tree Ensemble
Low Variance
Question: How do you create multiple plausible but different decision trees?
Requirements for Good Ensemble:
Sampling with Replacement: Statistical technique to create different training sets
Classification: Each tree votes for a class, majority wins Regression: Average the predictions from all trees
For new test example:1. Pass example through Tree 1 → get prediction 12. Pass example through Tree 2 → get prediction 23. Pass example through Tree 3 → get prediction 34. Combine predictions using voting/averaging5. Output final ensemble prediction
Tree ensembles fundamentally solve the robustness problem of single decision trees by leveraging the wisdom of crowds - multiple diverse trees working together produce more reliable and accurate predictions than any individual tree could achieve alone.