Skip to content
Pablo Rodriguez

Tree Ensembles Quiz

For the random forest, how do you build each individual tree so that they are not all identical to each other?

  • Sample the training data without replacement
  • Sample the training data with replacement and select a random subset of features to build each tree ✓
  • Train the algorithm multiple times on the same training set. This will naturally result in different trees.
  • If you are training B trees, train each one on 1/B of the training set, so each tree is trained on a distinct set of examples.

Answer Location: Found in Section 14: Random Forest uses “sampling with replacement to create new training set” and “at every node when choosing a feature to use to split… pick a random subset of K less than N features.”

You are choosing between a decision tree and a neural network for a classification task where the input x is a 100x100 resolution image. Which would you choose?

  • A decision tree, because the input is unstructured and decision trees typically work better with unstructured data.
  • A neural network, because the input is structured data and neural networks typically work better with structured data.
  • A neural network, because the input is unstructured data and neural networks typically work better with unstructured data. ✓
  • A decision tree, because the input is structured data and decision trees typically work better with structured data.

Answer Location: Found in Section 16: “I will not recommend using decision trees and tree ensembles on unstructured data. That’s data such as images, video, audio, and texts” while “Neural networks…tend to work better for unstructured data task.”

What does sampling with replacement refer to?

  • Drawing a sequence of examples where, when picking the next example, first replacing all previously drawn examples into the set we are picking from. ✓
  • It refers to using a new sample of data that we use to permanently overwrite (that is, to replace) the original data.
  • Drawing a sequence of examples where, when picking the next example, first remove all previously drawn examples from the set we are picking from.
  • It refers to a process of making an identical copy of the training set.

Answer Location: Found in Section 13: “The term with replacement means that if I take out the next token, I’m going to take this, and put it back in, and shake it up again, and then take on another one.”