Skip to content
Pablo Rodriguez

Practice Lab Ml Advice

Practice Lab: Advice for Applying Machine Learning

Section titled “Practice Lab: Advice for Applying Machine Learning”

This lab explores techniques to evaluate and improve machine learning models through systematic analysis of polynomial regression and neural network models.

  • Split datasets into training, cross-validation, and test sets
  • Calculate evaluation metrics for model performance
  • Diagnose bias and variance problems
  • Apply techniques to improve model performance
  • Understand model complexity trade-offs

1. Evaluating Learning Algorithms (Polynomial Regression)

Section titled “1. Evaluating Learning Algorithms (Polynomial Regression)”

Dataset Splitting:

  • Training set: 60% - tune model parameters w and b
  • Cross-validation set: 20% - tune hyperparameters like polynomial degree, regularization
  • Test set: 20% - evaluate final performance on new data

Key Function - Error Calculation:

eval_mse.py
# EXERCISE 1: Implement MSE calculation
def eval_mse(y, yhat):
"""
Calculate the mean squared error on a data set.
Args:
y : target values
yhat : predicted values
Returns:
err: mean squared error
"""
m = len(y)
err = 0.0
for i in range(m):
# STUDENT IMPLEMENTATION REQUIRED
err_i = (yhat[i] - y[i])**2 # Square the error
err += err_i # Accumulate errors
err = err / (2*m) # Calculate mean
return(err)

High Degree Polynomial Results:

  • Training error: Very low (overfitting training data)
  • Test error: Much higher (poor generalization)
  • Conclusion: High variance problem

Systematic Testing:

  • Test polynomial degrees 1 through 10
  • Plot training vs cross-validation error
  • Identify optimal degree where CV error is minimized

Lambda Parameter Testing:

  • Test range: [0.0, 1e-6, 1e-5, 1e-4, 1e-3, 1e-2, 1e-1, 1, 10, 100]
  • High λ → High bias (underfitting)
  • Low λ → High variance (overfitting)
  • Find optimal λ that minimizes cross-validation error

Training Set Size Impact:

  • As training examples increase → training error increases
  • As training examples increase → CV error typically decreases
  • High variance: More data helps significantly
  • High bias: More data provides limited improvement

Classification Error Function:

eval_cat_err.py
# EXERCISE 2: Implement classification error
def eval_cat_err(y, yhat):
"""
Calculate the categorization error
Args:
y : target class indices
yhat : predicted class indices
Returns:
cerr: classification error rate
"""
m = len(y)
incorrect = 0
for i in range(m):
# STUDENT IMPLEMENTATION REQUIRED
if yhat[i] != y[i]: # Check if prediction matches target
incorrect += 1 # Count incorrect predictions
cerr = incorrect/m # Calculate error rate
return(cerr)

Complex Model Architecture:

complex_model.py
# EXERCISE 3: Build complex neural network
model = Sequential([
# STUDENT IMPLEMENTATION REQUIRED
Dense(120, activation='relu', name="L1"), # Large hidden layer
Dense(40, activation='relu', name="L2"), # Medium hidden layer
Dense(6, activation='linear', name="L3") # Output layer (6 classes)
], name="Complex")
model.compile(
# STUDENT IMPLEMENTATION REQUIRED
loss=SparseCategoricalCrossentropy(from_logits=True),
optimizer=Adam(0.01)
)

Simple Model Architecture:

simple_model.py
# EXERCISE 4: Build simple neural network
model_s = Sequential([
# STUDENT IMPLEMENTATION REQUIRED
Dense(6, activation='relu', name="L1"), # Small hidden layer
Dense(6, activation='linear', name="L2") # Output layer
], name="Simple")
model_s.compile(
# STUDENT IMPLEMENTATION REQUIRED
loss=SparseCategoricalCrossentropy(from_logits=True),
optimizer=Adam(0.01)
)

Regularized Complex Model:

regularized_model.py
# EXERCISE 5: Add regularization to complex model
model_r = Sequential([
# STUDENT IMPLEMENTATION REQUIRED
Dense(120, activation='relu',
kernel_regularizer=tf.keras.regularizers.l2(0.1), name="L1"),
Dense(40, activation='relu',
kernel_regularizer=tf.keras.regularizers.l2(0.1), name="L2"),
Dense(6, activation='linear', name="L3")
], name="ComplexRegularized")
model_r.compile(
# STUDENT IMPLEMENTATION REQUIRED
loss=SparseCategoricalCrossentropy(from_logits=True),
optimizer=Adam(0.01)
)
  • Complex model: Low training error, high CV error (overfitting)
  • Simple model: Moderate training error, reasonable CV error
  • Regularized model: Balanced performance similar to “ideal” model
  • λ = 0.0: High variance (overfitting)
  • λ = 0.01-0.05: Good balance
  • λ > 0.1: High bias (underfitting)
  1. Data splitting enables differentiation between underfitting and overfitting
  2. Three-way splits allow parameter tuning with CV set and unbiased evaluation with test set
  3. Bias/variance analysis provides insight into whether models need more complexity or more data
  4. Regularization can help complex models generalize better without sacrificing capacity

The lab demonstrates systematic approaches to diagnosing and improving machine learning models through proper evaluation techniques and architectural decisions.