When working with the full linear regression model f(w,b) = wx + b, the cost function J(w,b) creates a 3D surface plot.
3D Surface Plot Characteristics
Shape: Bowl-like surface (similar to soup bowl, dinner plate, or hammock)
Axes: w and b as horizontal axes, J(w,b) as vertical axis
Points: Each (w,b) combination corresponds to a point on the surface
Height: Represents the cost value for those parameters
Process: Take horizontal slices of the 3D bowl
Result: Each slice becomes an oval/ellipse on the 2D plot
Meaning: Points on same contour have identical cost values
Horizontal axis: Parameter w
Vertical axis: Parameter b
Contour lines: Ovals showing equal cost values
Center: Global minimum at center of smallest oval
Parameter selection: Identify regions of good performance
Algorithm behavior: See how optimization algorithms navigate the cost landscape
Model improvement: Understand relationship between parameter changes and performance
Both 3D surface plots and 2D contour plots represent the same mathematical relationship between parameters (w,b) and cost J(w,b), providing different perspectives for understanding how to find optimal model parameters.
Result: Line that’s further from optimal compared to previous examples
Cost position: Even further from the minimum
Performance: Worse fit than the previous two examples
Function: f(x) with parameters close to optimal
Performance: Pretty good fit to the training set
Cost location: Very close to center of smallest ellipse (near global minimum)
w (slope): Controls line steepness and direction
b (y-intercept): Controls where line crosses vertical axis
Combined effect: Together determine line position and orientation
Visual inspection: How closely line follows data points
Error distances: Vertical gaps between points and line
Cost value: Mathematical measure of overall fit quality
Need efficient algorithm: Automatically find optimal w and b
Gradient descent: Algorithm that systematically finds the minimum
Next topic: Learn how gradient descent navigates the cost landscape
Understanding these examples shows why systematic optimization is necessary - manual parameter selection is inefficient and unlikely to find the true optimum.