Linear regression demonstrates the overall supervised learning workflow. The concepts learned here apply to many other machine learning models throughout the specialization.
Goal : Predict house prices based on house size using a dataset from Portland.
Data Structure :
X-axis : House size (square feet)
Y-axis : House price (thousands of dollars)
Data points : Each cross represents a house with its size and sale price
A client asks: “How much do you think I can get for this house?”
Process :
Measure the house size (e.g., 1250 square feet)
Use linear regression to fit a straight line to the data
Find where the house size intersects the best-fit line
Read the corresponding price prediction (approximately $220,000)
Why This is Supervised Learning
Training data includes “right answers” : Both house sizes (inputs) and actual sale prices (outputs)
Learning from examples : Model learns from houses with known size-price pairs
Prediction capability : Trained model can predict prices for new houses
Output : Predicts numbers (like $220,000, 1.5, -33.2)
Range : Any value from infinitely many possible numbers
Example : Linear regression for house prices
Output : Predicts categories or discrete classes
Examples : Cat vs. dog, disease diagnosis, email spam detection
Range : Limited set of possible categories
Plot shows relationship between size and price
Each data point represents one house sale
Visual pattern suggests linear relationship
Columns : Size (input feature) and Price (output target)
Rows : Individual training examples
Example : First row shows 2,104 sq ft house sold for $400,000
Corresponds to specific points on the graph
Machine Learning Notation
x : Input variable (feature) - house size
y : Output variable (target) - house price
m : Number of training examples (47 in this dataset)
(x, y) : Single training example pair
(x^(i), y^(i)) : The i-th training example (superscript is index, not exponentiation)
x^(1) = 2,104 : Size of first house
y^(1) = 400 : Price of first house (in thousands)
m = 47 : Total number of houses in dataset
This notation provides a standardized way to discuss machine learning concepts and will be used consistently throughout the specialization, becoming more familiar with practice.
Linear regression serves as a foundation because:
Demonstrates core supervised learning principles
Uses concepts applicable to complex models
Provides intuitive understanding of prediction
Shows relationship between input features and target outputs
Understanding linear regression prepares you for more sophisticated algorithms while establishing essential machine learning vocabulary and concepts.
The supervised learning process involves taking a training set and producing a function that can make predictions on new data.
Training Set Components
Input Features : Size of house (x)
Output Targets : Price of house (y)
“Right Answers” : The actual prices the model learns from
Feed both input features (x) and output targets (y) to the learning algorithm
These represent examples with known correct answers
Algorithm produces a function f
Historically called a “hypothesis,” but referred to as “function f” in this course
Function represents the learned model
Function f takes new input x (without known output)
Produces estimate/prediction called ŷ (y-hat)
ŷ represents the predicted value, may or may not equal actual true value
Definition : The model produced by the learning algorithm
Purpose : Takes input and produces predictions
Notation : f(x) or f_w,b(x)
Definition : Input feature or input variable
Example : Size of house in square feet
Role : Information available for making predictions
Definition : Model’s estimated output (y-hat)
Distinction : Different from true value y
Example : Estimated house price vs. actual sale price
For linear regression, the function f is represented as:
Where:
w and b : Numbers (parameters) that determine the prediction
Different values of w and b : Create different prediction functions
Alternative notation : f(x) = wx + b (simplified form)
The straight line function f(x) = wx + b makes predictions using a linear relationship:
Takes input x (house size)
Multiplies by w (weight/slope)
Adds b (bias/y-intercept)
Produces prediction ŷ (estimated price)
Linear Regression : Uses linear function for predictions
Variations :
Linear regression with one variable : Single input feature (univariate)
Univariate linear regression : Latin term meaning “one variable”
Multiple variable regression : Uses multiple input features (covered later)
In the house price example:
Training : Learn from houses with known sizes and prices
Function : Creates linear relationship between size and price
Prediction : Estimate price for new house based on its size
Practical use : Help real estate agent advise clients on pricing
The supervised learning process transforms training data into a practical tool for making informed predictions on new, unseen data.