Linear Regression Model

Overview of Supervised Learning Process

Linear regression demonstrates the overall supervised learning workflow. The concepts learned here apply to many other machine learning models throughout the specialization.

Problem Definition

Goal: Predict house prices based on house size using a dataset from Portland.

Data Structure:

X-axis: House size (square feet)
Y-axis: House price (thousands of dollars)
Data points: Each cross represents a house with its size and sale price

Real Estate Example

A client asks: “How much do you think I can get for this house?”

Process:

Measure the house size (e.g., 1250 square feet)
Use linear regression to fit a straight line to the data
Find where the house size intersects the best-fit line
Read the corresponding price prediction (approximately $220,000)

Supervised Learning Characteristics

Why This is Supervised Learning

Training data includes “right answers”: Both house sizes (inputs) and actual sale prices (outputs)
Learning from examples: Model learns from houses with known size-price pairs
Prediction capability: Trained model can predict prices for new houses

Regression vs Classification

Regression Model

Output: Predicts numbers (like $220,000, 1.5, -33.2)
Range: Any value from infinitely many possible numbers
Example: Linear regression for house prices

Classification Model

Output: Predicts categories or discrete classes
Examples: Cat vs. dog, disease diagnosis, email spam detection
Range: Limited set of possible categories

Data Representation

Graphical View

Plot shows relationship between size and price
Each data point represents one house sale
Visual pattern suggests linear relationship

Tabular View

Columns: Size (input feature) and Price (output target)
Rows: Individual training examples
Example: First row shows 2,104 sq ft house sold for $400,000
Corresponds to specific points on the graph

Standard Notation

Machine Learning Notation

x: Input variable (feature) - house size
y: Output variable (target) - house price
m: Number of training examples (47 in this dataset)
(x, y): Single training example pair
(x^(i), y^(i)): The i-th training example (superscript is index, not exponentiation)

Example Applications

x^(1) = 2,104: Size of first house
y^(1) = 400: Price of first house (in thousands)
m = 47: Total number of houses in dataset

This notation provides a standardized way to discuss machine learning concepts and will be used consistently throughout the specialization, becoming more familiar with practice.

Foundation for Advanced Models

Linear regression serves as a foundation because:

Demonstrates core supervised learning principles
Uses concepts applicable to complex models
Provides intuitive understanding of prediction
Shows relationship between input features and target outputs

Understanding linear regression prepares you for more sophisticated algorithms while establishing essential machine learning vocabulary and concepts.

Supervised Learning Process

Training Process Overview

The supervised learning process involves taking a training set and producing a function that can make predictions on new data.

Training Set Components

Input Features: Size of house (x) Output Targets: Price of house (y) “Right Answers”: The actual prices the model learns from

The Learning Algorithm Workflow

Step 1: Input Training Data

Feed both input features (x) and output targets (y) to the learning algorithm
These represent examples with known correct answers

Step 2: Generate Function

Algorithm produces a function f
Historically called a “hypothesis,” but referred to as “function f” in this course
Function represents the learned model

Step 3: Make Predictions

Function f takes new input x (without known output)
Produces estimate/prediction called ŷ (y-hat)
ŷ represents the predicted value, may or may not equal actual true value

Key Terminology

Definition: The model produced by the learning algorithm Purpose: Takes input and produces predictions Notation: f(x) or f_w,b(x)

Mathematical Representation

For linear regression, the function f is represented as:

f_w,b(x) = wx + b

Where:

w and b: Numbers (parameters) that determine the prediction
Different values of w and b: Create different prediction functions
Alternative notation: f(x) = wx + b (simplified form)

Function Behavior

The straight line function f(x) = wx + b makes predictions using a linear relationship:

Takes input x (house size)
Multiplies by w (weight/slope)
Adds b (bias/y-intercept)
Produces prediction ŷ (estimated price)

Why Choose Linear Functions?

Model Name and Variations

Linear Regression: Uses linear function for predictions

Variations:

Linear regression with one variable: Single input feature (univariate)
Univariate linear regression: Latin term meaning “one variable”
Multiple variable regression: Uses multiple input features (covered later)

Real Estate Application

In the house price example:

Training: Learn from houses with known sizes and prices
Function: Creates linear relationship between size and price
Prediction: Estimate price for new house based on its size
Practical use: Help real estate agent advise clients on pricing

The supervised learning process transforms training data into a practical tool for making informed predictions on new, unseen data.