Skip to content
Pablo Rodriguez

Logistic Regression

Logistic regression fits an S-shaped curve to classification data instead of the straight line used in linear regression. This produces outputs between 0 and 1, making it suitable for binary classification problems.

The sigmoid function (also called the logistic function) is defined as:

g(z) = 1 / (1 + e^(-z))

Where:

  • e ≈ 2.7 (mathematical constant)
  • z can be any real number
  • Output is always between 0 and 1

Large Positive z

When z is very large (e.g., 100), e^(-z) becomes tiny, so g(z) ≈ 1

Large Negative z

When z is very negative, e^(-z) becomes huge, so g(z) ≈ 0

z = 0

When z = 0, g(z) = 1/(1+1) = 0.5

  1. Linear Combination: Calculate z = w·x + b (same as linear regression)
  2. Sigmoid Application: Apply g(z) to get f(x) = g(w·x + b)

The logistic regression model is:

f(x) = g(w·x + b) = 1 / (1 + e^(-(w·x + b)))

This outputs a value between 0 and 1 for any input x.

The output f(x) represents the probability that y = 1 given input x.

Example: If f(x) = 0.7 for a tumor classification:

  • 70% chance the tumor is malignant (y = 1)
  • 30% chance the tumor is benign (y = 0)
  • Probabilities must sum to 100%

In research literature, you may see:

f(x) = P(y = 1 | x; w,b)

This notation means “probability that y equals 1, given input x, with parameters w and b.”

Logistic regression combines a linear function (w·x + b) with the sigmoid function to produce probability estimates between 0 and 1. This makes it ideal for binary classification tasks where we need to estimate the likelihood of an outcome rather than just predict a category.