Skip to content
Pablo Rodriguez

Simplified Cost Function

The loss function was defined separately for each case:

  • If y = 1: Loss = -log(f(x))
  • If y = 0: Loss = -log(1 - f(x))

Since y can only be 0 or 1, we can combine these into a single expression:

Loss = -y*log(f(x)) - (1-y)*log(1-f(x))
  • First term: -1 × log(f(x)) = -log(f(x))
  • Second term: -(1-1) × log(1-f(x)) = 0
  • Result: -log(f(x)) ✓

The cost function is the average loss across all training examples:

J(w,b) = (1/m) * Σ[Loss(f(x⁽ⁱ⁾), y⁽ⁱ⁾)]

Substituting the unified loss function:

J(w,b) = -(1/m) * Σ[y⁽ⁱ⁾*log(f(x⁽ⁱ⁾)) + (1-y⁽ⁱ⁾)*log(1-f(x⁽ⁱ⁾))]

This is the standard cost function used throughout the machine learning community for logistic regression.

Convex Function

Guarantees gradient descent will find the global minimum

Single Expression

No need for conditional logic - works for both y=0 and y=1 cases

Theoretically Grounded

Derived from maximum likelihood estimation principles

The unified expression makes implementation straightforward:

  • No conditional statements needed
  • Single formula handles both classes
  • Easier to vectorize for computational efficiency

Comparing different parameter values:

  • Better-fitting decision boundaries have lower cost
  • The cost function effectively measures model quality
  • Enables systematic parameter optimization

The simplified cost function elegantly combines both cases of the loss function into a single expression. This unified form maintains the same optimization properties while simplifying implementation and providing a clean mathematical foundation for gradient descent optimization.