Gaussian Distribution

Gaussian (Normal) Distribution

What is the Gaussian Distribution?

The Gaussian distribution is also called the normal distribution. When you hear either term, they mean exactly the same thing. Also known as the “bell-shaped distribution.”

Mathematical Definition

If x is a random variable with Gaussian distribution:

Mean parameter: μ (center of curve)
Variance parameter: σ²
Standard deviation: σ (width of curve)

Probability Density Function

p(x) = (1 / √(2π)) * (1/σ) * e^(-(x-μ)²/(2σ²))

Where:

π ≈ 3.14159 (ratio of circle’s circumference to diameter)
e = exponential function
μ = mean parameter
σ = standard deviation parameter

Visual Characteristics

Bell-Shaped Curve

Center: Located at mean μ
Width: Determined by standard deviation σ
Shape: Symmetric bell curve
Area under curve: Always equals 1 (probability requirement)

Historical Context

Called “bell-shaped” because resembles shape of classic tower bells
Example: Liberty Bell’s top portion follows this curve shape

Parameter Effects on Distribution

Changing Standard Deviation (σ)

σ = 1 (μ = 0):

Standard normal distribution
Moderate width curve

σ = 0.5 (μ = 0):

Narrower curve (less variance)
Taller peak (area still = 1)
σ² = 0.25 (variance)

σ = 2 (μ = 0):

Wider curve (more variance)
Shorter peak (area still = 1)
σ² = 4 (variance)

Changing Mean (μ)

Different μ values:

Shifts distribution left or right
Does not change shape or width
Width still determined by σ

Parameter Estimation

Given Dataset

With m examples: x⁽¹⁾, x⁽²⁾, …, x⁽ᵐ⁾

Estimate Mean (μ)

μ = (1/m) * Σ(i=1 to m) x⁽ⁱ⁾

Calculation: Average of all training examples

Estimate Variance (σ²)

σ² = (1/m) * Σ(i=1 to m) (x⁽ⁱ⁾ - μ)²

Calculation: Average of squared differences from mean

Statistical Notes

These formulas are called maximum likelihood estimates
Some statistics classes use (1/(m-1)) instead of (1/m)
In practice, difference between 1/m and 1/(m-1) is negligible
Using 1/m is more common in machine learning

Interpretation of p(x)

Probability Meaning

If you drew:

100 numbers from this distribution → histogram approximates bell curve
1,000 numbers → closer approximation
Infinite numbers with fine bins → exact bell curve

Practical Usage

High p(x): Example likely normal (near center)
Low p(x): Example likely anomalous (far from center)

Example Application

With fitted Gaussian distribution:

Example near center: High probability, considered normal
Example far from center: Low probability, considered anomalous

Understanding the Gaussian distribution is essential for anomaly detection as it provides a principled way to model normal behavior and identify deviations from expected patterns.