Visualizing distributions

updated 2026-05-02 2 min read #statistics #fundamentals #visualization

Most stats intuitions come from being able to picture a distribution. This note collects the half-dozen shapes that come up constantly and what each one looks like at a glance.

The normal distribution

The default. Mean $\mu$ , variance $\sigma^2$ :

f(x \mid \mu, \sigma^2) = \frac{1}{\sigma \sqrt{2\pi}} \exp\!\left(-\frac{(x - \mu)^2}{2 \sigma^2}\right)

Standard normal distribution with sigma markers

Useful facts:

$\approx 68\%$ of mass within $\pm 1\sigma$
$\approx 95\%$ within $\pm 2\sigma$
$\approx 99.7\%$ within $\pm 3\sigma$

The Central Limit Theorem says sums of independent finite-variance random variables tend to a normal — which is why it shows up everywhere.

Distribution zoo

Distribution	Shape	Use when…	Parameters
Normal	bell	sums of many small effects	$\mu, \sigma$
Log-normal	right-skewed	products of positive effects (incomes, file sizes)	$\mu, \sigma$
Exponential	declining	time between independent events	$\lambda$
Poisson	discrete bell	count of rare events in a window	$\lambda$
Binomial	discrete bell	$k$ successes in $n$ trials	$n, p$
Beta	flexible $[0,1]$	Bayesian priors over probabilities	$\alpha, \beta$
Power-law	heavy tail	”rich get richer” processes	$\alpha$

Generating + plotting

A reasonable default workflow with NumPy and Matplotlib:

import numpy as np
import matplotlib.pyplot as plt

rng = np.random.default_rng(42)

samples = {
    "normal":      rng.normal(0, 1, 10_000),
    "log-normal":  rng.lognormal(0, 0.5, 10_000),
    "exponential": rng.exponential(1.0, 10_000),
    "power-law":   rng.pareto(1.5, 10_000) + 1,
}

fig, axes = plt.subplots(2, 2, figsize=(8, 6))
for ax, (name, x) in zip(axes.flat, samples.items()):
    ax.hist(x, bins=60, density=True, alpha=0.7)
    ax.set_title(name)
    ax.set_xlim(np.quantile(x, [0.001, 0.999]))  # trim long tails
plt.tight_layout()

Tip — when plotting heavy-tailed distributions, switch to log-log axes (ax.set_xscale("log")). A power-law becomes a straight line.

Choosing one

flowchart TD
    A{What are you modeling?} --> B[continuous & symmetric]
    A --> C[continuous & positive only]
    A --> D[discrete counts]
    A --> E[probability of success]
    B --> B1[Normal]
    C --> C1{tail behavior?}
    C1 --> C2[exponential / log-normal]
    C1 --> C3[Pareto / power-law]
    D --> D1{rare events?}
    D1 --> D2[Poisson]
    D1 --> D3[Binomial]
    E --> E1[Beta or Binomial]

Tail risk

The most common modeling mistake is assuming normal when the underlying process is heavy-tailed. A few markers that you’re in heavy-tail territory:

sample mean keeps drifting as you add data
variance estimates are unstable across subsamples
log-log plot of survival function is roughly linear
one observation moves the mean by more than a percent

If two or more of these are true, a Gaussian model will systematically underestimate risk. See cap-theorem for an analogous “average case hides the worst case” pattern in distributed systems.

attention — softmax is a categorical distribution; cross-entropy is KL-divergence to a one-hot
cap-theorem — tail behavior of latency in distributed systems

The normal distribution

Distribution zoo

Generating + plotting

Choosing one

Tail risk

Related