Visualizing distributions
Most stats intuitions come from being able to picture a distribution. This note collects the half-dozen shapes that come up constantly and what each one looks like at a glance.
The normal distribution
The default. Mean , variance :
Useful facts:
- of mass within
- within
- within
The Central Limit Theorem says sums of independent finite-variance random variables tend to a normal — which is why it shows up everywhere.
Distribution zoo
| Distribution | Shape | Use when… | Parameters |
|---|---|---|---|
| Normal | bell | sums of many small effects | |
| Log-normal | right-skewed | products of positive effects (incomes, file sizes) | |
| Exponential | declining | time between independent events | |
| Poisson | discrete bell | count of rare events in a window | |
| Binomial | discrete bell | successes in trials | |
| Beta | flexible | Bayesian priors over probabilities | |
| Power-law | heavy tail | ”rich get richer” processes |
Generating + plotting
A reasonable default workflow with NumPy and Matplotlib:
import numpy as np
import matplotlib.pyplot as plt
rng = np.random.default_rng(42)
samples = {
"normal": rng.normal(0, 1, 10_000),
"log-normal": rng.lognormal(0, 0.5, 10_000),
"exponential": rng.exponential(1.0, 10_000),
"power-law": rng.pareto(1.5, 10_000) + 1,
}
fig, axes = plt.subplots(2, 2, figsize=(8, 6))
for ax, (name, x) in zip(axes.flat, samples.items()):
ax.hist(x, bins=60, density=True, alpha=0.7)
ax.set_title(name)
ax.set_xlim(np.quantile(x, [0.001, 0.999])) # trim long tails
plt.tight_layout()
Tip — when plotting heavy-tailed distributions, switch to log-log axes (
ax.set_xscale("log")). A power-law becomes a straight line.
Choosing one
flowchart TD
A{What are you modeling?} --> B[continuous & symmetric]
A --> C[continuous & positive only]
A --> D[discrete counts]
A --> E[probability of success]
B --> B1[Normal]
C --> C1{tail behavior?}
C1 --> C2[exponential / log-normal]
C1 --> C3[Pareto / power-law]
D --> D1{rare events?}
D1 --> D2[Poisson]
D1 --> D3[Binomial]
E --> E1[Beta or Binomial]
Tail risk
The most common modeling mistake is assuming normal when the underlying process is heavy-tailed. A few markers that you’re in heavy-tail territory:
- sample mean keeps drifting as you add data
- variance estimates are unstable across subsamples
- log-log plot of survival function is roughly linear
- one observation moves the mean by more than a percent
If two or more of these are true, a Gaussian model will systematically underestimate risk. See cap-theorem for an analogous “average case hides the worst case” pattern in distributed systems.
Related
- attention — softmax is a categorical distribution; cross-entropy is KL-divergence to a one-hot
- cap-theorem — tail behavior of latency in distributed systems