0
The normal distribution is
the log-normal distribution
Werner Stahel, Seminar für Statistik, ETH Zürich
and Eckhard Limpert
2 December 2014
1
The normal Normal distribution
We like it!
• Nice shape.
• Named after Gauss. Decorated the 10 DM bill.
• We know it. Passed the exam.
µ − 2σ
µ + 2σ
µ−σ
µ+σ
µ
−3 −2 −1 0 1 2 3
2/3 (68%)
95% (95.5%)
2
Why it is right.
It is given by mathematical theory.
• Adding normal random variables gives a normal sum.
• Linear combinations Y = α0 + α1X1 + α2X2 + ...
remain normal.
• −→ Means of normal variables are normally distributed.
• Central Limit Theorem: Means of non-normal variables
are approximately normally distributed.
• −→ “Hypothesis of Elementary Errors”:
If random variation is the sum of many small random effects,
a normal distribution must be the result.
• Regression models assume normally distributed errors.
3
Is it right?
Mathematical statisticians believe(d) that it is prevalent in Nature.
Well, it is not. Purpose of this talk: What are the consequences?
1. Empirical Distributions
2. Laws of Nature
3. Logarithmic Transformation, the Log-Normal Distribution
4. Regression
5. Advantages of using the log-normal distribution
6. Conclusions
1. Empirical Distributions 4
1. Empirical Distributions
Measurements:
size, weight, concentration, intensity, duration, price, activity
All > 0 −→ “amounts” (John Tukey)
Example: HydroxyMethylFurfurol (HMF) in honey (Renner 1970)
450
350
frequency
250
150
50
0
0 5 10 15 20 25 30 35 40 45 50
concentration
1. Empirical Distributions 5
Measurements:
size, weight, concentration, intensity, duration, price, activity
All > 0 −→ “amounts”
>0
Distribution is skewed: left steep, right flat, skewness
unless coefficient of variation cv(X) = sd(X)/E(X) is small.
Other variables may have other ranges and negative skewness.
They may have a normal distribution.
They are usually derived variables, not original measurements.
Any examples?
Our examples: Position in space and time, angles, directions. That’s it!
For some, 0 is a probable value: rain, expenditure for certain goods, ...
pH, sound and other energies [dB] −→ log scale!
1. Empirical Distributions 6
The 95% Range Check
For every normal distribution, negative values have a probability > 0.
−→ normal distribution inadequate for positive variables.
Becomes relevant when 95% range x ± 2σ b reaches below 0.
Then, the distribution is noticeably skewed.
450
350
frequency
250
150
50
0
−15 −10 −5 0 5 10 15 20 25 30 35 40 45 50
concentration
2. Laws of Nature 7
2. Laws of Nature
(a) Physics E = m · c2
s = ·v 2/(2 · a) ; Velocity v = F · t/m
Stopping distance
Gravitation F = G · m1 · m2/r 2
Gas laws p · V = n · R · T ; R = p0 · V0/T0
Radioactive decay Nt = N0 · e−kt
(b) Chemistry
Reaction velocityv = k · [A]nA · [B]nB
change with changing temperature ∆t → +100C =⇒ v → ·2
based on Arrhenius’ law k = A · e−EA /R · T
EA = activation energy; R = gas constant
Law of mass action: A+B ↔ C +D : Kc = [A]·[B]/[C]·[D]
2. Laws of Nature 8
(c) Biology
Multiplication (of unicellular organisms) 1 − 2 − 4 − 8 − 16
Growth, size st = s0 · k t
Hagen-Poiseuille Law; Volume:
Vt = (∆P · r4 · π)/(8 · η · L) ; ∆P : pressure difference
Permeability
Other laws in biology?
3. Logarithmic Transformation, Log-Normal Distribution 9
3. Logarithmic Transformation, Log-Normal Distribution
Transform data by log transformation
500
300
400
250
200
300
frequency
150
200
100
50 100
50
0
−30 −20 −10 0 10 20 30 40 50 −1.6 −1.2 −0.8 −0.4 0.0 0.4 0.8 1.2
concentration log(concentration)
3. Logarithmic Transformation, Log-Normal Distribution 10
The log transform Z = log(X)
• turns multiplication into addition,
• turns variables X > 0 into Z with unrestricted values,
• reduces (positive) skewness (may turn it negatively skewed)
• Often turns skewed distributions into normal ones.
Note: Base of logarithm is not important.
• natural log for theory,
• log10 for practice.
3. Logarithmic Transformation, Log-Normal Distribution 11
The Log-Normal Distribution
If Z = log(X) is normally distributed (Gaussian), then
the distribution of X is called log-normal.
Densities
1.2
2
1.5
2.0
4.0
8.0
density
1 0
0.0 0.5 1.0 1.5 2.0 2.5
green: normal distribution
3. Logarithmic Transformation, Log-Normal Distribution 12
2
√1 1 1 log(x)−µ
Density: exp −
σ 2π x 2 σ
Parameters: µ, σ : Expectation and st.dev. of log(X)
More useful:
• eµ = µ∗ : median, geometric “mean”, scale parameter
• eσ = σ ∗ : multiplicative standard deviation, shape parameter
σ ∗ (or σ ) determines the shape of the distribution.
Contrast to
2/2
• expectation E(X) = eµ ·eσ
• standard deviation sd(X) from var(X) = e σ2 σ2
e −1 e2µ
Less useful!
3. Logarithmic Transformation, Log-Normal Distribution 13
Ranges
Probability normal log-normal
2/3 (68%) µ±σ µ∗ ×/ σ ∗
95% µ ± 2σ µ∗ ×/ σ ∗2
×/ : “times-divide”
µ* ÷ σ*2
µ* ⋅ σ*2
µ* ÷ σ*
µ* ⋅ σ*
µ*
x
0 1 2 3
2/3 (68%)
95% (95.5%)
3. Logarithmic Transformation, Log-Normal Distribution 14
Properties
We had for the normal distribution:
• Adding normal random variables gives a normal sum.
• Linear combinations Y = α0 + α1X1 + α2X2 + ...
remain normal.
• −→ Means of normal variables are normally distributed.
• Central Limit Theorem: Means of non-normal variables
are approximately normally distributed.
• −→ “Hypothesis of Elementary Errors”:
If random variation is the sum of many small random effects,
a normal distribution must be the result.
• Regression models assume normally distributed errors.
3. Logarithmic Transformation, Log-Normal Distribution 15
Properties: We have for the log-normal distribution:
• Multiplying log-normal random variables gives a log-normal pro-
duct.
• −→ Geometric means of log-normal var.s are log-normally distr.
• Multiplicative Central Limit Theorem: Geometric means
of (non-log-normal) variables are approx. log-normally distributed.
• −→ Multiplicative “Hypothesis of Elementary Errors”:
If random variation is the product of several random effects,
a log-normal distribution must be the result.
Better name: Multiplicative normal distribution!
3. Logarithmic Transformation, Log-Normal Distribution 16
Qunicunx
Galton: Additive Limpert (improving on Kaptayn): Multiplicative
± 50 x
1.5
100 100
50 150 67 150
0 100 200 44 100 225
−50 50 150 250 30 67 150 338
−100 0 100 200 300 0 20 44 100 225 506
1 : 4 : 6 : 4 : 1 1 : 4 : 6 : 4 : 1
3. Logarithmic Transformation, Log-Normal Distribution 17
3. Logarithmic Transformation, Log-Normal Distribution 18
Back to Properties
• −→ Multiplicative “Hypothesis of Elementary Errors”:
If random variation is the product of several random effects,
a log-normal distribution must be the result.
Note: For “many small” effects, the geometric mean will have
a small σ ∗ −→ approx. normal AND log-normal!
Such normal distributions are “intrinsically log-normal”.
Keeping this in mind may lead to new insight!
• Regression models assume normally distributed errors! ???
4. Regression 19
4. Regression
Multiple linear regression:
Y = β0 + β1X1 + β2X2 + ... + E
Regressors Xj may be functions of original input variables
−→ model also describes nonlinear relations, interactions, ...
Categorical (nominal) input variables = “factors”
−→ “dummy” binary regressors
−→ Model includes Analysis of Variance (ANOVA)!
Linear in the coefficients βj
−→ “simple”, exact theory, exact inference
estimation by Least Squares −→ simple calculation
4. Regression 20
Characteristics of the model:
Formula:
Y = β0 + β1X1 + β2X2 + ... + E
additive effects, additive error
Error term E ∼ N (0, σ 2) −→
– constant variance
– symmetric error distribution
Target variable has skewed (error) distribution,
standard deviation of error increases with Y
−→ transform Y −→ log(Y ) !
log(Ye ) = Y = β0 +β1X1 +β2X2...+E
4. Regression 21
Ordinary, additive model Multiplicative model
Formula
Y = β0 + β1X1 + β2X2 + ... + E log(Ye ) = Y = β0 +β1X1 +β2X2...+E
e β1 · X
Ye = βe0 · X e β2 · ... · E
e
1 2
additive effects, additive error multiplicative effects, mult. errors
Error term
E ∼ N (0, σ 2) −→ e ∼ `N (1, σ ∗) −→
E
– constant variance – constant relative error
– symmetric error distribution – skewed error distribution
4. Regression 22
4. Regression 23
Yu et al (2012): Upregulation of transmitter release probability improves a conversion of synaptic
analogue signals into neuronal digital spikes
Figure 1. The probability of releasing glutamates increases during sequential presynaptic spikes...
4. Regression 24
Yu et al (2012): Upregulation of transmitter release probability improves a conversion of synaptic
analogue signals into neuronal digital spikes
Figure 4. Presynaptic Ca 2+ enhances an efficiency of probability-driven facilitation.
5. Advantages of using the log-normal distribution 25
5. Advantages of using the log-normal distribution
... or of applying the log transformation to data.
The normal and log-normal distributions are difficult to distinguish
for σ ∗ < 1.2 ↔ cv < 0.18
where the coef. of variation cv ≈ σ∗ − 1
−→ We discuss case of larger σ ∗ .
5. Advantages of using the log-normal distribution 26
More meaningful parameters
• The expected value of a skewed distribution is less typical
than the median.
• ( cv or) σ ∗ characterizes size of relative error
• Characteristic σ ∗ found in diseases:
latent periods for different infections: σ ∗ ≈ 1.4 ;
survival times after diagnosis of cancer, for different types: σ ∗ ≈3
−→ Deeper insight?
5. Advantages of using the log-normal distribution 27
Fulfilling assumptions, power
What happens to inference based on the normal distribution
if the data is log-normal?
• Level = prob. of falsely rejecting the null hypothesis
coverage prob. of confidence intervals are o.k.
• Loss of power! −→ wasted effort!
5. Advantages of using the log-normal distribution 28
• Loss of power! −→ wasted effort!
300
Difference between 2 groups (samples)
300
n0
5
10
50
250
250
effort, n n0 (%), for power 90%
200
200
150
150
100
100
1.0 1.5 2.0 2.5 3.0 3.5
s*
5. Advantages of using the log-normal distribution 29
More informative graphics
ad.0
10 15 20 25 30 35 40
ad.30
leth.0
leth.20
*^
leth.30
*
latency
5
0
−1 1 3 5 7 9 11
time
5. Advantages of using the log-normal distribution 30
More informative graphics
ad.0
10 15 20 25 30 35 40
ad.30
** +
*^
20
leth.0
leth.20 *
latency (log scale)
leth.30
10
*
latency
2 5
5
1
0
−1 1 3 5 7 9 11 −1 1 3 5 7 9 11
time time
More signi-
ficance
6. Conclusions 31
6. Conclusions
Genesis
• The normal distribution is good for estimators, test statistics,
data with small coef.of variation, and log-transformed data.
The log-normal distribution is good for original data.
• Summation, Means, Central limit theorem, Hyp. of elem. errors
−→ normal distribution
Multiplication, Geometric means, ...
−→ log-normal distribution
6. Conclusions 32
Applications
• Adequate ranges: µ∗ ×/σ ∗2 covers ≈ 95% of the data
• Gain of power of hypothesis tests −→ save efforts for experiments
(e.g., saves animals!)
• Regression models assume normally distributed errors.
−→ Regression model for log(Y ) instead of Y .
e0 · X β1 · X β2 · ... · E
Back transformation: Y = β e
1 2
• Parameter σ ∗ may characterize a class of phenomena
(e.g., diseases) −→ new insight ?!
6. Conclusions 33
Mathematical Statistics adds. Nature multiplies
−→ uses normal distribution −→ yields log-normal distribution
Scientists (and applied statisticians)
add logarithms!
use the normal distribution for log(data) and theory
use log-normal distribution for data
Thank you for your attention!