0% found this document useful (0 votes)

98 views34 pages

Lognormal PDF

The document discusses the normal distribution and why it may not always be the best fit for empirical data. It introduces the log-normal distribution as an alternative that is often a better fit when variable values are strictly positive. The log-normal distribution results from applying a logarithmic transformation to skewed positive data, which often yields a normal distribution of the transformed values.

Uploaded by

hachabarata

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

98 views34 pages

Lognormal PDF

Uploaded by

hachabarata

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

0

The normal distribution is

the log-normal distribution
Werner Stahel, Seminar für Statistik, ETH Zürich
and Eckhard Limpert

2 December 2014
1

The normal Normal distribution

We like it!

• Nice shape.

• Named after Gauss. Decorated the 10 DM bill.

• We know it. Passed the exam.

µ − 2σ

µ + 2σ
µ−σ

µ+σ
µ

−3 −2 −1 0 1 2 3
2/3 (68%)
95% (95.5%)
2

Why it is right.

It is given by mathematical theory.

• Adding normal random variables gives a normal sum.

• Linear combinations Y = α0 + α1X1 + α2X2 + ...

remain normal.

• −→ Means of normal variables are normally distributed.

• Central Limit Theorem: Means of non-normal variables
are approximately normally distributed.

• −→ “Hypothesis of Elementary Errors”:

If random variation is the sum of many small random effects,
a normal distribution must be the result.

• Regression models assume normally distributed errors.

Is it right?

Mathematical statisticians believe(d) that it is prevalent in Nature.

Well, it is not. Purpose of this talk: What are the consequences?

1. Empirical Distributions

2. Laws of Nature

3. Logarithmic Transformation, the Log-Normal Distribution

4. Regression

5. Advantages of using the log-normal distribution

6. Conclusions
1. Empirical Distributions 4

1. Empirical Distributions

Measurements:
size, weight, concentration, intensity, duration, price, activity
All > 0 −→ “amounts” (John Tukey)
Example: HydroxyMethylFurfurol (HMF) in honey (Renner 1970)
450
350
frequency
250
150
50
0

0 5 10 15 20 25 30 35 40 45 50
concentration
1. Empirical Distributions 5

Measurements:
size, weight, concentration, intensity, duration, price, activity
All > 0 −→ “amounts”
>0
Distribution is skewed: left steep, right flat, skewness
unless coefficient of variation cv(X) = sd(X)/E(X) is small.

Other variables may have other ranges and negative skewness.

They may have a normal distribution.
They are usually derived variables, not original measurements.
Any examples?

Our examples: Position in space and time, angles, directions. That’s it!

For some, 0 is a probable value: rain, expenditure for certain goods, ...

pH, sound and other energies [dB] −→ log scale!

1. Empirical Distributions 6

The 95% Range Check

For every normal distribution, negative values have a probability > 0.
−→ normal distribution inadequate for positive variables.
Becomes relevant when 95% range x ± 2σ b reaches below 0.
Then, the distribution is noticeably skewed.
450
350
frequency
250
150
50
0

−15 −10 −5 0 5 10 15 20 25 30 35 40 45 50
concentration
2. Laws of Nature 7

2. Laws of Nature

(a) Physics E = m · c2
s = ·v 2/(2 · a) ; Velocity v = F · t/m
Stopping distance
Gravitation F = G · m1 · m2/r 2

Gas laws p · V = n · R · T ; R = p0 · V0/T0

Radioactive decay Nt = N0 · e−kt

(b) Chemistry

Reaction velocityv = k · [A]nA · [B]nB

change with changing temperature ∆t → +100C =⇒ v → ·2
based on Arrhenius’ law k = A · e−EA /R · T
EA = activation energy; R = gas constant
Law of mass action: A+B ↔ C +D : Kc = [A]·[B]/[C]·[D]
2. Laws of Nature 8

Multiplication (of unicellular organisms) 1 − 2 − 4 − 8 − 16

Growth, size st = s0 · k t

Hagen-Poiseuille Law; Volume:

Vt = (∆P · r4 · π)/(8 · η · L) ; ∆P : pressure difference
Permeability

Other laws in biology?

3. Logarithmic Transformation, Log-Normal Distribution 9

3. Logarithmic Transformation, Log-Normal Distribution

Transform data by log transformation

500

300
400

250
200
300
frequency

150
200

100
50 100

50
0

−30 −20 −10 0 10 20 30 40 50 −1.6 −1.2 −0.8 −0.4 0.0 0.4 0.8 1.2
concentration log(concentration)
3. Logarithmic Transformation, Log-Normal Distribution 10

The log transform Z = log(X)

• turns multiplication into addition,

• turns variables X > 0 into Z with unrestricted values,

• reduces (positive) skewness (may turn it negatively skewed)

• Often turns skewed distributions into normal ones.

Note: Base of logarithm is not important.

• natural log for theory,

• log10 for practice.

3. Logarithmic Transformation, Log-Normal Distribution 11

The Log-Normal Distribution

If Z = log(X) is normally distributed (Gaussian), then

the distribution of X is called log-normal.

Densities

1.2
2

1.5
2.0
4.0
8.0
density
1 0

0.0 0.5 1.0 1.5 2.0 2.5

green: normal distribution

3. Logarithmic Transformation, Log-Normal Distribution 12
2
√1 1 1 log(x)−µ
Density: exp −
σ 2π x 2 σ

Parameters: µ, σ : Expectation and st.dev. of log(X)

More useful:

• eµ = µ∗ : median, geometric “mean”, scale parameter

• eσ = σ ∗ : multiplicative standard deviation, shape parameter
σ ∗ (or σ ) determines the shape of the distribution.
Contrast to
2/2
• expectation E(X) = eµ ·eσ

• standard deviation sd(X) from var(X) = e σ2 σ2
e −1 e2µ
Less useful!
3. Logarithmic Transformation, Log-Normal Distribution 13

Ranges
Probability normal log-normal

2/3 (68%) µ±σ µ∗ ×/ σ ∗

95% µ ± 2σ µ∗ ×/ σ ∗2
×/ : “times-divide”
µ* ÷ σ*2

µ* ⋅ σ*2
µ* ÷ σ*

µ* ⋅ σ*
µ*

x
0 1 2 3
2/3 (68%)
95% (95.5%)
3. Logarithmic Transformation, Log-Normal Distribution 14

Properties

We had for the normal distribution:

• Adding normal random variables gives a normal sum.

• Linear combinations Y = α0 + α1X1 + α2X2 + ...

remain normal.

• −→ Means of normal variables are normally distributed.

• Central Limit Theorem: Means of non-normal variables
are approximately normally distributed.

• −→ “Hypothesis of Elementary Errors”:

If random variation is the sum of many small random effects,
a normal distribution must be the result.

• Regression models assume normally distributed errors.

3. Logarithmic Transformation, Log-Normal Distribution 15

Properties: We have for the log-normal distribution:

• Multiplying log-normal random variables gives a log-normal pro-

duct.

• −→ Geometric means of log-normal var.s are log-normally distr.

• Multiplicative Central Limit Theorem: Geometric means
of (non-log-normal) variables are approx. log-normally distributed.

• −→ Multiplicative “Hypothesis of Elementary Errors”:

If random variation is the product of several random effects,
a log-normal distribution must be the result.

Better name: Multiplicative normal distribution!

3. Logarithmic Transformation, Log-Normal Distribution 16

Qunicunx

Galton: Additive Limpert (improving on Kaptayn): Multiplicative

± 50 x
1.5

100 100

50 150 67 150

0 100 200 44 100 225

−50 50 150 250 30 67 150 338

−100 0 100 200 300 0 20 44 100 225 506

1 : 4 : 6 : 4 : 1 1 : 4 : 6 : 4 : 1
3. Logarithmic Transformation, Log-Normal Distribution 17
3. Logarithmic Transformation, Log-Normal Distribution 18

Back to Properties

• −→ Multiplicative “Hypothesis of Elementary Errors”:

If random variation is the product of several random effects,
a log-normal distribution must be the result.

Note: For “many small” effects, the geometric mean will have
a small σ ∗ −→ approx. normal AND log-normal!

Such normal distributions are “intrinsically log-normal”.

Keeping this in mind may lead to new insight!

• Regression models assume normally distributed errors! ???

4. Regression 19

4. Regression

Multiple linear regression:

Y = β0 + β1X1 + β2X2 + ... + E

Regressors Xj may be functions of original input variables

−→ model also describes nonlinear relations, interactions, ...
Categorical (nominal) input variables = “factors”
−→ “dummy” binary regressors
−→ Model includes Analysis of Variance (ANOVA)!
Linear in the coefficients βj
−→ “simple”, exact theory, exact inference
estimation by Least Squares −→ simple calculation
4. Regression 20

Characteristics of the model:

Formula:
Y = β0 + β1X1 + β2X2 + ... + E

additive effects, additive error

Error term E ∼ N (0, σ 2) −→
– constant variance
– symmetric error distribution

Target variable has skewed (error) distribution,

standard deviation of error increases with Y
−→ transform Y −→ log(Y ) !
log(Ye ) = Y = β0 +β1X1 +β2X2...+E
4. Regression 21

Ordinary, additive model Multiplicative model

Formula
Y = β0 + β1X1 + β2X2 + ... + E log(Ye ) = Y = β0 +β1X1 +β2X2...+E
e β1 · X
Ye = βe0 · X e β2 · ... · E
e
1 2

additive effects, additive error multiplicative effects, mult. errors

Error term
E ∼ N (0, σ 2) −→ e ∼ `N (1, σ ∗) −→
E
– constant variance – constant relative error
– symmetric error distribution – skewed error distribution
4. Regression 22
4. Regression 23

Yu et al (2012): Upregulation of transmitter release probability improves a conversion of synaptic

analogue signals into neuronal digital spikes

Figure 1. The probability of releasing glutamates increases during sequential presynaptic spikes...
4. Regression 24

Yu et al (2012): Upregulation of transmitter release probability improves a conversion of synaptic

analogue signals into neuronal digital spikes

Figure 4. Presynaptic Ca 2+ enhances an efficiency of probability-driven facilitation.

5. Advantages of using the log-normal distribution 25

5. Advantages of using the log-normal distribution

... or of applying the log transformation to data.

The normal and log-normal distributions are difficult to distinguish

for σ ∗ < 1.2 ↔ cv < 0.18
where the coef. of variation cv ≈ σ∗ − 1
−→ We discuss case of larger σ ∗ .
5. Advantages of using the log-normal distribution 26

More meaningful parameters

• The expected value of a skewed distribution is less typical

than the median.

• ( cv or) σ ∗ characterizes size of relative error

• Characteristic σ ∗ found in diseases:
latent periods for different infections: σ ∗ ≈ 1.4 ;
survival times after diagnosis of cancer, for different types: σ ∗ ≈3
−→ Deeper insight?
5. Advantages of using the log-normal distribution 27

Fulfilling assumptions, power

What happens to inference based on the normal distribution

if the data is log-normal?

• Level = prob. of falsely rejecting the null hypothesis

coverage prob. of confidence intervals are o.k.

• Loss of power! −→ wasted effort!

5. Advantages of using the log-normal distribution 28

• Loss of power! −→ wasted effort!

300
Difference between 2 groups (samples)

300
n0
5
10
50
250

250
effort, n n0 (%), for power 90%
200

200
150

150
100

100
1.0 1.5 2.0 2.5 3.0 3.5
s*
5. Advantages of using the log-normal distribution 29

More informative graphics

ad.0
10 15 20 25 30 35 40

ad.30
leth.0
leth.20
*^
leth.30
*
latency
5
0

−1 1 3 5 7 9 11
time
5. Advantages of using the log-normal distribution 30

More informative graphics

ad.0
10 15 20 25 30 35 40

ad.30
** +
*^

20
leth.0
leth.20 *

latency (log scale)

leth.30

10
*
latency

2 5
5

1
0

−1 1 3 5 7 9 11 −1 1 3 5 7 9 11
time time
More signi-
ficance
6. Conclusions 31

6. Conclusions

Genesis

• The normal distribution is good for estimators, test statistics,

data with small coef.of variation, and log-transformed data.
The log-normal distribution is good for original data.

• Summation, Means, Central limit theorem, Hyp. of elem. errors

−→ normal distribution
Multiplication, Geometric means, ...
−→ log-normal distribution
6. Conclusions 32

Applications

• Adequate ranges: µ∗ ×/σ ∗2 covers ≈ 95% of the data

• Gain of power of hypothesis tests −→ save efforts for experiments
(e.g., saves animals!)

• Regression models assume normally distributed errors.

−→ Regression model for log(Y ) instead of Y .
e0 · X β1 · X β2 · ... · E
Back transformation: Y = β e
1 2
• Parameter σ ∗ may characterize a class of phenomena
(e.g., diseases) −→ new insight ?!
6. Conclusions 33

Mathematical Statistics adds. Nature multiplies

−→ uses normal distribution −→ yields log-normal distribution

Scientists (and applied statisticians)

add logarithms!
use the normal distribution for log(data) and theory
use log-normal distribution for data

Thank you for your attention!

The Normal Distribution Is The Distribution
100% (1)
The Normal Distribution Is The Distribution
34 pages
Log Normal Distribution
No ratings yet
Log Normal Distribution
13 pages
Lognormal Distribution
No ratings yet
Lognormal Distribution
20 pages
Distribution in Statistics
No ratings yet
Distribution in Statistics
49 pages
Distribucion Log Normal
No ratings yet
Distribucion Log Normal
52 pages
The Lognormal Distribution (Preliminary Version) : Median
No ratings yet
The Lognormal Distribution (Preliminary Version) : Median
8 pages
Log Normal
No ratings yet
Log Normal
6 pages
Common Probability Distributions in Nature
No ratings yet
Common Probability Distributions in Nature
7 pages
Lecture 18
No ratings yet
Lecture 18
32 pages
Lecture 7. Typical Probability Distribution For Continuous RV
No ratings yet
Lecture 7. Typical Probability Distribution For Continuous RV
29 pages
Log-Normal Distribution - Wikipedia
No ratings yet
Log-Normal Distribution - Wikipedia
23 pages
EJ1165803
No ratings yet
EJ1165803
15 pages
39 1 Norm Dist
No ratings yet
39 1 Norm Dist
24 pages
Module-2 q3 Statprobability Final
No ratings yet
Module-2 q3 Statprobability Final
47 pages
Logarithm in Biology - Mechanisms Generating The Log-Normal Distribution Exactly
No ratings yet
Logarithm in Biology - Mechanisms Generating The Log-Normal Distribution Exactly
15 pages
LogNormal From Wiki
No ratings yet
LogNormal From Wiki
9 pages
4.normal Distribution Haomin2021
No ratings yet
4.normal Distribution Haomin2021
94 pages
The Lognormal Distribution: X Is Said To Have The
No ratings yet
The Lognormal Distribution: X Is Said To Have The
3 pages
WB39 All
No ratings yet
WB39 All
44 pages
Risk Analysis For Information and Systems Engineering: INSE 6320 - Week 3
No ratings yet
Risk Analysis For Information and Systems Engineering: INSE 6320 - Week 3
9 pages
Lecture
No ratings yet
Lecture
6 pages
Normal Distribution
100% (1)
Normal Distribution
59 pages
Sta 2200 N (1) - 31-40
No ratings yet
Sta 2200 N (1) - 31-40
10 pages
Lecture II - Normaldsit
No ratings yet
Lecture II - Normaldsit
43 pages
Using Lognormal Distributions and Lognormal Probability Plots in Probabilistic Risk Assessments
No ratings yet
Using Lognormal Distributions and Lognormal Probability Plots in Probabilistic Risk Assessments
37 pages
39 1 Normal Dist
No ratings yet
39 1 Normal Dist
20 pages
Lect 4 The Normal distributionXIUGAI
No ratings yet
Lect 4 The Normal distributionXIUGAI
57 pages
Normal Distribution - Wikipedia, The Free Encyclopedia
No ratings yet
Normal Distribution - Wikipedia, The Free Encyclopedia
22 pages
Normal Distribution
No ratings yet
Normal Distribution
15 pages
Grade 10 Normal Distribution Lesson
No ratings yet
Grade 10 Normal Distribution Lesson
8 pages
Central Limit Theorem
No ratings yet
Central Limit Theorem
84 pages
Normal Distribution Overview
No ratings yet
Normal Distribution Overview
941 pages
Module 4 PDF
No ratings yet
Module 4 PDF
33 pages
FALLSEM2020-21 MAT2001 ETH VL2020210107492 Reference Material I 17-Oct-2020 Module 4
No ratings yet
FALLSEM2020-21 MAT2001 ETH VL2020210107492 Reference Material I 17-Oct-2020 Module 4
33 pages
Chap 004
No ratings yet
Chap 004
34 pages
Normal Distribution
No ratings yet
Normal Distribution
33 pages
Normal Dist Log Normal Dist Standard Normal Dist 1755529628
No ratings yet
Normal Dist Log Normal Dist Standard Normal Dist 1755529628
21 pages
3.0 Common Probability Distribution
No ratings yet
3.0 Common Probability Distribution
94 pages
StatProb11 Normal-Distribution
No ratings yet
StatProb11 Normal-Distribution
41 pages
Normal Distribution
No ratings yet
Normal Distribution
36 pages
2 - Probability (Part 4) - Continuous PD (Uniform, Normal, Exponential)
No ratings yet
2 - Probability (Part 4) - Continuous PD (Uniform, Normal, Exponential)
40 pages
Assignment 2 State Arman
No ratings yet
Assignment 2 State Arman
9 pages
Assignment 4
No ratings yet
Assignment 4
7 pages
Normal Distribution Normal Probability Distribution: Mean Continuous Random Variable (X)
No ratings yet
Normal Distribution Normal Probability Distribution: Mean Continuous Random Variable (X)
5 pages
Think Tank: Normal Distribution
No ratings yet
Think Tank: Normal Distribution
19 pages
Inf Sta3
No ratings yet
Inf Sta3
15 pages
Normal Distribution
No ratings yet
Normal Distribution
16 pages
HL Exploration Research-Normal Distribution: Stimulus: What Do You Expect The Knock Board Outcome?
No ratings yet
HL Exploration Research-Normal Distribution: Stimulus: What Do You Expect The Knock Board Outcome?
2 pages
Normal and Standard Normal Distribution
No ratings yet
Normal and Standard Normal Distribution
35 pages
OSU5509 U1 Session 04
No ratings yet
OSU5509 U1 Session 04
17 pages
Lesson Plan
No ratings yet
Lesson Plan
7 pages
Chapter 4
No ratings yet
Chapter 4
34 pages
ProbabilityStatistics Probability3
No ratings yet
ProbabilityStatistics Probability3
9 pages
Normal Distribution
No ratings yet
Normal Distribution
48 pages
Learning Journal Unit 4 Hs 4510-01
No ratings yet
Learning Journal Unit 4 Hs 4510-01
3 pages
Notes On Normal Distribution
No ratings yet
Notes On Normal Distribution
25 pages
Continuous Distribution
No ratings yet
Continuous Distribution
67 pages
Measure of Variability
No ratings yet
Measure of Variability
23 pages
Descriptive Statistics
100% (3)
Descriptive Statistics
41 pages
Sampling Techniques in Food
100% (1)
Sampling Techniques in Food
54 pages
Uses and Abuses of The Analysis of Covariance
No ratings yet
Uses and Abuses of The Analysis of Covariance
11 pages
Data Science Fundamentals Lab
No ratings yet
Data Science Fundamentals Lab
24 pages
Factors Driving The Purchase of Life Insurance Among Millennials in Sri Lanka
No ratings yet
Factors Driving The Purchase of Life Insurance Among Millennials in Sri Lanka
13 pages
Some of What Mathematicians Do-Krieger PDF
No ratings yet
Some of What Mathematicians Do-Krieger PDF
5 pages
17C - PowerPoint - Standard Deviation
No ratings yet
17C - PowerPoint - Standard Deviation
9 pages
Chapter 2 The Simple Regression Model
No ratings yet
Chapter 2 The Simple Regression Model
9 pages
Covariance Matrix
No ratings yet
Covariance Matrix
14 pages
Nota
No ratings yet
Nota
47 pages
Assignment 2 Question and Instruction
No ratings yet
Assignment 2 Question and Instruction
4 pages
The Effect of Breath Alcohol Simulator Solution Vo
No ratings yet
The Effect of Breath Alcohol Simulator Solution Vo
5 pages
Question Bank For Biostatistics
No ratings yet
Question Bank For Biostatistics
4 pages
Mini
No ratings yet
Mini
28 pages
Bias-Variance Tradeoffs: 1 Single Sample MLE
No ratings yet
Bias-Variance Tradeoffs: 1 Single Sample MLE
7 pages
An Introduction To Sphericity
No ratings yet
An Introduction To Sphericity
12 pages
B.Sc. Agriculture Exam Paper
100% (1)
B.Sc. Agriculture Exam Paper
2 pages
Confidence Intervals and Sample Size
100% (1)
Confidence Intervals and Sample Size
44 pages
Statistical Approaches To Causal Analysis, 1st Edition EPUB DOCX PDF Download
100% (11)
Statistical Approaches To Causal Analysis, 1st Edition EPUB DOCX PDF Download
14 pages
Solution Manual For Probability and Statistics For Engineering and The Sciences 8th Edition
No ratings yet
Solution Manual For Probability and Statistics For Engineering and The Sciences 8th Edition
89 pages
Sta 201 MDG
No ratings yet
Sta 201 MDG
5 pages
Volatility Clustering
No ratings yet
Volatility Clustering
19 pages
Statistical Analysis in Chemistry
No ratings yet
Statistical Analysis in Chemistry
8 pages
Financial Literacy Determinants and Impact On Financial Mu9e08oh
No ratings yet
Financial Literacy Determinants and Impact On Financial Mu9e08oh
26 pages
Denardo (2009) LinearProgramming
100% (4)
Denardo (2009) LinearProgramming
684 pages
Data Analysis Activity
No ratings yet
Data Analysis Activity
3 pages
International Journal of Accounting Information Systems: Hui Lin, Yujong Hwang
No ratings yet
International Journal of Accounting Information Systems: Hui Lin, Yujong Hwang
11 pages
s5 Statistics Set 1
No ratings yet
s5 Statistics Set 1
2 pages
Lecture Notes On Actuarial Mathematics: Jerry Alan Veeh
No ratings yet
Lecture Notes On Actuarial Mathematics: Jerry Alan Veeh
161 pages