0% found this document useful (0 votes)
14 views14 pages

Meth 2024 SM1

The document discusses probability distributions, particularly in the context of survival analysis, focusing on random variables representing time-to-event. It covers methods to characterize these distributions, including distribution and survival functions, hazard functions for both discrete and continuous random variables, and specific distributions like the exponential and Weibull distributions. Exercises are included to reinforce understanding of the concepts presented.

Uploaded by

Irch Ngoubili
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views14 pages

Meth 2024 SM1

The document discusses probability distributions, particularly in the context of survival analysis, focusing on random variables representing time-to-event. It covers methods to characterize these distributions, including distribution and survival functions, hazard functions for both discrete and continuous random variables, and specific distributions like the exponential and Weibull distributions. Exercises are included to reinforce understanding of the concepts presented.

Uploaded by

Irch Ngoubili
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Incomplete D?

ta methods
Supplementary Material 1:
On the probability distributions

Valentin Patilea

Ensai 2A, Oct-Dec 2024


This version: October 31, 2024

1/14
Agenda

The distribution of a duration


Characterizing the distribution
Common distributions in duration models

2/14
▶ The notions presented in the following are mostly stated for
random variables with non-negative values

▶ The reason is because these notions are mostly used in


survival analysis

▶ However, these notions extend to random variables with


values on the whole real line

3/14
Ways to characterize a probability distribution (1/2)

▶ Assume that we study a random variable Y ∈ R+ = [0, ∞)


that represent a time-to-event.
▶ Distribution function: FY : R+ 7→ [0, 1] defined by

FY (y ) = P(Y ≤ y ), y ≥0

▶ Survival function: SY : R+ 7→ [0, 1] defined by

SY (y ) = P(Y > y ) = 1 − FY (y ), y ≥0

▶ Some authors denote the survival function by F Y , and


called it survivor or reliability function.
▶ Notation: SY (y −) = P(Y ≥ y ), y ≥ 0

4/14
Ways to characterize a probability distribution (2/2)
▶ We say that Y admits a density (with respect to the
Lebesgue measure on the real line) if there exists a
measurable function fY : R+ 7→ R+ such that
Z y  Z ∞ 
FY (y ) = fY (t)dt, y ≥ 0 also SY (y ) = fY (t)dt, y ≥ 0
0 y

In this case, we also say that Y is absolutely continuous, or


simply continuous
▶ If FY (·) is differentiable at t, then

fY (y ) = FY′ (y ) = −SY′ (y )

▶ In the following we also use the simplified notation f , F and


S (or F ) instead of fY , FY and SY (or F Y )
▶ Other ways to characterize the distribution of Y : characteristic
function, moment generating function, etc.
5/14
Exercise: Moments Using the Survival Function
▶ Let Y be a nonnegative random variable with survival
function S.
▶ Exercise: Show that, for any α > 0,
Z ∞
E(Y α ) = α y α−1 S(y )dy ,
0

in the sense that if one side converges so does the other.1


▶ Deduce Z ∞
E(Y ) = S(y )dy .
0

▶ Exercise. Propose an alternative, direct proof for the


relationship Z ∞
E(Y ) = S(y )dy
0
using Fubini’s Theorem.
1
See Feller (1966), An Introduction to Probability Theory and Its Applications, vol. 2, Lemma 1, p. 150. 6/14
Hazard functions : discrete random variables (1/2)
▶ Let Y ∈ {y1 , y2 , . . .} with 0 ≤ y1 < y2 < · · ·

▶ Let pk = P(Y = yk ) > 0, k ≥ 1

▶ The hazard function (also called hazard rate, or failure


rate) is defined as

P(Y = yk ) pk pk pk
λ(yk ) = =P = = , k ≥ 1.
P(Y ≥ yk ) p
j≥k j S(yk −1 ) S(yk −)

▶ We could also write the hazard function as a conditional


probability
λ(yk ) = P(Y = yk | Y ≥ yk )

▶ The hazard rate is thus the probability that the event


occurs at time yk given that it did not occur previously

7/14
Hazard functions : discrete random variables (2/2)
▶ Exercise: show that for each k ≥ 1,
k
S(yk ) S(yk ) Y
λ(yk ) = 1 − =1− , S(yk ) = {1 − λ(yj )}
S(yk −1 ) S(yk −)
j=1

(by definition S(y1 −) = 1).

▶ The cumulative hazard function is defined as


X
Λ(yk ) = λ(yj ), k ≥1
1≤j≤k

Proposition
The hazard function characterizes the distribution of Y . The
same is true for the cumulative hazard function.

▶ Exercise: Prove the Proposition.


8/14
Hazard functions : continuous random variables (1/2)

▶ Assume the random variable Y ≥ 0 admits the density f

▶ The hazard function (also called hazard rate, or failure


rate) is defined as

f (y ) f (y )
λ(y ) = = , y ≥0
S(y ) P(Y ≥ y )

▶ Herein we will always use the convention 0/0 = 0!

▶ The cumulative hazard function is defined as


Z y
Λ(y ) = λ(t)dt, y ≥ 0.
0

9/14
Hazard functions : continuous random variables (2/2)

▶ In the case of a random variable with a density, which we


assume continuous, we also have
1
λ(y ) = lim P (Y ∈ [y , y + h) | Y ≥ y ) (1)
h↓0 h

Proposition
In the case where Y ≥ 0 admits the density f we have the
following relationships
 Z y 
S(y ) = exp(−Λ(y )) and f (y ) = λ(y ) exp − λ(t)dt .
0

In particular, any of λ(·) and Λ(·) could be used to characterize


the distribution of Y

10/14
Exercises:

▶ Prove the relationship (1) in the case where the density f (·)
is continuous

▶ Prove the Proposition on the previous slide

11/14
Agenda

The distribution of a duration


Characterizing the distribution
Common distributions in duration models

12/14
Exponential distribution

▶ A law for a nonnegative random variable Y


▶ The law is indexed by one positive parameter λ
▶ The survivor function: S(y ) = exp(−λy ), y ≥ 0
▶ Density: f (y ) = λ exp(−λy ), y ≥ 0
▶ E(Y ) = 1/λ; Var (Y ) = 1/λ2
▶ Some authors use a different parametrization: γ = 1/λ !
▶ Mode: one mode at y = 0
▶ Quantile function: q(p) = −λ−1 log(1 − p)
▶ Hazard function: λ(y ) ≡ λ (constant hazard rate)
▶ Cumulative Hazard function: Λ(y ) = λy

13/14
Weibull distribution
▶ A law for a nonnegative random variable Y
▶ The law is indexed by two parameters λ > 0 (scale parameter)
and k > 0 (shape parameter)
▶ The survivor function: S(y ) = exp(−(λy )k ), y ≥ 0
▶ Density: f (y ) = k λ(λy )k −1 exp(−(λy )k ), y ≥ 0
▶ E(Y ) = Γ(1 + 1/k )/λ; Var (Y ) – exercise
▶ Some authors use a different parametrization:
γ = 1/λ, or γ = λk , or yet other !
▶ The random variable W = (λY )k has an exponential law with
parameter equal to 1
▶ If U ∼ U[0, 1], then Y = λ−1 (− log(U))1/k is a Weibull random
variable with parameters λ and k
▶ One mode at y = 0 if k ≤ 1, and at ((k − 1)/k )1/k /λ if k > 1
▶ Quantile function: q(p) = λ−1 (− log(1 − p))1/k
▶ Hazard function: λ(y ) = k λ(λy )k −1
14/14

You might also like