0% found this document useful (0 votes)

32 views69 pages

Survival Analysis - Lecture 5

The document provides an overview of survival analysis, focusing on maximum likelihood estimation (MLE) and the use of Q-Q plots to assess distribution fit. It discusses various survival models, including non-parametric, semi-parametric, and parametric models, and explains how to interpret Q-Q plots and construct likelihood functions for different types of censored data. Additionally, it covers the estimation of parameters for exponential and Weibull distributions, including examples and methods for calculating MLE and confidence intervals.

Uploaded by

alcinialbob1234

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views69 pages

Survival Analysis - Lecture 5

Uploaded by

alcinialbob1234

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 69

Q-Q Plots Likelihood MLE

• Understanding • Defining • MLE Properties

• Interpreting • Interpreting • MLE in Survival Analysis
• Assessing • Log-Likelihood • Parameter Estimation
Evaluate model fit with Understand Apply MLE to
Q-Q plots. likelihood as a estimate survival
model fit measure model parameters.
Imagine a clinical trial testing a new drug for a
specific type of cancer. Researchers are
interested in understanding how long patients
survive after starting the treatment.
In a cancer drug trial, how can
researchers analyze patient
survival time?
Ways of Survival Analysis
Non-Parametric Models (Kaplan-Meier Estimator):
• Assumptions: Minimal assumptions, making it suitable for exploratory
analysis.
• Advantages: Provides a non-parametric estimate of the survival function.
• Disadvantages: Less powerful for comparing groups or identifying risk
factors.
Semi-Parametric Models (Cox Proportional Hazards Model)
• Assumptions: Does not assume a specific distribution for survival
times but assumes proportional hazards.
• Advantages: Flexible and robust to distributional assumptions.
• Disadvantages: Limited to proportional hazards assumption.
Ways of Survival Analysis
And the topic of our lecture today is
Parametric Survival Models
These models assume a specific distribution for survival times, such
as exponential, Weibull, or log-normal. This assumption allows for
precise estimation of survival probabilities and confidence intervals.
However, it's crucial to correctly specify the distribution, as incorrect
assumptions can lead to biased results.
Q-QPlots in Survival Analysis
What is a Q-Q Plot?
A Q-Q plot (Quantile-Quantile plot) is a graphical tool used to
assess if a sample comes from a particular theoretical
distribution. It compares the quantiles of the observed data to
the quantiles of the theoretical distribution.

Used to assess if a dataset follows a specified distribution

(e.g., normal, Weibull).
Quantile
A quantile is a value that divides a dataset or probability distribution into equal-
sized intervals. It helps to understand the distribution of data by marking specific
points on the scale.

For example, in a dataset:

• Median (50th percentile) is a quantile that splits data into two equal halves.
• Quartiles (like the 25th and 75th percentiles) divide data into four equal parts.
• Percentiles divide data into 100 equal parts, where the 90th percentile, for
instance, indicates that 90% of the data lies below this value.

Quantiles are widely used in statistics to assess data distribution, compare

distributions, and identify outliers.
Q-Q Plot
The x-axis represents the theoretical quantiles, and the y-axis
represents the sample quantiles.

Purpose:
• Assessing Distribution Fit:
⚬ Used to assess if a dataset follows
a specified theoretical distribution
(e.g., normal, Weibull, log-normal).
⚬ Helps to visually check the
goodness-of-fit for the chosen
distribution.
How to Interpret a Q-Q Plot:
• If the points on the Q-Q plot roughly follow a straight line, it suggests
that the data may follow the assumed distribution.
• Deviations from the straight line indicate potential departures from
the assumed distribution.

• From the
Straight Line Equation
we determine the best fit
line for the data:
Y=bx+a
How to draw a Q-Q plot to assess if the data follows the assumed distribution

1. *Sort the data*: Begin by sorting your data in ascending order. This step is essential for calculating the quantiles.

2. *Calculate the quantiles of the distribution*: Determine the theoretical quantiles of the assumed distribution you wish to
compare your data against. Common choices include the quantiles of a normal or uniform distribution, depending on the context.

3. *Calculate the quantiles of the data*: Compute the quantiles of your sorted data. These are essentially the data points ordered
in such a way that they can be compared to the theoretical quantiles.

4. *Plot the quantiles*: On a scatter plot, plot the calculated quantiles of your data against the theoretical quantiles of the
assumed distribution. The x-axis would show the quantiles of the assumed distribution, while the y-axis would display the
quantiles of your data.

5. *Interpret the plot*: If the points on the Q-Q plot fall along a straight line, it indicates that your data closely follows the assumed
distribution. Deviations from a straight line suggest departures from the assumed distribution.

• 6. *Assess the goodness of fit*: Analyze the Q-Q plot. If the points follow a straight line closely, it suggests a good fit to the
assumed distribution. Any curvature, outliers, or deviations indicate that the data may not conform to the expected
distribution.
1-Straight Line (Good Fit)
• If the points follow the 45-degree line closely,
your data likely follows the assumed
distribution.
• Small random deviations around the line are
normal but should not form any specific
pattern.

•
3-S-Shaped Curve (Light or Heavy Tails)
• Upward curve (concave): Indicates heavy
tails. Your data has more extreme values
than expected for the theoretical distribution.
• Downward curve (convex): Indicates light
tails. Your data has fewer extreme values
than expected.

•
Example:
Does the following Sample come from a normal distributed
population?3.89 4.75 6.33 4.75 7.21 5.78 5.80 5.20 7.90
• First, order the data from smallest to largest
3.89 4.75 4.75 5.20 5.78 5.80 6.33 7.21 7.90
• Then plot these values against appropriate quantiles from the
standard normal distribution
• divide the distribution into 10 equal areas
n=9
n+1=10
• find the theoritical quantile

from Standard normal table

find the values of standard
normal random variable that
make that happen and draw in
the X-axis
Identify the sample quantiles corresponding to the 25th and 75th percentiles:
• 25th percentile (Q1sampleQ1sample): 4.75
• 75th percentile (Q3sampleQ3sample): 6.33
Identify the theoretical quantiles corresponding to the 25th and 75th percentiles:
• 25th percentile (Q1theoreticalQ1theoretical): −0.674
• 75th percentile (Q3theoreticalQ3theoretical): 0.674
Calculate the slope (m):
m=0.674−(−0.674)6.33−4.75=1.3481.58≈1.17
Calculate the intercept (b):
b=4.75−1.17⋅(−0.674)=4.75+0.79≈5.54

The equation of the line is:

y=m⋅x+b
Substituting m=1.17 and b=5.54 :
y=1.17⋅x+5.54
1.Normal Distribution: The plot compares your data quantiles with the theoretical quantiles
of a normal distribution. The closer the points are to the red line, the more normally
distributed your data is.
2.Log-Normal Distribution: This plot uses a log-normal scale. It shows how well your data
matches a log-normal distribution.
3.Log-Logistic Distribution: The plot is based on the log-logistic distribution, using its
theoretical quantiles for comparison.
Likelihood Function
The likelihood function L(θ∣X) is a function of the parameters θ given the
observed data X. It represents the probability (or probability density) of
observing the data X under a specified model with parameters θ.

Role in Survival Analysis:

In survival analysis, the likelihood function is used to estimate the parameters
of the survival distribution.
• By maximizing the likelihood function, we find the parameter values that
are most likely to have generated the observed data.
Likelihood Function
2. Constructing the Likelihood Function
Suppose we have:
• A dataset X={x1,x2,…,xn}, where each Xi represents an observed data point.
• A probability distribution with parameter(s) θ that we want to estimate.

The likelihood function is the product of the probability densities (or probabilities, for
discrete data) of each observed data point, given the parameter θ:
Likelihood Function
•Hazard Function h(x; θ):
- Represents the instantaneous risk of an event at time
x.
- For exponential distribution: h(x; θ) = λ.

•Survival Function S(x; θ):

- Represents the probability of survival up to time x.
- For exponential distribution: S(x; θ) = e^(-λx).

we will use this soon

we have to know that uncensored data
=

Censored Data: Each type has a specific contribution to the likelihood

function.
• In survival analysis, data may be censored due to study limitations or
event timing outside observation periods.

1. Right-Censored Data
the event happend after a certain time

2. Left-Censored Data
the event happend before the certain time

3.interval-censored
the event happend between 2 known times

• xi : Uncensored event times (exact times of the event ),

• yj : Right-censored times (the event occurs after yj ),
• zk : Left-censored times (the event occurs before zk ),
• (ai,bi) : Interval-censored times
The likelihood function for a dataset with all types of censored data
is:

where:
n0: Number of uncensored observations.
nr: Number of right-censored observations.
nl: Number of left-censored observations.
ni: Number of interval-censored
observations.
if any tybe of censored is not exist we can remove it’s function
Maximum likelihood estimation
•Definition:
•Maximum Likelihood Estimation (MLE) is a method to estimate model
parameters.
•Finds parameter values that maximize the probability of observing the data
under the model.

•Purpose of MLE:
•Identifies parameters that best fit the data within a specified model.
•How MLE Works:
•Step 1: Construct the likelihood function.
Maximum likelihood estimation
Step 2: Log-Likelihood Function:
To simplify the calculations, we take the logarithm of the likelihood function to obtain the log-
likelihood function:

Step 3: Taking the

Derivative:
To find the MLE for θ, we differentiate the log-likelihood function with respect to θ and set it
equal to zero:
general example
Consider five subjects with different types of event
time or censoring information. We assume a
probability density function f(t)and hazard function
h(t) for the outcome.

Write down the contribution to the likelihood for

each subject.
Then, construct the full likelihood L(θ)by combining
each individual contribution, using the hazard
function h(t), survival function S(t), and cumulative
distribution function F(t) as necessary
Answer:-
Alice experienced the event at time
t=3 we can use this or this depend on the
Bob was right-censored at time question
t=7

Chris experienced the event at time

t=4

Dana was left-censored at time t=2

Erin was interval-censored between t=5 and

t=9
Answer:-
The full likelihood L(θ) is obtained by taking the product of each subject’s
independent contribution:

Expanding each term using the hazard and survival

functions:

The Maximum Likelihood Estimation (MLE) for θ can then be obtained by maximizing this likelihood function or,
more commonly, the log-likelihood function based on these contributions.

it is to complex because we have all types of censored data and it is not in most
question
usually questions contain 1 or 2 types
now we will take about MLE in different survival
distributions
EXPONENTIAL
DISTRIBUTION
One-Parameter Exponential Distribution:

The one-parameter exponential distribution is a continuous probability

distribution that is often used to model the time between events. It has a
single parameter, typically denoted by λ (lambda), which represents the rate
parameter.

Mean = 1/λ
density
function:

Survival function:

hazard function:

cumulative hazard
functions:
Estimation of λ for Data without Censored Observations:
Suppose that there are n persons in the study and everyone is followed to death or failure
Let t1 , t2 ,..., tn be the exact survival times of the n people.

The likelihood
function:

the log-likelihood function:

the MLE of λ :

the MLE of mean :

confidence interval

confidence interval for λ is:

confidence interval for the mean:

Exampl
e
Consider the following remission times in weeks for 21 patients with acute
leukemia:1,1,2,2,3,4,4,5,5,6,8,8,9,10,10,12,14,16, 20, 24, and 34. Assume that remission duration
follows the exponential distribution. obtain:
(a) The MLE of λ (b) The MLE of mean (c) The 95% confidence intervals for λ and mean

a
n = 21
) significance level = .05
b)

c
)
c
)
Estimation of λ for Data with Censored Observations:

The likelihood function

with rigth:censored

the log-likelihood function

:
MLE of the parameter :

the MLE of mean :

confidence interval

the same equation of uncensored data but we change the n (number

of sample) to r (number of uncensored data)
Example
Suppose that in a laboratory experiment 10 mice are exposed to carcinogens. The experimenter decides to
terminate the study after half of the mice are dead and to sacrifice the other half at that time. The survival
times of the five dead mice are 4, 5, 8, 9, and 10 weeks. The survival data of the 10 mice are 4, 5, 8, 9, 10,
10, 10+, 10+, 10+, and 10+. Assuming that the failure of these mice follows an exponential distribution

The probability of surviving given time for the mice can be estimated
from . For example, the probability that a mouse exposed to the
same carcinogen will survive longer than 8 weeks is

The probability of dying in 8 weeks is then 1-

0.629=0.371.
Weibull Distribution

Weibull Distribution in Survival analysis

The Weibull distribution is widely used in survival analysis to model time-to-

event data, particularly when the event rate varies over time. This distribution
is flexible because it can model increasing, constant, or decreasing hazard
rates, depending on its shape parameter

Mean=λΓ(1+1/𝛾)
density
function:

survivorship function:

hazard function:

cumulative hazard
functions:
Estimation of λ and γ for Data without Censored Observations:
Suppose that there are n persons in the study and everyone is followed to death or failure
Let t1 , t2 ,..., tn be the exact survival times of the n people.

The likelihood
function:

the log-likelihood function:

Maximum likelihood Estimation of λ ,𝛾
for Data without Censored Observations

The MLE of λ :

The MLE of 𝛾 :

The MLE of mean:

Example on the MLE Weibull Uncensored Data
Consider the following remission times in weeks for 21 patients with acute
leukemia:1,1,2,2,3,4,4,5,5,6,8,8,9,10,10,12,14,16, 20, 24, and 34. Assume that remission duration
follows the Weibull Distribution. obtain:
(a) The MLE of λ (b) The MLE of 𝛾 (c) The MLE of mean

λ
𝛾

𝜆 𝛾

The second derivative in Newton-Raphson:

• Provides information on the curvature of the log-likelihood function.

• Helps adjust step size and direction to improve convergence.
• Reduces the number of iterations needed for convergence,
especially in cases with well-behaved functions.

Since 𝛾 is slightly greater than 1, it indicates a slightly right-skewed distribution.

𝛾λ

Now lets apply the Estimated parameters on the S(t)

After calculating S(8)=0.6472, it means that there is a 64.72% probability that a patient will remain in
remission for at least 8 weeks. In other words, 64.72% of the patients are expected to survive beyond the
8-week mark.
Estimation of λ and γ for Data with Censored Observations:
Suppose that there are n persons in the study and everyone is followed to death or failure
Let t1 , t2 ,..., tn be the exact survival times of the n people.

The likelihood
function:
Assuming that δi is censorship indicator, where δ=1 if the event was observed (uncensored) and δ=0 if the data is censored.

the log-likelihood function:

Maximum likelihood Estimation of λ ,𝛾
for Data with Censored Observations

The MLE of λ :

The MLE of 𝛾 :

The MLE of mean: Calculating the exact mean can be more complex and generally requires
numerical integration. The mean will depend on the censoring level and the
distribution of the censored observations, as well as the parameter estimates λ, 𝛾
Example on the MLE Weibull Censored Data
Suppose that in a laboratory experiment 10 mice are exposed to carcinogens. The experimenter decides to
terminate the study after half of the mice are dead and to sacrifice the other half at that time. The survival
times of the five dead mice are 4, 5, 8, 9, and 10 weeks. The survival data of the 10 mice are 4, 5, 8, 9, 10, 10,
10+, 10+, 10+, and 10+. Assuming that the failure of these mice follows an Weibull Distribution
(a) The MLE of λ (b) The MLE of 𝛾 (c) The MLE of mean

where r is the number of uncensored data

points.
Step 3 Partial Derivative with Respect to λ:

Setting the derivative to 0 we get

Step 4 Partial Derivative with Respect to 𝛾:

Setting the derivative to 0 we get

The two estimated equations generally do not have a

closed-form solution, so they must be solved
numerically, arguably we tend to use R program or any
other softwares to solve them.

After using R program to estimate the parameters

𝛾λ
Now lets apply the Estimated parameters on the S(t)

For instance let’s calculate S(8) by substituting t=8, λ=0.07900991, 𝛾= 1.93827

After getting S(8)=0.6630, This result means that the probability a

mouse survives beyond 8 weeks is approximately 66.3%.
LOG-LOGISTIC
DISTRIBUTION
The Log-Logistic Distribution in Survival Analysis
The log-logistic distribution is a continuous probability distribution often used in
survival analysis and reliability engineering. It has two parameters: the shape
parameter (α), which affects the tail behavior and hazard function peak, and the
scale parameter here (γ) , “(β)”, which stretches or compresses the distribution
along the time axis. This distribution is versatile for modeling data with various
hazard rates and is particularly useful for lifetimes and failure times,
accommodating heavy-tailed data commonly found in practice.
density ≥ α γ
function:

survivorship function

survivor function ( S(t) )

cumulative hazard
functions:
Estimation of γ,α for Data without Censored Observations:
Suppose that there are n persons in the study and everyone is followed to death or failure
Let t1 , t2 ,..., tn be the exact survival times of the n people.

The likelihood
function:

the log-likelihood function:

Given the following remission times (in weeks) for 5 patients
with acute leukemia: t1=1, t2=2 , t3=4 , t4=6 , t5=8.
Step 1: Write Down the Likelihood Function

Step 2: get the log-likelihood function

Step 3: Substitute the Density Function into the

Log-Likelihood Function
Given the following remission times (in weeks) for 5 patients
with acute leukemia: t1=1, t2=2 , t3=4 , t4=6 , t5=8.

Step 3: Substitute the Density Function into

the Log-Likelihood Function
Given the following remission times (in weeks) for 5 patients
with acute leukemia: t1=1, t2=2 , t3=4 , t4=6 , t5=8.

Step 4: Differentiate the Log-Likelihood Function

The previous equations are nonlinear and need to be solved
numerically using methods such as the Newton-Raphson method
Step 1: Organize the Data

Step 2: Calculate the Kaplan-Meier Estimator

2,3.5, 5, 7, 9, 10, 15, 20,30,40

0.7
Step 3: Substitute the Density Function into
the Log-Likelihood Function
Step 4: Trial Values for α alpha and γ gamma
Step 4: Trial Values for α alpha and γ gamma
Step 3.2: Compare and Update α and γ

Step 3.3: Adjusting the Parameters

α γ
New Trial Values for α and γ
library(survival)
library(car)

set.seed(42)
survival_time <- rexp(100, rate = 0.2)
censoring <- sample(c(0, 1), 100, replace = TRUE, prob = c(0.3,
0.7))
surv_data <- Surv(time = survival_time, event = censoring)
uncensored_data <- survival_time[censoring == 1]

qqPlot(uncensored_data, distribution = "exp", rate = 0.2,

main = "QQ Plot - Exponential Distribution (Uncensored
Data Only)",
ylab = "Ordered Survival Times (Uncensored)",
xlab = "Theoretical Quantiles")

qqPlot(uncensored_data, distribution = "weibull", shape = 1.5,

scale = 5,
main = "QQ Plot - Weibull Distribution (Uncensored Data
Only)",
ylab = "Ordered Survival Times (Uncensored)",
xlab = "Theoretical Quantiles")
set.seed(42)
lambda <- 0.2
survival_time <- rexp(100, rate = lambda)
censoring <- sample(c(0, 1), 100, replace = TRUE, prob = c(0.3, 0.7))

likelihood_function <- function(lambda, survival_time, censoring) {

uncensored_likelihood <- lambda * exp(-lambda *
survival_time[censoring == 1])
censored_likelihood <- exp(-lambda * survival_time[censoring == 0])
total_likelihood <- prod(uncensored_likelihood) *
prod(censored_likelihood)
return(total_likelihood)
}

lambda_values <- seq(0.01, 1, by = 0.01)

likelihood_values <- sapply(lambda_values, function(lambda)
likelihood_function(lambda, survival_time, censoring))

plot(lambda_values, likelihood_values, type = "l", col = "blue", lwd = 2,

main = "Likelihood vs Lambda for Exponential Distribution (Censored
Data)",
xlab = "Lambda", ylab = "Likelihood")
abline(v = lambda_values[which.max(likelihood_values)], col = "red", lwd
set.seed(42)
n <- 100
shape_param <- 1.5
scale_param <- 2
survival_time <- rweibull(n, shape = shape_param, scale = scale_param)
censoring <- sample(c(0, 1), n, replace = TRUE, prob = c(0.3, 0.7))

likelihood_function_weibull <- function(shape, scale, survival_time, censoring) {

uncensored_likelihood <- (shape / scale) * (survival_time[censoring == 1] /
scale)^(shape - 1) *
exp(-(survival_time[censoring == 1] / scale)^shape)
censored_likelihood <- exp(-(survival_time[censoring == 0] / scale)^shape)
total_likelihood <- prod(uncensored_likelihood) * prod(censored_likelihood)
return(total_likelihood)
}

shape_values <- seq(0.1, 3, by = 0.1)

scale_values <- seq(0.5, 5, by = 0.1)
likelihood_values <- outer(shape_values, scale_values, Vectorize(function(shape, scale) {
likelihood_function_weibull(shape, scale, survival_time, censoring)
}))

# Plot the likelihood surface

persp(shape_values, scale_values, likelihood_values, theta = 30, phi = 30,
col = "lightblue", ltheta = 120, shade = 0.5,
xlab = "Shape Parameter (k)", ylab = "Scale Parameter (λ)", zlab = "Likelihood",
main = "Likelihood Surface for Weibull Distribution (Censored Data)")
set.seed(42)
n <- 100
shape_param <- 1.5
scale_param <- 2

survival_time <- scale_param * ((runif(n) / (1 - runif(n)))^(1 / shape_param))

censoring <- sample(c(0, 1), n, replace = TRUE, prob = c(0.3, 0.7))

loglogistic_likelihood <- function(alpha, beta, survival_time, censoring) {

uncensored_likelihood <- (alpha / beta) / (1 + (survival_time[censoring == 1] /
beta)^alpha)
censored_likelihood <- 1 / (1 + (survival_time[censoring == 0] / beta)^alpha)
total_likelihood <- prod(uncensored_likelihood) * prod(censored_likelihood)
return(total_likelihood)
}
alpha_values <- seq(0.5, 3, by = 0.1)
beta_values <- seq(0.5, 5, by = 0.1)

likelihood_values <- outer(alpha_values, beta_values, Vectorize(function(alpha, beta) {

loglogistic_likelihood(alpha, beta, survival_time, censoring)
}))

likelihood_values[1:5, 1:5]

persp(alpha_values, beta_values, likelihood_values, theta = 30, phi = 30,

col = "lightblue", ltheta = 120, shade = 0.5,
xlab = "Shape Parameter (α)", ylab = "Scale Parameter (β)", zlab = "Likelihood",
main = "Likelihood Surface for Log-Logistic Distribution (Censored Data)")
Any Questions?

Petroleum Data Managment
No ratings yet
Petroleum Data Managment
52 pages
Understanding Q-Q Plots: Latest News
No ratings yet
Understanding Q-Q Plots: Latest News
4 pages
Section 6 Slides PDF
No ratings yet
Section 6 Slides PDF
362 pages
Statistika-Lingkungan-06 6572 0
No ratings yet
Statistika-Lingkungan-06 6572 0
5 pages
Annotated 3 Ch3 Data Description F2014
No ratings yet
Annotated 3 Ch3 Data Description F2014
16 pages
Statistics Using R Tutorial
No ratings yet
Statistics Using R Tutorial
22 pages
AP Statistics Guide for Students
100% (2)
AP Statistics Guide for Students
12 pages
5 Random Var PDF
No ratings yet
5 Random Var PDF
74 pages
Normality Testing Guide
No ratings yet
Normality Testing Guide
103 pages
Warren Gilchrist - Statistical Modelling With Quantile Functions-Chapman & Hall - CRC (2000)
No ratings yet
Warren Gilchrist - Statistical Modelling With Quantile Functions-Chapman & Hall - CRC (2000)
301 pages
Solutions To Exam 1 Problem Set
100% (1)
Solutions To Exam 1 Problem Set
24 pages
Quantile Estimation, ASTM Data Points, July-August 2014
No ratings yet
Quantile Estimation, ASTM Data Points, July-August 2014
3 pages
Note 02
No ratings yet
Note 02
31 pages
Stat Cheat Sheet
No ratings yet
Stat Cheat Sheet
2 pages
Ap Stat 1-7 Notes
No ratings yet
Ap Stat 1-7 Notes
12 pages
QQPlot Marden 2004
No ratings yet
QQPlot Marden 2004
10 pages
07 One Sample Numerical
No ratings yet
07 One Sample Numerical
42 pages
TN 5 3.2 - 3.3
No ratings yet
TN 5 3.2 - 3.3
5 pages
Statistics Formula Sheet-With Tables
No ratings yet
Statistics Formula Sheet-With Tables
5 pages
Quartiles, Percentiles, Z-Scores Guide
No ratings yet
Quartiles, Percentiles, Z-Scores Guide
57 pages
R Programming
100% (1)
R Programming
4 pages
Stats Exam 1 Cheat Sheet
No ratings yet
Stats Exam 1 Cheat Sheet
3 pages
Quantitative Techniques For Managerial Decision - 1 (Qtmd1G21-1)
No ratings yet
Quantitative Techniques For Managerial Decision - 1 (Qtmd1G21-1)
26 pages
Business Statistics Flashcards - Quizlet11
No ratings yet
Business Statistics Flashcards - Quizlet11
19 pages
Measures of Position
No ratings yet
Measures of Position
19 pages
Math 54 Prac Test #1: ST RD
No ratings yet
Math 54 Prac Test #1: ST RD
2 pages
STATISTICS (Averages and Variation)
No ratings yet
STATISTICS (Averages and Variation)
8 pages
Iie 3017 02
No ratings yet
Iie 3017 02
35 pages
MGT601 Formulas
No ratings yet
MGT601 Formulas
8 pages
Basic Statisticks 1 - Assignment - Vivek T
100% (7)
Basic Statisticks 1 - Assignment - Vivek T
18 pages
ERM 4b Final
No ratings yet
ERM 4b Final
114 pages
Intro to Normal Distribution
No ratings yet
Intro to Normal Distribution
30 pages
Assignment in Stat Level 1
No ratings yet
Assignment in Stat Level 1
17 pages
Cheat Sheet 2 in 1-1
No ratings yet
Cheat Sheet 2 in 1-1
2 pages
Small Résumé Stats
No ratings yet
Small Résumé Stats
8 pages
Probability - Statistics - Class Notes
No ratings yet
Probability - Statistics - Class Notes
15 pages
Data Analysis & Sampling Guide
100% (1)
Data Analysis & Sampling Guide
20 pages
AP Stats Module 1 Notes
No ratings yet
AP Stats Module 1 Notes
2 pages
Lecture 1
No ratings yet
Lecture 1
26 pages
Key of Week1 - Lecture Notes
No ratings yet
Key of Week1 - Lecture Notes
10 pages
Data Types & Probability Analysis
100% (2)
Data Types & Probability Analysis
15 pages
Stat 101 Exam Study Guide
No ratings yet
Stat 101 Exam Study Guide
18 pages
Lecture 05-The Normal Distribution Asessing Normality
No ratings yet
Lecture 05-The Normal Distribution Asessing Normality
23 pages
Unit 1 Assignment SKELETON R spr18
No ratings yet
Unit 1 Assignment SKELETON R spr18
23 pages
Introduction to Statistics
No ratings yet
Introduction to Statistics
43 pages
Assignment (Key) 1
100% (1)
Assignment (Key) 1
16 pages
Measures of Relative Position
100% (1)
Measures of Relative Position
28 pages
Introduction To The Practice of Basic Statistics (Textbook Outline)
100% (14)
Introduction To The Practice of Basic Statistics (Textbook Outline)
65 pages
Statistics 3.4 Answers
100% (1)
Statistics 3.4 Answers
3 pages
Model Selection Using Correlation in Comparison With Qq-Plot
No ratings yet
Model Selection Using Correlation in Comparison With Qq-Plot
5 pages
Spring Semester, 2020-2021
No ratings yet
Spring Semester, 2020-2021
40 pages
Statistics For Data Sciences
No ratings yet
Statistics For Data Sciences
10 pages
Descriptive and Inferential Statistics. Confidence Interval
No ratings yet
Descriptive and Inferential Statistics. Confidence Interval
42 pages
Chapter 2.2
No ratings yet
Chapter 2.2
32 pages
Hydrology Lesson 4 Probability Plot and Method of Moments: Stefania Tamea
No ratings yet
Hydrology Lesson 4 Probability Plot and Method of Moments: Stefania Tamea
13 pages
AP Statistics Michel Liao
No ratings yet
AP Statistics Michel Liao
20 pages
A-Level Statistics Revision Guide
No ratings yet
A-Level Statistics Revision Guide
9 pages
Lec 2 .. Types of Charts
No ratings yet
Lec 2 .. Types of Charts
16 pages
Cox Proportional Hazards (PH) Model
No ratings yet
Cox Proportional Hazards (PH) Model
36 pages
Survival Lec 6-1
No ratings yet
Survival Lec 6-1
63 pages
Order Statistics - Lecture 9
No ratings yet
Order Statistics - Lecture 9
66 pages
Order Statistics - Lecture 8
No ratings yet
Order Statistics - Lecture 8
26 pages
Distributions 2
No ratings yet
Distributions 2
13 pages
MCQ On Index Number With Answers With Bold
No ratings yet
MCQ On Index Number With Answers With Bold
6 pages
Index Number (1) B SC III 16-4
No ratings yet
Index Number (1) B SC III 16-4
44 pages
Python Programming Exercises
No ratings yet
Python Programming Exercises
8 pages
Recursion
No ratings yet
Recursion
2 pages
Lecture 6
No ratings yet
Lecture 6
11 pages
OS Sheet (4)
No ratings yet
OS Sheet (4)
4 pages
Lecturer 3
No ratings yet
Lecturer 3
14 pages
OS Sheet
No ratings yet
OS Sheet
3 pages
OS Sheet (3)
No ratings yet
OS Sheet (3)
9 pages
Neonatal Organ Weights
No ratings yet
Neonatal Organ Weights
12 pages
FINAL ASSESSMENT ASSIGNMENT MBA Stats - Maths 28062020 105406pm PDF
No ratings yet
FINAL ASSESSMENT ASSIGNMENT MBA Stats - Maths 28062020 105406pm PDF
3 pages
Q2 Module 5 - Data Analysis Using Statistics and Hypothesis Testing
No ratings yet
Q2 Module 5 - Data Analysis Using Statistics and Hypothesis Testing
9 pages
Six Sigma - Practicetest.icbb.v2015!12!09.by - Austin.178q
No ratings yet
Six Sigma - Practicetest.icbb.v2015!12!09.by - Austin.178q
85 pages
MoyaniGregorio ManuscriptforPublication
No ratings yet
MoyaniGregorio ManuscriptforPublication
25 pages
Offshore Pile Capacity Reliability
No ratings yet
Offshore Pile Capacity Reliability
9 pages
Machine Learning for Tech Enthusiasts
No ratings yet
Machine Learning for Tech Enthusiasts
12 pages
Pengaruh Efisiensi Operasionalterhadap Profitabilitas
No ratings yet
Pengaruh Efisiensi Operasionalterhadap Profitabilitas
17 pages
TMA From Bearings and Multipath Time Delays
No ratings yet
TMA From Bearings and Multipath Time Delays
12 pages
Statistics For Computing II COM 216
No ratings yet
Statistics For Computing II COM 216
8 pages
Python For Chemistry in 21 Days: Minutes
No ratings yet
Python For Chemistry in 21 Days: Minutes
32 pages
How Does Internet Financial Product Participation Matter For Household Financial Vulnerability? Evidence From China
No ratings yet
How Does Internet Financial Product Participation Matter For Household Financial Vulnerability? Evidence From China
20 pages
1.1 R - A Longitudinal Study of Self-Assessment Accuracy
No ratings yet
1.1 R - A Longitudinal Study of Self-Assessment Accuracy
5 pages
Assessing The Viability of Rain-Powered Mini Hydro-Electric Generator Prototype As Sustainable and Renewable Energy Source For Household Use
No ratings yet
Assessing The Viability of Rain-Powered Mini Hydro-Electric Generator Prototype As Sustainable and Renewable Energy Source For Household Use
13 pages
One-Sample T Test Results Presentation 11. APA Style Results Presentation
No ratings yet
One-Sample T Test Results Presentation 11. APA Style Results Presentation
1 page
Assignment 7
No ratings yet
Assignment 7
2 pages
Bayesian Statistics and MCMC Methods For Portfolio Selection
No ratings yet
Bayesian Statistics and MCMC Methods For Portfolio Selection
62 pages
RESEARCH
No ratings yet
RESEARCH
30 pages
Net - Paper-1
No ratings yet
Net - Paper-1
27 pages
Globus Final Report
No ratings yet
Globus Final Report
80 pages
Stats Learning Practice Solutions
100% (1)
Stats Learning Practice Solutions
3 pages
Religion and Politics Under Capitalism A Humanistic Approach To The Terminology 1st Edition Stefan Arvidsson Online Reading
No ratings yet
Religion and Politics Under Capitalism A Humanistic Approach To The Terminology 1st Edition Stefan Arvidsson Online Reading
91 pages
Laplacian in Computer Vision
No ratings yet
Laplacian in Computer Vision
5 pages
Managerial Economics and Statistics
No ratings yet
Managerial Economics and Statistics
2 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
14 pages
Difference Between Time Series and Cross Sectional Data
No ratings yet
Difference Between Time Series and Cross Sectional Data
3 pages
Modular Learning Challenges
100% (4)
Modular Learning Challenges
13 pages
Clustering Chicken Production Areas
No ratings yet
Clustering Chicken Production Areas
9 pages
Statistics For Business and Economics: Metric Edition, 14th Edition Cengage South-Western Instant Download
100% (1)
Statistics For Business and Economics: Metric Edition, 14th Edition Cengage South-Western Instant Download
60 pages
Silverschmidt Concrete Test Hammer Concrete Test Hammer
No ratings yet
Silverschmidt Concrete Test Hammer Concrete Test Hammer
34 pages

Survival Analysis - Lecture 5

Uploaded by

Survival Analysis - Lecture 5

Uploaded by

Q-Q Plots Likelihood MLE

• Understanding • Defining • MLE Properties

Used to assess if a dataset follows a specified distribution

For example, in a dataset:

Quantiles are widely used in statistics to assess data distribution, compare

from Standard normal table

The equation of the line is:

Role in Survival Analysis:

•Survival Function S(x; θ):

we will use this soon

Censored Data: Each type has a specific contribution to the likelihood

• xi ​: Uncensored event times (exact times of the event ),

Step 3: Taking the

Write down the contribution to the likelihood for

Chris experienced the event at time

Dana was left-censored at time t=2

Erin was interval-censored between t=5 and

Expanding each term using the hazard and survival

The one-parameter exponential distribution is a continuous probability

the log-likelihood function:

the MLE of mean :

confidence interval for λ is:

confidence interval for the mean:

The likelihood function

the log-likelihood function

the MLE of mean :

the same equation of uncensored data but we change the n (number

The probability of dying in 8 weeks is then 1-

Weibull Distribution in Survival analysis

The Weibull distribution is widely used in survival analysis to model time-to-

the log-likelihood function:

The MLE of mean:

The second derivative in Newton-Raphson:

• Provides information on the curvature of the log-likelihood function.

Since 𝛾 is slightly greater than 1, it indicates a slightly right-skewed distribution.

Now lets apply the Estimated parameters on the S(t)

the log-likelihood function:

where r is the number of uncensored data

Setting the derivative to 0 we get

Step 4 Partial Derivative with Respect to 𝛾:

The two estimated equations generally do not have a

After using R program to estimate the parameters

For instance let’s calculate S(8) by substituting t=8, λ=0.07900991, 𝛾= 1.93827

After getting S(8)=0.6630, This result means that the probability a

survivor function ( S(t) )

the log-likelihood function:

Step 2: get the log-likelihood function

Step 3: Substitute the Density Function into the

Step 3: Substitute the Density Function into

Step 4: Differentiate the Log-Likelihood Function

Step 2: Calculate the Kaplan-Meier Estimator

Step 3.3: Adjusting the Parameters

qqPlot(uncensored_data, distribution = "exp", rate = 0.2,

qqPlot(uncensored_data, distribution = "weibull", shape = 1.5,

likelihood_function <- function(lambda, survival_time, censoring) {

lambda_values <- seq(0.01, 1, by = 0.01)

plot(lambda_values, likelihood_values, type = "l", col = "blue", lwd = 2,

likelihood_function_weibull <- function(shape, scale, survival_time, censoring) {

shape_values <- seq(0.1, 3, by = 0.1)

# Plot the likelihood surface

survival_time <- scale_param * ((runif(n) / (1 - runif(n)))^(1 / shape_param))

loglogistic_likelihood <- function(alpha, beta, survival_time, censoring) {

likelihood_values <- outer(alpha_values, beta_values, Vectorize(function(alpha, beta) {

persp(alpha_values, beta_values, likelihood_values, theta = 30, phi = 30,

You might also like

• xi : Uncensored event times (exact times of the event ),