Final-Review BIOSTATS PHD

The document is a review of key statistical tools and concepts in biostatistics, focusing on methods for analyzing continuous and categorical data. It covers correlation, regression, t-tests, ANOVA, non-parametric tests, binomial tests, and chi-square tests, providing guidelines on when to use each method with example scenarios. The review emphasizes the importance of study design and data distribution in making accurate statistical inferences about populations.

Uploaded by

seamusmcfitz3091

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views24 pages

Final-Review BIOSTATS PHD

Uploaded by

seamusmcfitz3091

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Final Exam Review

Introduction to Biostatistics 171:161

The Main Idea
• Statistics are a tool for us to take information from a smaller group
and use it to reach conclusions/answer questions about a population.
• Most of what we do is based off of this:
• We look at study design to make sure that the smaller group is reflective of
the population we are interested in.
• We look at the distribution of our data and chose methods that will give us
the most accurate answers.
• We also look at the questions we are trying to answer and chose methods
that best address them.
Our Tools and When to Apply them
• The rest of this review will be a summary of the main statistical tools
and concepts we have learned in the class.
• We will look at all the tools for continuous data first, and categorical
data second.
• The format will look like:
• An introduction of a tool/method
• When to use it/when not to use it
• An example question that hints at that method being appropriate.
Correlation
• Correlation is a point estimate of the association between two
continuous (or nearly continuous) variables
• The correlation coefficient is represents by the symbol “r” and takes
on values between -1 and 1.
• -1 is a perfect negative correlation, 0 is no correlation, +1 is perfect
positive correlation.
• Correlation is symmetric (the r measuring height vs. weight is the
same as the r measuring weight vs. height)
Visual of Different Correlation Coefficients
When to use Correlation
• Correlation is used as a summary statistic for the association between
two continuous variables, so use it if you are asked to measure the
strength of association between two continuous variables.

Example Scenario:
Each day a local ice cream shop keeps track of its sales and the
temperature that day. The store manager is interested in seeing if
there is a relationship between sales and temperature. How might the
manager assess the possibility of a relationship?
Regression
• Regression shares a lot of similarities with correlation
• Both are often shown visually through scatter plots
• Both are often used assess the relationship between two continuous variables
• However regression is focused on predicting/modeling one variable
based on knowing another.
• The equation has the form of a line:
𝑌 = 𝛼 + 𝛽𝑋 X is the predictor, β is the slope, α is the intercept,
Y is the response
• Regression equations are not symmetric (the equation where X
represents height, Y represents weight will not have the same
alpha/beta as the reverse)
When to Use Regression
• Use regression when you want to predict a continuous outcome
based off of knowing a related variable.

Example Scenario:
Each day a local ice cream shop keeps track of its sales and the
temperature that day. The store manager is interested in projecting his
sales over the next week based off of the weather forecast. Is there are
way to do this? How?
The One-Sample t-test
• The t-distribution is used over the Normal distribution when
population standard deviation is unknown (we use the sample
standard deviation as its estimate)
• For hypothesis testing a One-Sample t-statistic has the form:
𝑥 − 𝜇0
𝑡𝑜𝑏𝑠 =
𝑆
𝑛

Where 𝑥 is the sample average, 𝜇0 is the hypothesized value (zero in paired cases), S is the sample standard dev.
When to use a One-Sample t-test
• Use this t-test when you have 1 group (or 2 paired groups) with
continuous, normally distributed data that you wish compare to a
hypothesis.
The Two-Sample t-test
• The two sample t-test is an extension of the one sample test. It is used to
assess the difference between two the mean or two unpaired groups.

(𝑥1 −𝑥2 )−0

• 𝑡𝑜𝑏𝑠 = 1 1
with degrees of freedom = n1 + n2 – 2
𝑆𝐷𝑝𝑜𝑜𝑙𝑒𝑑 𝑛 +𝑛
1 2

• Use the two-sample t-test when comparing two continuous means for
normally distributed data. Use Student’s test when you are willing to
assume equal standard deviations for each group, use Welch’s test when
you are not.
ANOVA (analysis of variance)
• Think of ANOVA as an extension of the t-test used to assess the difference
between the means of 3 or more treatment groups.
• The test statistic is a ratio of variability being explained by a model to the
variability that remains unexplained.
• The test statistic follows the F-Distribution, and has the form:
𝑆𝑆𝑅0 −𝑆𝑆𝑅1
𝐹𝑜𝑏𝑠 = /(𝜎 2 )
𝑑1 −𝑑0
• Here 𝜎 2 is found by summing the squares of the residuals and then dividing
by (n - #groups)
• SSR1 is the sum of squares of the residuals in the full model
• SSR0 is the sum of squares of the residuals in the null model
Non-Parametric Tests
• Non-parametric tests are done by ranking observations, and
comparing the ranks. They are used on continuous data that is highly
skewed or contains extremely influential outliers.
• The Wilcoxon Signed Rank test is the non-parametric version of one-sample t-
test on paired data
• The Wilcoxon Rank Sum test is the non-parametric version of a two-sample t-
test on unpaired data
• Spearman Correlation is a non-parametric approach to quantifying
association with correlation.
Binomial Test
The binomial distribution is used for data with binary outcomes.

𝑛 𝑘 𝑛−𝑘
𝑃 𝑋=𝑘 = 𝑝 1−𝑝
𝑘

-The probability for a certain number of successes(k) out of a certain

number of outcomes(n) is found using the above formula.
-Often we need to sum many of these to find a p-value.
-The complement rule is commonly used with binomial probilities
Binomial Example
A study at Johns Hopkins estimated the survival chances of infants born
prematurely by surveying the records of all premature babies born at their
hospital in a three-year period. In their study, they found 39 babies who
were born at 25 weeks gestation, 31 of which survived at least 6 months. If
the true survival probability is 50%, how likely is it that 31 or more babies
would survive.

This is a binomial setting because:

#1) The outcome is binary (survived/died)
#2) We are assessing a number of successes out of a number of trials (31
survivals out of 39 possibilities)
Pearson’s Chi-Square and Fisher’s Exact Tests
• Pearson’s Chi-Square Test is a way of assessing association between
categorical groups typically used on 2x2 contingency tables
• Fisher’s Exact Test is an exact approach used on contingency tables
• See the last review session or the last practice quiz for examples.
Example #1
• A team from Yale School of Medicine took a look at 1,433 people
diagnosed with intracranial meningioma, the most commonly
diagnosed brain tumor in the United States. Researchers compared
these patients to a test group of 1,350 people without tumors.
Participants offered self-reported lifetime dental X-ray histories.
Researchers then analyzed the different types of X-rays these two
groups had undergone. Patients with tumors were more than twice
as likely to have had "bitewing" X-rays at least once per year.
Bitewings, in which a patient bites down on X-ray film, take photos of
the upper and lower back teeth.
Example #1 – Key points
• The outcome here was categorical
• This was a retrospective study
• The article uses an odds ratio to measure an association
Example #2
• In a study of 16 overweight young adults in India, participants were
given, in turns, a dose of an extract made from unroasted coffee
beans and a placebo, three times a day over 22 weeks. Their diet
throughout the study was unchanged, and they were physically
active. Between trials, the participants were given a two-week break
for their bodies to reset. Though a few participants given the extract
only lost 7 pounds, others lost as much as 26 pounds. On average, the
subjects lost 17.5 pounds each, and reduced their body weight by
10.5 percent. Body fat also declined by 16 percent, even though the
participants were eating an average of 2,400 calories and burning
roughly 400.
Example #2 – Key points
• This was a cross-over study design
• These data could be analyzed using a 1-sample t-test testing whether
the weight loss could be zero.
• No evidence of outliers, the mean is centered almost perfectly
between the minimum and maximum.
Example #3
• Researchers at the University of College London surveyed nearly
8,000 participants over the age of 52. Using a fake aspirin bottle
complete with instructions as the testing instrument, researchers
asked participants to answer four basic questions, including "What is
the maximum number of days you may take this medicine?" and "List
three situations for which you should consult a doctor." All the
answers could be found on the label. One third of the adults failed to
correctly answer all four questions, and one in eight got two or more
wrong. Researchers then monitored the volunteers' health for five
years. During that time, 621 of the participants died, and people who
missed two or more questions were more than twice as likely to have
died than those who got the answers correct.
Example #3 – Key points
• Researchers subdivided the participants into 2 groups and assessed
the differences between these two groups.
• The outcome was survival, measured as a binary outcome. The
researchers used relative risk as a measure of association.
Example #4
• Researchers from Penn State found that increasing the amount of
spices in your diet may lower the level of potentially harmful fat in
your bloodstream. The experiment compared two groups of healthy,
overweight men. One group ate meals seasoned with the special
spice blend; the other ate the same meals prepared without the
spices. Men who ate the spicy food saw a decrease of one-third in the
level of triglycerides (a type of fat linked to heart disease) in their
bloodstreams, and 20 percent lower insulin levels overall — even
when the meals were high in fat and made with heavy oils.
Example #4 – Key points
• The outcomes measured here are continuous (the difference in
triglycerides/insulin)
• There are two groups, a treatment and a control group
• The data could be tested using a two-sample t-test, the researchers
reported using a ratio of means.

Optimal Statistical Test Selection
No ratings yet
Optimal Statistical Test Selection
4 pages
Statistics For A2 Biology
100% (1)
Statistics For A2 Biology
9 pages
Choosing A Significance Test Objectives
No ratings yet
Choosing A Significance Test Objectives
15 pages
Nonparametric Test: DR - Dr. Siswanto, MSC
No ratings yet
Nonparametric Test: DR - Dr. Siswanto, MSC
44 pages
Aiml 4
No ratings yet
Aiml 4
64 pages
Pearson R Correlation: Test
No ratings yet
Pearson R Correlation: Test
5 pages
Statistics in Research
No ratings yet
Statistics in Research
48 pages
Statistics For Dummies
100% (3)
Statistics For Dummies
41 pages
Non Parametric Test
No ratings yet
Non Parametric Test
6 pages
Last Meeting Incomplete
No ratings yet
Last Meeting Incomplete
6 pages
NUR5201 Week1 Research-Principles
No ratings yet
NUR5201 Week1 Research-Principles
33 pages
Parametric & Non-Parametric Tests
100% (1)
Parametric & Non-Parametric Tests
34 pages
All Voting Slides Alone (Without Solutions)
No ratings yet
All Voting Slides Alone (Without Solutions)
11 pages
Parametric & Non-Parametric Tests
No ratings yet
Parametric & Non-Parametric Tests
34 pages
Choosing The Right Statistical Test
No ratings yet
Choosing The Right Statistical Test
10 pages
Non-Parametric Tests Guide
No ratings yet
Non-Parametric Tests Guide
58 pages
IB Biology IA Statistical Analysis Guide
No ratings yet
IB Biology IA Statistical Analysis Guide
20 pages
Biostatstics Getcho W1Sat
No ratings yet
Biostatstics Getcho W1Sat
14 pages
Statistical METHODS
No ratings yet
Statistical METHODS
96 pages
06 Correlational Statistics
No ratings yet
06 Correlational Statistics
32 pages
How To Select Appropriate Statistical Test?: Technical Notes
No ratings yet
How To Select Appropriate Statistical Test?: Technical Notes
3 pages
RBC Statistics Overview RBC
No ratings yet
RBC Statistics Overview RBC
31 pages
Choosing The Right Statistical Test - Types & Examples
No ratings yet
Choosing The Right Statistical Test - Types & Examples
6 pages
Statistics - Thesis Writing
No ratings yet
Statistics - Thesis Writing
18 pages
Non Parametric Tests
100% (1)
Non Parametric Tests
37 pages
Statistical Treatment
No ratings yet
Statistical Treatment
7 pages
Essential Medical Stats - MSC Clin Res
No ratings yet
Essential Medical Stats - MSC Clin Res
67 pages
Statistical Tests and Types Explained
No ratings yet
Statistical Tests and Types Explained
31 pages
SRM
No ratings yet
SRM
6 pages
What Does A Statistical Test Do
No ratings yet
What Does A Statistical Test Do
16 pages
Statistics Supplement McEvoy
No ratings yet
Statistics Supplement McEvoy
10 pages
Phân Tích Dữ Liệu Và Xác Định Phép Kiểm Thống Kê
No ratings yet
Phân Tích Dữ Liệu Và Xác Định Phép Kiểm Thống Kê
50 pages
Medical Statistics New
No ratings yet
Medical Statistics New
46 pages
Statistical Analysis 101
No ratings yet
Statistical Analysis 101
38 pages
JASP
No ratings yet
JASP
8 pages
2M Biostatistics & Research Methodology Ans PDF
100% (1)
2M Biostatistics & Research Methodology Ans PDF
17 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
22 pages
R&M Assignment
No ratings yet
R&M Assignment
5 pages
PDF - 4.2 Review On Inferential Statistics Choosing The Correct Tool
No ratings yet
PDF - 4.2 Review On Inferential Statistics Choosing The Correct Tool
43 pages
Biostatistics M1-1
No ratings yet
Biostatistics M1-1
57 pages
Lecture 2 Guide To Statistics2
No ratings yet
Lecture 2 Guide To Statistics2
23 pages
Final Practice 1 Answers
No ratings yet
Final Practice 1 Answers
6 pages
Parametric Tests Explained
No ratings yet
Parametric Tests Explained
47 pages
Parametric and Non Parametric Assignment
No ratings yet
Parametric and Non Parametric Assignment
17 pages
Test of Significanc
No ratings yet
Test of Significanc
24 pages
Inferential Statistics: DR Abrar Umar
No ratings yet
Inferential Statistics: DR Abrar Umar
28 pages
Week 6 - Result and Analysis 2 (UP)
No ratings yet
Week 6 - Result and Analysis 2 (UP)
7 pages
Lecture 4 - How To Choose A Statistical Test
No ratings yet
Lecture 4 - How To Choose A Statistical Test
18 pages
Descriptive Statistics Inferential Statistics: Chinna Chadayan
No ratings yet
Descriptive Statistics Inferential Statistics: Chinna Chadayan
40 pages
Stats Test Selection Reading
No ratings yet
Stats Test Selection Reading
3 pages
Non Parametric Test Methods Final
No ratings yet
Non Parametric Test Methods Final
56 pages
Test of Statistical Hypothesis
No ratings yet
Test of Statistical Hypothesis
100 pages
Student X Height (CM) y Weight (KG) : Relationship
No ratings yet
Student X Height (CM) y Weight (KG) : Relationship
8 pages
JAMOVI AND Basic Statistics
No ratings yet
JAMOVI AND Basic Statistics
28 pages
Correlation and Regression
No ratings yet
Correlation and Regression
5 pages
Stats For Primary FRCA
No ratings yet
Stats For Primary FRCA
7 pages
Skewness & Kurtosis Guide
100% (1)
Skewness & Kurtosis Guide
13 pages
Ams 310 HW 1
No ratings yet
Ams 310 HW 1
9 pages
Slovins Formula
No ratings yet
Slovins Formula
20 pages
Confusion Matrix
No ratings yet
Confusion Matrix
3 pages
The Spearman and Kendall Rank Correlation Coefficients Between Intuitionistic Fuzzy Sets
No ratings yet
The Spearman and Kendall Rank Correlation Coefficients Between Intuitionistic Fuzzy Sets
8 pages
OUTPUT Spss Versi 20
No ratings yet
OUTPUT Spss Versi 20
7 pages
Module 1 Quiz
60% (5)
Module 1 Quiz
7 pages
Assignment 3
No ratings yet
Assignment 3
12 pages
Understanding Multivariate Research A Primer For Beginning Social Scientists First Edition. Edition Berry Instant Download
No ratings yet
Understanding Multivariate Research A Primer For Beginning Social Scientists First Edition. Edition Berry Instant Download
100 pages
New Findings On Key Factors in Uencing The UK's Referendum On Leaving The EU
No ratings yet
New Findings On Key Factors in Uencing The UK's Referendum On Leaving The EU
11 pages
Statistical Methods Used in Qsar: Dr. Chirag J. Patel
100% (2)
Statistical Methods Used in Qsar: Dr. Chirag J. Patel
29 pages
Math Lesson Plans for Teachers
No ratings yet
Math Lesson Plans for Teachers
8 pages
Computational Laboratory For Economics
0% (1)
Computational Laboratory For Economics
461 pages
Student Group Project Guide
No ratings yet
Student Group Project Guide
3 pages
Hypothesis Testing Basics
No ratings yet
Hypothesis Testing Basics
40 pages
Correlation & Regression Numericals
No ratings yet
Correlation & Regression Numericals
4 pages
Business Statistics Study Guide For ToHM
No ratings yet
Business Statistics Study Guide For ToHM
4 pages
Early vs Late Caffeine in Preterm Neonates
No ratings yet
Early vs Late Caffeine in Preterm Neonates
10 pages
365 Data Science Axs
No ratings yet
365 Data Science Axs
103 pages
Hypotheis Testing
No ratings yet
Hypotheis Testing
12 pages
The T Test Prepared by B.saikiran (12NA1E0036)
No ratings yet
The T Test Prepared by B.saikiran (12NA1E0036)
14 pages
GRR & Aaa
No ratings yet
GRR & Aaa
9 pages
Discriminant Analysis Guide
No ratings yet
Discriminant Analysis Guide
16 pages
Quantitative Reasoning Final Exam (QTN-561)
100% (2)
Quantitative Reasoning Final Exam (QTN-561)
17 pages
Rowing Propulsive Force Analysis
No ratings yet
Rowing Propulsive Force Analysis
28 pages
Data Transformation and Standardization
No ratings yet
Data Transformation and Standardization
5 pages
CFA 2024 L1 Hypothesis Testing
No ratings yet
CFA 2024 L1 Hypothesis Testing
19 pages
Tyagi Et Al. 2021
No ratings yet
Tyagi Et Al. 2021
19 pages
AAAI-2023 教程用于因果推断的机器学习
No ratings yet
AAAI-2023 教程用于因果推断的机器学习
145 pages
FRM一级公式表
No ratings yet
FRM一级公式表
13 pages

Final-Review BIOSTATS PHD

Uploaded by

Final-Review BIOSTATS PHD

Uploaded by

Final Exam Review

Introduction to Biostatistics 171:161

(𝑥1 −𝑥2 )−0

-The probability for a certain number of successes(k) out of a certain

This is a binomial setting because:

You might also like