BM005IU - Statistics for Health Science
Lab 4: Hypothesis Testing
LEARNING OUTCOMES
• Understand how to conduct t-test (one-sample, two-
sample, paired) using R studio
• Be able to calculate power and sample size from a
study with t or z test.
1. Student t-Test
In the base version of R, the function t.test() is a very versatile tool for hypothesis
testing. An example of the syntax is
t.test(x, y, alternative = “two.sided”, mu = 0, paired =
FALSE, conf.level = 0.95)
Arguments Meaning
x The first numerical dataset
y The second numerical dataset (only for two-sample test, optional)
alternative Type of alternative hypothesis: “two.sided”, “less”, or “greater”
mu The true population means. Set to NULL if two-sample test is used.
paired Type of two-sample comparison: TRUE (paired sample), FALSE
(independent sample)
conf.level Confidence level (default 0.95). This is only used to compute the
confidence interval
Example 1: Given below are the height of 10 people. The original population has a
normally distributed height with a known mean of 165 cm with unknown variance
186 169 177 180 178 173 187 173 192 173
a. Create a vector containing the heights of the group
height1 <- c(186, 169, 177, 180, 178, 173, 187, 173, 192, 173)
Perform a hypothesis test to determine whether this group of 10 people has a different mean
height compared to the population average.
1
BM005IU - Statistics for Health Science
t.test(height1, mu = 165)
You should obtain the following output:
data: height1
t = 5.8808, df = 9, p-value = 0.0002346
alternative hypothesis: true mean is not equal to 165
95 percent confidence interval:
173.4916 184.1084
sample estimates:
mean of x
178.8
Note that the output does not directly state whether the difference is statistically significant or
not, but a p-value is always reported. Recall from the lecture that a p-value that is lower than
the chosen significance level (in this case, lower than 0.05) is considered statistical
significance.
Task 1:
a. The dataset height2 contains the height of the twin siblings of each individual in
height1. Perform a hypothesis test to determine whether the height of twin siblings
significantly differs with 0.05 significance level.
height1 <- c(186, 169, 177, 180, 178, 173, 187, 173, 192, 173)
height2 <- c(179, 186, 160, 168, 169, 175, 168, 151, 153, 179)
b. Repeat the test in question (a) but with treating the two samples as independent.
What do you notice about the result?
c. The dataset height3 and height4 contains the heights of a small sample of people
born in the west district and south district, respectively. Determine whether the
mean of height in each district significantly differ with 0.05 significance level.
height3 <- c(183, 163, 173, 175, 173, 168, 184)
height4 <- c(164, 182, 164, 176, 184, 154, 163, 164, 170, 163,
143)
Task 2: Two groups of mice are said to come from the same population. The weight of
each group of mice is given below
Mice1 <- c(30.6, 29.2, 27.3, 29.6, 25.9, 28.8, 29.1, 28.3,
29.1, 30, 31.9, 27.2, 26.9, 29.8, 29.7, 30.2, 30.8, 30.9,
33.9, 31.5, 29.8, 28.4, 28.7, 30.8, 31.3, 31.3, 31.1, 25.6,
2
BM005IU - Statistics for Health Science
31.9, 30.6, 31.5, 28.5, 27.9, 28.7, 26.3, 29, 31.5, 28.1,
30.5, 29.7, 28.8, 30.2, 29.4, 29.3, 28.9)
Mice2 <- c(30.4, 28.9, 25.5, 30.5, 27.5, 24.1, 27.5, 28.1,
25.6, 26, 30.6, 26.6)
a. Produce a graphical representation of the distribution of each group.
b. Using appropriate statistics, describe the location and the spread of each group.
c. Calculate the difference of the means 𝑑. Assuming the two groups come from the
same population, what is the probability that we obtain two groups with a
difference of the means equal to or larger than 𝑑.
d. Compute the 95% confidence interval of 𝜇1 − 𝜇2 . Based on this information, what
can you conclude from the claim that these two groups come from the same
population (i.e. having the same weight distribution)?
2. Power Calculation
Power of a test is the probability of correctly rejecting the null hypothesis i.e.
Pr(𝑟𝑒𝑗𝑒𝑐𝑡 𝐻0 | 𝐻0 𝑓𝑎𝑙𝑠𝑒). Note that this is related to the rate of type II error (false
negative):
β = 1 − Power
The power of a test can be calculated using the functions in the package “pwr”.
pwr.norm.test()
Arguments Meaning
d Scaled effect size (𝜇1 − 𝜇0 )/𝜎
n Sample size (ignored if you want to calculate the sample size)
sig.level Significance level, usually set as 0.05
power Power, usually set as 0.8 (ignored if you want to calculate the power)
alternative Type of alternative hypothesis: “two.sided”, “less”, or “greater”
pwr.t.test()is applicable for both one-sample and two-sample tests. However, the
function is more limited in two-sample analysis.
Arguments Meaning
d Scaled effect size (𝜇1 − 𝜇0 )/𝑠 (if two-sample test is selected, use the
pooled standard deviation here)
n Sample size (ignored if you want to calculate the sample size)
sig.level Significance level, usually set as 0.05
3
BM005IU - Statistics for Health Science
power Power, usually set as 0.8 (ignored if you want to calculate the power)
type Types of test: “one.sample”, “two.sample”, or “paired”
alternative Type of alternative hypothesis: “two.sided”, “less”, or “greater”
Example 2: Calculate the power of a one-sample t test with a sample size of 12, where
𝜇1 = 256, 𝜇0 = 250, and the sample standard deviation of 9.5. (0.05 significance level)
pwr.t.test(d =(256-250)/9.5, n = 12, sig.level = 0.05, type =
“one.sample”)
Your output should be:
One-sample t test power calculation
n = 12
d = 0.6315789
sig.level = 0.05
power = 0.5142725
alternative = two.sided
Task 2: The dissolution time of the pharmaceutical tablet X is normally distributed with
a well-established variance of 47 𝑚𝑖𝑛2 and a mean of 30 𝑚𝑖𝑛. Random tablets from
three batches were taken randomly for quality assessment:
Batch A: 15 tablets were taken with an average dissolution time of 34 min
Batch B: 18 tablets were taken with an average dissolution time of 29 min
Batch C: 25 tablets were taken with an average dissolution time of 35 min
a. Calculate the p-value when comparing each batch with the standard 30 min
dissolution time. Which of the batches are different with statistical significance?
b. Determine the power of each of the three tests.
You may notice that in the previous calculations, the power of a test is determined AFTER
the data has been collected (i.e. post hoc). Post hoc analysis of power shows the
probability that the collected data leads to a false negative result.
The power can be also pre-determined as the planning stage of a study (i.e. a priori),
which is used to calculate the sample size required to achieve said power.
To do this, set n to NULL, and include the power = 0.8. The difference between 𝜇0 and
𝜇1 might be chosen based on clinical, biological, or legal relevance.
4
BM005IU - Statistics for Health Science
c. A difference of 5 minutes or lower is considered acceptable. Calculate the
minimum number of tablets needed to be sampled so that the power to detect a
difference of 5 minutes is 0.8.
3. Questions
Question 1: Elevated serum level of the enzyme aspartate amino transferase (AST) is an
indicator of liver diseases. A new nanoparticle formulation N is supposed to reduce
serum AST level in patients with liver inflammation. To test this claim, a group of
individuals are given N for 6 months. The researchers want to detect a change in 5 units
per litre over the study with a significance level of 0.05 and a power of 0.8.
a. Assume the standard deviation of the difference in serum AST is normally
distributed with 𝜎𝐷 = 13 𝑢𝑛𝑖𝑡𝑠 𝑝𝑒𝑟 𝑙𝑖𝑡𝑟𝑒. Calculate the minimum sample size
required.
b. Due to logistical constraints, you are only allowed to recruit up to 30 patients
in your study. Calculate the power if you recruit this number of participants.
Question 2: Triceps Skinfold Thickness (TST) (cm) is an indicator of the amount of fat
stored in the human body. One study compares the TST in group 1 (people without
chronic respiratory diseases) and group 2 (people with chronic respiratory disease). For
this question, you will need to import or enter the data in the dataset “Lab4.csv” on
Blackboard.
a. Find a 95% confidence interval estimation of the mean TST in each group.
b. From previous studies, the population mean TST is assumed to be 1.3 cm.
Perform appropriate test to compare the TST of each group with this value at a
significance level 0.05. Please state the test statistic, p-value, and the
conclusion of the test in your response.
c. Perform an appropriate hypothesis test to compare the TST in each group at a
significance level of 0.05. Please state the test statistic, p-value, and the
conclusion of the test in your response.
4. Submission
Please complete your answer to ALL questions in this lab. Submit your works as a pdf
file via this link by 13:00 Tuesday 19/05/2025.
_______________________________________________________________
This is the end of Lab 4! Have a great week!