THE T-TEST 1
Hypothesis Testing
The t-test
Hypothesis testing: calculations
from the normal distribution
• Consider: 2 varieties of lettuce
Gigantic Enormous
• Question:
Is one variety heavier than the other or is there
no noticeable difference?
1
THE T-TEST 1
Comparing between 2 varieties
1. Compare one sample from each?
2. Weigh all lettuce plants from both plots?
3. Take representative samples and use normal
distribution statistics?
• When comparing between the 2 means, how would
we know that the difference we see is not just due to
chance?
• If we can show that this possibility is not likely, then
we can say that the difference between the 2 means is
statistically significant.
Gigantic Enormous
787 567
890 671
657 540
765 680
1000
745 698
900 789 520
459 488
800 mean 727.4 594.9
s.d. 75.4 79.0
700
Weight (g)
600 Quick check:
An overlap in
500 error bars is
400 an early
indication
300
that means
200 are not
100 significantly
different from
0 each other.
Gigantic = Enormous
Thus, from a cursory look at the error bars, the considerable overlap
might indicate that the 2 means are not significantly different from
each other (P > 0.05).
2
THE T-TEST 1
Gigantic Enormous
741 567
714 671
717 540
765 680
1000 745 698
710 520
900 700 488
mean 727.4 594.9
800 s.d. 21.8 79.0
Weight (g) 700
600
Quick check:
500 No overlap.
400
300
200
100
0
Gigantic > Enormous
In this particular case, the lack of a considerable overlap in the ranges
of the error bars might indicate that the 2 means are significantly
different from each other (P < 0.05).
Who will believe us?
Although this visual inspection
(i.e., checking if there is a
significant overlap in the error
bars or not) might be useful, it
is no guarantee nor can we use
it as an acceptable basis that
we are correct in our analysis.
What we need is to present an objective basis for
our conclusion through computation.
3
THE T-TEST 1
Hypothesis Testing
• A major goal of statistical testing is to draw inferences
about a population by examining a sample from that
population.
• This can be carried out by proving/disproving hypotheses
that we declare prior to the analysis.
• The null hypothesis, H0
H0: µ1 = µ2 the means of the 2 populations are equal OR
there is no significant difference between the 2
populations OR
the mean of population 1 is not significantly
different from the mean of population 2
• The alternate hypothesis, HA
HA: µ1 ≠ µ2 the means of the 2 populations are not equal OR
population 1 is either less than or greater than
the mean of population 2
The t-test
• probably the most important statistical test in relation to
biological variability
• Objective: to determine if 2 sample means are estimates
of the same population mean
• Method: compares the means with a value called the
standard error of differences between means (s.e.d.m.)
– H0: the 2 means are equal (they are estimates of the same
population
– If we get a value of P > 0.05, then the means are equal
– If the value of P < 0.05, then the means most likely did not come
from the same population (or we can say that there is a
statistically significant difference between the means)
4
THE T-TEST 1
Why P ≤ 0.05?
• Why is 0.05 our threshold value? Arbitrary choice
• P ≤ 0.05 simply means that there is 1 chance in 20 sets of
samples that the 2 means come from the same pool of
possible outcomes (sample space)
• If we are able to show that P is indeed ≤ 0.05, we still
cannot claim that there is a true difference between the
means. Why? Because nothing is 100% sure in statistics.
• It simply means that, when we get a statistically
significant result at P<0.05, there is a 1 in 20 chance that
we have drawn the wrong conclusion (or, to rephrase
that, we have a 5% chance of being wrong).
The t-test
In the simplest terms, the t-test answers the question:
Is the difference between the means big enough?
s.e.d.m.
What is “big enough”?
For very large samples (n ≥ 60), it should be >1.96 (or 2) s.e.d.m.
95% confidence limits*
95% confidence limits*:
Mean 1.96 standard deviations/errors
*note that anything beyond this confidence limit
totals 5%, thus, the 1-in-20 chance of being wrong
mentioned earlier (P<0.05).
-2s -1s Mean +1s +2s
5
THE T-TEST 1
The t-test
In the simplest terms, the t-test answers the question:
Is the difference between the means big enough?
s.e.d.m.
What is “big enough”?
For increasingly smaller sample sizes, the factor 1.96 is amplified
to compensate for the increasingly poor estimate of the true
variance. implication, it becomes increasingly more difficult to disprove
the null hypothesis with smaller and smaller sample sizes.
Amplification factors for different size samples (n-1) can be
found in the table of “t” values (Student’s t), compiled by
William Gossett.
Student’s t distribution
One -
0.25 0.20 0.15 0.10 0.05 0.025 0.01 0.005 0.0005
Tailed
Two -
0.50 0.40 0.30 0.20 0.10 0.05 0.02 0.01 0.001
Tailed
1 1.000 1.376 1.963 3.078 6.314 12.71 31.82 63.66 636.6
2 0.816 1.061 1.386 1.886 2.920 4.303 6.965 9.925 31.60
3 0.765 0.978 1.250 1.638 2.353 3.182 4.541 5.841 12.92
4 0.741 0.941 1.190 1.533 2.132 2.776 3.747 4.604 8.610
5 0.727 0.920 1.156 1.476 2.015 2.571 3.365 4.032 6.869
6 0.718 0.906 1.134 1.440 1.943 2.447 3.143 3.707 5.959
7 0.711 0.896 1.119 1.415 1.895 2.365 2.998 3.499 5.408
8 0.706 0.889 1.108 1.397 1.860 2.306 2.896 3.355 5.041
9 0.703 0.883 1.100 1.383 1.833 2.262 2.821 3.250 4.781
10 0.700 0.879 1.093 1.372 1.812 2.228 2.764 3.169 4.587
0.674 0.842 1.036 1.282 1.645 1.960 2.326 2.576 2.807
6
THE T-TEST 1
2-tailed vs 1-tailed
• 1-tailed test:
1 1 2
2
H0: µ1 = µ2
HA: µ1 > µ2
Notice that the alternative
hypothesis in the 1-tailed test
only considers one possibility
• 2-tailed test:
H0: µ1 = µ2
HA: µ1 ≠ µ2
• µ1 > µ2
• µ1 < µ2
The alternate hypothesis of the
2-tailed test considers both
possibilities
The standard t-test
• most common
• For sample sizes of 30 or less
7
THE T-TEST 1
The standard t-test: procedure
TREATMENT A TREATMENT B
MEAN A MEAN B
Calculate sum of squares Calculate sum of squares
of figures in column A of figures in column B
Add the two sums of squares together
The standard t-test: procedure
Divide by the pooled df to obtain pooled variance.
Compute for standard deviation.
Compute for s.e.d.m.
s.e.d.m. = s x √1/nA+1/nB
TEST: is the observed difference between the means divided by the s.e.d.m
> T at P=0.05 and 1/nA + 1/nB degrees of freedom?
8
THE T-TEST 1
t-test Decision Guide*
t statistic P value decision conclusion
Accept H0 2 groups are not
significantly
ttab > tcalc P > 0.05 different from
Reject HA each other
Reject H0 2 groups are
significantly
ttab < tcalc P < 0.05 different from
Accept HA each other
*Save yourself precious calories in thinking; use this decision guide as soon as you obtain your
calculate t value or the p value.
Sample problem
Men (x series) Women (y series)
We have a sample of 12 men and 75 80
14 women who have been 72 76
subjected to the horror of a 68 74
statistics exam and want to test
66 70
whether there is any evidence
that the difference of 2.71% in 65 68
the mean percent mark 65 68
awarded (59.08 for men as 60 66
against 61.79 for women) is 58 65
statistically significant. Exam
50 62
marks, especially those
between 80% and 20%, can be 48 58
expected to be normally 42 56
distributed, so transformation is 40 43
not necessary. 40
39
9
THE T-TEST 1
Sample problem
• H0: x = y the mean scores of the men is not significantly different
from the mean scores of the women
• HA : x ≠ y the mean scores of the men is significantly different
from the mean scores of the women
Note that we are using x and y here to
represent the mean scores of the men and
women because these were explicitly
identified in the given data (i.e., x and y
series)
Sample problem
Men (x series) Women (y series)
n 12 14
Total x 709 865
Mean x 59.08 61.79
Correction factor
41,890.08 53,444.64
(x)2 / n
Added squares
43,371 55,695
x2
Sum of squares: -
1480.91 2250.36
x2 - (x)2 / n
Pooled sum of squares 3731.26
Pooled df: (12-1)+(14-1)
24
=11+13
Pooled variance: (SS/df) 155.47
10
THE T-TEST 1
Sample problem
Men (x series) Women (y series)
Pooled standard deviation, s 12.47
s.e.d.m.
4.91
s√1/n + 1/n = 12.47√1/12+1/14
t = x and y difference
(61.79-59.08)/4.91 = 0.55
s.e.d.m
Note that this value cannot be
tcalc = 0.55 negative, thus, use the absolute
value of this difference
ttab,0.05,24
= 2.06
0.55 < 2.06
tcalc < ttab
Thus, accept H0 :there is no significant
difference in the mean test scores of the 2
groups.
Practice
Determine if there are significant differences in
the mean height and weight of the 5 female
and 5 male students. Use ttab = t0.05(2),8 =2.306
Height in meters Weight in kg
females males females males
1.62 1.68 52 64
1.50 1.54 45 46
1.55 1.72 47 68
1.49 1.63 47 45
1.65 1.70 47 49
11
THE T-TEST 1
A B
n
Total x
Mean x
Correction factor (x)2 / n
Added squares x2
Sum of squares: - x2 - (x)2 / n
Pooled sum of squares
Pooled df: dfA + dfB
Pooled variance: SS/df
Pooled sd
s.e.d.m. sd * √(1/nA + 1/nB)
t = difference of 2 means
s.e.d.m
Data Set
70
60
50
40
males
30
females
20
10
0
height weight
HO: height of males = height of females
HA: height of males ≠ height of females
HO: weight of males = weight of females
HA: weight of males ≠ weight of females
12
THE T-TEST 1
Height
Females Males
n 5 5
Total x 7.81 8.27
Mean, x 1.562 1.654
Correction factor
(x)2 / n 12.199 13.679
Added squares
x2 12.219 13.699
Sum of squares
x2 - (x)2 / n 0.020 0.021
Pooled sum of squares 0.041
Pooled df 8
Pooled variance: SS/df 0.005
Height
Males Males
Pooled standard deviation 0.072
s.e.d.m.
0.046
s√1/n + 1/n = 0.072√1/5+1/5
t = x and y difference
/1.562-1.654/÷0.046 = 2.000
s.e.d.m
tcalc = 2.0
ttab = t0.05(2),8= 2.306
ttab > tcalc , P > 0.05 actual P = 0.0766
This can be
Thus, accept H0: = calculated using
Excel or JASP
or the 0.092m difference between males
and females is not statistically significant
13
THE T-TEST 1
Weight
Females Males
n 5 5
Total x 238 272
Mean x 47.6 54.4
Correction factor
(x)2 / n 11328.8 14796.8
Added squares
x2 11356 15262
Sum of squares
x2 - (x)2 / n 27.2 465.2
Pooled sum of squares 492.4
Pooled df 8
Pooled variance: SS/df 61.55
Weight
Females Males
Pooled standard deviation 7.85
s.e.d.m.
4.96
s√1/n + 1/n = 7.85√1/5+1/5
t = x and y difference
/47.6-54.4/÷4.96 = 1.37
s.e.d.m
tcalc = 1.37
ttab = t0.05(2),8= 2.306
ttab > tcalc , P > 0.05 actual P = 0.208
Thus, accept H0: =
or the 6.8 kg difference between males
and females is not statistically significant
14