Statistical Packages
FOUR
1 Dr Mahi
Statistical Inference about
Difference between Two Populations
Means (𝝁𝟏 -𝝁𝟐 )
Independent Samples Dependent Samples
Parametric Non- Parametric Non-
parametric parametric
Dr Mahi
Statistical Inference about ( 𝝁𝟏 − 𝝁𝟐 )
If we want to make a comparison between two different
groups( independent), to make this comparison we take the
difference between their means.
Today we are going to talk about a new parameter , the
difference about two population means (𝝁𝟏 − 𝝁𝟐 ) which is
an unknown parameter and we want to make inference
about it using the difference between the two sample
means( ). Group 1 ( 𝝁𝟏 ) Group 2 ( 𝝁𝟐 )
Men Salaries Women Salaries
Production in firm 1 Production in firm 2
Scores of girls Scores of boys
-
Time taken by men Time taken by women
Profit in firm 1 Profit in firm 2
Loss in Factory A Loss in Factory B
Performance of male Performance of female
Income of women Income of men
3 Dr Mahi Efficiency of bank A Efficiency of bank B
Testing hypotheses for Two Sample
Testing hypotheses about (𝝁𝟏 − 𝝁𝟐 )
Testing hypotheses for two populations ( the difference
between two means) has either one of the following three
tests:
Left tailed test Two tailed test Right tailed test
4 less
Dr Mahi greater
Introduction to Statistical Hypotheses Testing:
Parametric Tests Non-Parametric Tests
Normal Population Population not Normal
One Two K One Two K
Sample Samples Samples Sample Samples Samples
Independent
Independent
Independent
Independent
Dependent
Dependent
Dependent
Dependent
1) Requires specific assumptions about the 1) Does not require any assumptions about
distribution of the data. The data distribution.
2) Used with quantitative data only. 2) Used with quantitative & qualitative data.
3) Have greater power. 3) Have less power.
5
Dr Mahi
One Sample tests
Non-Parametric
Parametric Tests
Tests
Population is Normal Population is not Normal
1-sample Z 1-sample t 1-sample 1-sample
Wilcoxon sign
Variance of Variance of
the the Not
population population Symmetric
is Known
symmetric
is unknown
Testing for the mean Testing for the median
6 Dr Mahi
Two Sample tests
Independent Dependent
Samples Samples
Non- Non-
Parametric Parametric Parametric
Tests Tests Parametric
Tests Tests
2 independent Mann-Whitney Wilcoxon
samples t-test test Paired t-test test
Testing for the Testing for the Testing for the Testing for the
7 mean median Dr Mahi mean median
Testing hypotheses for 2-independent samples
( Parametric case)
2-independent- samples t-test
1) This test is used when we want to test the equality of two means of
independent populations 𝝁𝟏 and 𝝁𝟐
2) The null and alternative hypotheses have one of the following
shapes:
Left tailed test
Two tailed test
Right tailed test
Dr Mahi
Dr Mahi
Assumptions of the 2-independent samples t- test:
1) Data of each sample is from a normal population:
2) Data values must be independent and random:
3) The variance of the two independent groups are equal:
ينفذ االثنين
معا
9
Dr Mahi
Example 1:
Test the hypothesis that the height of men
and women are the same, or that there is no
difference between the height of men and
women at α = 5%.
Solution:
Ho: 𝝁𝟏 = 𝝁𝟐 claim
H1: 𝝁𝟏 𝝁𝟐≠
OR
10
Dr Mahi
Let’s first satisfy the assumptions of the test:
1) Data of each sample is from a normal population:
Stat Basic Statistics Normality Test
Sample 1 Sample 2
The normality test ( Anderson-Darling) for both samples shows that
the data is from a normal population and accept Ho of normality for
both11 samples as the P-values are ˃ 0.05. Dr Mahi
2) Data values must be independent and random:
Runs Test
Stat nonparametric Runs Test
12 Dr Mahi
C1 Male C1 Male
C2 Female C2 Female
13 Dr Mahi
Runs Test: Male; Female
Runs test for Male
Runs above and below K = 69.9
The observed number of runs = 6
The expected number of runs = 5.8
6 observations above K; 4 below
* N is small, so the following approximation may be invalid.
P-value = 0.888
The P-value of both
tests indicates to
Runs test for Female accept the null
Runs above and below K = 63.6667 hypothesis Ho of
randomness and
The observed number of runs = 7
The expected number of runs = 6.83333 independence of the
7 observations above K; 5 below data observations.
* N is small, so the following approximation may be invalid.
P-value = 0.917
14 Dr Mahi
3) The variance of the two independent groups are equal:
Stat Basic Statistics 2-Variance
C1 Male C1 Male
C2 Female C2 Female
15
Dr Mahi
Graphs Histogram
The histograms of both Male and Female are nearly
symmetric so we accept normality of both groups.
16 Dr Mahi
Graphs Idividual value plot
Mean of
male
Mean of
female
The individual value plot for both Male and Female
shows that the dispersion is nearly similar so we
accept that the variances are equal for male and
female, but the mean of male is greater than female
17 Dr Mahi
Graphs Summary Plot
2 1
4
1) Both tests of Bonett and Levene have large P-values so we
accepted Ho of equal variances of the two groups.
2) Also both confidence intervals of these tests include 1, so also we
accepted Ho of equal variances of the two groups.
3) The confidence intervals for the standard deviation for both male
and female shows no significance difference between the
dispersion of both groups.
4) The box plots of the two groups are nearly symmetric, but the
18
median of male is greater thanDrthe
Mahi
median of female. Dr Mahi
Test and CI for Two Variances: Male; Female
Method
Null hypothesis σ(Male) / σ(Female) = 1
Alternative hypothesis σ(Male) / σ(Female) ≠ 1
Significance level α = 0.05
Statistics
95% CI for
Variable N StDev Variance StDevs
Male 10 3.247 10.544 (2.068; 6.342)
Female 12 3.367 11.333 (2.144; 6.317)
Ratio of standard deviations = 0.965
Ratio of variances = 0.930
95% Confidence Intervals
CI for
CI for StDev Variance
1 The confidence intervals for standard deviation
Method Ratio Ratio
and variance for both tests Bonett and Levene.
Bonett (0.454; 2.294) (0.206; 5.261)
Levene (0.358; 2.564) (0.128; 6.572)
Observe that both intervals contain 1, so we
accept Ho of equal variances.
Test
Method DF1 DF2 Statistic P-Value
Bonett — — — 0.915 The P-values of both tests showed that
Levene19 1 20 0.04 0.837 we accept Ho of equal variances. Dr Mahi
After satisfying the 3 assumptions, Let’s apply the main test:
Stat Basic Statistics 2-Sample t
C1 Male
C2 Female
C1 Male
C2 Female
20 Dr Mahi
Graphs
Both graphs show that the mean of the two groups ( male
and female) are different and we will reject Ho od equal
means.
21
Dr Mahi
Options
Two-Sample T-Test and CI: Male; Female
Two-sample T for Male vs Female
N Mean StDev SE Mean
Male 10 69.90 3.25 1.0...................
Female 12 63.67 3.37 0.97……………..
Difference = μ (Male) - μ (Female)
Estimate for difference: 6.23 ( 69.90- 63.67)………….
95% CI for difference: (3.27; 9.19)
T-Test of difference = 0 (vs ≠): T-Value = 4.39 P-Value = 0.000 DF = 20 (n1 +n2 -2)………..
Both use Pooled StDev = 3.3134
1) The P-value of the test is 0.00 which is less than 0.05 so we reject
Ho of equal height mean for male and female and accept H1, so
we rejected the claim.
2) The confidence interval is (3.27; 9.19) which doesnot contain zero (
the hypotheized value), so this also ensures to reject Ho.
3) The sign of both limlts are positive ( +, +), then it should be a right
22
tailed test not a two tailed test. Dr Mahi
Example 2:
A healthcare consultant wants to compare the patient satisfaction
ratings of two hospitals. The consultant collects ratings from 20
patients for each of the hospitals. The consultant had the following
information: The mean ratings for hospital A is 80.30 with standard
deviation equals 8.18. While the mean ratings for hospital B is 59.3
with standard deviation equals 12.4.
Test the hypotheses of significance difference between the two
hospital’s mean ratings ( the two means are different/not equal) at
5% significance level.
Solution:
Ho: 𝝁𝟏 = 𝝁𝟐 OR Ho: 𝝁 - 𝝁 =0
𝟏 𝟐
H1: 𝝁𝟏 ≠ 𝝁𝟐 claim H1: 𝝁𝟏 − 𝝁𝟐 ≠ 𝟎
23
Dr Mahi
Here we don not have the data, but we have summarized
data, so are not going to be sure of the assumptions just to
apply the test:
Stat Basic Statistics 2-Sample t
20 80.3 8.18
20 59.3 12.4
24
Dr Mahi
Options OK
Two-Sample T-Test and CI
Sample N Mean StDev SE Mean
1 20 80.30 8.18 1.8……….
2 20 59.3 12.4 2.8…………
Difference = μ (1) - μ (2)
Estimate for difference: 21.00…………
95% CI for difference: (14.28; 27.72)
T-Test of difference = 0 (vs ≠): T-Value = 6.32 P-Value = 0.000 DF = 38…………
Both use Pooled StDev = 10.5041
1) Because the P-value is 0.000 which is less than the significance
level 0.05, then the consultant reject the null hypothesis of equal
means ratings and conclude that the ratings for the two hospitals
differ.
2) The confidence interval is (14.28; 27.72) which doesnot contain
zero ( the hypotheized value), so this also ensures to reject Ho.
3) The sign of both limlts are positive ( +, +), then it should be a right
25
tailed test not a two tailed test. Dr Mahi
Example 3:
An analyst wants to compare between the mean number of operations
done the last 20 months at two hospitals ) ) اإلسراء و البتراءAlesraa
Hospital and Albatraa Hospital.
The analyst claims that there is no significance difference between the
mean number of operations at the two hospital ( the two means are
equal/the same) . Test the analyst’s claim at 5% significance level.
Solution:
Ho: 𝝁𝟏 = 𝝁𝟐 claim
H1: 𝝁𝟏 ≠ 𝝁𝟐
26 Dr Mahi
First you should satisfy the assumptions of the 2-independent samples
t- test:
1) Data of each sample is from a normal population:
Stat Basic Statistics Normality Test
The normality test ( Anderson-Darling) for both samples shows that
the data for both samples are from a normal populations and accept
Ho of
27 normality for both samples as the P-values are ˃ 0.05.
Dr Mahi
2) Data values must be independent and random:
Runs Test
Stat nonparametric Runs Test
28 Dr Mahi
3) The variance of the two independent groups are equal:
Stat Basic Statistics 2-Variance
اإلسراءC1
اإلسراءC1 البتراءC2
البتراءC2
29
Dr Mahi
After satisfying the 3 assumptions, Let’s apply the main test:
Stat Basic Statistics 2-Sample t
C1 Alesraa
C2 Albatraa
C1 Alesraa
C2 Albatraa
30 Dr Mahi
Both graphs show that the mean of the two groups ( male and
female) are the same and there is no difference and so we will
31
accept Ho of equal means.
Applying the
main test
32
Dr Mahi
Ho: 𝝁𝟏 = 𝝁𝟐 claim
H1: 𝝁𝟏 ≠ 𝝁𝟐
1) Because the P-value of the 2-sample t test is 1 which is greater
than the significance level 0.05, then the null hypothesis of equal
means is accepted and we conclude that the mean number of
operations for the two hospitals is equal, thus the claim of the
analyst is accepted.
2) The confidence interval is (-613 , 613) which contains zero ( the
hypotheized value), so this also ensures to accept Ho that mean
number of operations for the two hospitals is equal .
3) The sign of both limlts are positive ( -, +), which ensures that the
test is a two tailed test.
33 Dr Mahi
Example 4:
OR
34
Dr Mahi
Solution: Ho: 𝝁𝟏 = 𝝁𝟐 claim
H1: 𝝁𝟏 ≠ 𝝁𝟐
35 Dr Mahi
1) Because the P-value of the 2-sample t test is 0.085 which is
greater than the significance level 0.05, then the null hypothesis
of equal means is accepted and we conclude that the two means
are equal, thus the claim of the analyst is accepted.
2) The confidence interval is (-6.56 , 0.56) which contains zero ( the
hypotheized value), so this also ensures to accept Ho .
3) The sign of both limlts are positive ( -, +), which ensures that the
test is a two tailed test and accepting Ho of equal means .
36
Dr Mahi
Example 5:
Given the following data of male mass and female mass with either
two forms each in a separate column or stacked in one column and
subscripts with indicating gender Male= 1 and female = 2
Perform a 2-samples t-test with all its steps to verify the claim that the
male and female masses are the same at 5% significance level.
37
Dr Mahi
Solution: Ho: 𝝁𝟏 = 𝝁𝟐 claim
H1: 𝝁𝟏 ≠ 𝝁𝟐
1) Because the P-value of the 2-sample t test is 0.000 which is smaller
than the significance level 0.05, then the null hypothesis of equal
means is rejected and we conclude that the two means are not
equal, thus the claim of the analyst is rejected.
2) The confidence interval is (-5.122 , -2.373) which does not contain
zero ( the hypotheized value), so this also ensures to reject Ho .
3) The sign of both limlts are positive ( -, -), which ensures that the
38 Dr Mahi
test is a left tailed test and reject Ho of equal means .
Testing hypotheses for 2-independent samples
( Non-parametric case)
Mann-Whitney Test
1) This test is used when we want to test the equality of two medians
of 2- independent populations Ƞ𝟏 and Ƞ𝟐
2) The null and alternative hypotheses have one of the following
shapes:
Ho: Ƞ𝟏 = Ƞ𝟐 OR
H1: Ƞ𝟏 ≠ Ƞ𝟐
39 Dr Mahi
Assumptions of Mann-Whitney test:
1) Data values must be independent and random:
Runs Test
Stat nonparametric Runs Test
2) The dependent variable is ordinal or numeric (continuous)
Let’s apply the main test:
Stat nonparametric Mann-Whitney
40 Dr Mahi
41 Dr Mahi
Example 6:
A state highway department uses two brands of paint for
painting stripes on roads. A highway official wants to know
whether the durability of the two brands of paint are
different . For each paint, the official records the number of
months the paint persists on the highway.
The official performs a Mann-Whitney test to determine
whether the median number of months that the paint
persists differs between the two brands at 5% significance
level. Solution:
Ho: Ƞ𝟏 = Ƞ𝟐
H1: Ƞ𝟏 ≠ Ƞ𝟐 claim
Observe different
42
sample sizes. Dr Mahi
Stat nonparametric Mann-Whitney
43 Dr Mahi
1) Because the P-value of Mann-Whitney test is 0.002 which is
smaller than the significance level 0.05, then the null hypothesis
of equal medians is rejected and we conclude that the two
medians are different, the official rejects Ho .
2) The confidence interval is (-3 , -0.9) which does not contain zero (
the hypotheized value), so this also ensures to reject Ho .
3) The sign of both limlts are positive ( -, -), which ensures that the
test is not a two tailed test and leads to reject Ho of equal
medians and accept H1 the claim . Dr Mahi
Example 7:
Ho: Ƞ𝟏 = Ƞ𝟐 claim
Solution:
H1: Ƞ𝟏 ≠ Ƞ𝟐
45 Dr Mahi
33-36
1) Because the P-value of Mann-Whitney test is 0.1489 which is
greater than the significance level 0.05, then the null hypothesis
of equal medians is accepted and we conclude that the two
medians are equal.
2) The confidence interval is (-7.501 , 2.001) which contains zero (
the hypotheized value), so this also ensures to accept Ho .
3) The sign of both limlts are positive ( -, +), which ensures that the
test is a two tailed test and accepting Ho of equal medians .
46 Dr Mahi
Home Work
Four
47 Dr Mahi
Exercise 1
The following data is for a group of men and women who did
workouts at a gym three times a week for a year. The trainer of men
and women measured their body fat. Given below this data for men
and women:
Men:
13.3 6 20 8 14 18 19 25 16 24 15 10 15
Women:
22 16 21.7 21 30 26 12 23.2 28 23
The trainer indicates that the mean body fat for men is greater than
the mean body fat for women. Do you agree with the claim of the
trainer at 5% significance level.(Use a 2-sample t- test)
Ho: 𝝁𝟏 ≤ 𝝁𝟐 OR Ho: 𝝁𝟏 - 𝝁𝟐 ≤ 0
H1: 𝝁𝟏 ˃ 𝝁𝟐 claim H1: 𝝁𝟏 − 𝝁𝟐 ˃ 0 claim
48 Dr Mahi
Exercise 2
The following data is for a group of men and women who did
workouts at a gym three times a week for a year. The trainer of men
and women measured their body fat. Given below this data:
Men:
Sample size = 10, mean= 22.29 standard deviation= 5.32
Women:
Sample size = 13, mean= 14.95 standard deviation= 6.84
The trainer indicates that the mean body fat for men is equal the
mean body fat for women. Do you agree with the claim of the trainer
at 5% significance level. .(Use a 2-sample t- test)
Ho: 𝝁𝟏 = 𝝁𝟐 OR Ho: 𝝁𝟏 - 𝝁𝟐 = 0
H1: 𝝁𝟏 ≠ 𝝁𝟐 claim H1: 𝝁𝟏 − 𝝁𝟐 ≠0 claim
49 Dr Mahi
Exercise 3
The analyst analyzes this data using Mann -Whitney test and the
results were as follows:
Comment on these results in details, write the null and alternative
hypotheses, Is this test a parametric or non-parametric test and what
are 50the assumptions of this test? Justify
Dr Mahi
your answer.
Exercise 4
The analyst analyze this data using a two sample test , and the results were as
follows :
What is the name of the test?
Comment on these results in details, write the null and alternative
hypotheses, Is this test a parametric or non-parametric test and what
are the assumptions of this test? Justify your answer.
51 Dr Mahi
Exercise 5
Perform a 2-sample-t test to confirm if the claim of this study is correct or no at 5%
significance level if the following information is given for the two samples:
52 Dr Mahi
Exercise 6
Two independent sampling stations, station 1 and station 2,
were chosen for a study on population. For 12 monthly
samples collected at station 1, the species diversity index
had a mean value 3.11 and a standard deviation of 0.771,
while another 10 monthly samples collected at station 2,
had a mean index value 2.04 and a standard deviation of
0.448. Assume that the two populations are approximately
normally distributed with equal variances; Construct a 90%
confidence interval for the difference between the
population means for the two locations and test that the
two means are equal. .(Use a 2-sample t- test)
53
Dr Mahi
Thank You
54
Dr Mahi