CHAPTER 3:
STATISTICAL
INFERENCES (WEEK 10)
3.1 Introduction
3.2 Sampling distribution                 WEEK 8
3.3 Inference for single population
3.4 Inference for two populations       WEEK 9
3.5 Nonparametric statistics  WEEK 10
NONPARAMETRIC STATISTICS (NPS)
                              INTRODUCTION TO NPS
NPS is the alternative to test a non-normal distribution such as
@ flat distribution
@ peaked distribution
@ skewed distribution
(refer figure 3.9 page 77)
NPS is referred as distribution-free tests.
NPS use median in making inferences about a population (parametric tests use mean).
NPS is also used to infer non-numerical data that require ranking approach such as
1. Nominal data
2. Ordinal data
3. Interval scale or ratio scale data but there is no assumption regarding the probability
   distribution of the population where the sample is selected.
Normal distributed data ➔ Parametric test (CI, HT, t-test, F-test,
           ANOVA)
Non-normal distributed data ➔ Nonparametric test.
NPS are:
❑ Sign Test
❑ Mann-Whitney Test
❑ Kruskal Wallis Test
❑ Wilcoxon Signed Rank Test
❑ Spearman’s Rank Correlation Test
                               Sign Test
➔The simplest NPS
➔Test the value of the median from a single sample.
➔Convert the data value into +/- sign.
Sign Test Procedures
1. State the hypotheses (determine the type of test: 2-tailed test,
    left/right tailed test).
Note: Mo is the hypothesized median
2. Determine the sign
Put a + sign for the value greater than the hypothesized median value
Put a - sign for the value less than the hypothesized median value
Put a 0 for the value equal to the hypothesized median value
3. Compute test statistic, k
4. Find the critical value from sign test table.
   Information needed:
   1. significance level, α
   2. size sample, n
     where n = total number of signs + and - signs
5. Make a decision
   Reject Ho if test statistic, k ≤ critical value
6. Conclusion
Example 1:
The following data constitute a random sample of 15 measurement of the octane rating of a
certain kind gasoline:
                       99.0 102.3 99.8 100.5 99.7 96.2 99.1 102.5
                          103.3 97.4 100.4 98.9 98.3 98.0 101.6
Test the null hypothesis median = 98 against the alternative hypothesis median > 98 at 0.05
level of significance.
Solution:
1. H0: median = 98
     H1: median > 98 (Claim) ➔ Right-tailed test
2.     99.0      102.3    99.8     100.5      99.7     96.2     99.1     102.5
         +         +       +        +          +         -       +         +               + sign = 12
      103.3       97.4   100.4       98.9      98.3     98.0     101.6                     -sign = 2
        +          -       +          +          +        0        +                       +/- sign = 14
3. This is right-tailed test, so test statistic, k = number of - signs = 2
cont:
4. significance level, α = 0.05
   size sample, n = 14
   ➔ critical value = 3
5. Since test statistic, k = 2 ≤ critical value = 3 so we reject Ho .
6. There is enough evidence to support the claim that the median of the octane rating of
   a certain kind gasoline is greater than 98.
Example 2:
An owner of a souvenir shop hypothesizes that the median number of items sold per day is 40.
A random of 20 days yields the following data for the number of items sold each day. At α =
0.05 test the owner’s hypothesis.
                                                18   43   40   16 22
                                                30    29 32 37 36
                                                      39 34 39 45 28
                                                    36 40 34 39 52
Solution:
1.   H0: median = 40      (Claim)
     H1: median ≠ 40     ➔ two-tailed test
2.
                                                -     +   0    -    -
                                                -     -   -    -    -
                                                                                       + sign = 3
                                                -     -   -    +    -
                                                                                       -sign = 15
                                                -     0   -    -    +
                                                                                       +/- sign = 18
3. This is two-tailed test, so test statistic, k = minimum number between + and - signs = 3
cont:
4. significance level, α = 0.05
   size sample, n = 18
   ➔ critical value = 4
5. Since test statistic, k = 3 ≤ critical value = 4 so we reject Ho .
6. There is enough evidence to reject the claim that the median number of items sold per
day is 40.
                             Mann-Whitney Test (MWT)
• To determine whether a difference exist between two populations median of non-
  normal distribution.
• Sometimes called as Wilcoxon rank sum test.
• Equivalent parametric test to MWT is the t-test for two independent samples.
Mann-Whitney Test Procedures
1. State the hypotheses (determine the type of test)
2. Rank the data values
    1.Combine all the data from the two samples (regard them as 1 sample).
    2. Rank the data from smallest to largest (from 1 and so on, if there is tie
      data, each of the data will get the average rank of the data).
3. Calculate the test statistic
    1. Label sample 1 and sample 2. Let:
        - sample 1 ➔ smaller sample size between the two independent samples.
        - if the both samples have same size, either one can be regard as sample 1.
    2. List the ranks of data values ranked in step 2 for both sample 1 and sample 2.
    3. Calculate the sum of ranks for both samples.
Test statistic, T is based on:
where ƩR1 = sum of ranks from sample 1
       n1 = sample size of sample 1
       n2 = sample size of sample 2
                      Summary of test statistic for Mann-Whitney test
4. Find and calculate critical value, Tcv
                                            Tcv = [TL , TU ]
    where TL ➔ find from table of MWT for given α, n1 and n2.
           TU = n1(n1+ n2+1) - TL
5. Make a decision base on:
                               Note:
                                       means not included.
6. Conclusion
Example:
Data below show the marks obtained by electrical engineering students in an
examination:
                                Gender                  Marks
                                 Male                     60
                                 Male                     62
                                 Male                     78
                                 Male                     83
                                Female                    40
                                Female                    65
                                Female                    70
                                Female                    88
                                Female                    92
Can we conclude
         = 0.1
                 the achievements of male and female students identical at significance
level        ?
      Solution:
      1. H0: There is no difference in the achievements of male and female students
           H1: There is a difference in the achievements of male and female students                ➔ two-tailed test
                      Gender                        Marks                         Rank
      2.
                       Male                           60                            2
                       Male                           62                            3
                                                                                             n=4    Thus n1 is
                       Male                           78                            6
                       Male                           83                            7               sample from
                      Female                          40                            1               male and n2 is
                      Female                          65                            4               sample from
                      Female                          70                            5        n=5    female
                      Female                          88                            8
                      Female                          92                            9
        n1 = 4this
    3. Since
We have                 5; T1 = test,
               , n2is= two-tailed  R1 =thus
                                        2 + 3test
                                               + 6statistic,    T1* = 4 ( 4 +(T51+, T
                                                   + 7 = 18;T = minimum             1)1*−)18 = 22
T = min (T1 ,T1* ) = min (18, 22 ) = 18
4. Critical value, Tcv = [TL , TU ]
    α = 0.1 ➔ α/2 = 0.05 ; n1 = 4, n2 = 5, thus from table   TL = 13
   Calculate TU = n1(n1+ n2+1) - TL = 4(4+5+1) – 13 = 27
   ➔ Tcv = [13,27]
5. Make a decision
                                            T  TL , TU 
   For two-tailed test, we reject H0 when
   Since T = 18 ϵ Tcv = [13,27], thus we we fail to reject H0 .
6. Conclusion
   There is not enough evidence to support the claim that there is a difference in the
   achievement between male and female students.