WST121 PRACTICAL TEST – 2006
TOTAL: 30          MARKS:
    INITIALS AND SURNAME
    STUDENT NUMBER
    SIGNATURE
INSTRUCTIONS:
•      Use ONLY EXCEL for Section A and ONLY SAS for Section B.
•      Answer all questions for Section A by using the data in the spreadsheet.
•      Give all answers correctly up to at least 3 DECIMAL PLACES.
•      NO CALCULATORS OR ANY OTHER CALCULATION DEVICES may be used.
•      Although the test is open-book, NO SHARING OF NOTES IS ALLOWED.
•      All other exam regulations apply.
SECTION A
NUMBER of EXCEL WORKBOOK
One general belief by observers in the business world is that taller men earn more money than shorter men.
In Spreadsheet Nr.1 the height (in cm) (x) and the salary (y) (in R1 000) of 100 MBA graduates are given.
Fit a straight line to the data and write down the following values:
            a                                          b
           R2
                                                                                                         (3)
SECTION B
Question 1 to Question 3 are written questions. Do only Question 4 in SAS.
Question 1
National Paper Company must purchase a new machine for producing cardboard boxes. The company must
choose between two machines, machine 1 and machine 2. Since the machines produce boxes of equal
quality, the company will choose the machine that produces the most boxes in a one-hour period. It is known
that there are substantial differences in the abilities of the company’s machine operators. Therefore National
Paper has decided to compare the machines using a paired difference experiment. The results for eight
randomly selected machine operators, producing boxes for an hour on one machine and then for an hour on
the other machine are as follows.
Operator        1      2      3     4    5        6    7   8
Machine 1      53      60    58    48    46      54   62   49
Machine 2      50      55    56    44    45      50   57   47
SAS Output 1
                                       The MEANS Procedure
                                     Analysis Variable : DIFF
                     N            Mean       Std Error    t Value    Pr > |t|
                     ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
                     8       3.2500000       0.5261043       6.18      0.0005
                     ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
SAS Output 2
                                      The TTEST Procedure
                                            Statistics
                                   Lower CL            Upper CL   Lower CL                Upper CL
 Variable                      N       Mean     Mean       Mean    Std Dev      Std Dev    Std Dev   Std Err
 MACHINE1-MACHINE2             8      2.006     3.25      4.494     0.9839        1.488     3.0286    0.5261
                                                      T-Tests
                             Variable            DF   t Value     Pr > |t|
                             MACHINE1-MACHINE2    7      6.18       0.0005
(a) Complete the following SAS program so that it will produce EITHER SAS output 1 OR SAS output 2
    (α=0.05) given above.
   SAS program
   DATA Q1;
   INPUT MACHINE1 MACHINE2 @@;
   CARDS;
   53 50 60 55 58 56 48 44
   46 45 54 50 62 57 49 47
   ;
                                                                                                               (3)
(b) Use SAS Output 2 given on the previous page to write down a 95% confidence interval for µD, the
    population mean difference.
                                                                                                              (1)
(c) Test the claim that machine 1 produces significantly more boxes than machine 2. Use a significance
    level of 5%. In your answer, only give the hypotheses, the p-value for the test and the rejection criteria.
                                                                                                               (3)
Question 2
A study was undertaken in 2005 regarding the efficiency of recruitment companies. For each of 4 agencies
the number of positions filled per month, is given in the table below.
                                                                    Month
               Agency      Jan      Feb    Mar     Apr     May     Jun Jul    Aug    Sep     Oct   Nov   Dec
                  1         50       55    51      60      59      57   55     53     61     62    50    49
                  2         25       26    21      30      31      29   26     27     28     28    30    20
                  3         25       26    27      28      30      31   35     25     26     27    31    25
                  4         15       17    10       9      12      15   19     7      20      6     7    10
SAS Output
                                        The GLM Procedure
                                     Class Level Information
                                 Class         Levels    Values
                                 agency             4    1 2 3 4
                                   Number of observations    48
                                       The GLM Procedure
Dependent Variable: f
                                               Sum of
   Source                         DF          Squares      Mean Square   F Value    Pr > F
   Model                           3      11541.75000       3847.25000    231.20    <.0001
   Error                          44        732.16667         16.64015
   Corrected Total                47      12273.91667
                      R-Square      Coeff Var       Root MSE         f Mean
                      0.940348       13.35629       4.079234       30.54167
   Source                         DF        Type I SS      Mean Square   F Value    Pr > F
   agency                          3      11541.75000       3847.25000    231.20    <.0001
   Source                         DF      Type III SS      Mean Square   F Value    Pr > F
   agency                          3      11541.75000       3847.25000    231.20    <.0001
                                        The GLM Procedure
                                       Scheffe's Test for f
                  NOTE: This test controls the Type I experimentwise error rate.
                                Alpha                              0.05
                                Error Degrees of Freedom             44
                                Error Mean Square              16.64015
                                Critical Value of F             2.81647
                                Minimum Significant Difference   4.8408
                    Means with the same letter are not significantly different.
                  Scheffe Grouping          Mean      N    agency
                                   A         55.167    12     1
                                   B         28.000    12     3
                                   B
                                   B        26.750     12    2
                                   C        12.250     12    4
(a) Complete the following SAS program so that it will produce (only) the output given above and on the
    previous page.
SAS program
data q2;
input agency f @@;
cards;
1   50   1   55   1   51   1   60 1 59 1 57 1 55 1 53         1 61    1   62 1 50 1 49
2   25   2   26   2   21   2   30 2 31 2 29 2 26 2 27         2 28    2   28 2 30 2 20
3   25   3   26   3   27   3   28 3 30 3 31 3 35 3 25         3 26    3   27 3 31 3 25
4   15   4   17   4   10   4   9 4 12 4 15 4 19 4 7 4         20 4    6   4 7 4 10
;
                                                                                                           (3)
(b) Test on a 5% level of significance whether the average number of positions filled per month differs
    significantly for the four agencies. Only give the hypotheses, p-value and the decision that can be made.
                                                                                                           (3)
(c) When doing a Scheffe pairwise comparison on a 5% level of significance, answer the following:
   The average number of positions filled per month by agency 2 differs significantly from the average
   number of positions filled per month by agency(s):
                                                                                                         (1)
(d) Assume that the assumption of normality does not hold. Complete the following SAS program to test on
    a 5% level of significance whether the median number of positions filled per month by the four agencies
    differs significantly.
SAS program
data q2;
input agency f    @@;
cards;
1 50 1 55 1 51    1   60 1 59 1 57 1 55 1 53     1 61   1   62 1 50 1 49
2 25 2 26 2 21    2   30 2 31 2 29 2 26 2 27     2 28   2   28 2 30 2 20
3 25 3 26 3 27    3   28 3 30 3 31 3 35 3 25     3 26   3   27 3 31 3 25
4 15 4 17 4 10    4   9 4 12 4 15 4 19 4 7 4     20 4   6   4 7 4 10
;
                                                                                                         (3)
Question 3
The quality control manager of an automobile parts factory would like to know whether the quality of parts
produced (defective=def or acceptable=acc) depends on the day of the workweek. Random samples of 100
parts produced on each day of the week were selected and for each part it was determined whether it is
defective or acceptable. The results of this test can be seen in the SAS output below.
(a) Complete the following SAS Program to produce the output given below.
   SAS program
   data q3;
   input day$ result$ freq @@;
   cards;
   Mon def 12 Tue def 7 Wed def 7 Thu def 10 Fri def 14
   Mon acc 88 Tue acc 93 Wed acc 93 Thu acc 90 Fri acc 86
  ;
                                                                                                        (4)
SAS Output
                                    The FREQ Procedure
                                  Table of result by day
              result     day
              Frequency‚
              Percent ‚
              Row Pct ‚
              Col Pct ‚Fri       ‚Mon     ‚Thu     ‚Tue     ‚Wed      ‚ Total
              ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
              acc      ‚      86 ‚     88 ‚     90 ‚     93 ‚      93 ‚    450
                       ‚ 17.20 ‚ 17.60 ‚ 18.00 ‚ 18.60 ‚ 18.60 ‚ 90.00
                       ‚ 19.11 ‚ 19.56 ‚ 20.00 ‚ 20.67 ‚ 20.67 ‚
                       ‚ 86.00 ‚ 88.00 ‚ 90.00 ‚ 93.00 ‚ 93.00 ‚
              ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
              def      ‚      14 ‚     12 ‚     10 ‚      7 ‚       7 ‚     50
                       ‚   2.80 ‚    2.40 ‚   2.00 ‚   1.40 ‚    1.40 ‚ 10.00
                       ‚ 28.00 ‚ 24.00 ‚ 20.00 ‚ 14.00 ‚ 14.00 ‚
                       ‚ 14.00 ‚ 12.00 ‚ 10.00 ‚       7.00 ‚    7.00 ‚
              ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
              Total          100      100      100      100       100      500
                          20.00     20.00    20.00    20.00    20.00    100.00
                          Statistics for Table of result by day
                  Statistic                     DF       Value      Prob
                  ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
                  Chi-Square                     4      4.2222    0.3768
                  Likelihood Ratio Chi-Square    4      4.2331    0.3754
                  Mantel-Haenszel Chi-Square     1      4.0031    0.0454
                  Phi Coefficient                       0.0919
                  Contingency Coefficient               0.0915
                  Cramer's V                            0.0919
                                    Sample Size = 500
(b) Use α=0.10 and test whether there is evidence of a significant relationship between quality of the items
    produced and the day of the week.
    In your answer, only give the hypotheses, the p-value for the test and the decision made.
                                                                                                           (3)
Question 4
Use SAS to determine the answer to the following questions. Only write down your answer in the space
provided.
(a) Suppose Y~χ2(10). P (Y ≥ 8) =
(b) Suppose T~t(10). The 80th percentile of T =
(c) Suppose X~n(1500, 1602). P (1400 ≤ X ≤ 1700) =
                                                                                                           (3)
                                                                                                          [30]