AppliedStatistics PDF
AppliedStatistics PDF
net/publication/297394168
Applied Statistics
CITATIONS READS
0 20,089
1 author:
Philip E Crewson
33 PUBLICATIONS 1,513 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Philip E Crewson on 08 March 2016.
  Applied Statistics
         Desktop Reference
Phil Crewson
Preface
The approach of the Guide is to connect classically taught statistics with statistical
software output. The Guide presents commonly used hypothesis tests and formulas,
provides an example of how to do the calculations, and relates the hypothesis test to
output generated by statistical software.
For many, the most valuable portions of the Guide will be the Glossary and the anno-
tated presentation of output from statistical software.
The software output examples provided in the text and the Glossary should be famil-
iar to anyone who has used one of the popular statistical software packages. Most of
the software output presented in the Guide is created with AcaStat statistical soft-
ware. AcaStat is an easy-to-use statistical software system that analyzes raw data
(electronic data not aggregated).
                                                                                         i
Some examples also use StatCalc. StatCalc analyzes summary data (multiple re-
cords reduced to counts, means, proportions). StatCalc is especially useful for verify-
ing hand calculations and creating z-scores and confidence intervals for means and
proportions.
Glossary
The Glossary contains over 200 definitions of statistical terms and concepts and in-
cludes a quick ‘Applied Stat’ review of 26 common statistical techniques with anno-
tated statistical output examples.
Copyright
Copyright © 2014 by Philip E. Crewson. All rights reserved under International and
Pan-American Copyright Conventions. By payment of the required fees, you have
been granted the non-exclusive, non-transferable right to access and read the text of
this e-book on-screen. No part of this text may be reproduced, transmitted, down-
                                                                                        ii
loaded, decompiled, reverse engineered, or stored in or introduced into any informa-
tion storage and retrieval system, in any form or by any means, whether electronic or
mechanical, now known or hereinafter invented, without the express written permis-
sion of the author.
                                                                                     iii
1
Research Design and Reporting
1.1!Research Design
The scientific model guides research design to ensure findings are quantifiable (meas-
ured in some fashion), verifiable (others can substantiate our findings), replicable (oth-
ers can repeat the study), and defensible (provides results that are credible to
others--this does not mean others have to agree with the results). For some the sci-
entific model may seem too complex to follow, but it is often used in everyday life
and should be evident in any research report, paper, or published manuscript. The
corollaries of common sense and proper paper format with the scientific model are
given below.
                                                                                         4
Overview of first four elements of the Scientific Model
The following discussion provides a very brief introduction to the first four elements
of the scientific model. Although these four elements are not the primary focus of
this text, they are the foundations for systematic research and data analysis and
should be carefully considered before, during, and after statistical analyses. The re-
maining elements that pertain to testing hypotheses and evaluating results are the pri-
mary focus for the remainder of the Desktop Reference Guide.
1. Research Question
The research question should be a clear statement about what the researcher intends
to investigate. It should be specified before research is conducted and openly stated
in reporting the results. One conventional approach is to put the research question in
writing in the introduction of a report starting with the phrase "The purpose of this
study is . . . .“ This approach forces the researcher to clearly identify the research ob-
jective, allow others the to benchmark how well the study design answers the pri-
mary goal of the research, and identify the key abstract concepts involved in the re-
search
Abstract concepts: The starting point for measurement. Abstract concepts are best
understood as general ideas in linguistic form that help us describe reality. They
range from the simple (hot, long, heavy, fast) to the more difficult (responsive, effec-
tive, fair). Abstract concepts should be evident in the research question and/or pur-
pose statement. An example of a research question is given below along with how it
might be reflected in a purpose statement.
Research question: Is the quality of public sector and private sector employees differ-
ent?
Purpose statement: The purpose of this study is to determine if the quality of public
and private sector employees is different.
                                                                                         5
2. 	 Develop Theory
A theory is one or more propositions that suggest why an event occurs. It is the re-
searcher’s explanation for how the world works. These theoretical propositions pro-
vide a framework for analysis that are predisposed to determine “What is reality” not
“What should reality be.” A sound theory must have logical integrity and should con-
sider current knowledge on the event being explored. In other words, a thorough lit-
erature review before the design and implementation of a study is the hallmark of
good research.
3. Identify Variables
Variables are measurable abstract concepts that help describe relationships. This
measuring of abstract concepts is referred to as operationalization. In the previous
research question "Is the quality of public sector and private sector employees differ-
ent?” the key abstract concepts are employee quality and employment sector. To
measure "quality" a measurable representation of employee quality will need to be
found. Possible quality variables may be performance on a standardized intelligence
test, attendance at work, performance evaluations, etc. The variable for employment
sector seems to be fairly self-evident, but a good researcher must be very clear on
how they define and measure the concepts of public and private sector employment
(i.e., how is non-profit employment handled?).
Variables represent empirical indicators of an abstract concept. This does not mean
there is complete congruence between our measure and the abstract concept. Vari-
ables used in statistical analysis are unlikely to measure all aspects of an abstract
concept. Put simply, all variables have an error component.
                                                                                        6
shown below, one would expect that as more valid indicators of an abstract concept
are used, the effect of the error term would decline:
The appropriate design and use of multiple indicators is beyond this text, but this ap-
proach is evident in many tools commonly used today such as psychological assess-
ments, employee satisfaction, socio-economic status, and work motivation meas-
ures.
Levels of Data
There are four levels of variables. It is essential to be able to identify the levels of
data used in a research design. The level of data used in a research design directly im-
pacts which statistical methods are most appropriate for testing research hypothe-
ses. The four levels of data are listed below in order of their precision.
Properties:
                                                                                           7
    • categories are mutually exclusive (an object or characteristic can only be con-
     tained in one category of a variable)
• no logical order
Ordinal: Classifies objects by type or kind but also has some logical order (military
rank, letter grades).
Properties:
Interval: Classifies by type, logical order, but also requires that differences between
levels of a category are equal (temperature in degrees Celsius, distance in kilometers,
age in years).
Properties:
Ratio: Classification is similar to interval but ratio data has a true zero starting point
(total absence of the characteristic). For most purposes, the analyses presented in
this text assume interval/ratio are the same.
                                                                                             8
Association
Statistical techniques are used to explore connections between independent and de-
pendent variables. This connection between or among variables is often referred to
as association. Association is also known as covariation and can be defined as
measurable changes in one variable that occur concurrently with changes in another
variable. A positive association is represented by change in the same direction (in-
come rises with education level). Negative association is represented by concurrent
change in opposite directions (hours spent exercising and % body fat). Spurious as-
sociations are associations between two variables that can be better explained by a
third variable. As an example, if after taking cold medication for seven days the symp-
toms disappear, one might assume the medication cured the illness. Most of us, how-
ever, would probably agree that the change experienced in cold symptoms are proba-
bly better explained by the passage of time rather than pharmacological effect (i.e.,
the cold would resolve itself in seven days irregardless of whether the medication
was taken or not).
Causation
2. Precedence: Does the independent variable vary before the dependent variable?
                                                                                         9
3. Plausibility: Is the expected outcome consistent with theory/prior knowledge?
The accuracy of our measurements is affected by their reliability and validity. Reliabil-
ity is the extent to which the repeated use of a measure obtains the same values
when no change has occurred (can be evaluated empirically). Validity is the extent to
which the operationalized variable accurately represents the abstract concept it in-
tends to measure (cannot be confirmed empirically-it will always be in question). Reli-
ability negatively impacts all studies but is very much a part of any methodology/
operationalization of concepts. As an example, reliability can depend on who per-
forms the measurement and when, where, and how data are collected (from whom,
written, verbal, time of day, season, current public events).
Study Design
Experimental: The experimental design uses a control group and applies treatment
to a second group. It provides the strongest evidence of causation through extensive
                                                                                           10
controls and random assignment to remove other differences between groups. Using
the evaluation of a job training program as an example, one could carefully select and
randomly assign two groups of unemployed welfare recipients. One group would be
provided job training and the other would not. If the two groups are similar in all
other relevant characteristics, one could assume any differences between the groups
employment one year later was caused by job training.
Whenever an experimental design is used, both internal and external validity can be-
come very important factors.
Internal validity: The extent to which accurate and unbiased association between the
IV and DVs were obtained in the study group.
External validity: The extent to which the association between the IV and DV is accu-
rate and unbiased in populations outside the study group.
Quasi-experimental: The quasi-experimental design does not have the level of con-
trols employed in an experimental design (most social science research). Although in-
ternal validity is lower than can be obtained with an experimental design, external va-
lidity is generally better (assuming a good random sample) and a well designed study
should allow for the use of statistical controls to compensate for extraneous variables.
Panel study: Repeated cross-sectional studies over time with the same participants
(cohort study).
                                                                                       11
1.2 Reporting Results
The following provides an outline for presenting the results of systematic research.
Regardless of the specific format used, the key points to consider in reporting the re-
sults of research are:
2. Completely explain study assumptions and method of inquiry so that others may du-
   plicate the study.
4.1. Accurate
4.3. Titled and documented so that they could stand on their own without a report.
Paper Format
The following outline may be a useful guide in formatting a research report. It incorpo-
rates elements of the research design and steps for hypothesis testing (in italics). It
may be helpful to refer back to this outline after reviewing the sections on hypothesis
testing. The report should clearly identify each section so the reader does not get lost
or confused.
                                                                                          12
I. Introduction: The introduction section should include a definition of the central re-
search question (purpose of the paper), why it is of interest, and a review of the litera-
ture related to the subject and how it relates to the hypotheses.
Elements:
• Purpose statement
II. Method: Describe the source of the data, sample characteristics, statistical tech-
nique(s) applied, level of significance necessary to reject the null hypotheses, and
how the abstract concepts were operationalized.
Elements:
• Assumptions
1. Random sampling
2. Independent subgroups
• Hypotheses
• Rejection criteria
                                                                                        13
        1. Indicate alpha (amount of error the researcher is willing to except)
III. Results: Describe the results of the data analysis and the implications for the hy-
potheses. The results section should include such elements as univariate, bivariate,
and multivariate analyses; significance test statistics, the probability of error and re-
lated status of the hypotheses tests (were they rejected?).
Elements:
• Decide results
IV. Conclusion: Summarize and evaluate the results. Put in plain words what the re-
search found concerning the central research question. Identify alternative variables,
discuss implications for further study, and identify the weaknesses of the research and
findings.
Elements:
VI. Appendix: Additional tables and information not included in the body of the re-
port.
                                                                                        14
Table Format
Survey of Customers
                                                        Customer Characteristics
                                           Sex                              Age
                        Total (1)   Female       Male        18-29 30-39 40-49 50-59      60+
     Sample Size ->      3686       1466         2081        1024   1197    769    493     98
    % of Total (1) ->    100%        41%         59%         29%     33%    22%    14%    3%
   Margin of Error ->    1.6%       2.6%         2.2%        3.1%   2.8%    3.5%   4.4%   9.9%
Staff Professional?
                  Yes      89.4%     89.6%       89.5%       89.5% 89.4% 89.5% 89.6% 90.0%
                   No      10.6%     10.4%       10.5%       10.5% 10.6% 10.5% 10.4% 10.0%
Treated Fairly?
                  Yes      83.1%     82.8%       83.8%       82.6% 83.0% 83.5% 84.3% 87.9%
                  No       16.9%     17.2%       16.2%       17.4% 17.0% 16.5% 15.7% 12.1%
Served Quickly?
                  Yes      71.7%     69.3%       74.1%       73.3% 69.3% 72.1% 74.2% 78.0%
                   No      28.3%     30.7%       25.9%       26.7% 30.7% 27.9% 25.8% 22.0%
(1) Total number of cases is based on responses to the question concerning being served
quickly. Non-responses to survey items cause the sample sizes to vary.
                                                                                                15
Critical Review Checklist
he following checklist may help when critically evaluating research prepared by others.
Is there anything else that should be added to the list?
4. Hypotheses evident
14. Other?
15. _________________________________________
16. _________________________________________
                                                                                     16
2
Data File Basics
There are two general sources of data; primary and secondary data. Primary data is
designed and collected specifically to answer one or more research questions. Secon-
dary data was collected by others for purposes that may or may not match the re-
search questions being explored. An example of a primary data source would be an
employee survey the researcher designs and implements for an organization to evalu-
ate job satisfaction. An example of a secondary data source would be the use of cen-
sus data or other publicly available data such as the General Social Survey to explore
research questions that may not have been specifically envisioned when the data de-
sign was created.
The best way to envision a data file is to use the analogy of the common spreadsheet
software. In spreadsheets, there are columns and rows. For many data files, a
spreadsheet provides an easy means of organizing and entering data. In a rectangu-
lar data file, columns represent variables and rows represent observations. Variables
are commonly formatted as either numerical or string. A numerical variable is used
whenever the research plans to manipulate the data mathematically. Examples would
be age, income, temperature, and job satisfaction rating. A string variable is used
whenever the research plans to treat the data entries like words. Examples would be
names, cities, case identifiers, and race. Most variables that could be considered
string are coded as a numeric variable. As an example, data for the variable "sex"
might be coded 1 for male and 2 for female instead of using a string variable that
would require letters (e.g., "Male" and "Female"). This has two benefits. First, numeri-
                                                                                      17
cal entries are easier and quicker to enter. Second, manipulation of numerical data
with statistical software is generally much easier than manipulating string variables.
There are several common data file formats. As a general rule, there are data files
that are considered system files and data files that are text files. System files are cre-
ated by and for specific software applications. Examples would be Microsoft Excel,
SAS, STATA, and SPSS. Text files contain data in ASCII format and are almost univer-
sal in that they can be imported into most statistical programs.
Text files
Fixed: In fixed formatted data, each variable will have a specific column location.
When importing fixed formatted data into a statistical package, these column locations
must be identified for the software program. The following is an example:
++++|++++|++++|++++|++++|
10123HoustonTX12Female1
Reading from left to right, the variables and their location in the data file are:
                                                                                         18
Free: In free formatted data, either a space or special value separates (delimits) each
variable. Common delimiters are tabs or commas. When importing free formatted
data into a statistical package, the software assumes that when a delimiter value is
found that it is the end of the previous variable and the next character will begin an-
other variable.
The following is an example of a comma separated value data file (know as a csv file):
101,23,Houston,TX,12,Female,1,
When reading either fixed or free data, statistical software counts the number of vari-
ables and assumes when it reaches the last variable that the next variable will be the
beginning of another observation (case).
Data dictionary
A data dictionary defines the variables contained in a data file (and sometimes the for-
mat of the data file). To properly define and document a data file, the researcher
should record information one the coding of each variable. The following is an exam-
ple of a description of variable coding for a three-question survey.
     Q1: How satisfied are you with the current pay system?
          a. Very satisfied
          b. Somewhat satisfied
          c. Satisfied
          d. Somewhat dissatisfied
          e. Very dissatisfied
     Q2: How many years have you been employed here?
     Q3: Please fill in your department name:
                                                                                          19
To properly define and document a data file, record the following information:
                       If a fixed data set, the column location and possibly row in data
Variable location:
                       file
For the employee survey, the data dictionary for a comma separated data file will look
like the following:
Variable name:	        Q1
Variable type:	        Numerical (categorical)
Variable location:	    Second variable
Variable label:	       Satisfaction with pay system
Value labels:	         1=Very satisfied
	                      2=Somewhat satisfied
	                      3=Satisfied
	                      4=Somewhat dissatisfied
	                      5=Very dissatisfied
                                                                                       20
Variable name:        Q2
Variable type:        Numerical
Variable location:    Third variable
Variable label:       Years employed
Value labels:         None
Variable name:        Q3
Variable type:        String
Variable location:    Fourth variable
Variable label:       Department name
Value labels:         None
If the data for four completed surveys are entered into a spreadsheet, it will look like
the following:
            A              B            C             D
    1      CASEID              Q1            Q2           Q3
    2        1001               2             5        Admin
    3        1002               5            10          MIS
    4        1003               1            23    Accounting
    5        1004               4             3         Legal
The data will look like the following if saved in a text file as comma separated (note:
the total number of commas for each record equals the total number of variables):
                                                                                           21
Some statistical software will not read the first row in the data file as variable names.
In that case, the first row must be deleted before saving to avoid computation errors.
The data file without variable names will look like the following:
1001, 2, 5, Admin,
1002, 5, 10, MIS,
1003, 1, 23, Accounting,
1004, 4, 3, Legal,
This is a very efficient way to store and analyze large amounts of data, but it should
be apparent at this point that a data dictionary would be necessary to understand what
the aggregated data represent. Documenting data files is very important. Although
this was a simple example, many research data files have hundreds of variables and
thousands of observations that will be uninterpretable without careful documentation.
                                                                                            22
3
Descriptive Statistics
This section presents techniques for using counts and proportions to create univari-
ate descriptive statistics. Use the techniques presented in this section if the analysis
variable meets the following characteristics.
                                                                                        23
                 Nominal Data                                      Ordinal Data
   Classifies objects by type or characteristic    Classifies objects by type or kind but also
                                                   has some logical order
   ★ Categories are mutually exclusive
   ★ No logical order                              ★ Categories are mutually exclusive
                                                   ★ Logical order exists
                                                   ★ Scaled according to amount of a
                                                     particular characteristic they possess
                                           Examples
               sex, race, ethnicity                        income (low, moderate, high)
              government agency                           job satisfaction (5 point scale)
                     religion                               course letter grade (A to F)
          neighborhood (urban/suburb)
               yes/no responses
Univariate descriptive statistics are used to describe one variable through the crea-
tion of a summary statistic. Common univariate statistics for nominal/ordinal data in-
clude ratios, rates, proportions, and percentages.
Ratios
A ratio measures the extent to which one category of a variable outnumbers another
category in the same variable.
Ratio Example: The community has 1370 Protestants and 930 Catholics.
                                                                                                 24
Rates
A rate measures the number of actual occurrences out of all possible occurrences
per an established period of time.
Rate Example: Create the death rate per 1,000 given there are 100 deaths a year in a
population of 10,000.
            # PerYear                                100
Rate =                   × RateMetric        Rate =         ×1, 000
         TotalPopulation              	 or 	        10, 000
Proportions
                OccurenceFreq
Pr oportion =
                 TotalPossible
            OccurenceFreq
Percent =                  ×100
             TotalPossible
                                                                                   25
Software Output: An example using survey data.
Frequencies
Interpretation
[b] 4.00% have 9 years of education (2/50 = .04; .04 * 100 = 4.00%)
[c] 6.00% have 9 years or less education (3/50 = .06; .06*100 = 6.00%)
                                                                          26
3.2 Univariate Statistics – Mean, Median, and Mode
This section presents techniques for using interval/ratio data to create descriptive sta-
tistics. For the methods reviewed in this section, the analysis variable must by inter-
val or ratio level data.
 Classifies objects by type or kind but also has   Classifies objects by type or kind but also has
 some logical order in consistent intervals.       some logical order in consistent intervals.
 ★ Categories are mutually exclusive          ★ Same as Interval but also has a zero
 ★ Logical order exists                         starting point
 ★ Scaled according to amount of a particular
   characteristic they possess
 ★ Differences between each level are equal
 ★ No zero starting point
                                   Examples of Interval/Ratio
        Miles to work                      Income                                 IQ
          SAT score                     Education in years                   Temperature
             Age                              Weight                             Height
The following reviews common univariate descriptive statistics used to represent cen-
tral tendency and univariate statistics that describe variation in the data.
Mode: The mode is the most frequently occurring score. A distribution of scores can
be unimodal (one score occurred most frequently), bimodal (two scores tied for most
frequently occurring), or multimodal. In the table that follows the mode is 32. If there
were also two scores with the value of 60, the distribution would be bimodal (32 and
60).
                                                                                                     27
Median: The median is the point on a rank ordered list of scores below which 50% of
the scores fall. It is especially useful as a measure of central tendency when there
are very extreme scores in the distribution, such as would be the case if we had
someone in the age distribution provided below who was 120. If the number of
scores is odd, the median is the score located in the position represented by (n+1)/2.
In the table below the median is located in the 4th position (7+1)/2 and would be re-
ported as a median of 42. If the number of scores are even, the median is the aver-
age of the two middle scores.
As an example, if the last score (65) is dropped from the following table, the median
would be represented by the average of the 3rd (6/2) and 4th score, or (32+42)/2=37.
Always remember to order the scores from low to high before determining the me-
dian.
24
32 Mode
32
55
60
65
44.29 Mean
                                                                                        28
Mean: 	
X=
   ∑ X    i
The mean is the sum of the scores (     ) divided by the number of scores (n) to com-
pute an arithmetic average of the scores in the distribution. The mean is the most of-
ten used measure of central tendency. It has two properties: 1) the sum of the devia-
tions of the individual scores (Xi) from the mean is zero, 2) the sum of squared devia-
tions from the mean is smaller than what can be obtained from any other value cre-
ated to represent the central tendency of the distribution. In the above table the mean
age is (310/7) or 44.29.
Weighted Mean:
When two or more means are combined to develop an aggregate or grand mean, the
influence of each mean must be weighted by the number of cases in its subgroup.
Example
Wrong Method:
                                                                                      29
Correct Method: 	
Measures of Variation
Range: The range is the difference between the highest and lowest score (high-low).
It describes the span of scores but cannot be compared to distributions with a differ-
ent number of observations. In the table below, the range is (65-24) or 41.
Variance: The variance represents the average of the squared deviations between the
individual scores and the mean. The larger the variance the more variability there is
among the scores in a given distribution. When comparing two samples with the
same unit of measurement (e.g., age), the variances are comparable even though the
sample sizes may be different. Generally, smaller samples have greater variability
among the scores than larger samples. The formula is almost the same for estimat-
ing population variance.
                                            2
                          Σ( X i − µ )
                     σ2 =
Population Variance:            N
                        S2 =
                              (
                             Σ Xi − X   )
Sample Variance:		              n −1
                                                                                        30
Standard deviation: The standard deviation represents the square root of variance. It
provides a representation of the variation among scores that is directly comparable to
the raw scores.
                                                       2
                                         Σ( X i − µ )
                                      σ=
Population Standard Deviation:		               N      		
                                      S=
                                              (
                                            Σ Xi − X   )
 Sample Standard Deviation:	 	                 n −1
Example
                   24      -20.29           411.68
                   32      -12.29           151.04
                   32      -12.29           151.04           squared deviations
                   42       -2.29            5.24
                   55      10.71            114.70
                   60      15.71            246.80
                   65      20.71            428.90
                                                                                         31
Software Example: Uses data from previous example.
Descriptive Statistics
Variable: Age
----------------------------------------------------------------------
Count             7            Pop Var                    215.6327 [b]
Sum             310.0000       Sam Var                    251.5714
Mean             44.2857       Pop Std                     14.6844
Median           42.0000       Sam Std                     15.8610
Min              24.0000       Std Error                    5.9949
Max              65.0000       CV%                         35.8152 [c]
Range            41.0000       95% CI (+/-)                14.6690 [d]
Skewness          0.0871 [a]   One sample t-test (mu=0)|p<| 0.0003 [e]
----------------------------------------------------------------------
Missing Cases 0
Interpretation
[a] 	 Skewness provides an indication of the how asymmetric the distribution is for a given sam-
ple. A negative value indicates a negative skew. Values greater than 1 or less than -1 indicate a
non-normal distribution.
[b]	 Population variance (Pop Var) and population standard deviation (Pop Std) will always be
less than sample variance and sample standard deviation, since the sum of squares is divided
by n instead of n-1.
[c]	 Coefficient of Variation (CV) is the ratio of the sample standard deviation to the sample
mean: (sample standard deviation/sample mean)*100 to calculate CV%. It is used as a measure
of relative variability, CV is not affected by the units of a variable.	
[d] Add and subtract this value to the mean to create a 95% confidence interval.
[e]	 This represents the results of a one-sample t-test that compares the sample mean to a hy-
pothesized population value of 0 years of age. In this example, the sample mean age of 44 is sta-
tistically significantly different from zero.
                                                                                                32
3.3 Standardized Z-Scores
To obtain a standardized score, subtract the mean from the individual score and di-
vide by the standard deviation.
Interpretation
[a] Ed is 1.31 standard deviations above the mean age for those represented in the sample.
                                                                                              33
Software Output: Data from the previous example.
The following displays z-scores created by statistical software when conducting a de-
scriptive analysis. The statistical software calculates the mean and standard devia-
tion for the sample data and then calculates a z-score for each observation and out-
puts the result to the data file thereby creating a new variable (in this case Z-Age).
The following displays the descriptive statistics for the Z-Age variable. Notice that
the mean is zero and sample variance and sample standard deviation are one.
Descriptive Statistics
Variable: Z-Age
--------------------------------------------------------------------
Count                  7        Pop Var                       0.8571
Sum                    0.0000   Sam Var                       1.0000
Mean                   0.0000   Pop Std                       0.9258
Median               -0.1441    Sam Std                       1.0000
Min                  -1.2790    Std Error                     0.3780
Max                    1.3060   CV%                              ERR
Range                  2.5850   95% CI (+/-)                  0.9248
Skewness               0.0871   One sample t-test (mu=0) |p<| 1.0000
--------------------------------------------------------------------
Missing Cases 0
                                                                                         34
Z-scores may also be used to compare similar individuals in different groups. As an
example, to compare students with similar abilities taking two different classes with
the same level of course content but different instructors, the exam scores from the
two courses could be “normalized.” To more properly compare student A's perform-
ance in class A with Student B's performance in class B, the numerical test scores
could be adjusted by the variation (standard deviation) of test scores in each class
and the distance of each student's test score from the average (mean) for the class.
The student with the greater z-score performed relatively better, as measured by the
test score, than the student with the lower z-score.
Transformed Z-Scores
Z-scores have two major disadvantages for a lay audience. First, negative scores
have negative connotations. Second, z-scores and their associated decimal point
presentation can be difficult for others to interpret. These disadvantages can be miti-
gated by transforming the z-scores into another metric that removes decimals and
negative values. This is a common practice in standardized testing.
Procedure:
                                                                                       35
where
= Arbitrary mean
Fred took an IQ test and correctly answered 70 out of 100 points. The mean score
on this test is 75 with a standard deviation 5. The following transforms Fred’s z-score
(-1) to a scale where the mean is 100 and a standard deviation is 15.
Sample statistics:
Fred’s z-score:
Transformation: and
                                                                                     36
3.4 Bivariate Statistics – Counts and Percents
Column %: Of those who are female, 75% (75/100) support the tax.
Column Total %: Overall 50% (100/200) are female and 50% (100/200) are male.
                                                                                        37
Software Output: An example using survey data.
         Count |
         Row % |
         Col % |
       Total % |    Male | Female     |   Total
     -------------------------------------------
       <12 yrs |       3 |       4 [a]|       7
               |   42.86 |   57.14 [b]|
               |   11.54 |   16.67 [c]|   14.00
               |    6.00 |    8.00 [d]|
     -------------------------------------------
       HS Grad |      10 |      11    |      21
               |   47.62 |   52.38    |
               |   38.46 |   45.83    |   42.00
               |   20.00 |   22.00    |
     -------------------------------------------
       College |      13 |       9    |      22
               |   59.09 |   40.91    |
               |   50.00 |   37.50    |   44.00
               |   26.00 |   18.00    |
     -------------------------------------------
               |      26 |      24    |      50
         Total |   52.00 |   48.00    | 100.00
Interpretation
[a] Count: There are 4 females who have less than 12 years of education.
[b] Row %: Of those who have less than a high school education (<12 yrs), 57.14% (4/7) are fe-
male.
[c] Column %: Of those who are female, 16.67% (4/24) have less than a high school education.
[d] Total %: 8% (4/50) of the sample is females with less than a high school education.
                                                                                            38
Summary Table
The following summary table is based on column percent from the crosstabulation
shown on the previous page. Counts are from the column marginals (Totals) for male
and female.
Education level for all subjects in the sample and by sex subgroups
Observations (n) = 50 26 24
Interpretation
There are slightly more men in this sample than women. Using percentages to adjust for differ-
ences in sample size, men are more likely (50%) to have a college education than women
(37.5%).
                                                                                            39
3.5 Bivariate Statistics – Means
The following example provides separate income means for males and females.
Category:     All
--------------------------------------------------------------------
Count                      1439              Pop Std Dev                  19498.0888
Min                           0.0000         Sample Std Dev               19504.8672
Max                      75000.0000          Standard Error                  514.1769
Sum                  24060750.0000           95% CI (+/-)                   1008.6397
Median                   11250.0000          95% Lower Limit              15711.8259
Mean                     16720.4656          95% Upper Limit              17729.1053
Range                    75000.0000          Coeff of Variation              116.6527
                                                                                          40
By Subgroup Variable: SEX
Category:    Male
--------------------------------------------------------------------
Count                      620               Pop Std Dev           22407.0499
Min                            0.0000        Sample Std Dev        22425.1420
Max                      75000.0000          Standard Error          901.3426
Sum                   14445000.0000          95% CI (+/-)           1770.0598
Median                   18750.0000          95% Lower Limit       21528.3273
Mean                     23298.3871          95% Upper Limit       25068.4469
Range                    75000.0000          Coeff of Variation       96.2519
Category:    Female
--------------------------------------------------------------------
Count                      819               Pop Std Dev           15177.0254
Min                            0.0000        Sample Std Dev        15186.2995
Max                      75000.0000          Standard Error          530.9765
Sum                    9615750.0000          95% CI (+/-)           1042.2369
Median                    4500.0000          95% Lower Limit       10698.6056
Mean                     11740.8425          95% Upper Limit       12783.0794
Range                    75000.0000          Coeff of Variation      129.3459
                                                                                41
4
Hypothesis Testing Basics
The chain of reasoning and systematic steps used in hypothesis testing are the back-
bone of every statistical test regardless of whether one writes out each step in a class-
room exercise or uses statistical software to conduct statistical tests on variables
stored in a data file.
★    The sample estimate is compared to the underlying distribution of the same size
     sampling distribution.
★    The probability that a sample estimate reflects the population parameter is deter-
     mined from the sampling distribution.
                                                                                        42
Reasonable doubt is based on probability sampling distributions. The benchmark for
reasonable doubt (defined by “alpha”) is established by the researcher. Alpha .05 is a
common benchmark for reasonable doubt. At alpha .05 we know from the sampling
distribution that a test statistic at or beyond this benchmark will only occur by ran-
dom chance five times out of 100 (5% probability). Since a test statistic that results
in an alpha of .05 could only occur by random chance 5% of the time, we assume
that the test statistic resulted because there are true differences between the popula-
tion parameters, not because we unwittingly drew a biased random sample.
The following table summarizes the possible outcomes of hypothesis testing as they
relate to “truth” in the underlying population. It is important to remember that in hy-
pothesis testing we are using sample statistics to make predictions about the un-
known population values. The orange cells represent erroneous conclusions from hy-
pothesis testing. One error (Type I) is to reject the null hypothesis when there is no dif-
ference in the population (i.e., this is being wrong when we conclude significance).
The other error (Type II) is to not reject the null hypothesis when there is an actual dif-
ference. The blue cells represent correct decisions from hypothesis testing.
                                       Type I error
     Rejected Null Hyp.                                          Correct Decision
                                         (alpha)
                                                                                          43
When learning statistics we generally conduct statistical tests by hand. In these situa-
tions, we establish before the test is conducted what test statistic is needed (called
the critical value) to claim statistical significance. So, if we know for a given sampling
distribution that a test statistic of plus or minus 1.96 would only occur 5% of the time
randomly, any test statistic that is 1.96 or greater in absolute value would be statisti-
cally significant. In an analysis where a test statistic was exactly 1.96, there would be
a 5% chance of being wrong if statistical significance is claimed. If the test statistic
was 3.00, statistical significance could also be claimed but the probability of being
wrong would be much less (about .002 if using a 2-tailed test or two-tenths of one
percent; 0.2%). Both .05 and .002 are known as alpha; the probability of a Type I er-
ror.
When conducting statistical tests with computer software, the exact probability of a
Type I error is calculated. It is presented in several formats but is most commonly re-
ported as "p <" or "Sig." or "Signif." or "Significance." Using "p <" as an example, if
a priori the threshold for statistical significance is established at alpha .05, any test
statistic with significance at or less than .05 would be considered statistically signifi-
cant and the null hypothesis of no difference must be rejected. The following table
links p values with a constant alpha benchmark of .05 (note: alpha remains constant
while p-values are a direct result of the analysis on sample data):
.10 .05 10% chance difference is not significant Not statistically significant
                                                                                            44
4.2 The Normal Distribution
Although there are numerous sampling distributions used in hypothesis testing, the
normal distribution is the most common example of how data would appear if we cre-
ated a frequency histogram where the x axis represents the values of scores in a dis-
tribution and the y axis represents the frequency of scores for each value. Most
scores will be similar and therefore will group near the center of the distribution.
Some scores will have unusual values and will be located far from the center or apex
of the distribution. In hypothesis testing, we must decide whether the unusual val-
ues are simply different because of random sampling error or they are in the extreme
tails of the distribution because they are truly different from others. Sampling distribu-
tions have been developed that tell us exactly what the probability of this sampling er-
ror is when data originate from a random sample collected from a population that is
normally distributed.
                                                                                       45
Using theoretical sampling probability distributions
Sampling distributions approximate the probability that a particular value would occur
by chance alone. If the means were collected from an infinite number of repeated ran-
dom samples of the same sample size from the same population most means will be
very similar in value, in other words, they will group around the true population mean.
In a normal distribution, most means will collect about a central value or midpoint of a
sampling distribution. The frequency of means will decrease as the value of the ran-
dom sample mean increases its distance from the center of a normal sampling distri-
bution toward the tails. In a normal probability distribution, about 95% of the means
resulting from an infinite number of repeated random samples will fall between 1.96
standard errors above and below the midpoint of the distribution which represents
the true population mean and only 5% will fall beyond (2.5% in each tail of the distri-
bution).
The following are commonly used points on a distribution for deciding statistical sig-
nificance.
Standard error: The standard error is a mathematical adjustment to the sample stan-
dard deviation to account for the effect sample size has on the underlying sampling
distribution. It represents the standard deviation of the sampling distribution.
The percentage of scores beyond a particular point along the x axis of a sampling dis-
tribution represent the percent of the time during an infinite number of repeated sam-
ples one would expect to have a score at or beyond that value on the x axis. This
                                                                                      46
value on the x axis is known as the critical value when used in hypothesis testing.
The midpoint represents the actual population value. Most scores will fall near the ac-
tual population value but will exhibit some variation due to sampling error. If a score
from a random sample falls 1.96 standard errors or farther above or below the mean
of the sampling distribution, we know from the probability distribution that there is
only a 5% or less chance of randomly selecting a set of scores that would produce a
sample mean that far from the true population mean. This area above and below
1.96 standard errors is the region of rejection.
When conducting significance testing, if we have a test statistic that is at least 1.96
standard errors above or below the mean of the sampling distribution, we assume we
have a statistically significant difference between our sample mean and the expected
mean for the population. Since we know a value that far from the population mean
will only occur randomly 5% or less of the time, we assume the difference is the re-
sult of a true difference between the sample and the population mean, and is not the
result of random sampling error. The 5% is also known as the probability of being
wrong when we conclude statistical significance.
A 2-tailed test is used when the researcher cannot determine a priori whether a differ-
ence between population parameters will be positive or negative. A 1-tailed test is
used when it is reasonable to expect a difference will be positive or negative.
                                                                                          47
4.3 Steps to Hypothesis Testing
I. General Assumptions
Random sampling
                                                                                        48
     Note: The alternative hypothesis will indicate whether a 1-tailed or a 2-tailed test
     is utilized to reject the null hypothesis.
Ha for 1-tail tested: The __ of __ is greater (or less) than the __ of __.
     This determines how different the parameters and/or statistics must be before
     the null hypothesis can be rejected. This "region of rejection" is based on alpha (
      ) -- the error associated with the confidence level. The point of rejection is
     known as the critical value.
     The collected data are converted into standardized scores for comparison with
     the critical value.
     If the test statistic equals or exceeds the region of rejection bracketed by the criti-
     cal value(s), the null hypothesis is rejected. In other words, the chance that the
     difference exhibited between the sample statistics is due to sampling error is
     remote--there is an actual difference in the population.
                                                                                          49
5
Confidence Intervals
For interval estimation, the data must be from a random sample. The following pre-
sents interval estimation techniques for proportions (nominal/ordinal data) and also
for means (interval/ratio data).
Interval estimation (margin of error) uses sample data to determine a range (interval)
that, at an established level of confidence, is expected to contain the population pro-
portion.
Steps
Use the z-distribution table to find the critical value for a 2-tailed test given the se-
lected confidence level (alpha).
where
                                                                                            50
            p = sample proportion
q=1-p
CV = critical value
CI = p ± (CV)(Sp)
Interpret
Based on alpha .05, the researcher is 95% confident that the proportion in the popula-
tion from which the sample was obtained is between __ and __.
Note: Given the sample data and level of error, the confidence interval provides an es-
timated range of proportions that is most likely to contain the population proportion.
The term "most likely" is measured by alpha (i.e., in most cases there is a 5% chance
--alpha .05-- that the confidence interval does not contain the true population propor-
tion).
                                                                                         51
More About the Standard Error of the Proportion
The standard error of the proportion will vary as sample size and the proportion
changes. As the standard error increases, so will the margin of error.
As a proportion approaches 0.5 the error will be at its greatest value for a given sam-
ple size. Proportions close to 0 or 1 will have the lowest error. The error above a pro-
portion of .5 is a mirror reflection of the error below a proportion of .5.
As sample size increases the error of the proportion will decrease for a given propor-
tion. However, the reduction in error of the proportion as sample size increases is not
constant. Using a proportion of 0.9 as an example, increasing the sample size from
100 to 300 cut the standard error by about half (from .03 to .017). Increasing the sam-
ple size by another 200 only reduced the standard error by about one quarter (.017 to
.013).
                                                                                      52
Example: Interval Estimation for Proportions
Problem: A random sample of 500 employed adults found that 23% had traveled to a
foreign country. Based on these data, what is the estimate for the entire employed
adult population?
Compute Interval
Interpret
Based on a 95% confident level, the actual proportion of all employed adults who
have traveled to a foreign country is between 19.3% and 26.7%.
                                                                                     53
Software output: Summary data from the previous example.
                                            Margin Limits
Confidence      Error +/-            Lower            Upper
------------------------------------------------------------
    90%            0.0311           0.1989           0.2611
    95%            0.0369           0.1931           0.2669
    99%            0.0486           0.1814           0.2786
Interpretation
[a] 	 At the 95% confidence level, the percent of all employees in the population who believe
the promotion system is not fair is between 12.2% to 27.8%.
[b]	 The margin of error is different for each response category due to the change in propor-
tions. Proportions at 0.5 (50%) will have the highest level of error. Using one margin of error for
multiple comparisons
                                                                                                 54
To avoid calculating a separate margin of error for each response category, it is com-
mon to calculate the most conservative standard error of a proportion (p=0.5) and
use this to represent the margin of error for all response options within a specific sub-
group. In the table below, a separate margin of error is calculated for the total sam-
ple, the male sample, and the female sample.
Summary Table: Using one confidence interval (CI) for all data in one subgroup
Interpretation
[a]	 Based on the 95% confidence level, the percent of U.S. adults with a junior college educa-
tion level is between 3.5% and 8.5% (6.0% +/- 2.5%).
[b]	 Based on the 95% confidence level, the percent of Female U.S. adults with a junior college
education level is between 2.8% and 9.6% (6.2% +/- 3.4%).
[c]	 Based on the 95% confidence level, the percent of U.S. adults with a graduate education
level is between 5.1% and 10.1% (7.6% +/- 2.5%).
[d]	 Based on the 95% confidence level, the percent of Female U.S. adults with a graduate edu-
cation level is between 2.0% and 8.8% (5.4% +/- 3.4%).
                                                                                               55
Interval Estimation for the Difference Between Two Proportions
This approach uses sample data to determine a range (interval) that, at an estab-
lished level of confidence, will contain the difference between two population propor-
tions.
Steps
Use the z distribution table to find the critical value for a 2-tailed test (at alpha .05 the
critical value would equal 1.96).
where and
CI = (p1-p2) ± (CV)(Sp1-p2)
CV = critical value
Interpret
Based on alpha .05, the researcher is 95% confident that the difference between the
proportions of the two subgroups in the population from which the sample was ob-
tained is between __ and __.
Note: Given the sample data and level of error, the confidence interval provides an es-
timated range of proportions that is most likely to contain the difference between the
                                                                                           56
population subgroups. The term "most likely" is measured by alpha or in most cases
there is a 5% chance (alpha .05) that the confidence interval does not contain the
true difference between the subgroups in the population.
Interval estimation involves using sample data to determine a range (interval) that, at
an established level of confidence, is expected to contain the mean of the population.
Steps
2. Use either the z distribution (if n>120) or the t distribution (for all sizes of n).
3. Use the appropriate table to find the critical value for a 2-tailed test
4. Multiple hypotheses can be compared with the estimated interval for the popula-
   tion to determine their significance. In other words, differing values of population
   means can be compared with the interval estimation to determine if the hypothe-
   sized population means fall within the region of rejection.
Estimation Formula
where
= sample mean
CV = critical value (consult z or t distribution table for df=n-1 and chosen alpha-- com-
monly .05)
                                                                                          57
Standard error of the mean
       s
sx =
        n
Estimation
Df=n-1 or 29
CV=2.045
Standard error
                      sx = .219
             		
                                                                                   58
CI 95 = 19.5 ± 2.045 (.219 )
CI 95 = 19.5 ±.448
Interpretation
Based on a 95% confidence level, the actual mean age of the all incoming freshmen
will be somewhere between 19 years (the lower limit) and 20 years (the upper limit) of
age.
                                         Margin Limits
Confidence      Error +/-            Lower             Upper
------------------------------------------------------------
    90%            0.3723          19.1277          19.8723
    95%            0.4481          19.0519          19.9481[a]
    99%            0.6039          18.8961          20.1039
Interpretation
[a] 	 Based on a 95% confidence level, the actual mean age of the all incoming freshmen will be
somewhere between 19 years (the lower limit) and 20 years (the upper limit) of age. Note that
both measures are rounded.
                                                                                             59
6
Z-Tests of Proportions and Chi-
Square
This section presents techniques for using counts and proportions to conduct hy-
pothesis testing. Use the techniques presented in this section if the analysis or de-
pendent variable meet the following characteristics.
                                                                                        60
6.1 Z-test of Proportions
For hypothesis testing in this section both the independent and dependent variable
must be nominal or ordinal. In addition, the data are assumed to be from a random
sample. The relevant sampling distribution for this section is the standard normal (Z)
distribution. Tables of critical values are included in the Appendix.
Example
Problem: Historical data indicates that about 10% of the agency's clients believe
they were given poor service. Now under new management for six months, a random
sample of 110 clients found that 15% believe they were given poor service.
                                                                                     61
I.	   Assumptions
If 2-tailed test
If 1-tailed test
      Ha: The proportion of current clients reporting poor service is significantly greater
      than the historical proportion of clients reporting poor service.
                                                                                           62
IV. Compute the Test Statistic
p = population proportion
q=1-p
n = sample size
Test Statistic
         Since the test statistic of 1.724 did not meet or exceed the critical value of
         1.96, there is insufficient evidence to conclude there is a statistically signifi-
         cant difference between the historical proportion of clients reporting poor
         service and the current proportion of clients reporting poor service.
         Since the test statistic of 1.724 exceeds the critical value of 1.65, conclude
         the proportion of current clients reporting poor service is significantly greater
         than the historical proportion of clients reporting poor service.
                                                                                          63
Software Output: Summary data from the previous example.
Interpretation
[a]	 For a 2-tailed test, the p-value represents the probability of making a type 1 error (conclud-
ing there is statistical significance when there is none). Since there is about 8% chance of mak-
ing a type 1 error, which exceeds the 5% error limit established in the rejection criteria of the hy-
pothesis testing process (alpha), do not conclude there is statistical significance.
                                                                                                   64
Comparing Proportions From Two Independent Samples (Two-Sample Test)
Example
Problem: A survey was conducted of students from the Princeton public school sys-
tem to determine if the incidence of hungry children was consistent in two schools lo-
cated in lower-income areas. A random sample of 80 elementary students from
school A found that 23% did not have breakfast before coming to school. A random
sample of 180 elementary students from school B found that 7% did not have break-
fast before coming to school. Before putting more resources into school A, the
school superintendent wants to verify there is a statistically significant difference be-
tween the schools (i.e., the difference is beyond just random error created from two
small samples of students).
I. Assumptions
                                                                                           65
    Ha: There is a statistically significant difference between the proportion of stu-
    dents in school A not eating breakfast and the proportion of students in school B
    not eating breakfast.
where and
Test Statistic
    Since the test statistic 3.721 exceeds the critical value of 1.96, conclude there is
    a statistically significant difference between the proportion of students in school
                                                                                        66
     A not eating breakfast and the proportion of students in school B not eating
     breakfast.
Interpretation
[a]	 For a 2-tailed test, the p-value represents the probability of making a type 1 error (conclud-
ing there is statistical significance when there is none). Since there is far less than a 5% chance
of making a type 1 error, conclude there is statistical significance.
                                                                                                 67
Appropriate Sample Size
Sample size is an important issue when using statistical tests that rely on the stan-
dard normal distribution (z-distribution). As a rule of thumb, sample size is generally
considered large enough to use the z-distribution when n(p)>5 and p is less than q (1-
p). If p is greater than q (note: q = 1-p), then n(q) must be greater than 5. Otherwise
use the binomial distribution.
Example A
n=60, p=.10
n(p) 60(.10) = 6
Example B
n=50, p=.80
q=1-p or q=.20
n(q) 60(.20) = 12
                                                                                          68
6.2 Chi-Square
Chi-square is the most common technique used to compare two nominal/ordinal vari-
ables, especially when at least one of the variables has more than two categories or
the sample size is small. In most cases, chi-square is preferred over a difference of
proportions z-test. The relevant sampling distribution for this section is the chi-
square distribution.
★ Goodness of Fit
★ Measuring Association
The chi-square goodness of fit test is used to compare frequencies (counts) among
multiple categories of nominal or ordinal level data for one-sample (univariate analy-
sis).
Problem: Evaluate variations in the proportion of defects produced from five assem-
bly lines. A random sample of 100 defective parts from the five assembly lines pro-
duced the following contingency table.
I. Assumptions
                                                                                         69
    Nominal or Ordinal level data
    Ho: There is no significant difference among the assembly lines in the observed
    frequencies of defective parts.
    Ha: There is a significant difference among the assembly lines in the observed fre-
    quencies of defective parts.
    At alpha .05 and 4 degrees of freedom, the critical value from the chi-square dis-
    tribution is 9.488
and . . .
n = sample size
Fo = observed frequency
                                                                                     70
                        Line A      Line B     Line C    Line D      Line E
                    Fo    24          15         22        20           19
          Fe (100/5=20)   20          20         20        20           20
            Chi-Square    .8         1.25        .2         0          .05     2.30
    Since the chi-square test statistic 2.30 does not meet or exceed the critical value
    of 9.488, do not conclude there is a statistically significant difference among the
    assembly lines in the observed frequencies of defective parts.
                                                                                      71
Chi-Square Test of Independence
Problem: Evaluate the association between a person's sex and their attitudes toward
school spending on athletic programs. A random sample of adults in a school district
produced the following table (counts).
I. Assumptions
No more than 20% of the cells have an expected frequency less than 5
No empty cells
      Ho: There is no association between a person's sex and their attitudes toward
      spending on athletic programs.
      Ha: There is an association between a person's sex and their attitudes toward
      spending on athletic programs.
                                                                                      72
III.	 Set the Rejection Criteria
where
                                                                                       73
Chi-square Calculations           Female                 Male
Spend more                  (15-20.952)2/20.952 (25-19.048)2/19.048
Spend same                  (5-10.476)2/10.476     (15-9.524)2/9.524
Spend less                  (35-23.571)2/23.571 (10-21.429)2/21.429
    Since the chi-square test statistic 21.2 exceeds the critical value of 5.991, con-
    clude there is a statistically significant association between a person's sex and
    their attitudes toward spending on athletic programs. As is apparent in the contin-
    gency table, males are more likely to support spending on athletic programs than
    females.
Standardized Residuals
Standardized residuals are used to determine what categories (cells) were major con-
tributors to rejecting the null hypothesis. When the absolute value of the residual (R)
is greater than 2.00, the researcher can conclude it was a major influence on a signifi-
cant chi-square test statistic.
                                                                                          74
Example using the observed and expected frequencies in the previous example:
R Absolute R
Interpretation
[a]	 Attitudes toward spending less for both females and males had a major contribution to the
chi-square result.
                                                                                             75
6.3!Coefficients for Measuring Association
The following are a few of several measures of association used with chi-square and
other contingency table analyses. When using the chi-square statistic, these coeffi-
cients can be helpful in interpreting the strength of the relationship between two vari-
ables once statistical significance has been established. The logic for using measures
of association is as follows:
Even though a chi-square test may show statistical significance between two vari-
ables, the relationship between those variables may not be substantively important.
Measures of association are available to help evaluate the relative strength of a statis-
tically significant relationship. In most cases, they are not used in interpreting the
data unless the chi-square statistic first shows there is statistical significance (i.e., it
doesn't make sense to say there is a strong relationship between two variables when
the statistical test shows this relationship is not statistically significant).
Phi
Phi is only used on 2x2 contingency tables. It is interpreted as a measure of the rela-
tive (strength) of an association between two variables ranging from 0 to 1.
                                                                                               76
Cramer's V Coefficient (V)
Characterizations
                                                                                         77
Proportional Reduction of Error (PRE)
Lambda
Gamma
Gamma is another PRE measure ranging from -1 to 1 that estimates the extent errors
are reduced in predicting the order of paired cases. Gamma ignores ties.
Kendall’s Tau b
Tau b is similar to Gamma but includes ties. It can range from -1 to 1 but since stan-
dardization is different from Gamma, it provides no clear explanation of PRE.
Inter-rater Agreement
Cohen’s Kappa
                                                                                      78
Software Output: Survey of U.S. adults.
       Count |
       Row % |
       Col % |
     Total % |    Male | Female |     Total
--------------------------------------------
       Favor |      314|      497|      811
             |    38.72|    61.28|
             |    73.88|    88.91|    82.42
             |    31.91|    50.51|
--------------------------------------------
      Oppose |      111|       62|      173
             |    64.16|    35.84|
             |    26.12|    11.09|    17.58
             |    11.28|     6.30|
--------------------------------------------
             |      425|      559|      984
       Total |    43.19|    56.81|   100.00
Measures of Association
-------------------------------------
Cramer's V                  .196 [c]
Pearson C                   .192
Lambda Symmetric            .082 [d]
Lambda Dependent=Column     .000
Lambda Dependent=Row        .115 [e]
                                                                 79
Interpretation
[a]	 The association between opinion on gun control and respondent sex is statistically signifi-
cant. This is the most common measure used for chi-square significance.
[b]	 When sample sizes are small, the continuous chi-square value tends to be too large. The
Yates continuity correction adjusts for this bias in 2x2 contingency tables. Regardless of sample
size, it is a preferred measure for chi-square tests on 2x2 tables.
[d]	 A symmetric lambda is used when identification of independent and dependent variables is
not useful.
                                                                                               80
Summary Table
This table includes descriptive statistics for nominal level data using counts and per-
centages. It also includes inferential statistics using confidence intervals for propor-
tions, chi-square, and measures of association. Five calculations were required to cre-
ate the margin of errors. Four crosstabulations are presented in one table using col-
umn percent (Sex Education by Sex, Sex Education by Race, Gun Control by Sex,
Gun Control by Race).
Table 1: Attitudes toward sex education and gun permit policy issues by sex and race of respondent
(percent).
                                              Sex                                  Race
                                                                   P <**                              P <**
                                                                                          Non-
                                  Total   Male      Female   (Cramer's V)       White           (Cramer's V)
                                                                                          White
                       Favor    48% [a]   33%         62%           .015 [c]     48%       50%          .888
                                                                   (.282) [d]                          (.017)
                      Oppose      52%     67%         38%                        52%       50%
                                                                                                            81
Interpretation
[a]	 Based on a 95% confidence level, the proportion of all U.S. adults who favor requiring gun
permits is 48% plus or minus 11.3% or between 37% to 59%.
[b]	 A statistically significant relationship between sex and attitudes toward sex education in
school is not evident (p > .05). Based on this random sample, there is a lack of convincing evi-
dence for a relationship in the underlying population.
[c]	 There is a statistically significant relationship between sex and attitudes toward gun per-
mits (p < .05). Women are significantly more likely (62%) to favor gun permits than men (33%).
[d]	 There is a weak to moderate association between sex and attitudes toward gun permits
based on a Cramer’s V of .282.
                                                                                                   82
7
T-Tests of Means and One-Way
ANOVA
This section presents hypothesis testing techniques for interval/ratio data. The type
of analytical technique used will depend on the level of the independent variable.
This chapter focuses on t-tests and ANOVA. Later chapters will address correlation
and regression.
                                                                                         83
7.1 T-test of Means
For inferential statistics (confidence intervals and significance tests), the data are as-
sumed to be from a random sample. To conduct significance tests, the test statistic
is compared to a critical value from a sampling distribution. Tables of critical values
are included in the Appendix. The relevant sampling distributions for this section are
the t distribution and f distribution.
Statistical Techniques
A one-sample t-test is used to compare a mean from a random sample to the mean
(mu) of a population. It is especially useful to compare a mean from a random sam-
ple to an established data source such as census data to determine if a sample is un-
biased (i.e., representative of the underlying population). Examples include compar-
ing mean age and education levels of survey respondents to known values in the
population.
Problem: Compare the mean age of incoming students to the known mean age for all
previous incoming students. A random sample of 30 incoming college freshmen re-
vealed the following statistics: mean age 19.5 years, standard deviation 1 year. The
college database shows the mean age for previous incoming students was 18.
                                                                                          84
I.	   Assumptions
Random sampling
      Ho: There is no significant difference between the mean age of past college stu-
      dents and the mean age of current incoming college students.
      Ha: There is a significant difference between the mean age of past college stu-
      dents and the mean age of current incoming college students.
                                                                                       85
     Test statistic
     Given that the test statistic (8.197) exceeds the critical value (2.045), the null hy-
     pothesis is rejected in favor of the alternative. There is a statistically significant
     difference between the mean age of the current class of incoming students and
     the mean age of freshman students from past years. In other words, this year's
     freshman class is on average older than freshmen from prior years.
     If the results had not been significant, the null hypothesis would not have been
     rejected. This would be interpreted as the following: There is insufficient evi-
     dence to conclude there is a statistically significant difference in the ages of cur-
     rent and past freshman students.
Interpretation
[a]	 For a 2-tailed test, the p-value represents the probability of making a type 1 error (conclud-
ing there is statistical significance when there is none). Since there is less than a .01 % ( p <
.0001) chance of making a type 1 error, there is statistical significance.
                                                                                                 86
Comparing Two Independent Sample Means (Two-Sample Test)
A two-sample t-test is used to compare two sample means. The independent vari-
able is nominal level data and the dependent variable is interval/ratio level data.
Problem: The number of years of education were collected from one random sample
of 38 police officers from City A and a second random sample of 30 police officers
from City B. The average years of education for the sample from City A is 15 years
with a standard deviation of 2 years. The average years of education for the sample
from City B is 14 years with a standard deviation of 2.5 years. Is there a statistically
significant difference between the education levels of police officers in City A and City
B?
I. Assumptions
Random sampling
Independent samples
Organize data
                                                                                           87
    For a 2-tailed hypothesis test
    Ha: The mean education level of police officers working in City A is significantly
    greater than the mean education level of police officers working in City B.
Standard error
                                                                                       88
    Test Statistic
V. Decide Results
    Police officers in City A have significantly more years of education than police offi-
    cers in City B. The test statistic 1.835 exceeds the critical value of 1.668 for a 1-
    tailed test.
                                                                                       89
Software Output: Uses survey data to compare the mean incomes of males and fe-
males.
                         Sample 1         Sample 2
                         Female           Male
-----------------------------------------------------------
Sample Mean             12279.4667       15266.4706
Std Deviation            4144.6323        3947.7352
Sample Size (n)            15               17
                        Homogeneity of Variance
------------------------------------------------------------
              F-ratio   1.09    DF (14, 16)   p < 0.4283 [a]
Interpretation
[a] The F-ratio is not statistically significant. Therefore, use the equal variance test statistic.
[b]	 For a 2-tailed test, the p-value represents the probability of making a type 1 error (conclud-
ing there is statistical significance when there is none). Since there is about 4.6% (p < .0455)
chance of making a type 1 error, which does not exceed the 5% error limit (p=.05) established in
the rejection criteria of the hypothesis testing process (alpha), conclude there is statistical signifi-
cance.
                                                                                                       90
Computing F-ratio
The F-ratio is used to determine whether the variances in two independent samples
are equal. If the F-ratio is not statistically significant, assume there is homogeneity of
variance and employ the standard t-test for the difference of means. If the F-ratio is
statistically significant, use an alternative t-test computation such as the Cochran
and Cox method.
Problem: Given the following summary statistics, are the variances equal?
Numerator of the ratio is the sample with the larger variance (Sample B).
Denominator of the ratio is the sample with the smaller variance (Sample A).
Fcv= 2.70
                                                                                         91
Compute the Test Statistic
where
Decide Results
Compare the test statistic with the f critical value (Fcv) listed in the F distribution. If
the F-ratio equals or exceeds the critical value, the null hypothesis (Ho)           (there is
no difference between the sample variances) is rejected. If there is a difference in the
sample variances, the comparison of two independent means should involve the use
of the Cochran and Cox method or one of several alternative techniques.
The test statistic (1.50) did not meet or exceed the critical value (2.70). Therefore,
there is no statistically significant difference between the variance exhibited in Sam-
ple A and the variance exhibited in Sample B. Assume homogeneity of variance for
tests of the difference between sample means.
                                                                                              92
Software Output: Comparing the mean incomes of high school graduates and high
school dropouts from survey data.
                         Sample 1         Sample 2
                         <12 yrs          HS Grade
------------------------------------------------------------
Sample Mean             23042.5000       31000.8235
Std Deviation            4450.5413        9442.4477
Sample Size (n)             6               17
                        Homogeneity of Variance
------------------------------------------------------------
              F-ratio   5.08    DF (16, 5)   p < 0.0408 [a]
Interpretation	 	
[a] 	 The F-ratio is statistically significant. Use the unequal variance test statistic.
[b]	 If homogeneity of variance had not been considered, the research may have erroneously
failed to find statistical significance.
[c]	 Since there is only a 1.2% (p < .0124) chance of making a type 1 error, which does not ex-
ceed the 5% error limit (p=.05, alpha), conclude there is statistical significance.
                                                                                              93
7.2 One-Way Analysis of Variance (ANOVA)
One-way analysis of variance (ANOVA) is used to evaluate means from two or more
subgroups. A statistically significant ANOVA indicates there is more variation be-
tween subgroups than would be expected by chance. It does not identify which sub-
group pairs are significantly different from each other. ANOVA is used to evaluate mul-
tiple means of one independent variable to avoid conducting multiple t-tests (see mul-
tiple comparison problem in section 6.3).
Problem: The number of years of education were obtained from one random sample
of 38 police officers from City A, a second random sample of 30 police officers from
City B, and a third random sample of 45 police officers from City C. The average
years of education for the sample from City A is 15 years with a standard deviation of
2 years. The average years of education for the sample from City B is 14 years with a
standard deviation of 2.5 years. The average years of education for the sample from
City C is 16 years with a standard deviation of 1.2 years.
Is there a statistically significant difference between the education levels of police offi-
cers in City A, City B, and City C?
                                                                                       94
I.	   Assumptions
      Ho: There is no statistically significant difference among the three cities in the
      mean years of education for police officers.
Ha: There is a statistically significant difference among the three cities in the
At alpha.05, df=(2,110)
                                                                                          95
    Consult f-distribution, Fcv = 3.072
    F=
                        ∑ n ( X − X ) / (k −1)
                                  k           i       g
# 2 & 2 2
where
Estimate F Statistic
    F=
                                      (
                              ∑ nk Xi − X g           )       / ( k −1)
         #
                                                  ) + ∑( X − X ) &(' / (n − k )
                     2                                2                         2
              (       )          (
         %$∑ Xi − X 1 + ∑ Xi − X 2                                  i       3
                                                                                    96
                               	
    Compare the F statistic to the F critical value. If the F statistic equals or exceeds
    the Fcv, the null hypothesis is rejected. This suggests that the population means
    of the groups sampled are not equal -- there is a difference between the group
    means.
    Since the F-statistic (9.931) exceeds the F critical value (3.072), we reject the null
    hypothesis and conclude there is a statistically significant difference between the
    three cities in the mean years of education for police officers.
                                                                                        97
Software Output: Comparing amount of time spent watching TV per day by age
group.
Analysis of Variance
                            F Statistic              p <
                          --------------------------------
                                   5.94 [f]     0.0005 [g]
Interpretation
[e] n-1
                                                                             98
[f]	   Between Groups Mean Squares ÷ Within Groups Mean Squares
[g]	 The F-Statistic is statistically significant. The p-value represents the probability of making
a type 1 error (concluding there is statistical significance when there is none). Since there is
about .05% (p < .0005) chance of making a type 1 error, conclude there is statistical signifi-
cance; ie, there is variation among the groups.
                                                                                                  99
7.3 Multiple Comparison Problem
(Post hoc comparisons)
As statistical tests are repeatedly run between subgroups within the same variable,
the probability of making a type one error increases (i.e., if 100 t-tests were con-
ducted, by random chance 5 out of 100 may be statistically significant when if fact
they are not). Post hoc comparisons adjust for the problem of multiple comparisons
and provide information regarding which subgroup pairs have a statistically signifi-
cant difference. There are several alternative approaches to conducting post hoc
comparisons. Bonferroni is one of the most conservative (less likely to find statistical
significance).
Software Output: The following presents post hoc comparisons for the preceding
ANOVA output.
                                                                                       100
Interpretation	 	
[a] 	 The difference between the mean of those 30-39 (G1) and the mean of those 50 or older
(G2) is a -0.666. A negative mean indicates those 50 or older have a greater mean number of
hours watching television than those 30-39.
[b]	 There is a statistically significant difference between the number of hours watching televi-
sion for those between the ages of 30 and 39 and the number of hours watching television for
those 50 and older.
[c]	 There is a statistically significant difference between the number of hours watching televi-
sion for those between the ages of 40 and 49 and the number of hours watching television for
those 50 and older.
                                                                                               101
Summary Report
This table uses descriptive statistics (means), t-tests, and ANOVA to compare city
manager characteristics. The independent variables are sex and city size (nominal/
ordinal). The dependent variables are age, education, and years as a manager
(interval/ratio).
Table 2: City manager characteristics (means) by sex and city size category.
Manager Characteristics
                    Age       46.80 49.68      41.82 .028 [a]   41.71     46.86   54.67 .004 [b]
       Years of Education     16.63 16.68      16.55 .838       16.14     17.71   16.56 .152
        Years as Manager      16.08 18.40      12.09 .044 [c]   15.18     16.57   17.11 .861
                                                                                                102
Interpretation
[a]	 There is a significant difference between male and female city managers in mean age.
Male city managers (49.7 years) are significantly older than female city managers (41.8 years).
[b]	 There is a significant difference among small, medium, and large cities in the mean ages of
city managers. It appears that larger size cities are likely to have older city managers than
smaller cities. This conclusion should be confirmed with a post-host analysis.
[c] Males have significantly greater years experience (about 6 years more on average) as city
managers than females.
[d]	 There is a significant difference among small, medium, and large cities in the mean skill
preferences of city managers. It appears that large cities are likely to have city managers who
place more importance on negotiation skills than analytical skills. In contrast, small and medium
size cities prefer analytical skills over negotiation. This conclusion should be confirmed with a
post-host analysis.
                                                                                                  103
8
Correlation
Example: The following data were collected to estimate the correlation between
years of formal education and income at age 35.
                                                                                      104
The following scattergram assists in evaluating linearity and also
helps to identify problems related to extreme values.
Y: Income
       44.0|                             *
           |
           |
           |
       34.5|
           |                       *
           |
           |
           | *         *
       25.0| *
             ---------------|--------------|
           12.0           15.0          18.0
X: Education
Compute Pearson's r
                   Education    Income
                    (Years)     ($1000)
Name                   X           Y            XY        X2      Y2
         Susan        12           25          300       144     625
            Bill      14           27          378       196     729
           Bob        16           32          512       256    1024
          Tracy       18           44          792       324    1936
          Joan        12           26          312       144     676
            Σ=        72          154          2294     1064    4990
            n=         5
                                                                       105
                   n∑ XY − ∑ X ∑Y
rxy =
        #                     2& #                   2&
        %$ n ∑ X 2
                   −   (    X  )
                           ∑ (' * %$n∑Y 2 −   (    Y )
                                                  ∑ ('
                  5 ( 2294) − 72 (154)
rxy =
        "#5 (1064) − 72 2 $% * "#5 ( 4990 ) −154 2 $%
                                                      	
Interpret
A positive coefficient indicates the values of variable A vary in the same direction as
variable B. A negative coefficient indicates the values of variable A and variable B
vary in opposite directions.
Characterizations of Pearson r
.7 to .9 high correlation
.5 to .7 moderate correlation
.3 to .5 low correlation
In this example, there is a very high positive correlation between the variation of edu-
cation and the variation of income. Individuals with higher levels of education earn
more than those with comparably lower levels of education.
                                                                                       106
Coefficient of Determination
I. Assumptions
                                                                                        107
    Ho: There is no association between annual income and education for employed
    adults.
    Ha: There is an association between annual income and education for employed
    adults.
V. Decide Results
    Since the test statistic 11.022 exceeds the critical value 2.101, there is a statisti-
    cally significant association in the national population between an employed
    adult's education and their annual income.
                                                                                        108
Software Output: Example of one bivariate comparison with a scattergram. Compar-
ing individual income in U.S. dollars to years of education.
Pearson's Correlation
INCOME: [a]
   42779.0|                    *     *
          |                 *        * *
          |                 *     * * *
          |              *     *     *
   28030.0|        *        *
          |     * *         * * * *
          | *            *
          |        *        *
          |     *        *
   12281.0| * *
            ---------------|--------------|
            4.0          12.0          20.0
          EDUC: [b]
Number of cases: 32
Missing: 0 [c]
Pearson Correlation: 0.751 [d]
p < (2-tailed signif.): 0.0000 [e]
Interpretation
[a]	 The Y axis of the scattergram. If theory suggests cause and effect, the Y axis is commonly
used for the dependent (response) variable.
[b]	 The X axis of the scattergram. If theory suggests cause and effect, the X axis is commonly
used for the independent variable.
[c]	 Since each observation (case) must have values for both income and education, any obser-
vations where one or both of these variables have no data will be removed from the analysis.
[d]	 Pearson correlation coefficient representing a high positive correlation between education
and income. Interpretation: As years of education increases so does personal income.
                                                                                           109
Software Output: Example of bivariate comparisons displayed in a correlation ma-
trix. Comparing individual income, education, and months of work experience.
-----------------------------------------------
    Coeff |          Correlation Matrix        |
    Cases |-----------------------------------|
      p < |    INCOME |      EDUC | WORKEXP |
-----------------------------------------------
   INCOME | 1.000 [a]| 0.751 [d]| -0.160 [f]|
          |     32 [b]|     32    |      32    |
          |      . [c]| 0.000 [e]| 0.381 [g]|
-----------------------------------------------
     EDUC |    0.751 |     1.000 |    -0.520 |
          |       32 |         32 |         32 |
          |    0.000 |          . | 0.002 [h]|
-----------------------------------------------
 WORKEXP |    -0.160 |    -0.520 |      1.000 |
          |       32 |         32 |         32 |
          |    0.381 |     0.002 |           . |
-----------------------------------------------
        2-tailed significance tests
         '.' p-value not computed
Interpretation
[a] 	 The diagonal in the matrix represents Pearson correlations between the same variable
which will always be 1 since the variables are identical. The correlations above the diagonal are
a mirror reflection of these below the diagonal, so only interpret half of the matrix (in this case
three correlation coefficients).
[c]	 A statistical test is not performed when comparing a variable to itself. Mathematically this
will always equal zero.
                                                                                                 110
[d]	 Pearson correlation coefficient representing a high positive correlation between education
and income. Interpretation: As years of education increases so does personal income.
[f]	 Pearson correlation coefficient representing a very weak negative correlation between work
experience and income. Interpretation: As years of work experience increases personal income
decreases.
[g] There is not a statistically significant association between work experience and income.
[h] There is a statistically significant association between work experience and education.
                                                                                                111
8.2 Spearman Rho Coefficient
Problem: Five college students' have the following rankings in math and philosophy
courses. Is there an association between student rankings in math and philosophy
courses?
where
                                                                                     112
Note: When two or more observations of one variable are the same, ranks are as-
signed by averaging positions occupied in their rank order.
Example:
Score 2          3       4       4       5       6       6      6            8
Rank     1       2       3.5     3.5     5       7       7      7            9
Interpret Coefficient
There is a moderate negative correlation between the math and philosophy course
rankings of students. Students who rank high as compared to other students in their
math course generally have lower philosophy course ranks and those with low math
rankings have higher philosophy course rankings than those with high math rankings.
Note: The formulas for Pearson r and Spearman rho are equivalent when there are no
tied ranks.
                                                                                  113
Hypothesis Testing for Spearman Rho
Problem: Based on a Spearman Rho of .70 and a sample size of 20, is there an asso-
ciation between the music and physics rankings of students statistically significant?
I. Assumptions
Sample size is ≥ 10
      Null Hypothesis (Ho): There is no association between music and physics class
      rankings for all students in the population.
                                                                                    114
    Note: The t distribution should only be used when the sample size is 10 or more.
Convert into
    Reject if the t-observed is equal to or greater than the critical value. The variables
    are/are not related in the population. In other words, the association displayed
    from the sample data can/cannot be inferred to the population.
    Since the test statistic 4.159 exceeds the critical value 2.101, there is a statisti-
    cally significant association in the national population between student music
    and physics rankings. Students who rank high in musical ability will also likely
    rank high in physics.
                                                                                        115
9
Simple OLS Regression
Linear ordinary least squares regression involves predicting the score for a dependent
variable (Y) based on the score of an independent variable (X). Data are tabulated for
two variables X and Y. If a linear relationship exists between the variables, it is appro-
priate to use linear regression to base predictions of the Y variable from values of the
X variable. See the multiple regression chapter for further discussion of assumptions.
9.1 Procedure
3.   Determine the "standard error of the estimate" (Sxy) to evaluate the distribution
     of scores around the predicted Y score.
Data are tabulated for two variables X and Y. Use Pearson’s r to help determine if
there is a linear relationship between the variables. If a significant linear relationship
                                                                                         116
exists between the variables, it is appropriate to use linear regression to base predic-
tions of the Y variable on the relationship developed from the original data.
where
= predicted score
where
= mean of variable X
= mean of variable Y
Insert the values for the intercept (a) and slope (b) of the completed equation for a
straight line and select an individual X score to predict a Y score.
                                                                                        117
Determine the Standard Error of the Estimate
The standard error of the estimate is used to provide a confidence interval around a
point estimate for predicted Y in the underlying population. In relative terms, larger
standard errors indicate less prediction accuracy for the population value than equa-
tions with smaller standard errors.
                                                                                         118
Problem: The following data were collected to estimate the correlation between
years of formal education and income at age 35 and are the same data used in an ear-
lier example to estimate Pearson r.
                    Education                Income
                     (Years)                 ($1000)
Name                    X                       Y               XY         X2
         Susan          12                      25              300       144
             Bill       14                      27              378       196
            Bob         16                      32              512       256
          Tracy         18                      44              792       324
           Joan         12                      26              312       144
          Sum=          72                     154             2294      1064
             n=          5
         Mean=         14.4                   30.8
                                                                                 119
            	
                      2
            "      ^%
          ∑$# yi − y '&
     se =
	            n−2        	   	                  	   	
where
                                                                                     120
Confidence interval for predicted Y
Given a 5% chance of error, the estimated income for a person with 15 years of edu-
cation will be $32,485 plus or minus $10,424 or somewhere between $22,060 and
$42,909 (the interval estimate).
                         2
         "      ^%
       ∑$# yi − y '& / (n − 2)
sb =                         2
            ∑( x − X )
                     i
                                                                                 121
Where CV= critical value (t-distribution, 2-tailed, .05 alpha, df=n-2) and 2.8 is the
point estimate for the slope.
or simplify to
Interpretation
                                                                                        122
9.2 Hypothesis testing
The null hypothesis states that X is not related with Y (i.e., the slope is 0)
bcv = 3.182
                       2.8
                 t=
        		            .625 	 	   t = 4.48
Decision
Reject the null hypothesis. The slope for education is statistically significant (i.e., t
test statistic of 4.48 is beyond the critical value of 3.182). As education in years in-
creases so will income.
                                                                                            123
9.3 Evaluating the power of the regression model
 ^
yi −Y     = deviation explained by X 		   (Explained Dev.)
      ^
yi − yi    = deviation not explained by X 	    (Unexplained Dev.)
                                                                                        124
Example of Explained and Unexplained Deviations Using One Observation
xi
     = Tracy’s education 18 years
^
y    = Tracy’s predicted income is $40.9   	
                                                                        125
The formula for estimating deviations for all observations is as follows:
           2
∑(Y −Y )
     i
                    = the total deviation of Y
            2
  "^       %
∑$#Y i −Y '&
             = deviation explained by X
                2
  "     ^ %
∑$#Yi −Yi '&
                    = deviation not explained by X
                                                                  2
                                                     "^       %
                                                   ∑$#Y i −Y '&
                                            R2 =              2
                            		    or	 	
                                                   ∑(Y −Y )
                                                        i
                                                                            126
Example:
Impact of R2 on predictions:
R2 is sample specific:
Two samples with the same variables, slope, and intercept could have different R2 be-
cause of the fit between the data and the regression line (different variation in Y; see
formula).
                                                                                         127
Software Output: Regressing individual income on education from survey data.
ANOVA Statistics
Coefficients
Estimated Model
Interpretation	 	
[a]	 The correlation coefficient.
[b]	 Education explains 56% of the variation in income.
[c]	 Standard error of the estimate. Used for creating confidence intervals for predicted Y.
[d]	 The regression model is statistically significant.
                                                                                               128
[e]	 The Y intercept and mean value of income if education is equal to 0.
[f]	 The impact of a one-unit change in education on income. Interpretation: One additional
year of education will on average result in an increase in income of $732.
[g]	 Standard error of the slope coefficient. Used for creating confidence intervals for b and for
calculating a t-statistic for significance testing.
[h]	 Standardized regression partial slope coefficients. Used only in multiple regression models
(more than one independent variable), beta-weights in the same regression model can be com-
pared to one another to evaluate relative effect on the dependent variable. Beta-weights will indi-
cate the direction of the relationship (positive or negative). Regardless of whether negative or
positive, larger beta-weight absolute values indicate a stronger relationship.
[i]	 The t-test statistic. Calculated by dividing the b coefficient by the standard error of the
slope.
[j]	   Education has a statistically significant positive association with income.
                                                                                                 129
10
Multiple OLS Regression
Multiple regression is an extension of the simple regression model that allows the in-
corporation of more than one independent variable into the explanation of Y. Multiple
regression helps clarify the relationship between each independent variable and Y by
holding constant the effects of the other independent variables in the model.
General equation:
Interpretation:
We continue to use the least squares approach for fitting the data based on the for-
mula for the straight line; this as the least sum of the squared differences between ob-
served Y and predicted Y.
                    2
       "    ^ %
ESS = ∑$Yi −Y i '
       #        &
                                                                                    130
Partial slope equation (Three variable example)
Process:
2. Measure portion of X1 not explained by X2, which is a simple application of the bi-
variate model through the following process:
analogous to
    Compute the prediction error where u represents the portion of X1 which X2 can-
    not explain.
analogous to
    Compute the prediction error where v represents the portion of Y which X2 can-
    not explain.
                                                                                     131
3. The above is incorporated into the formula for b1
                                               ^          ^
                                       ⎛        ⎞⎛       ⎞
                                       ⎜ X − X
                                     ∑ ⎝ 1 1 ⎠⎝
                                                 ⎟⎜ Y − Y ⎟
                                                            ⎠
                              b1 =                        2
                                                   ^
                                           ⎛        ⎞
                                              X   X
                                         ∑ ⎝ 1 1 ⎟⎠
                                           ⎜   −
                	   	   or	
This equation results in a measure of the slope for X1 that is independent of the linear
effect of X2 on X1 and Y. Put in more applied terms, a multiple regression model al-
lows us to evaluate the spuriousness of relationships. As an example. We found in
our bivariate model that education has a significant causal relationship with income.
Given the simplicity of this model, we don't know if the relationship might disappear if
we controlled for employee experience. In adding experience to our model, we face
four possible outcomes:
3.   Both education and experience are significant (evidence bivariate relationship be-
     tween education and income is not spurious).
4.   Neither are significant (evidence of high correlation between education and expe-
     rience).
                                                                                      132
Partial slope example
Data
ANOVA Statistics
                  Sum of Sqs      df         Mean Sq          F       p <
--------------------------------------------------------------------------
  Regression         233.915       2         116.957     18.153    0.0522
    Residual          12.885       2           6.443
       Total         246.800
Coefficients
Variable         b Coeff.     std. error      BetaWgt         t       p <
--------------------------------------------------------------------------
    Constant     -14.981           7.739                 -1.936    0.1925
   Education       2.900           0.490        0.962     5.925    0.0273
  Experience       0.692           0.400        0.281     1.732    0.2255
                                                                             133
Interpreting the Parameter Estimates
Intercept (a0):	 	   The average value of income when all independent variables are
equal to "0" (-14.981 or -$14,981).
Education (b1):	     When we hold experience constant, a one unit change in education
results in an average change in income of 2.9k ($2,900).
Experience (b2):	 When we hold education constant, a one unit change in experience
results in an average change in income of .692k ($692).
This uses the same general procedure as bivariate (simple) regression model.
          n = sample size
          k = number of independent variables
df = 2
                                                                                      134
The equation for the 95% confidence interval for the education slope is the following:
This is interpreted as there is a probability of .95 that the slope of education in the
population is between .800 and 5 ($800 and $5,000).
The significance test for the two slope coefficients uses the same degrees of free-
dom, sampling distribution, and critical value (tcv=4.303) as applied to the confidence
interval calculation.
Predicting Y
and data for Jim who has 13 years of education and 10 years experience.
                                                                                          135
Jim's predicted average income is $29,639 (this is the predicted point estimate for
Jim).
Again using the same degrees of freedom, sampling distribution, and critical value.
There is a probability of .95 that someone in the population with 13 years of educa-
tion and 10 years experience would earn between $18,720 and $40,560. Obviously
this model would not be very useful for predicting incomes in the population. This is
not surprising given the very small sample size (n=5).
Adjusted R2
                                                                                       136
It has become standard practice to report the adjusted R2, especially when there are
multiple models presented with varying numbers of independent variables.
Adjusted R2 formula
  2 ⎛ 2    k ⎞⎛ n − 1 ⎞
R = ⎜ R −       ⎟⎜           ⎟
    ⎝     n − 1 ⎠⎝ n − k − 1 ⎠
  2 ⎛          2 ⎞⎛ 5 − 1 ⎞
R = ⎜ .948 −       ⎟⎜           ⎟
    ⎝        5 − 1 ⎠⎝ 5 − 2 − 1 ⎠
  2
R = .896
Partial slope coefficients are not directly comparable to one another. This is because
each variable is likely based on a different metric (age, income, sex). Beta weights
(standardized slope coefficients) are used to compare the relative importance of re-
gression coefficients in a model. They are not useful for comparisons among models
from other samples.
                         ⎛ s xi   ⎞
                β i = bi ⎜        ⎟
                         ⎜ s      ⎟
Formula: 	               ⎝ y      ⎠ 		   or partial slope * (std dev. of Xi ÷ std dev. of Y)
Example:
                                                                                                 137
                                             sxi
Education standard deviation                       = 2.88
                                               sy
Begin salary standard deviation                     = 7870.64
         ⎛ s xi   ⎞                   ⎛ 2.88 ⎞
β i = bi ⎜        ⎟       β i = 1878.2⎜          ⎟
         ⎜ s      ⎟                      7870 .64          β i = .687
         ⎝ y      ⎠ 	 	               ⎝          ⎠ 	 	
F-statistic
I. Assumptions
Ho: The regression coefficients taken together simultaneously are equal to zero.
Ha: The regression coefficients taken together simultaneously are equal to zero.
                                                                                    138
     H 0 ≠ β1 ≠ β2 ≠ βk ≠ 0
where
    Formula (A): This shows that the F-ratio is related to the explanatory power of
    the entire regression equation
        "           %" n − k −1 %
                    '$ (       )
              2
            R
     F =$                        '
        $ (1− R 2 ) '#    k      &
        #           &
Formula (B): This formula is more relevant for common regression output.
                                2
                 ⎛ ^
               ∑ ⎜⎝ Yi − Y ⎞⎟⎠ / k
     F=                    2
                   ^
            ⎛         ⎞
          ∑ ⎝ i ⎟⎠ /(n − k − 1)
            ⎜ Y − Y i
                                   	 	
    The mean sum of squares for what the model explains divided by the mean sum
    of squares for what the model does not explain.
                                                                                      139
    Example:
        "           %" n − k −1 %
                    '$ (       )
              2
            R                                           ⎛ .446 ⎞⎛ 471 ⎞
     F =$                        '                  F = ⎜⎜             ⎟⎟⎜   ⎟
        $ (1− R 2 ) '#    k
        #           &            &
                                   	           	        ⎝   (1 − .446 ) ⎠  ⎝ 2 ⎠ 	 	    F = 189.59
or
                                 2
                  ⎛ ^
                ∑ ⎜⎝ Yi − Y ⎞⎟⎠ / k
     F=                     2
                      ^
          ∑ ⎜⎝ Yi − Y i ⎞⎟⎠ /(n − k − 1)
            ⎛
                                              	 	   	                                      		
V. Decision
    Note: It is possible to have independent variables that individually are not statisti-
    cally significant but as a group do have a significant relationship with the depend-
    ent variable. Lack of significance for each independent variable may be due to
    high correlation with other independent variables in the model (multicolinearity).
Dummy Variables
                                                                                                         140
         The partial slope coefficient represents the on average change in
         the dependent variable between the coded value of 1 for the inde-
         pendent variable and the reference group of 0.
Always use k-1 dummy variables in a regression model, where k is equal to the num-
ber of categories in the variable. In the following example, a dummy variable called
“Male” is created where 1=Male and 0=Female, the reference group. There is no
need to create a dummy variable for Female (k=2; or, K-1 = 1 dummy variable).
Original Variable
                                                                                       141
Example B: Creating a dummy variable from a three-category variable
For a three-category variable, two dummy variables are created to represent three
categories.
Original Variable
Clerical Dummy: recoded into 0/1 variable (0=Custodial and Manager; 1=Clerical)
Custodian Dummy: recoded into 0/1 variable (0=Clerical and Manager; 1=Custodian)
Managers are the reference group, represented in the model when the clerical and
custodian dummy variables are both equal to zero in the data.
                                                                                    142
Software Output: A multiple regression example regressing individual income on
education in years, work experience in months, and a dummy variable representing
sex (0=Female, 1=Male).
ANOVA Statistics
               Sum of Sqs   df         Mean Sq        F     p <
-----------------------------------------------------------------
Regression 419605192.680     3   139868397.560   27.709 0.0000 [c]
  Residual 141339012.195    28     5047821.864
     Total 560944204.875    31
Coefficients
Variable     b Coeff.    std. error   BetaWgt       t        p <
-----------------------------------------------------------------
Constant -338.10507      2027.91657               -0.167   0.8688
EDUC       870.14643 [d] 108.50630    0.89240 [e] 8.019    0.0000 [f]
WORKEXP    193.70166 [g]   82.88935   0.26208      2.337   0.0268
SEX       2821.21837 [h] 803.62244    0.33626      3.511   0.0015
Estimated Model
INCOME = -338.105 + 870.146(EDUC) + 193.702(WORKEXP)
                  + 2821.218(SEX)
Interpretation
[a] 	 Adjusted R2. Education, experience, and a person’s sex explain 72% of the variation in indi-
vidual income.
                                                                                              143
[b]	 Standard error of the estimate. Used for creating confidence intervals for predicted Y.
[d]	 The impact of a one-unit change in education on income. One additional year of education
will on average result in an increase in income of $870 holding experience and sex constant.
[e]	 Standardized regression partial slope coefficients. Used only in multiple regression models
(more than one independent variable), beta-weights in the same regression model can be com-
pared to one another to evaluate relative effect on the dependent variable. Beta-weights will indi-
cate the direction of the relationship (positive or negative). Regardless of whether negative or
positive, larger beta-weight absolute values indicate a stronger relationship. In this model, edu-
cation has the greatest effect on income followed in order by sex and experience.
[h]	 Education has a statistically significant association with income. The probability of a type
one error is less than .0001 which far exceeds alpha of p = .05. Also, experience and sex are
statistically significant.
[i]	 One additional month of experience will on average result in an increase in income of $194
holding education and sex constant.
[j] 	 Males will on average earn an income of $2,821 more than women holding education and
experience constant.
                                                                                                144
11
Regression Assumptions
Although ordinary least squares multiple regression is a very robust statistical tech-
nique, careful consideration should be given to the assumptions underlying obtaining
the Best Linear Unbiased Estimates (BLUE) from a regression model. In practice, it is
very difficult to completely satisfy all the assumptions but there are techniques to
help adjust the affects on the model. A few suggested corrections are provided in
the following brief review.
1. No measurement error
The independent (X) and dependent (Y) variables are accurately measured. Non-
random error (e.g., measuring the wrong income for a defined group of people) will se-
riously bias the model. Random error is the more common assumption violation. Ran-
dom error in an independent variable may lower R2 and the partial slope coefficients
can vary dramatically depending on the amount of error. Other independent variables
that do not have random measurement error will be biased if they are correlated with
another independent variable with measurement error. Error in measuring the depend-
ent variable many not bias the estimates if the error is random.
2. No specification error
The theoretical model is assumed to be a) linear, b) additive, and c) includes the cor-
rect variables.
Non-Linearity
                                                                                         145
Linear implies that the average change in the dependent variable associated with a
one unit change in an independent variable is constant regardless of the level of the
independent variable. If the partial slope for X is not constant for differing values of X,
X has a nonlinear relationship with Y and results in biased partial slopes. There are
two basic types of nonlinearity discussed in the following.
In this example the variable’s slope is positive but the steepness of the curve in-
creases as the value of X increases.
Correction: A log-log model takes a nonlinear specification where the slope changes
as the value of X increases and makes it linear in terms of interpreting the parameter
estimates. It accomplishes this by transforming the dependent and all independent
variables by taking their log and replacing the original variables with the logged vari-
ables. The result is coefficients that are interpreted as a % change in Y given a 1%
change in X.
Example:
Note: All data must have positive values. A log of a negative value will equal 0 and
as a result biases the model. After modeling, the anti log of the coefficients can be
used to estimate y.
                                                                                        146
Not linear and the slope changes direction (pos or neg)
In this example the variable’s slope changes direction depending on the value of X.
Correction: A polynomial model may be used to correct for changes in the slope coef-
ficient sign (positive or negative): This is accomplished by adding additional variables
that are incremental powers of the independent variable to model bends in the slope.
Example:
Non-Additivity
Additive implies that the average change in the dependent variable associated with a
one unit change in an independent variable (X1) is constant regardless of the value of
another independent variable (X2) in the model. If this assumption is violated, we can
no longer interpret the slope by saying "holding other variables constant" since the
values of the other variables may possibly change the slope coefficient and therefore
its interpretation.
                                                                                       147
The figure above displays a non-additive relationship when (X1) is interval/ratio and
(X2) is a dummy variable. If the partial slope for (X1) is not constant for differing values
of (X2), (X1) and (X2) do not have an additive relationship with Y.
Correction: An interaction term may be added to the model using a dummy variable
where the slope of X1 is thought to depend on the value of a dummy variable X2. The
final model will look like the following:
Model:
Interpretation:
b1+ b2 is interpreted as the slope for X1 when the dummy variable (X2) is 1
                                                                                        148
Model without interaction term		
Including the correct independent variables implies that an irrelevant variable has not
been included in the model and all theoretically important relevant variables are in-
cluded. Failing to include the correct variables in the model will bias the slope coeffi-
cients and may increase the likelihood of improperly finding statistical significance.
Including irrelevant variables will make it more difficult to find statistical significance.
Correction: Remove irrelevant variables and, if possible, include missing relevant vari-
ables.
When the mean error (reflected in residuals) is not equal to zero, the y intercept may
be biased. Violation of this assumption will not affect the slope coefficients. The par-
tial slope coefficients will remain Best Unbiased Linear Estimates (BLUE).
The distribution of the error term closely reflects the distribution of the dependent vari-
able. If the dependent variable is not normally distributed the error term may not be
normally distributed. Violation of this assumption will not bias the partial slope coeffi-
cients but may affect significance tests.
                                                                                         149
Correction: Always correct other problems first and then re-evaluate the residuals
★ If the distribution of residuals is skewed to the right (higher values), try using the
   natural log of the dependent variable
★ If the distribution of residuals is skewed to the left (lower values), try squaring the
   dependent variable.
Always check the distribution of the residuals again after trying a correction.
5. Homoskedasticity
The variance of the error term is constant (homoskedastic) for all values of the inde-
pendent variables. Heteroskedasticity occurs when the variance of the error term
does not have constant variance. The parameter estimates for partial slopes and the
intercept are not biased if this assumption is violated; however, the standard errors
are biased and hence significance tests may not be valid.
Diagnosis Of Heteroskedasticity
                                                                                           150
Plot the regression residuals against the values of the independent variable(s). If there
appears an even pattern about a horizontal axis, heteroskedasticity is unlikely.
For small samples there may be some tapering at each end of the horizontal distribu-
tion.
                                                                                      151
If there is a cone or bow tie shaped pattern, heteroskedasticity is suspected.
6. No autocorrelation
No autocorrelation assumes that the error terms are not correlated across observa-
tions. Violation of this assumption is likely to be a problem with time-series data
where the value of one observation is not completely independent of another observa-
tion. (Example: A simple two-year time series of the same individuals is likely to find
that a person's income in year 2 is correlated with their income in the prior year.) If
there is autocorrelation, the parameter estimates for the partial slopes and the inter-
cept are not biased but the standard errors are biased and hence significance tests
may not be valid.
Diagnosis
                                                                                          152
Use the Durbin-Watson (d) statistic
Correction: Use generalized least squares (GLS) or weighted least squares (WLS)
models to create coefficients that are BLUE.
7. No Multicolinearity
Diagnosis
★ Failing to find any variables statistically significant yet the F-statistic shows the
   model is significant.
                                                                                             153
★ Examine covariation among the independent variables by calculating all possible
   bivariate combinations of the Pearson correlation coefficient. Generally a high cor-
   relation coefficient (say .80 or greater) suggests a problem. This is imperfect since
   multicolinearity may not be reflected in a bivariate correlation matrix.
Correction:
★ Increase sample size to lower standard errors. This doesn't always work and is
   normally not feasible since adding more cases is not a simple exercise in most
   studies.
★ Combine two or more variables that are highly correlated into a single indicator of
   an abstract concept.
★ Delete one of the variables that are highly correlated. This may result in a poorly
   specified model.
★ Leave all the variables in model and rely on the joint hypothesis F-test to evaluate
   the significance of the model. This is especially useful if multicolinearity is causing
   most if not all of the independent variables to be not significant.
Summary
As noted at the beginning of this section, it is very difficult to completely satisfy all of
the assumptions required in regression modeling. The best starting point in design-
ing an adequate model is theory. Theory should drive what variables are included in
                                                                                        154
the regression model and it will also help inform the researcher on the possible inter-
actions among the independent variables. A model that does not perform as antici-
pated by theory suggests either the theory was wrong or that the regression esti-
mates and significance tests are being biased by one or more assumption violations.
The following page has a brief summary of the assumption violations, affect on the
model, and possible corrective action.
                                                                                     155
BLUE Check List: Diagnosing and correcting violations of regression assumptions
   Random error in DV                      +    -    data errors; unreliable measure clean errors find better
                                                                                     measure
   Random error in IV              B            -    data errors; unreliable measure clean errors find better
                                                                                     measure
   Residuals not normally                  B         plot histogram of residuals        positive skew > log DV
   distributed                                                                          negative skew > sqr DV
B=Biased; ‘+’ =Increases; ‘-’ =Decreases; b=slope coefficient; Se=Standard Error; DV=Dependent Variable;
IV=Independent Variable
                                                                                                                       156
12
Logistic Regression
Logistic regression is used for multiple regression models where the dependent vari-
able is dichotomous (0 or 1 values) or ordinal if conducting multinomial logistic regres-
sion. By convention, the dependent variable should be coded 0 and 1 where 1 repre-
sents the value of greatest interest.
Assumptions:
See multiple regression for a more complete list of ordinary least squares (OLS) re-
gression assumptions. In addition to the key characteristic that the dependent vari-
able is discrete rather than continuous, the following assumptions for logistic regres-
sion differ from those for OLS regression:
Linearity is not required in logistic regression between the independent and depend-
ent variables. It does require that the logits of the independent and dependent vari-
ables are linear. If not linear, the model may not find statistical significance when it ac-
tually exists (Type II error).
Unlike OLS regression, large sample size is needed for logistic regression. If there is
difficulty in converging on a solution, a large number of iterations, or very large regres-
sion coefficients, there may be insufficient sample size.
                                                                                         157
Software Output: Evaluating the affect of age, education, sex, and race on whether
or not a person voted in the presidential election.
Explanatory Model
VOTE92 = Constant + AGE + EDUC + FEMALE + WHITE + e
Coefficients
Variable   b Coeff.          SE       Wald         p <         OR
------------------------------------------------------------------
Constant    -4.1620      0.4184    98.9755      0.0000
     AGE     0.0335 [d] 0.0040     69.1386 [e] 0.0000 [f] 1.0340 [g]
    EDUC     0.2809      0.0244   132.4730      0.0000      1.3243
  FEMALE    -0.1249      0.1269     0.9686      0.3250      0.8826
   WHITE     0.1000      0.1671     0.3581      0.5495      1.1052
------------------------------------------------------------------
Estimated Model
VOTE92 = -4.1620 + .0335(AGE) + .2809(EDUC) + -.1249(FEMALE)
                 + .1000(WHITE)
                       Predicted Y
Observed Y             0           1        % Correct
                +-----------+-----------+
           0    |        103 |        314 |     24.70 [h]
                +-----------+-----------+
           1    |         64 |        964 |     93.77 [i]
                +-----------+-----------+----------
                                     Total      73.84 [j]
                                                                               158
Interpretation
[a] 	 This ranges from 0 to less than 1 and indicates the explanatory value of the model. The
closer the value is to 1 the greater the explanatory value. Analogous to R2 in OLS regression.
[b]	 Indicates explanatory value of the model. Ranges from 0 to 1 and adjusts for the number
of independent variables. Analogous to Adjusted R2 in OLS regression. In this model, the ex-
planatory value is relatively low.
[c]	 Tests the combined significance of the model in explaining the dependent (response) vari-
able. The significance test is based on the Chi-Square sampling distribution. Analogous to the
F-test in OLS regression. The independent variables in this model have a statistically significant
combined effect on voting behavior.
[d]	 A one unit increase in age (1 year) results in an average increase of .0335 in the log odds of
vote equaling 1 (voted in election).
[e]	 The test statistic for the null hypothesis that the b coefficient is equal to 0 (no effect). It is
calculated by (B/SE)2.
[f]	 Based on the Chi-Square sampling distribution, there is a statistically significant relation-
ship between age and voting behavior. The probability voting increases with age, holding educa-
tion, sex, and race constant.
[g]	 OR=Odds Ratio. The odds of a change in the dependent variable (vote) given a one unit
change in age is 1.03. Note: if the OR is >1, odds increase; if OR <1, odds decrease; if OR =1,
odds are unchanged by this variable. In this example, the odds of voting increases with a one
unit increase in age.
[h]	 Of those who did not vote, the model correctly predicted non-voting in 24.7% of the non-
voters included in the sample data.
[i]	 Of those who did vote, the model correctly predicted voting in 93.8% of the voters in-
cluded in the sample data.
[j]	 Of those who did or did not vote, the model correctly predicted 73.8% of the voting deci-
sions.
                                                                                                   159
13
Tables
                                               160
Z Distribution Critical Values
The area beyond Z is the proportion of the distribution beyond the critical value (region of rejec-
tion). Example: If a test at .05 alpha is conducted, the area beyond Z for a two-tailed test is .025
(Z-value 1.96). For a one-tailed test the area beyond Z would be .05 (Z-value 1.65).
                                                                                                 161
T Distribution Critical Values (2-Tailed)
                                                                                                     162
Chi-square Distribution Critical Values
                                                 Alpha
    df       .10               .05                .02                  .01      .001
    1        2.706            3.841              5.412                 6.635   10.827
    2        4.605            5.991              7.824                 9.210   13.815
    3        6.251            7.815              9.837                11.345   16.266
    4        7.779            9.488             11.688                13.277   18.467
    5        9.236           11.070             13.388                15.086   20.515
    6       10.645           12.592             15.033                16.812   22.457
    7       12.017           14.067             16.622                18.475   24.322
    8       13.362           15.507             18.168                20.090   26.125
    9       14.684           16.919             19.679                21.666   27.877
   10       15.987           18.307             21.161                23.209   29.588
   11       17.275           19.675             22.618                24.725   31.264
   12       18.549           21.026             24.054                26.217   32.909
   13       19.812           22.362             25.472                27.688   34.528
   14       21.064           23.685             26.873                29.141   36.123
   15       22.307           24.996             28.259                30.578   37.697
   16       23.542           26.296             29.633                32.000   39.252
   17       24.769           27.587             30.995                33.409   40.790
   18       25.989           28.869             32.346                34.805   42.312
   19       27.204           30.144             33.687                36.191   43.820
   20       28.412           31.410             35.020                37.566   45.315
   21       29.615           32.671             36.343                38.932   46.797
   22       30.813           33.924             37.656                40.289   48.268
   23       32.007           35.172             38.968                41.638   49.728
   24       33.196           36.415             40.270                42.980   51.179
   25       34.382           37.652             41.566                44.314   52.620
   26       35.563           38.885             42.856                45.642   54.052
   27       36.741           40.113             44.140                46.963   55.476
   28       37.916           41.337             45.419                43.278   56.893
   29       39.087           42.557             46.693                49.588   58.302
   30       40.256           43.773             47.962                50.892   59.703
                                                                                        163
F Distribution Critical Values
                                            Alpha .05
Denominator DF
                                                Numerator DF
 DF       1        2        3           4          5        10        30      60      120     500
   5    6.608    5.786    5.409      5.192       5.050    4.735     4.496    4.431   4.398   4.373
   7    5.591    4.737    4.347      4.120       3.972    3.637     3.376    3.304   3.267   3.239
  10    4.965    4.103    3.708      3.478       3.326    2.978     2.700    2.621   2.580   2.548
  13    4.667    3.806    3.411      3.179       3.025    2.671     2.380    2.297   2.252   2.218
  15    4.543    3.682    3.287      3.056       2.901    2.544     2.247    2.160   2.114   2.078
  20    4.351    3.493    3.098      2.866       2.711    2.348     2.039    1.946   1.896   1.856
  25    4.242    3.385    2.991      2.759       2.603    2.236     1.919    1.822   1.768   1.725
  30    4.171    3.316    2.922      2.690       2.534    2.165     1.841    1.740   1.683   1.637
  35    4.121    3.267    2.874      2.641       2.485    2.114     1.786    1.681   1.623   1.574
  40    4.085    3.232    2.839      2.606       2.449    2.077     1.744    1.637   1.577   1.526
  45    4.057    3.204    2.812      2.579       2.422    2.049     1.713    1.603   1.541   1.488
  50    4.034    3.183    2.790      2.557       2.400    2.026     1.687    1.576   1.511   1.457
  55    4.016    3.165    2.773      2.540       2.383    2.008     1.666    1.553   1.487   1.431
  60    4.001    3.150    2.758      2.525       2.368    1.993     1.649    1.534   1.467   1.409
 120    3.920    3.072    2.680      2.447       2.290    1.910     1.554    1.429   1.352   1.280
 500    3.860    3.014    2.623      2.390       2.232    1.850     1.482    1.345   1.255   1.159
                           This is only a portion of a much larger table.
                                            Alpha .01
Denominator DF
                                                 Numerator DF
 DF       1         2        3           4          5        10        30     60      120     500
   5   16.258    13.274   12.060      11.392 10.967 10.051 9.379             9.202   9.112   9.042
   7   12.246     9.547    8.451       7.847      7.460    6.620     5.992   5.824   5.737   5.671
  10   10.044     7.559    6.552       5.994      5.636    4.849     4.247   4.082   3.996   3.930
  13    9.074     6.701    5.739       5.205      4.862    4.100     3.507   3.341   3.255   3.187
  15    8.683     6.359    5.417       4.893      4.556    3.805     3.214   3.047   2.959   2.891
  20    8.096     5.849    4.938       4.431      4.103    3.368     2.778   2.608   2.517   2.445
  25    7.770     5.568    4.675       4.177      3.855    3.129     2.538   2.364   2.270   2.194
  30    7.562     5.390    4.510       4.018      3.699    2.979     2.386   2.208   2.111   2.032
  35    7.419     5.268    4.396       3.908      3.592    2.876     2.281   2.099   2.000   1.918
  40    7.314     5.179    4.313       3.828      3.514    2.801     2.203   2.019   1.917   1.833
  45    7.234     5.110    4.249       3.767      3.454    2.743     2.144   1.958   1.853   1.767
  50    7.171     5.057    4.199       3.720      3.408    2.698     2.098   1.909   1.803   1.713
  55    7.119     5.013    4.159       3.681      3.370    2.662     2.060   1.869   1.761   1.670
  60    7.077     4.977    4.126       3.649      3.339    2.632     2.028   1.836   1.726   1.633
 120    6.851     4.787    3.949       3.480      3.174    2.472     1.860   1.656   1.533   1.421
 500    6.686     4.648    3.821       3.357      3.054    2.356     1.735   1.517   1.377   1.232
                            This is only a portion of a much larger table.
                                                                                                     164
14
Appendix
Basic Formulas
                                   165
Basic Formulas
Population Mean
Sample Mean
Population Variance
Sample Variance
                                   166
Z Score	 	   	    	   	    	   	
                                          s
                                   sx =
Standard Error--Mean		     	   	          n
Standard Error--Proportion
                                              167
Glossary of Abbreviated Terms
Mean of a sample
F The F ratio
Fe Expected frequency
Fo Observed frequency
Ho Null Hypothesis
Ha Alternate Hypothesis
P Proportion
r2 Coefficient of determination
s2 Sample variance
                                               168
     	 	   Alpha-- probability of Type I error
Population variance
"Summation of"
Chi-square statistic
Infinity
                                                 169
Order of Mathematical Operations
First
All operations in parentheses, brackets, or braces, are calculated from the inside out
Second
Third
                                                                                    170
A priori
Decisions that are made before the results of an inquiry are known. In hy-
pothesis testing, testable hypotheses and decisions regarding alpha and
whether a one-tailed or two-tailed test will be used should be established be-
fore collecting or analyzing the data.
The starting point for measurement. Abstract concepts are best understood
as general ideas in linguistic form that help us describe reality. They range
from the simple (hot, heavy, fast) to the more difficult (responsive, effective,
fair). Abstract concepts should be evident in the research question and/or
purpose statement.
Research Question: Is the quality of public sector and private sector employ-
ees different?
Example
Addition Rule: When two events, A and B, are mutually exclusive, the prob-
ability that A or B will occur is the sum of the probability of each event.
or
There is about a 15% chance of pulling an Ace or Jack with one draw from
a deck of playing cards.
The probability of a Type I error. Alpha represents the threshold for claiming
statistical significance.
When conducting statistical tests with computer software, the exact prob-
ability of a Type I error is calculated. It is presented in several formats but is
most commonly reported as "p <" or "Sig." or "Signif." or "Significance."
Using "p <" as an example, if a priori you established a threshold for statisti-
cal significance at alpha .05, any test statistic with significance at or less
than .05 would be considered statistically significant and you would be re-
quired to reject the null hypothesis of no difference.
The following table links p values with a constant alpha benchmark of .05.
Example
________________
         Count |
         Row % |
         Col % |
       Total % |    Male | Female     |   Total
     -------------------------------------------
       <12 yrs |       3 |       4 [a]|       7
               |   42.86 |   57.14 [b]|
               |   11.54 |   16.67 [c]|   14.00
               |    6.00 |    8.00 [d]|
     -------------------------------------------
       HS Grad |      10 |      11    |      21
               |   47.62 |   52.38    |
               |   38.46 |   45.83    |   42.00
               |   20.00 |   22.00    |
     -------------------------------------------
       College |      13 |       9    |      22
               |   59.09 |   40.91    |
               |   50.00 |   37.50    |   44.00
               |   26.00 |   18.00    |
     -------------------------------------------
               |      26 |      24    |      50
         Total |   52.00 |   48.00    | 100.00
Interpretation
[a] Count: There are 4 females who have less than 12 years of education.
[b] 	 Row %: Of those who have less than a high school education (<12 yrs), 57.14%
(4/7) are female.
[c] 	 Column %: Of those who are female, 16.67% (4/24) have less than a high
school education.
[d] 	 Total %: 8% (4/50) of the sample is female with less than a high school educa-
tion.
________________
Income by EduCat
Assumptions
Probability Distribution
T Distribution where . . .
Degrees of Freedom
df= n-2
Formula
                n∑ XY − ∑ X ∑Y
rxy =
         #
                    (∑ X ) &(' * #%$n∑Y − (∑Y ) &('
                           2                    2
         %$n∑ X −
               2                      2
________________
Pearson's Correlation
INCOME: [a]
        21779.0|                    *     *
               |                 *        * *
               |                 *     * * *
               |              *     *     *
        14030.0|        *        *
               |     * *         * * * *
               | *            *
               |        *        *
               |     *        *
         6281.0| * *
                 ---------------|--------------|
                 4.0          12.0          20.0
EDUC: [b]
Number of cases: 32
Missing: 0 [c]
Pearson Correlation: 0.751 [d]
p < (2-tailed signif.): 0.0000 [e]
Interpretation
[a]	 The Y axis of the scattergram. If theory suggests cause and effect, the Y axis is
commonly used for the dependent (response) variable.
[b]	 The X axis of the scattergram. If theory suggests cause and effect, the X axis is
commonly used for the independent variable.
[c]	 Since each observation (case) must have values for both income and education,
any observations where one or both of these variables have no data will be removed
from the analysis.
________________
-----------------------------------------------
    Coeff |          Correlation Matrix        |
    Cases |-----------------------------------|
      p < |    INCOME |      EDUC | WORKEXP |
-----------------------------------------------
   INCOME | 1.000 [a]| 0.751 [d]| -0.160 [f]|
          |     32 [b]|     32    |      32    |
          |      . [c]| 0.000 [e]| 0.381 [g]|
-----------------------------------------------
     EDUC |    0.751 |     1.000 |    -0.520 |
          |       32 |         32 |         32 |
          |    0.000 |          . |     0.002 |
-----------------------------------------------
 WORKEXP |    -0.160 |    -0.520 |      1.000 |
          |       32 |         32 |         32 |
          |    0.381 |     0.002 |           . |
-----------------------------------------------
        2-tailed significance tests
         '.' p-value not computed
Interpretation
[a] 	 The diagonal in the matrix represents Pearson correlations between the same
variable which will always be 1 since the variables are identical. The correlations
above the diagonal are a mirror reflection of these below the diagonal, so you only in-
terpret half of the matrix (in this case three correlation coefficients).
[c]	 A statistical test is not performed when you are comparing a variable to itself.
Mathematically this will always equal zero.
[f]	 Pearson correlation coefficient representing a very weak negative correlation be-
tween work experience and income. Interpretation: As years of work experience in-
creases personal income decreases.
[g]	 There is not a statistically significant association between work experience and
income.
Assumptions
Probability Distribution
Degrees of Freedom
Formula
where
________________
       Count |
       Row % |
       Col % |
     Total % |    Male | Female |     Total
--------------------------------------------
       Favor |      314|      497|      811
             |    38.72|    61.28|
             |    73.88|    88.91|    82.42
             |    31.91|    50.51|
--------------------------------------------
      Oppose |      111|       62|      173
             |    64.16|    35.84|
             |    26.12|    11.09|    17.58
             |    11.28|     6.30|
--------------------------------------------
             |      425|      559|      984
       Total |    43.19|    56.81|   100.00
Measures of Association
-------------------------------------
Cramer's V                  .196 [c]
Pearson C                   .192
Lambda Symmetric            .082 [d]
Lambda Dependent=Column     .000
Lambda Dependent=Row        .115 [e]
Interpretation
[b]	 When sample sizes are small, the continuous chi-square value tends to be too
large. The Yates continuity correction adjusts for this bias in 2x2 contingency tables.
Regardless of sample size, it is a preferred measure for chi-square tests on 2x2 ta-
bles.
Formula
________________
       Count |
       Row % |
       Col % |
     Total % |    Male | Female |     Total
--------------------------------------------
       Favor |      314|      497|      811
             |    38.72|    61.28|
             |    73.88|    88.91|    82.42
             |    31.91|    50.51|
--------------------------------------------
      Oppose |      111|       62|      173
             |    64.16|    35.84|
             |    26.12|    11.09|    17.58
             |    11.28|     6.30|
--------------------------------------------
             |      425|      559|      984
       Total |    43.19|    56.81|   100.00
Measures of Association
-------------------------------------
Cramer's V                  .196 [a]
Pearson C                   .192
Lambda Symmetric            .082
Lambda Dependent=Column     .000
Lambda Dependent=Row        .115
Interpretation
Also used in logistic regression to estimate the overall performance of the re-
gression model.
Assumptions
Probability Distribution
Degrees of Freedom
Formula
and . . .
n = sample size
k = number of categories or cells
Fo = observed frequency
________________
Frequencies
Variable: EduCat              Education Level
Interpretation
________________
       Count |
       Row % |
       Col % |
     Total % |    Male | Female |     Total
--------------------------------------------
       Favor |      314|      497|      811
             |    38.72|    61.28|
             |    73.88|    88.91|    82.42
             |    31.91|    50.51|
--------------------------------------------
      Oppose |      111|       62|      173
             |    64.16|    35.84|
             |    26.12|    11.09|    17.58
             |    11.28|     6.30|
--------------------------------------------
             |      425|      559|      984
       Total |    43.19|    56.81|   100.00
Measures of Association
-------------------------------------
Cramer's V                  .196
Pearson C                   .192
Lambda Symmetric            .082 [a]
Lambda Dependent=Column     .000
Lambda Dependent=Row        .115 [b]
Interpretation
When sample sizes are small, the continuous chi-square tends to be too
large. The continuity correction adjusts for this bias in 2x2 contingency ta-
bles. Regardless of sample size, it is a preferred measure for chi-square
tests on 2x2 tables.     If a cell has zero observations, Fisher's exact test is
more appropriate for chi-square significance tests.
________________
       Count |
       Row % |
       Col % |
     Total % |    Male | Female |     Total
--------------------------------------------
       Favor |      314|      497|      811
             |    38.72|    61.28|
             |    73.88|    88.91|    82.42
             |    31.91|    50.51|
--------------------------------------------
      Oppose |      111|       62|      173
             |    64.16|    35.84|
             |    26.12|    11.09|    17.58
             |    11.28|     6.30|
--------------------------------------------
             |      425|      559|      984
       Total |    43.19|    56.81|   100.00
Interpretation
[a]	 When sample sizes are small, the continuous chi-square value tends to be too
large. The Yates continuity correction adjusts for this bias in 2x2 contingency tables.
Regardless of sample size, it is a preferred measure for chi-square tests on 2x2 ta-
bles.
Used to represent the relative distribution of values in one variable. The cate-
gories can be any level of measurement and are presented as counts, per-
cents, and cumulative percents.
________________
Interpretation
Assumptions
Elements being rated (images, diagnoses, clinical indications, etc.) are inde-
pendent of each other.
One rater’s classifications are made independently of the other rater’s classi-
fications.
The same two raters provide the classifications used to determine kappa.
Formula
    p − pe
kˆ = 0
     1 − pe
where
________________
       Count |
       Row % |
       Col % |
     Total % |Not Dise |Diseased |    Total
--------------------------------------------
Not Diseased |       41|       11|       52
             |    78.85|    21.15|
             |    82.00|    22.00|    52.00
             |    41.00|    11.00|
--------------------------------------------
Diseased     |        9|       39|       48
             |    18.75|    81.25|
             |    18.00|    78.00|    48.00
             |     9.00|    39.00|
--------------------------------------------
             |       50|       50|      100
       Total |    50.00|    50.00|   100.00
Agreement Statistics
------------------------------------------
Proportion Agree                 0.800 [a]
Kappa Statistic                  0.600 [b]
Kappa Standard Error             0.100
Kappa Significance (p<)          0.000 [c]
Interpretation
[a]	 The radiologist interpretations agree in 80% of the patients (note they are review-
ing imaging of the same patients independently).
Assumptions
Random sampling
Probability Distribution
T- Distribution
Formula
     X −µ
t=
      Sx
Where
        s
sx =
         n
________________
Interpretation
[a]	 Represents the historical mean starting income of all graduates of two year tech-
nical colleges in the United States (in constant dollars).
[b]	 Represents the mean starting income of a random sample of 30 graduates from
this year’s technical colleges in the United States.
[c]	 The standard deviation for the sample of 30 graduates (you generally use the
sample standard deviation unless the population standard deviation is known).
[d]	 There is a statistically significant difference between the starting income of this
year’s graduating class and the historical average starting pay. This year’s graduates
have a mean starting pay that is significantly less than the mean pay for prior gradu-
ates.
Compares two means from two random samples; such as mean incomes of
males and females, Republicans and Democrats, Urban and Rural residents.
If there are more than two comparable groups, ANOVA is a more appropri-
ate technique.
Assumptions
Random sampling
Independent samples
Homogeneity of variance
Probability Distribution
T distribution
Formula
       X1 − X 2
t=
        S x −x
            1   2
________________
                         Sample 1         Sample 2
                         Female           Male
-----------------------------------------------------------
Sample Mean             12279.4667       15266.4706
Std Deviation            4144.6323        3947.7352
Sample Size (n)            15               17
                        Homogeneity of Variance
------------------------------------------------------------
              F-ratio   1.09    DF (14, 16)   p < 0.4283 [a]
Interpretation
[a]	 The F-ratio is not statistically significant. Therefore, use the equal variance test
statistic.
[b]	 For a 2-tailed test, the p-value represents the probability of making a type 1 error
(concluding there is statistical significance when there is none). Since there is about
4.6% (p < .0455) chance of making a type 1 error, which does not exceed the 5% er-
ror limit (p=.05) established in the rejection criteria of the hypothesis testing process
(alpha), you conclude there is statistical significance.
Assumptions
Probability Distribution
F-Distribution where . . .
Formula
F-Test Statistic
F=
                      ∑ n ( X − X ) / (k −1)
                            k       i       g
        #
                    ) ∑( X − X ) + ∑( X − X ) &(' / (n − k )
                       2                    2               2
        %$∑ ( X
              i − X1 +          i       2           i   3
where
________________
Analysis of Variance
                              F Statistic             p <
                            ---------------------------------
                                     5.94 [f]      0.0005 [g]
Interpretation	 	
[a]	 k-1, where k = 4 group means
[e] n-1
[g]	 The F-Statistic is statistically significant. The p-value represents the probability
of making a type 1 error (concluding there is statistical significance when there is
none). Since there is about .05% (p < .0005) chance of making a type 1 error, you
conclude there is statistical significance; ie, there is variation among the groups.
The Bonferroni post hoc test adjusts alpha to compensate for multiple com-
parisons. In this situation, one may falsely conclude a significant effect
where there is none. In particular, use of a .05 cut-off value for significance
theoretically guarantees that if there were 20 pairwise comparisons, there
will by chance alone appear to be one with significance at the .05 level. The
Bonferroni adjustment is often offered as an additional test for ANOVA mod-
els to guide post ANOVA exploration and interpretation regarding which
paired mean comparisons are likely to have statistically significant differ-
ences. There are other corrections for this problem. Some are useful for un-
ordered groups, such as patient height versus sex, while others are applied
to ordered groups (to evaluate a trend), such as patient height versus sex
when stratified by age. Consequently, when multiple statistical tests are con-
ducted between the same variables, the significance cut-off value is often
adjusted to represent a more conservative estimate of statistical signifi-
cance. There is a debate about which post hoc method to use, and some
hold the view that this correction is overused. The Bonferroni correction ad-
justs the threshold for significance, which is equal to the desired p- value
(e.g., .05, .01) divided by the number of paired outcome variables being ex-
amined. One limitation of the Bonferroni correction is that by reducing the
level of significance associated with each test, we have reduced the power
of the test, thereby increasing the chance of incorrectly retaining the null hy-
pothesis.
________________
Analysis of Variance
                          F Statistic             p <
                        -----------------------------
                                 5.94         0.0005
Interpretation
[a]	 The difference between the mean of those 30-39 (G1) and the mean of those 50
or older (G2) is a -0.666. A negative mean indicates those 50 or older have a greater
mean number of hours watching television than those 30-39.
[b]	 There is a statistically significant difference between the number of hours watch-
ing television for those between the ages of 30 and 39 and the number of hours
watching television for those 50 and older.
[c]	 There is a statistically significant difference between the number of hours watch-
ing television for those between the ages of 40 and 49 and the number of hours
watching television for those 50 and older.
Steps
2. Use either z distribution (if n>120) or t distribution (for all sizes of n).
3. Use the appropriate table to find the critical value for a 2-tailed test
Estimation Formula
where
= sample mean
CV = critical value (consult distribution table for df=n-1 and chosen alpha--
commonly .05)
Estimation
Df=n-1 or 29
CV=2.045
Standard error
                          sx = .219
                  		
CI 95 = 19.5 ±.448
________________
                                       Margin Limits
Confidence      Error +/-          Lower             Upper
----------------------------------------------------------
    90%            0.3723        19.1277          19.8723
    95%            0.4481        19.0519          19.9481
    99%            0.6039        18.8961          20.1039
________________
Interpretation
Assumptions
Probability Distribution
Formula
ps = sample proportion
pu
     = population proportion
Where
qu = 1− pu
n = sample size
________________
Interpretation
[a]	 For a 2-tailed test, the p-value represents the probability of making a type 1 error
(concluding there is statistical significance when there is none). Since there is about
8% chance of making a type 1 error, which exceeds the 5% error limit established in
the rejection criteria of the hypothesis testing process (alpha), you would not con-
clude there is statistical significance.
Compares two proportions from two random samples; such as males and
females, Republicans and Democrats, Urban and Rural residents. If there
are more than two comparable groups, chi-square is a more appropriate
technique.
Assumptions
Probability Distribution
Use z-distribution
Formula
^ ^
                       	   	   and
                                        q = 1− p
________________
Interpretation
[a]	 For a 2-tailed test, the p-value represents the probability of making a type 1 error
(concluding there is statistical significance when there is none). Since there is far less
than a 5% chance of making a type 1 error, you would conclude there is statistical sig-
nificance between the schools in the proportion of children who need nutrition sup-
port at school.
Steps
Use the z-distribution table to find the critical value for a 2-tailed test given
the selected confidence level (alpha)
where
p = sample proportion
q=1-p
CV = critical value
CI = p ± (CV)(Sp)
Interpret
Based on alpha .05, you are 95% confident that the proportion in the popula-
tion from which the sample was obtained is between __ and __.
Note: Given the sample data and level of error, the confidence interval pro-
vides an estimated range of proportions that is most likely to contain the
population proportion. The term "most likely" is measured by alpha (i.e., in
most cases there is a 5% chance --alpha .05-- that the confidence interval
does not contain the true population proportion).
The standard error of the proportion will vary as sample size and the propor-
tion changes. As the standard error increases, so will the margin of error.
 	                      	
                                            Sample Size (n)
 Proportion (p)              100     300       500     1000    5000    10000
        0.9                 0.030   0.017    0.013     0.009   0.004   0.003
        0.8                 0.040   0.023    0.018     0.013   0.006   0.004
        0.7                 0.046   0.026    0.020     0.014   0.006   0.005
        0.6                 0.049   0.028    0.022     0.015   0.007   0.005
        0.5                 0.050   0.029    0.022     0.016   0.007   0.005
        0.4                 0.049   0.028    0.022     0.015   0.007   0.005
        0.3                 0.046   0.026    0.020     0.014   0.006   0.005
        0.2                 0.040   0.023    0.018     0.013   0.006   0.004
        0.1                 0.030   0.017    0.013     0.009   0.004   0.003
As sample size increases the error of the proportion will decrease for a given
proportion. However, the reduction in error of the proportion as sample size
increases is not constant. As an example, at a proportion of 0.9, increasing
the sample size from 100 to 300 cut the standard error by about half (from
.03 to .017). Increasing the sample size by another 200 only reduced the
standard error by about one quarter (.017 to .013).
Problem: A random sample of 500 employed adults found that 23% had
traveled to a foreign country. Based on these data, what is your estimate for
the entire employed adult population?
Compute Interval
Interpret
You are 95% confident that the actual proportion of all employed adults who
have traveled to a foreign country is between 19.3% and 26.7%.
________________
                                            Margin Limits
Confidence      Error +/-            Lower            Upper
------------------------------------------------------------
    90%            0.0311           0.1989           0.2611
    95%            0.0369           0.1931           0.2669
    99%            0.0486           0.1814           0.2786
Rate Example: Create death rate per 1000 given there are 100 deaths a
year in a population of 10,000.
The number of cases in one category divided by the number of cases in an-
other category.
Ratio = f1/f2 or frequency in one group (larger group) divided by the fre-
quency in another group.
Ratio Example: Your community has 1370 Protestants and 930 Catholics
1370/930 = 1.47 or for every one Catholic there are 1.47 Protestants in the
given population.
Used for multiple regression modeling when the dependent variable is nomi-
nal level data. Binomial logistic is used when the dependent variable is a di-
chotomy (0,1). Multinomial logistic regression is used when there are more
than two nominal categories.
Assumptions
See multiple regression for a more complete list of ordinary least squares
(OLS) regression assumptions. In addition to the key characteristic that the
dependent variable is discrete rather than continuous, the following assump-
tions for logistic regression differ from those for OLS regression:
Unlike OLS regression, large sample size is needed for logistic regression. If
there is difficulty in converging on a solution, a large number of iterations, or
very large regression coefficients, there may be insufficient sample size.
________________
Dependent Y: VOTE92
Explanatory Model
Coefficients
Variable b Coeff.           SE      Wald         p <        OR
----------------------------------------------------------------
Constant   -4.1620      0.4184   98.9755      0.0000
     AGE    0.0335 [d] 0.0040    69.1386 [e] 0.0000 [f] 1.0340 [g]
    EDUC    0.2809      0.0244 132.4730       0.0000     1.3243
  FEMALE   -0.1249      0.1269    0.9686      0.3250     0.8826
   WHITE    0.1000      0.1671    0.3581      0.5495     1.1052
----------------------------------------------------------------
Estimated Model
VOTE92 = -4.1620 + .0335(AGE) + .2809(EDUC) + -.1249(FEMALE)
                 + .1000(WHITE)
                        Predicted Y
Observed Y              0           1        % Correct
                 +-----------+-----------+
           0     |        103 |        314 |     24.70 [h]
                 +-----------+-----------+
           1     |         64 |        964 |     93.77 [i]
                 +-----------+-----------+----------
                                      Total      73.84 [j]
Interpretation
[a] 	 This ranges from 0 to less than 1 and indicates the explanatory value of the
model. The closer the value is to 1 the greater the explanatory value. Analogous to
R2 in OLS regression.
[b]	 Indicates explanatory value of the model. Ranges from 0 to 1 and adjusts for the
number of independent variables. Analogous to Adjusted R2 in OLS regression. In
this model, the explanatory value is relatively low.
[c]	 Tests the combined significance of the model in explaining the dependent (re-
sponse) variable. The significance test is based on the Chi-Square sampling distribu-
tion. Analogous to the F-test in OLS regression. The independent variables in this
model have a statistically significant combined effect on voting behavior. 	
[d]	 A one unit increase in age (1 year) results in an average increase of .0335 in the
log odds of vote equaling 1 (voted in election).
[e]	 The test statistic for the null hypothesis that the b coefficient is equal to 0 (no ef-
fect). It is calculated by (B/SE)2.
[g]	 OR=Odds Ratio. The odds of a change in the dependent variable (vote) given a
one unit change in age is 1.03. Note: if the OR is >1, odds increase; if OR <1, odds
decrease; if OR =1, odds are unchanged by this variable. In this example, the odds of
voting increases with a one unit increase in age.
[h]	 Of those who did not vote, the model correctly predicted non-voting in 24.7% of
the non-voters included in the sample data.
[i]	 Of those who did vote, the model correctly predicted voting in 93.8% of the vot-
ers included in the sample data.
[j]	 Of those who did or did not vote, the model correctly predicted 73.8% of the vot-
ing decisions.
Assumptions
1. No measurement error
2. No specification error
5. Homoskedasticity
6. No autocorrelation
7. No Multicolinearity
________________
ANOVA Statistics
Coefficients
Estimated Model
Interpretation
[a] 	 Adjusted R2. Education, experience, and a person’s sex explain 72% of the
variation in individual income.
[b]	 Standard error of the estimate. Used for creating confidence intervals for pre-
dicted Y.
[d]	 The impact of a one-unit change in education on income. One additional year of
education will on average result in an increase in income of $870 holding experience
and sex constant.
[e]	 Standardized regression partial slope coefficients. Used only in multiple regres-
sion models (more than one independent variable), beta-weights in the same regres-
sion model can be compared to one another to evaluate relative effect on the depend-
ent variable. Beta-weights will indicate the direction of the relationship (positive or
negative). Regardless of whether negative or positive, larger beta-weight absolute val-
ues indicate a stronger relationship. In this model, education has the greatest effect
on income followed in order by sex and experience.
[h]	 Education has a statistically significant association with income. The probability
of a type one error is less than .0001 which far exceeds alpha of p = .05. Also, experi-
ence and sex are statistically significant.
[i]	 One additional month of experience will on average result in an increase in in-
come of $194 holding education and sex constant.
[j] 	 Males will on average earn an income of $2,821 more than women holding edu-
cation and experience constant.
Assumptions
1. Both the independent (X) and the dependent (Y) variables are interval or
ratio data.
Formula
where
= predicted score
where
= mean of variable X
= mean of variable Y
________________
ANOVA Statistics
Coefficients
Estimated Model
Interpretation	 	
[a]	 The correlation coefficient.
[b]	 Education explains 56% of the variation in income.
[c]	 Standard error of the estimate. Used for creating confidence intervals for pre-
dicted Y.
[d]	 The regression model is statistically significant.
[e]	 The Y intercept and mean value of income if education is equal to 0.
[f]	 The impact of a one-unit change in education on income. Interpretation: One ad-
ditional year of education will on average result in an increase in income of $732.
[g]	 Standard error of the slope coefficient. Used for creating confidence intervals
for b and for calculating a t-statistic for significance testing.
[h]	 Standardized regression partial slope coefficients. Used only in multiple regres-
sion models (more than one independent variable), beta-weights in the same regres-
sion model can be compared to one another to evaluate relative effect on the depend-
ent variable. Beta-weights will indicate the direction of the relationship (positive or
negative). Regardless of whether negative or positive, larger beta-weight absolute val-
ues indicate a stronger relationship.
[i]	 The t-test statistic. Calculated by dividing the b coefficient by the standard error
of the slope.
[j]	     Education has a statistically significant positive association with income.
Diagnosis
Correction:
Use generalized least squares (GLS) or weighted least squares (WLS) mod-
els.
Partial slope coefficients are not directly comparable to one another. This is
because each variable is often based on a different metric. Beta weights
(standardized slope coefficients) are used to compare the relative impor-
tance of regression coefficients in a model. They are not useful for compari-
sons among models from other samples.
Formula:
          ⎛ s xi    ⎞
β i = bi ⎜          ⎟
          ⎜ s       ⎟
          ⎝ y       ⎠ 		   or partial slope * (std dev. of Xi ÷ std dev. of Y)
Example:
                                              sxi
Education standard deviation                        = 2.88
                                                sy
Begin salary standard deviation                      = 7870.64
         ⎛ s xi    ⎞                   ⎛ 2.88 ⎞
β i = bi ⎜         ⎟       β i = 1878.2⎜         ⎟
         ⎜ s       ⎟                                       β i = .687
         ⎝ y       ⎠ 	 	               ⎝ 7870.64 ⎠ 	 	
A variable that has only two values or attributes. Sex is a common example
(male or female).
This assumption specifies that two comparison groups must be normally dis-
tributed.
A table that displays the joint frequency distributions of two variables. Gen-
erally referred to as a contingency table or crosstabulation. By convention,
the independent variable is usually represented in the columns and the de-
pendent variable is represented in the rows.
Example:
       Count |
       Row % |
       Col % |
     Total % |Male      |Female   |   Total
--------------------------------------------
No           |        21|       15|      36
             |     58.33|    41.67|
             |     84.00|    71.43|   78.26
             |     45.65|    32.61|
--------------------------------------------
Yes          |         4|        6|      10
             |     40.00|    60.00|
             |     16.00|    28.57|   21.74
             |      8.70|    13.04|
--------------------------------------------
             |        25|       21|      46
       Total |     54.35|    45.65|  100.00
Precedence: Does the independent variable vary before the effect exhibited
in the dependent variable?
________________
Descriptive Statistics
Variable: REALINC           FAMILY INCOME IN CONSTANT $
--------------------------------------------------------------------
Count                        1614                     Pop Var           297866464.1298
Sum                      37950010.0000                Sam Var           298051130.2576
Mean                        23513.0173                Pop Std                 17258.8083
Median                      18375.0000                Sam Std                 17264.1574
Min                            245.0000               Std Error                 429.7280
Max                         68600.0000                CV%                        73.4238
Range                       68355.0000                95% CI (+/-)              842.8885
Skewness                          0.7979              t-test(mu=0)   |p<|         0.0001
Kurtosis                          2.8317
--------------------------------------------------------------------
    -----------------------------------------------------
| Results of Random Sampling                                            |
|                                                                       |
| Variable Randomized:                 REALINC                          |
| Size of Each Sample:                           30                     |
| Number of Random Samples:                  1000                       |
|                                                                       |
| This is the mean of means produced from 1000                          |
| repeated random samples.                                              |
    -----------------------------------------------------
Descriptive Statistics
Variable: Mean
---------------------------------------------------------------------
Count                        1000                     Pop Var               10491553.2947
Sum                      19247808.4165                Sam Var               10502055.3500
Mean                        19247.8084                Pop Std                   3239.0667
Median                      19230.4583                Sam Std                   3240.6875
Min                          7231.5833                Std Error                  102.4795
Max                         30788.3333                CV%                         16.8367
Range                       23556.7500                95% CI (+/-)               201.0998
Skewness                          0.0753              t-test(mu=0)   |p<|          0.0001
Kurtosis                          3.1711
---------------------------------------------------------------------
A document that tells the location and meaning of variables and values in a
data file.
The Coefficient of Variation (CV) is the ratio of the sample standard deviation
to the sample mean: (sample standard deviation/sample mean)*100 to calcu-
ate CV%. Used as a measure of relative variability, CV is not affected by the
units of a variable. CV is useful for comparing the variation in two series of
data which are measured in two different units. Examples include a compari-
son between variation in height and variation in weight or comparing differ-
ent experiments involving the same units of measure but conducted by dif-
ferent persons.
This is a construct system that connects our view of the world with data. It
connects theory, hypotheses, and data with our vision of reality.
The boundary values that contain the interval estimate.             An interval esti-
mate is an interval of values within which we can state with a degree of confi-
dence that the true population parameter falls. Used in conjunction with the
point estimate. Also known as the confidence interval and is often ex-
pressed as a margin of error.
The degree to which one measure correlates with other measures of the
same abstract concept.
The extent to which the indicator reflects the full domain of interest.
Used to represent the relationship between two variables. The variables are
usually nominal or ordinal level and the relationship is described by counts
and row and column percent. Also referred to as a crosstabulation or cros-
stab.
A measure that can take on any value within a given interval or set of inter-
vals. An infinite number of possible values such as distance in kilometer and
loan interest rates.
A variable that is held to one value to help clarify the relationship between
other variables. As an example, sex may be controlled to investigate the re-
lationship between education level and income (i.e., a separate analysis for
males and females).
The percentage of cases within an interval and all preceding intervals. Gen-
erally displayed in a frequency distribution table.
A document that tells the meaning of variables and values in a data file.
________________
The number of observations in the data that are free to vary after the sample
statistics have been calculated. Once the second to last calculation for a sta-
tistic has been completed, the final result is automatically determined with-
out calculation. This last value is not free to vary, so generally one is sub-
tracted from the count of observations (n-1).
A measure not under the control of the researcher that reflects responses
caused by variations in another measure (the independent variable). Also
called the response variable.
This is a variable that is classified by two values. These variables are also
referred to as binary variables or dummy variables. Dummy variables are
qualitative variables that measure presence or absence of a characteristic.
As an example, a dummy variable representing the characteristic "male"
could be represented as 0=female (non-male) and 1=male.
The extent to which observations differ (vary) from one another. Also known
as variation. Common summary measures of dispersion are range, vari-
ance, and standard deviation.
The extent to which multiple sample outcomes are clustered around the
mean of the sampling distribution.    The efficiency of a statistic is the de-
gree to which the statistic is stable from sample to sample. If one statistic
has a smaller standard error than another, then the statistic with smaller stan-
dard error is the more efficient statistic.
The extent to which the association between the independent and depend-
ent variable is accurate and unbiased in populations outside the study
group.
________________
       Count |
       Row % |
       Col % |
     Total % |Male      |Female   |   Total
--------------------------------------------
No           |        21|       15|      36
             |     58.33|    41.67|
             |     84.00|    71.43|   78.26
             |     45.65|    32.61|
--------------------------------------------
Yes          |         4|        6|      10
             |     40.00|    60.00|
             |     16.00|    28.57|   21.74
             |      8.70|    13.04|
--------------------------------------------
             |        25|       21|      46
       Total |     54.35|    45.65|  100.00
Interpretation
A method of judging the value of a program while the program activities are
forming or happening. Formative evaluation focuses on the process (Bhola
1990).
Occurs when the variance of the error term does not have constant variation
across multiple values of an independent variable.
The variance of the Y scores in a correlation are uniform for the values of the
X scores. In other words, the Y scores are equally spread above and below
the regression line.
Random samples collected so that the selection of a particular case for one
sample has no effect on the probability that any particular case will be se-
lected for the other samples.
A measure that can take on different values which are subject to manipula-
tion by the researcher.
A procedure for stating the degree of confidence that the measures we have
of our environment are accurate. Inferential measures are based on less
than the entire population and represent an estimate of the true but un-
known value of a population characteristic. Inference depends on the con-
cept of random sampling.
Represents the point where the regression line intercepts the Y axis and indi-
cates the average value of Y when X is equal to zero (0).
The extent to which accurate and unbiased association between the inde-
pendent variable and dependent variable was obtained in the study group.
Errors in data collection that occur when knowledge of the results of one
test affect the interpretation of a second test.
The distance representing the range of scores between the second and third
quartile of a distribution of scores.
Objects classified by type or characteristic, with logical order and equal dif-
ferences between levels of data. The meaning between each level of data is
the same. Also known as scale data.
The mathematical characteristic of a variable and the major criterion for se-
lecting statistical techniques. Variables can be measured at the nominal, or-
dinal, interval, ratio level.
The relationship between the independent variables and the dependent vari-
able is conducive to being fit with a straight line.
A survey that collects data at different points in time on the same objects.
Used to express an interval of values within which we can state with a de-
gree of confidence that the true population parameter falls. Used in conjunc-
tion with the point estimate. Also known as the confidence interval. The mar-
gin of error is commonly expressed as plus or minus a set value that repre-
sents half of the confidence interval.   As an example, 50% with a margin of
error +/- 3% indicates that the true population percentage is estimated to
be between 47% and 53%. There is always a chance this interval is wrong.
If the margin of error is based on a 95% confidence level, this means there
is a 5% chance the true population value will not fall within the margin of er-
ror.
X=
     ∑X     i
Example:
24
32 Mode
32
55
60
65
44.29 Mean
The point on a scale of measurement below and above which fifty percent
of the observations fall. To obtain the median, the observations must be
rank ordered by their values.
Example:
Variable Age
24
32
32
42 Median
55
60
65
Example:
Variable Age
24
                       32
                            Mode
                       32
42
55
60
65
Occurs when one of the independent variables has a substantial linear rela-
tionship with another independent variable in a multiple regression model. It
occurs in any model and is more a matter of degree of colinearity rather
than whether it exists or not.
The probability of the joint occurrence of two or more outcomes across two
or more events. The probability of the joint occurrence of outcome A and
outcome B will be equal to the product of the probabilities of A and B.
Example:
Multiplication Rule: When two events, A and B, are mutually exclusive, the
probability that A and B will occur is the product of the probability of each
event. (The example assumes a card is not returned to the deck after selec-
tion.)
or
There is about a 0.6% chance of pulling an Ace and a Jack with two draws
from a deck of playing cards.
No two outcomes can both occur in any given trial. As an example, males
and females are mutually exclusive groups in the characteristic sex. The ran-
dom selection of one individual from a population cannot (normally) produce
a person who is both male and female.
     2. 50% of the scores lie above and 50% below the midpoint of the dis-
     tribution.
3. The curve is asymptotic to the x axis (i.e., never touches the x axis).
     4. The mean, median, and mode are located at the midpoint of the x
     axis.
Example:
The odds ratio is used to determine if an event is the same for two groups.
The odds of an outcome represents the probability that the outcome does
occur divided by the probability that the outcome does not occur. An odds
ratio of 1 means the event is equally likely in both groups. An odds ratio
greater than one suggests the event is more likely in one group. A ratio less
than one suggests the event is less likely in one group as compared to an-
other.
A hypothesis test used when the direction of the difference can be predicted
with reasonable confidence or the focus of the test is only one tail of the
sampling distribution.
Objects classified by type or characteristic with some logical order. Also re-
ferred to as ordinal scale data. Examples of ordinal variables include military
rank, letter grade, class standing.
The probability of a Type I error. Type I error is rejecting a true null hypothe-
sis. Commonly interpreted as the probability of being wrong when conclud-
ing there is statistical significance.
________________
       Count |
       Row % |
       Col % |
     Total % |    Male | Female |     Total
--------------------------------------------
       Favor |      314|      497|      811
             |    38.72|    61.28|
             |    73.88|    88.91|    82.42
             |    31.91|    50.51|
--------------------------------------------
      Oppose |      111|       62|      173
             |    64.16|    35.84|
             |    26.12|    11.09|    17.58
             |    11.28|     6.30|
--------------------------------------------
             |      425|      559|      984
       Total |    43.19|    56.81|   100.00
Measures of Association
-------------------------------------
Cramer's V                  .196 [c]
Pearson C                   .192
Lambda Symmetric            .082 [d]
Lambda Dependent=Column     .000
Lambda Dependent=Row        .115 [e]
Interpretation
[b]	 When sample sizes are small, the continuous chi-square value tends to be too
large. The Yates continuity correction adjusts for this bias in 2x2 contingency tables.
Regardless of sample size, it is a preferred measure for chi-square tests on 2x2 ta-
bles.
Commonly used for nominal data. It is a circular figure with slices that repre-
sent the relative frequency (proportion) of a value of a variable.
Example:
Two variables vary in the same direction. As the values of one variable in-
crease the values of a second variable also increase. High values in one
variable are associated with high values in the second variable.
This study design does not have the ability to employ the controls employed
in an experimental design. Although internal validity is less than the experi-
mental design, external validity is generally better and statistical controls are
used to compensate for extraneous variables.
Error that can randomly occur in the measurement and data collection proc-
ess.
A variable whose values are not pre-determined. A measure where any par-
ticular value is based on chance through random sampling. A random vari-
able requires that a researcher have no influence on a particular observa-
tion's value.
Objects classified by type or characteristic, with logical order and equal dif-
ferences between levels, and having a true zero starting point. A ratio vari-
able includes more information than nominal, ordinal, or interval scale data.
Measures using this scale are considered to be quantitative variables. Exam-
ples include time, distance, income, age. Other examples include frequency
counts or percentage scales.
A straight line that best displays the relationship between two variables.
The equation for a straight line is used to fit the line to the data points.
"Best" is defined as the regression line that minimizes the error exhibited be-
tween the observed data points and those predicted with the regression
line.
The ratio of two conditional probabilities. Relative risk is defined as the ratio
of risk in the exposed group divided by the risk in the unexposed group.
Relative risk equals 1 when an event is equally probable in both groups. A
relative risk greater than 1 suggests the event is more likely in the one group
as compared to another. A relative risk less than 1 suggest the event is less
likely.
The extent to which a measure obtains similar results over repeat trials.
Defines the purpose of the study by clearly identifying the relationship(s) the
researcher intends to investigate.
A measure of association for two ordinal level variables that have numerous
rank ordered categories that are similar in form to a continuous variable.
Spearman's rho can range from -1 (strong negative association) to 1 (strong
positive association). Zero represents no association.
                                                          2
                                            Σ( X i − µ )
                                         σ=
Population Standard Deviation:		                  N      		
                                         S=
                                                 (
                                            Σ Xi − X      )
 Sample Standard Deviation:	 	                 n −1
Example:
                  24       -20.29            411.68
                  32       -12.29            151.04
                  32       -12.29            151.04           squared deviations
                  42        -2.29             5.24
                  55        10.71            114.70
                  60        15.71            246.80
                  65        20.71            428.90
        s
sx =
         n
Interpretation
A field of inquiry that is concerned with the collection, organization, and in-
terpretation of data according to well-defined procedures. It is the key to
every scientific discipline that studies the behavioral and biological charac-
teristics of human beings or our physical environment.
Interpreted as the probability of a Type I error. Test statistics that meet or ex-
ceed a critical value are interpreted as evidence that the differences exhib-
ited in the sample statistics are not due to random sampling error and there-
fore are evidence supporting the conclusion there is a real difference in the
populations from which the sample data were obtained.
The sum of the squared deviations between each observed value and the
group mean. Used to calculate variance and standard deviation.
A method of judging the value of a program at the end of the program activi-
ties. The focus is on the outcome.
One or more propositions that suggest why an event occurs that provides a
framework for further analysis.
A type of hypothesis test used when the direction of the difference cannot
be reasonably predicted a priori (before collecting or analyzing the data) or
the focus is on both tails of the sampling distribution. Most statistical test-
ing employs a two-tailed test. Statistical software typically reports p-values
based on a two-tailed test. Divide the p-value by 2 to get the one-tailed p-
value.
The object under study. This could be people, schools, cities, etc.
The study of one variable characteristic. Examples include test scores, vot-
ing record, salaries, education obtained, and medical test results.
A characteristic that can form different values from one observation to an-
other. If a characteristic is not a variable, it is a constant.
                                              2
                          Σ( X i − µ )
                     σ2 =
Population Variance:            N
                          S2 =
                                 (
                               Σ Xi − X   )
Sample Variance:		                n −1
Example:
Wrong Method:
Correct Method: