0% found this document useful (0 votes)
583 views53 pages

Broota Sir Book

This document serves as an introduction to experimental design, focusing on its principles and methods for data collection, analysis, and interpretation in behavioral sciences. It emphasizes the importance of controlling variance through systematic, extraneous, and error variance, and outlines various procedures such as randomization, elimination, and matching for controlling extraneous variables. The book primarily adheres to the 'Fisher tradition' of experimental design, highlighting the need for a structured approach to achieve valid research outcomes.

Uploaded by

ayeshamathur249
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
583 views53 pages

Broota Sir Book

This document serves as an introduction to experimental design, focusing on its principles and methods for data collection, analysis, and interpretation in behavioral sciences. It emphasizes the importance of controlling variance through systematic, extraneous, and error variance, and outlines various procedures such as randomization, elimination, and matching for controlling extraneous variables. The book primarily adheres to the 'Fisher tradition' of experimental design, highlighting the need for a structured approach to achieve valid research outcomes.

Uploaded by

ayeshamathur249
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

K D BROOTA

*U NEW AGE INTERNATIONAL PUBLISHERS


Experimental Design:
An Introduction

WHAT IS EXPERIMENTAL DESIGN? 2


EXPERIMENTAL DESIGN AS VARIANCE CONTROL 3
Systematic Variance3 wag eet TE
Extraneous Variance 3 Te bets
_ Randomization A
~ Elimination a
~ Matching» 2
_ Additional Independent Variable 5
~ Statistical Control: OF
uth Heatin
Error Variance 6.
‘Validity re: SE :
TYPES OF EXPERIMENTAL D GNS 4

"Bask T ERMINOLOGY OF STATISTICAL ANALYSIS 13


Null Hypothesis (Hj) 14
Level of Significance 14
Type | and Type Il Error 15
Power of the Test 75
Region of Rejection 15
This book is about basic principles of experimental design and analysis, and is addressed to the students
and researchers in behavioural sciences. More particularly, it is an attempt to set forth the principles of
designing experiments, methods of data collection, analysis, and interpretation of results. Before we
focus on the principles of designing experiments and methods of analysis, it would be putting the matter
in proper perspective if we first defined experimental design.
The term experimental design has been used differently by different authors. A look at the literature
on the subject reveals that the term experimental design has been used to convey mainly two different,
though interrelated, meanings. In the first category are those who have used the term in general sense to
include a wide range of basic activities for carrying out experiments, that is, everything from formulation
of hypotheses to drawing of conclusions. The second definition of the term is comparatively restricted.
The term is used in the “Fisher tradition” that is, to state statistical principles underlying experimental
designs and their analysis, wherein an experimenter can schedule treatments and measurements for
optimal statistical efficiency. It contains activities like procedure for selection of factors and their levels
for manipulation, identification of extraneous variables that need to be controlled, procedures for handling
experimental units, selection of criterion measure, selection of specific design (e.g., factorial design,
Latin square design) and analysis of data.
In this book, we shall be dealing with the designs that conform primarily to the latter definition of
the term, although, other aspects of designing, contained in the former definition, cannot be ignored in
entirety. The reader will appreciate that research is an integrated activity, where one step out of a sequence
of steps cannot be effectively isolated from the rest. Thus, knowledge of the basic principles of
experimental design covered by both the definitions is a prerequisite for achieving the objectives of
research. However, in this book, we shall concentrate primarily on the second definition of the term,
that is, principles of experimental design and analysis, in the “Fisher tradition”.

WHAT IS EXPERIMENTAL DESIGN? |

Winer (1971) has compared the design of an experiment to an architect's plan for the structure of a
building. The designer of experiments performs a role similar to that of the architect. The prospective
owner of a building gives his basic requirements to the architect, who then exercising his ingenuity
prepares a plan or a blue-print outlining the final shape of the structure. Similarly, the designer of the
experiment has to do the planning of the experiment so that the experiment on completion fulfils the
objectives of research. According to Myers (1980), the design is the general structure of the experiment,
not its specific content.
EXPERIMENTAL DESIGN: AN INTRODUCTION ff a

Though, there are different objectives of designing of an experiment, it may not be out of proportion
to state that the most important function of experimental design is to control variance. According to
Lindquist (1956), “Research design is the plan, structure, and strategy of investigation conceived so as
to obtain answer to research question and to control variance”. The first part of the statement emphasizes
only upon the objective of research, that is, (o obtain answer to research question. The most important
function of the design is the strategy (o control variance. This point will be elaborated in the discussion
that follows,

EXPERIMENTAL DESIGN AS VARIANCE CONTROL

Variance control, as we shall notice throughout this book, is the central theme of experimental design.
Variance is a measure of the dispersion or spread of a set of scores. It describes the extent to which the
scores differ from each other. Variance and variation, though used synonymously, are not identical
terms. Variation is a more general term which includes variance as one of the statistical methods of
representing variation. A lot more is discussed about variance in chapter 2. Here we shall confine the
discussion and only emphasize its importance and methods of its control.
The problem of variance control has three aspects that deserve full attention. The three aspects of
variance are: systematic variance, extraneous variance and error variance. Main functions of experimental
design are to maximize the effect of systematic variance, control extraneous source of variance!, and
minimize error variance. The major function of experimental design is to take care of the second function,
that is, control of extraneous source of variance. Here we shall consider this aspect in comparatively
greater detail. It will be seen later on that various designs are available for controlling the extraneous
source of variance in different situations, and with the help of these designs, an experimenter can draw
valid inference.

SYSTEMATIC VARIANCE
Systematic variance is the variability in the dependent measure due to the manipulation of the experimental
variable by the experimenter. An important task of the experimenter is to maximize this variance. This
objective is achieved by making the levels of the experimental variable/s as unlike as possible. Suppose,
an experimenter is interested in studying the effect of intensity of light on visual acuity. The experimenter
decides to study the effect by manipulating three levels of light intensity, ic., 10 mL, 15 mL, and
20 mL. As the difference between any two levels of the experimental variable is not substantial, there is
little chance of separating its effect from the total variance. Thus, in order to maximize systematic
variance, it is desirable to make the experimental conditions (levels) as different as possible. In this
- experiment, it would be appropriate, then, to modify the levels of light intensity to 10 mL, 20 mL, and
30 mL, so that the difference between any two levels is substantial.

EXTRANEOUS VARIANCE
In addition to the independent variable and the dependent variable, which are main concerns in any
experiment, extraneous variables are encountered in all experimental situations that can influence the
dependent variable.
IExtraneous source of variance is contributed by all the variables other than the independent variable whose effect is
being studied in the experiment. These variables have often been called extraneous variables, irrelevant variables,
secondary variables, nuisance variables etc. In this book, all variables in the experimental situation other than the
independent variable have been termed as extraneous variables or secondary variables.
RESEARCH
Experimenta DEesIGN IN BEHAVIOURAL is

There are five basic procedures for controlling the extraneous source of variance. These procedures
are:
(i) Randomization (ii) Elimination

(iii) Matching (iv) Additional Independent Variable


(1) Statistical Control

Randomization
An important method of controlling extraneous variable/s is randomization. It is considered to be the
most effective way to control the variability due to all possible extraneous sources. If thorough
randomization has been achieved, then the treatment groups in the experiment could be considered
statistically equal in all possible ways. Randomization is a powerful method of controlling secondary
variables. In other words, it is a procedure for equating groups with respect to secondary variables,
According to Cochran and Cox (1957), “Randomization is somewhat analogous to insurance in that it is
a precaution against disturbances that may or may not occur and that may or may not be serious if they
do occur”.
Randomization in the experiment could mean random selection of the experimental units from the
larger population of interest to the experimenter, and/or random assignment of the experimental units or
subjects to the treatment conditions. Random assignment means that every experimental unit has an
equal chance of being placed in any of the treatment conditions or groups. However, in making groups
equal in the experiment, we may have random assignment with constraints. The assignment is random,
except for our limitations on number of subjects per group or equal number of males and females, and
so on. Random selection and random assignment are different procedures. It is possible to select a
random sample from a population, but then assignment of experimental units to groups may get biased.
Random assignment of subjects is critical to internal validity. If subjects are not assigned randomly,
confounding” may occur.
An experimental design that employs randomization as a method of controlling extraneous variable
is called randomized group design. For example, in the randomized group design (chapter 3), extraneous
source of variance due to individual differences is controlled by assigning subjects randomly to, say, k
treatment conditions in the experiment. According to McCall (1923), “Just as representativeness can be
secured by the method of chance, ... . so equivalence may be secured by chance, provided the number
of subjects to be used is sufficiently numerous”. This refers to achieving comparable groups through the
principle of chance. It may, however, be noted that randomization is employed even when subjects are
matched. In repeated measures design (within subject design), where each subject undergoes all the
treatment conditions, the order in which treatments are administered to the subjects is randomized
independently for each subject (see chapters 6, 11 and 12).
Fisher’s most fundamental contribution has been the concept of achieving pre-experimental equation
of groups through randomization. Equating of the effects through random assignment of subjects to
groups in the experiment is considered to be the overall best tool for controlling various sources of

*Term is used to describe an operation of variables in an experiment that confuses the interpretation of data. If the
independent variable is confounded with a secondary variable, the experimenter cannot separate the effects of the two
variables on the dependent measure.
EXPERIMENTAL DESIGN: AN INTRODUCTION fi ff za

extraneous variation at the same time. Perhaps, the most important discriminating feature of the
experimental design, as compared to the quasi-experimental design’, is the principle of randomization.

Elimination
Another procedure for controlling the unwanted extraneous variance is elimination of the variable by so
choosing the experimental units that they become homogencous, as far as possible, on the variable to be
controlled. Suppose, the sex of a subject, an unwanted secondary variable, is found to influence the
dependent measure in an experiment. Therefore, the variable of sex (secondary source of variance) has
to be controlled. The experimenter may decide to take either all males or all females in the experiment,
and thus, control through climination the variability due to sex variable. Procedure explained in this
particular example is also referred to as the method of constancy. Let us take another example to illustrate
the control of unwanted extraneous variance by elimination. Suppose, intelligence of the subjects in the
group is found to influence the scores of the subjects on achievement test. Its potential effect on the
dependent variable can be controlled by selecting subjects of nearly uniform intelligence. Thus, we can
control the extraneous variable by eliminating the variable itself. However, with this procedure we lose
the power of generalization of results. If we select subjects from a restricted range, then we can discuss
the outcome of experiment within this restricted range, and not outside it. Elimination procedure for
controlling the extraneous source of variance is primarily a non-experimental design control procedure.
Elimination as a procedure has the effect of accentuating the between group variance through decrease
in the within group or error variance.

Matching
Another procedure, which is also a non-experimental design procedure, is control of extraneous source
of variance through matching. The procedure is to match subjects on that variable which is substantially
related to the dependent variable. That is, if the investigator finds that the variable of intelligence is
highly correlated with the dependent variable, it is better to control the variance through matching on
the variable of intelligence. Suppose, an investigator is interested in studying the efficacy of method of
instruction on the achievement scores of the 10‘ grade children. The methods to be evaluated are:
lecture, seminar, and discussion. Here the method of instruction is the experimental variable of interest
to the investigator. The investigator discovers that the achievement scores (DV) are positively correlated
with the intelligence of the subjects, that is, subjects with high intelligence tend to score high on the
achievement test and those who are low on intelligence score are low on the achievement test. Thus, the
variable of intelligence (not of direct interest to the investigator) needs to be controlled because it is a
source of variance that will influence the achievement scores. In this experiment, the extraneous variable
(intelligence) can be controlled by matching the subjects in the three groups on intelligence (concomitant
variable).
However, matching as a method of control limits the availability of subjects for the experiment. If
the experimenter decides to match subjects on two or three variables, he may not find enough subjects
for the experiment. Besides this, the method of matching biases the principle of randomization. Further,
matching the subjects on one variable may result in their mismatching on other variables.
Additional Independent Variable
Sometimes the experimenter may consider elimination inexpedient or unpractical. He may not eliminate
the extraneous variable (of not direct interest to the experimenter) from the experiment and, thus, build
3In this book, as stated earlier, the subject of experimental design is treated in the “Fisher tradition”. The reader is
advised to go through the other aspect of designing, referred to in the introductory paragraph with the first meaning of
the term, experimental design. Campbell and Stanley (1963) have presented an excellent treatment of the subject
experimental and quasi-experimental designs in the non-statistical tradition.
Pa a ® Experimentat DesiGn IN BEHAVIOURAL RESEARCH
ee,

it right into the design as a second independent variable. Suppose, an experimenter is interested in
studying the efficacy of methods of instruction on achievement scores. He does not want to eliminate
the variable of intelligence. He introduces intelligence as an attribute* variable. He creates three groups
on the basis of intelligence scores of the subjects. The three groups consist of subjects of superior
intelligence, average intelligence and low intelligence as levels of the second variable (intelligence),
With the help of analysis of variance, the experimenter can take out the variance due to intelligence
(main effect of intelligence) from the total variance. The experimenter may decide to study the influence
of intelligence on achievement, and also the interaction between intelligence and method of instruction,
Thus, the secondary source of variance is controlled by introducing the secondary variable as an
independent variable in the experiment, and the experimenter gets the advantage of isolating the effect
of intelligence on achievement and the interaction effect as additional information.
Due outcome of such a control procedure is a factorial design. In the above example, it will be a
3 x 3 factorial design. There, the first variable or factor is intelligence (having three levels) and the
second variable is the method of instruction (three levels). The first factor or independent variable is a
classification variable or a control variable and the second one is the experimental variable, which was
directly manipulated by the experimenter.

Statistical Control
In this approach, no attempt is made to restrain the influence of secondary variables. In this technique,
one or more concomitant secondary variables (covariates) are measured, and the dependent variable is
statistically adjusted to remove the effects of the uncontrolled sources of variation. Analysis of covariance
is one such technique. It is used to remove statistically the possible amount of variation in the dependent
variable due to the variation in the concomitant secondary variable. The method has been presented in
chapter 14.
The extraneous source of variance can also be controlled with the help of various experimental
designs. For example, we can make the extraneous variable constant by “blocking” the experimental
units as in the randomized complete block design (chapter 5). In this design, the subjects pretested on
the concomitant secondary variable are grouped in blocks on the basis of their scores on the concomitant
variable so that the subjects within blocks are relatively homogeneous. The purpose is to create between
block differences. Later on, the variance between the blocks is taken out from the total variance. Thus,
the variability due to the extraneous variable is statistically held constant.
Let us take up an example to illustrate this point. Suppose, an investigator finds that anxiety level of
the subjects, an extraneous variable of no direct consequence to the purposes of the experiment, influences
the dependent variable in the experiment. The experimenter can control this secondary source of variation
through elimination, that is, by selecting subjects of low anxiety level only. However, this procedure
will limit the generality of the results. So the experimenter may decide to apply statistical technique to
control the extraneous variable (anxiety level). He can administer an anxiety test (to measure concomitant
variable) to all the subjects (selected randomly for the experiment), and then create blocks on the basis
of their anxiety scores such that within the blocks the subjects are as homogeneous as possible, and the
differences between the blocks are high. In such a design, the variability due to the block differences is
taken out from the total variation. Thus, the statistical control technique can be utilized by the experimenter
to control the variance contributed by an extraneous variable.

ERROR VARIANCE
The results of experiments are affected by extraneous variables which tend to mask the effect of
experimental variable. The term experimental error or error variance is used to refer to all such
4A characteristic that can be identified and measured.
EXPERIMENTAL DesiGN: AN INTRopucTION Mi ff
uncontrolled sources of variation in experiments. Error variance results from random fluctuations in the
experiment. Experimental errors can be controlled either through experimental procedures or some
statistical procedure. If we are not able to effectively control the extraneous source of variation, then it
will form the part of error variance, By controlling secondary source of variation, one can reduce the
experimental error.
Iwo main sourees of error variance may be distinguished. First is inherent variability in the
experuinentalunits to which treatments are applied. Second source of error variance is lack of uniformity
in physical conduet of experiment, or in other words lack of standardized experimental technique. This
refers to the errors of measurement.
Individuals vary a lot in respect of intelligence, aptitude, interests, anxiety, etc. All these person-
related variables tend to inflate the experimental error, The other source of error variance is associated
with errors of measurement and could be due to unreliable measuring instrument, fatigue on the part of
experimental units, transient emotional states of the subject, inattention by subjects at some point of
time, and so on.
Statistical controls can be applied to minimize such error variance. For example, repeated measures
design can be used to minimize the experimental errot. By this technique the variability due to individual
differences is taken out from the total variability, and thus, the error variance is reduced. Analysis of
covariance is also a technique to reduce the error variance. Further, error variance can be controlled by
increasing the reliability of measurements by giving clear and unambiguous instructions, and by using
a reliable measuring instrument, etc. ‘
It has been pointed out earlier that an important function of experimental design is to maximize the
systematic variance, control extraneous source of variance, and minimize error variance. The systematic
variance or variance due to experimental variable is tested against the error variance (F test is discussed
at length in chapter 2), therefore, the error variance should be minimized to give systematic variance a
chance to show the significance. In the; next,chapter, we shall learn that for the variability due to
experimental variable (between group. variance) to be accurately evaluated for significant departure
from chance expectations, the denominator, that is, error variance should be an accurate measure of the
error. wal! p* met og ak ow

VALIDITY ns
Validity is an important concept in measurement, 1‘may itbe i ina | testing situation or inriepeniinentall
situation. In experimental situation, validity iis related to, the control of secondary variables. More the
secondary variation that slips into an investigation, greater is the possibility that the independent variable
was not wholly responsible for dependent variable changes. Secondary, or extraneous variation may
influence the dependent variable to an extent, where the conclusions drawn become invalid.
In experimental] situations, the validity problem is divided into two parts—internal and external
validity. Internal validity is basic minimum without which the outcome of any experiment is
uninterpretable. That is, it is concerned with making certain that the independent variable manipulated
in the experiment was responsible for the variation in dependent variable. On the other hand, external
validity is concerned with generalizability. That is, to what populations, settings, treatment variables,
etc., can the effect (obtained in an experiment) be generalized. For detailed discussion on internal and
external validity, the reader may refer to Campbell and Stanley (1963).
Fart B® Exrerimentat Desicn IN BEHAVIOURAL RESEARCH

TYPES OF EXPERIMENTAL DESIGNS

In behavioural sciences, specially in education and social research, it 1s not always possible to exercise
full control over the experimental situation. For example, the experimenter may not have the liberty of
assigning subjects randomly to the treatment groups or the experimenter may not be in a position to
apply the independent variable whenever or to whomever he wishes. Collectively, such experimental]
situations form part of quasi-experimental designs. |

In another research situation, the objective may be to study intensively a particular individual rather
than a group of individuals. In the former case the researcher may be interested in answering questions
about a certain person or about a person’s specific behaviour. For example, behaviour of a particular
individual may be observed to note changes over a period of time to study the effect of a behaviour
modification technique. All such designs in which observations or measurements are made on individual
subject are categorized as single case experimental designs, in contrast to the designs in which groups
of subjects are observed and the experimenter has full control over the experimental situation (as in
experimental design).
The experimental situations in which experimenter can manipulate the independent variable/s and
has liberty to assign subjects randomly to the treatment groups and control the extraneous variables are
designated as true experiments. The designs belonging to this category are called experimental designs
and in this book we are concerned with such regions only.
Understanding the nature of experimental design will be easier if we fully comprehend the nature
and meaning of quasi-experimental design and single case experimental design. Let us consider the
three types of designs—single case experimental design, quasi-experimental design, and experimental
design, in some detail.

Single Case Experimental Design


_ Single case experimental designs are an outgrowth of applied clinical research, specially in the area of
behaviour modification. In this type of design, repeated measurements are taken across time on one
particular individual to note the subtle changes in behaviour. Single subject or single case experimental
designs are an extension of the before-after design.
The single case experimental designs do not lend themselves to clear statistical analysis and
hypothesis testing has not been formalized as in the case of experimental designs. The experimenter
relies on the convincingness of the data. In these designs, the experimenter cannot control the order
effects. Moreover, the designs do not provide a good basis for generalization. However, single case
experimental designs provide us such information about human behaviour that is not always obtainable
in group designs. It is specially useful in clinical research where individual’s behaviour is of paramount
importance. Here, we shall not deal with the subject of single case experimental design in detail. For
-detailed treatment of the subject, the reader may refer to Hersen and Barlow (1976).

Quasi-Experimental Design
All such experimental situations in which experimenter does not have full control over the assignment
of experimental units randomly to the treatment conditions, or the treatment cannot be manipulated, are
collectively called quasi-experimental designs. For example, in an ex-post-facto study the independent
. variable has already occurred and hence, the experimenter studies the effect after the occurrence of the
variable. In another situation, three intact groups are available for the experiment but the experimenter
,
ExPERIMENTAL DESIGN: AN INTRODUCTION & ff a

cannot assign the subjects to the treatment conditions; only treatments can be applied randomly to the
three intact groups. There are various such situations in which the experimenter does not have full
control over the situation. The plan of such experiments constitutes the quasi-experimental design.
Let us take an example from research to distinguish quasi-experimental design from experimental
design. First, we give an example of an experimental design. Suppose, an investigator is interested in
evaluating the eflicacy of three methods of instruction (lecture, seminar and discussion) on the
achievement scores of the students of 10" grade. The experimenter draws a random sample of kn subjects
irom a large population of 10!" grade students. Then 1 subjects are assigned randomly to each of the k
(here & = 3) treatment conditions. Each of the 1 subjects in each of the & treatment groups is given
instructions with a method for one month. Thereafter, a common achievement test is administered to all
the subjects. The outcome of the experiment is evaluated statistically in accordance with the design of
the experiment (randomized group design or single factor experiment).
Let us consider an example of the quasi-experimental design. Suppose, for the aforesaid problem,
the experimenter cannot draw a random sample of 10" grade students as the schools will not permit the
experimenter to regroup the classes to provide instructions with the methods he is interested in. Ideal
conditions being unavailable, the experimenter finds three schools, following the same curriculum and
each providing instructions by one of the three methods. He administers an achievement test to the
subjects from the three schools and compares the outcome to evaluate the effect of each method of
instruction (ex-post-facto) on achievement scores.
It is observed from the example that the experimenter in the second condition did not have control
over the selection of subjects and also over the assignment of subjects to the treatments. Further, the
experimenter could not manipulate the independent variable (providing instructions with the three
methods) as the independent variable had already occurred. This experiment constitutes what we call
quasi-experiment.
Notice that the objective of the experiment was same in both the designs. However, random
assignment of subjects to the treatment groups was not possible in the quasi-experiment and it was,
therefore, a handicap in controlling secondary variables. These investigations are as sound as experimental
investigations, but are less powerful in drawing causal relationships between independent and dependent
variables. The statistical tests applied to the data obtained from quasi-experimental designs are same as
those applied to data in experimental designs. It is possible to perform even analysis of covariance on
data of such studies. However, the conclusions cannot be drawn with as much confidence as from the
studies employing experimental designs because some of the assumptions (e.g., randomization)
underlying the statistical tests are violated in the quasi-experiments. Besides this, the experimenter does
not have full control over the secondary variables.
Though quasi-experimental investigations have limitations, nevertheless these are advantageous in
certain respects. It is possible to seek answers to several kinds of problems about past situations and
those situations which cannot be handled by employing experimental design.
o

Experimental Design
Included in this category are all those designs in which large number of experimental units or subjects
are studied, the subjects are assigned randomly to the treatment groups, independent variable is
‘ manipulated bythe experimenter, and the experimenter has complete control over’ the scheduling of ~
independent variable/s. Fisher’s statistical innovations had tremendous influence on the growth of this
io B® Exeerimentat Desicn IN BeHaviourAL RESEARCH 2
ee
subject; his special contribution was the problem of induction or inference. After the invention by
Fisher of the technique of analysis of variance, it became possible to compare groups and study
simultaneously the influence of more than one variable.
There are three types of experimental designs—between subjects design, within subjects design
and mixed design. In the between subjects design, each subject is observed only under one of the
several treatment conditions. In the within subjects design or repeated measures design, each subject is
observed under all the treatment conditions involved in the experiment. Finally, in the mixed design,
some factors are between subjects and some within subjects.

BASIC TERMINOLOGY IN EXPERIMENTAL DESIGN

As in any area of study, experimental designs also have some terminology which we shall be using in
the chapters that follow. It is essential to get acquainted with the terminology for clear understanding of
the designs and analysis given in the following chapters.

Factor
A factor is a variable that the experimenter defines and controls so that its effect can be evaluated in the
experiment. The term factor is used interchangeably with terms, treatment or experimental variable.
A factor is also referred to as an independent variable. A factor may be an experimental variable which
is manipulated by the experimenter. For example, the experimenter manipulates the intensity of
illuminance to study its effect on visual acuity. Here, illuminance is an experimental variable and is
referred to as treatment factor. Then, there are subject related variables which cannot be directly
manipulated by the experimenter but can be manipulated through selection. For example, if the
experimenter is interested in studying the effect of age on RT (response time), he may manipulate the
age by selecting subjects of different age levels. When the variable is manipulated through selection, it
is generally referred to as classification factor. Variables of this category allow the researcher to assess
the extent of differences between the subjects.
The independent variable or factor that is directly manipulated by the experimenter is also known
as E type of factor and one that is manipulated through selection is known as S type of factor. S type of
factors are generally included to classify the subjects for the purposes of control. At times, the experimenter
may be interested in evaluating the effect of S' type of factors. For example, an experimenter may
classify subjects into low, medium and high economic groups to assess the extent of differences between
the subjects in the three groups. However, most of the time the classification factor is built into the
design, not because of intrinsic interest in the effects but because the results are likely to be difficult to
interpret if these factors are not included. These factors are defined by their function in the design and
may be either classification or treatment factors.
Factors are denoted by the capital letters 4, B, C, D and so on. For example, in an experiment
having two factors, Factor A refers to the variable of intensity of light and Factor B to the variable of
size.
Levels
Each specific variation in a factor is called the level of that factor. For example, the factor light intensity
may consist of three levels: 10 mL, 20 mL and 30 mL. Experimenter may decide to choose the number
of levels of a factor. The number of potential levels of a factor is generally very large. The choice. of
EXPERIMENTAL DESIGN: AN INTRODUCTION ff a
levels to be included and manner of selection of the levels, from among the
large number available to
the experimenter, ina design is a major decision on the part of the experime
nter. Some factors may have
infinite number of potential levels (c.g., light intensity) and others may have few
(e.g., sex of the subject).
In case the experimenter decides to select P levels from potential P levels availabl
e on the basis of some
systematic, non-random procedure, the factor is considered a fixed factor. In contrast
to this systematic
sclection procedure, when the experimenter decides to include p levels from the
potential P levels
through random procedure, then the factor is considered a random factor. A detailed
discussion on the
manner of selection of levels of factors and the statistical models involved
have been presented in
chapter 3.
The potential levels ofa factor are desi gnated by the corresponding lower case-(small) letters
of the
factor symbol with a subscript. For example, the potential levels of factor A will be designat
ed by the
symbols a,, a,, Ay vy Ay, Similarly, the potential levels of factor B will be designa
ted by the symbols
Ls Hag: Bg, ening b,.
°
Dimensions
The dimensions of a factorial experiment are indicated by the number of levels of each factor
and the
number of factors. For example, a three factor experiment in which the first factor has p levels,
second
q levels and the third r levels, will be designated as p * qr factorial experiment. This is the general
form and the dimensions in the specific case may assume any value for p, g, and r. A factorial
experiment,
for example, in which there are three factors, first having 2 levels, second having 4 levels
and third
having 4 levels, is called 2 x 4 x 4 (read as two by four by four) factorial experiment. The
dimension of
this experiment is 2 x 4 x 4.
.
Treatment Combinations
A treatment is an independent variable in the experiment. In this text, the term treatment will be used
to
refer to a particular set of experimental conditions. For example, in a 2 x 4 factorial experiment
, the
subjects are assigned to 8 treatments. The term treatment and treatment combinations will be
used
interchangeably. In a single factor experiment, the levels of the factor constitute the treatments. Suppose,
in an experiment the investigator is interested in studying the effect of levels of illumination on visual
acuity and the experimenter decides to have three levels of illuminance. Thus, there will be three treatments
and in a randomized group design, each of the n subjects will be assigned to each of the three treatments
randomly. Let us take another example to present a case of treatment combination. In a2x3x4
factorial experiment, there will be a total of 24 treatment combinations and each of the n subjects will
be assigned randomly to one of the 24 treatment combinations.

Replication |
The term replication refers to an independent repetition of the experiment under as nearly identical
conditions as possible. The experimental units in the repetitions being independent samples from the
population being studied. It may be pointed out that an experiment with n observations per cell is to be
distinguished from an experiment with n replications with one observation per cell. The total number of
observations per treatment in the two experiments is the same, but the manner in which the two
experiments are conducted differs. For example, a 2 x 2 x 2 factorial experiment having 8 treatment
combinatior.s with 5 observations per treatment is different from an experiment with 5 replications with
one observation per cel]. The total number of observations per treatment is the same, that is, five. The
purpose of a replicated experiment is to maintain more uniform conditions within each cell of the
experiment to eliminate possible extraneous source of variation between cells. The partitioning of total
Bie.
BES BS Exeerimentar Desicn in BeHAviourAL RESEARCH ee

variation and df (degrees of freedom) in the replicated and non-replicated experiments will differ (see
chapter 7). It is quite important that the number of observations per cell for any single replication
all cells of the experiment.
should be the maximum so as to ensure uniform conditions within

Main Effects
over other
The difference in performance from one level to another for a particular factor, averaged
(MS) for the levels of factors
factcrs is called main effect. In a factorial experiment, the mean squares
of a 2 x 3 x 4 factorial experiment
are called the main effects of the factors. Let us consider an example corresponds to
factor C four. The A sum of squares
in which factor 4 has two levels, factor B three and and
of squares to a comparison between levels b,, b,,
a comparison between levels a, and a,, the B sum differ ence in
between levels C,, Cy, C3» and c,. The
b,, and the C sum of squares to a comparison effect
of factors B and C, is called the main
performance between levels a, and a,, averaged over levels
levels b,, b,, and b,, averaged over levels of
of A. Similarly, the difference in performance among
so on.
factors 4 and C, is called the main effect of B and
representing the mean performance on
The main effect, graphically, is the curve joining the points
other factors in the experiment. A significant main
the levels of a particular factor averaged over the
, the curve will not be parallel to the X-axis.
effect will have significant slope or, in other words

Simple Effects
is
on one factor at a given level of the other factor
Ina factorial experiment, the effect of a treatment in which facto rs 4
le of a 2 x 2 factorial experiment
called the simple effect. Let us consider an examp each of the two levels of B
treatment on two levels of A under
and B have two levels each. The effect of each of the two
calle d simpl e effect of A. Simil arly, the effect of treatment on two levels of B under
is
levels of A is called simple effect of B.
r interaction. For
nted in the same manner as the two facto
Graphically, the simple effects are prese levels of factor A are
le, the simpl e effect of A, graph icall y, is the AB interaction profile where the
examp effect s at
sented by the levels 5, and b, are the simple
marked on the X-axis, and the two curves repre the levels of
effect of B is the AB interaction profile where
each of the two levels. Similarly, the simple a, are the simple effects at each
curves, represented by a, and
factor B are marked on X-axis and the two
level.
.
Interaction Effect one
w the investigator to study the effects of more than
Factorial designs are important because they allo over the single factor
efficiency of factorial design
variable at a time. Apart from the advantage of
experiment, it at the same time permits the invest igator to evaluate the interaction among the independent
in research. It can be evaluated in all
variables that are present. Interaction is an important concept
les.
experiments having two or more independent variab
s of one variable alters
Interaction between two variables is said to occur when change in the value
n, from a statistician’
the effects on the other. However, it may be noted that the presence of interactio
at the firs!
point of view, destroys the additivity of the main effects. That is, what is added by one factor
n, the othe!
level of the other is different from what is added at another level. Absence of interactio on
hand, means that the additive property applies to the main effects, that is, they are indep enden t.
Let us explain the concept of interaction with the help of an example. An experimenter is interested
in evaluating the effect of two study hours (i.e., 4 hrs. and 8 hrs.) on the achievement scores of the 10
EXPERIMENTAL DESIGN: AN INTRODUCTION §§ a

grade students. In order to control the influence of secondary variable of intelligence, the experimenter
includes intelligence as a second independent variable in the experiment. Two groups of subjects are
included in the experiment, one of high intelligence and the other of low intelligence. The students are
assigned randomly to the two levels of the experimental variable (study hours). It is, thus, a 2? or 2 x 2
factorial experiment. The mean scores of the four groups are summarized in a2 x 2 contingency table
below:

Study hours
4 hrs 8 hrs
. i 1 9.
Intelligence Level Hl ae :
Lo 5.0 6.5
Difference for high intelligence group: 9.8 — 7.1 = 2.7
Difference for low intelligence group: 6.5—5.0 = 1.5
Interaction (Difference) = 1.2
Alternatively,

Difference for 8 hours group: 9.8 — 6.5 = 3.3


Difference for 4 hours group: 7.1 —5.0=2.1
Interaction (Difference) = 1.2
Interaction is indicated by the failure of the differences to be equal. If the differences are equal, it
means, there is no interaction. Interaction is measured by the difference between the two differences. It
can be observed from the table that the increase in the mean scores of the high intelligence group under
8 hours is much more than the corresponding increase in the mean scores of the low intelligence group.
That is, what is added by the factor of intelligence at the level of 4 hours is different from what is added
at the level of 8 hours. Clearly, the two factors have a combined effect which is different from the
effects when the two are applied separately.
It may be noted that in the presence of significant interaction effect, the main effect should be
interpreted with caution. It is observed quite often that, despite questionable meaning, F values for the
main effects are reported when interaction is present. This practice is open to criticism, but is almost
unavoidable. However, one must be cautious in interpreting the outcome which should be in relation to
the interaction present.

_BASIC TERMINOL LP, ifs

F STATISTICAL ANALYSI
We obtain a sample, conduct an experiment in accordance with the design of the experiment and finally
test the hypotheses, i.e., draw inferences beyond the data. Once the data have been obtained, the next
important problem for researcher is how to evaluate objectively the evidence provided by a set of
observations. In the tradition of experimental design presented in this book, the method of collecting
data, layout for the set of observations to be made, and statistical analysis, all are decided in advance of
the actual conduct of the experiment. Once a particular design is selected, all aspects of the experiment
from the initial to the final stage are taken care of.
B® Experimenta Desicn In BEHAVIOURAL RESEARCH

The statistical tests are useful tools to draw conclusions from evidence provided by samples, [t is
expected that the student or researcher using this book will have knowledge of elementary statistics
However, for completeness of the volume, some of the statistical concepts occurring in the following
chapters are briefly recapitulated here.

Null Hypothesis (H,)


The first important step in the decision making procedure is to state the null hypothesis (H,). The null
hypothesis is a hypothesis of no differences. It is a statistical hypothesis usually formulated for the
express purpose of being rejected. If H) is rejected, we may accept the alternate hypothesis (H le
Suppose, a researcher is interested in investigating the effect of certain treatments on two groups of
subjects. On the basis of some theory, he predicts that the two groups will differ in their mean performance,
This prediction would be the research hypothesis (H,) and confirmation of H, will lend support to the
theory from which it was derived. To test the research hypothesis (H,), we state it in operational form as
the alternate hypothesis (H,), ¢.g., U, # M.,, that is, the mean of first group is not equal to the mean of the
second group (two tailed). The null hypothesis (H)) would be 1, = p.,, i.e., the means of the two groups
are equal. In other words, the null hypothesis states that the performance of treatment groups is so
similar that the groups must belong to the same population or implies that the experimental manipulation
had no effect on the groups.
After formulating the null hypothesis, a suitable statistical test is applied. If the test yields a value
whose associated probability of occurrence under H) is equal to or less than o (level of significance set
in advance of the collection of data), we decide to reject H, in favour of H,. On the other hand if the test
yields a value, whose associated probability of occurrence under H) is greater than a, we decide to
accept the null hypothesis, that would mean that the two groups were random samples from the same
population. |
It may be noted that on the basis of experimental evidence a statistical test can lead to the rejection
of the null hypothesis but not its acceptance. The null hypothesis can never be proved by any finite
amount of experimentation (Fisher, 1949). In an experiment, given a null hypothesis with no specific
alternate hypothesis (H,), the experimenter either rejects or does not reject the null hypothesis. The
term “does not reject” does not mean that the null hypothesis is accepted. It is, therefore, important that
null hypothesis and alternate hypothesis be formulated precisely for each experiment.

Level of Significance
Level of significance is our own decision making procedure. In advance of the data collection, for the
requirement of objectivity, we specify the probability of rejecting the null hypothesis, which is called
the significance level of the test and is indicated by a. Conventionally, «= .05 and .01 have been chosen
as the levels of significance. We reject a null hypothesis whenever the outcome of the experiment has a
probability equal to or Jess than .05. The frequent use of .05 and .01 levels of significance is a matter of
convention having little scientific basis.
In contemporary statistical decision theory, this convention of adhering rigidly to an arbitrary .05
level has been rejected. It is not uncommon to report the probability value even when probability
associated with the outcome is greater than the conventional level .05. The reader can apply his own
judgement in making his decision on the basis of the reported probability level. In fact, the choice of
EXPERIMENTAL DESIGN: AN INTRODUCTION a

level of significance should be determined by the nature of the problem for which we seek an answer
and the consequences of the findings. For example, in medical research where the efficacy of a particular
medicine is being evaluated, .05 level may be considered a lenient standard. Perhaps, a stringent level
of significance, say .001 is more appropriate in this situation. However, if we select a very small value
of &, we decrease the probability of rejecting the null hypothesis when it is in fact false. The choice of
level of significance is related to the two types of errors in arriving at a decision about H/o.

Type | and Type Il Error


In making tests of significance, we are likely to be in error in drawing an inference concerning the
hypothesis to be tested. There are two types of errors which may be made while arriving at a decision
about the null hypothesis. The first, Type I error, is to reject A, when in fact it is true. The second, the
Type II error, is to accept H, when in fact it is false.
The probability of committing Type I error is associated with the level of significance, that is, ot.
Larger, the a, the more is the likelihood of A, getting rejected falsely. In other words, if the level of
significance for rejecting H, is high, we are more likely to commit Type I error.
The Type II error is usually represented by 8. When H, is false and we decide on the basis of a test
of significance not to reject H), then we are likely to commit Type II error.
p of Type I error = o
p of Type II error = 8
The probability of making Type I error is controlled by the level of significance (c) which is at the
discretion of the experimenter. For the requirement of objectivity, the specific values of « should be
specified before beginning data collection.

Power of the Test


We have just considered that there is an inverse relation between the likelihood of making the two types
of errors. That is, a decrease in o will increase f for a given sample of N elements. If we wish to reduce
Type I and Type II errors, we must increase N.
Various statistical tests offer the possibility of different balances between the two types of errors.
For achieving this balance, the notion of the power function of a statistical test is relevant.
The power of a test is defined as the probability of rejecting the null hypothesis (H) when it is in
fact false and, thus, must be rejected. That is,
Power =1 — probability of Type II error = 1 — B
It may be noted that the power ofa test increases with the increase in size of sample (N).

Region of Rejection
Region of rejection of H, is defined with reference to the sampling distribution. The decision rules
specify that H, be rejected if an observed statistic has any value in the region of rejection. The probability
associated with any value in the region of rejection is equal to o or less than a.
We @ ExeerimMentat DEsIGN IN BEHAVIOURAL RESEARCH

Fig. 1.1 (a) One-tailed test region of rejection, « = .05.


(b) Two-tailed test region of rejection, o = .05.

The location of region of rejection is affected by the nature of experimental hypothesis (H,). If H,
predicts the direction of the difference, then a one-tailed test is applied. However, if the direction of the
difference is not indicated by H,, then the two-tailed test is applied. It may be noted that one-tailed and
affected.
two-tailed tests differ only in the location of the region of rejection, size of the region is not
The one-tailed region and two-tailed region are being presented in Figs. 1.1 a and b respectively.
of the sampling
It can be seen in Fig. 1.1a that the region of rejection is entirely at one end or tail
test (Fig. 1.1b), the
distribution, 5 per cent of the entire area being under the curve. In a two-tailed
per cent of total area on each
region of rejection is located at both ends of the sampling distribution, 2.5
side of the distribution. . ‘
Analysis of Variance:
The Foundation of
Experimental Design

ANALYSIS OF VARIANCE AND t TEST 19


THE CONCEPT OF VARIANCE 20
Numerical Example 20
Numerical Example 24
ONE-WAY ANALYSIS OF VARIANCE 26
Numerical Example 27
Partitioning of Total variation and df 27
Computation 28
Test of Significance: 30
Strength of Association 31
General Comments 31
Summary of Steps 32

TWO-WAY ANALYSIS OF VARIAI


Numerical Example 33
Partitioning of Tota
Computation 35.
Geometric Representation of Interaction 39
Test of Significance - 40
ASSUMPTIONS UNDERLYING ANALYSIS OF VARIANCE 40
Normality 41
Homogeneity of Variance 41
Independence 42
General Comments on Assumptions 43
TRANSFORMATIONS 43
Square-Root Transformation 44
Logarithmic Transformation 44
Reciprocal Transformation 44
Arcsin Transformation 45
General Comments on Transformations 45
Linear Transformations 45
Rie WB @ ExeerimMentat DESIGN IN BEHAVIOURAL RESEARCH

ANALYSIS OF VARIANCE BY RANKS 47

The Kruskal-Wallis One-Way Analysis of Variance 47


Numerical Example 48
Computation 48
Test of Significance 49
Tied Observations 49
The Friedman Two-Way Analysis of Variance 50
Numerical Example 51
Computation 52
Test of Significance 52
Tied Observations 52
_ Small Samples 52 |
The analysis of variance was developed by Sir Ronald A. Fisher, a renowned British Statistician, and
the name F test was given to it by Snedecor in Fisher’s honour. The variance ratios, designated as F;
were tabulated by Snedecor in 1946. This device has made tremendous contribution to designing of
experiments and their statistical analysis.
The analysis of variance deals with variances rather than with standard deviations and standard
errors. The technique is useful in testing differences between two or more means. Its special merit lies
in testing differences between all of the means at the same time. The analysis of variance is a powerful
aid to the researcher. It helps him in designing studies efficiently, and enables him to take account of the
interacting variables. It also aids in testing hypotheses. It provides the basis for nearly all the tests of
significance in the designs which we shall consider in the following chapters. The working of the
analysis of variance and its reasoning, therefore, should be thoroughly grasped. In this chapter, we will
try to present some of the concepts and the working of the analysis of variance, which is the foundation
of experimental design. It will help in understanding the principles of designing experiments and the
statistical analysis.

ANALYSIS OF VARIANCE AND ¢ TEST


The ¢ test of significance is adequate when we want to determine whether or not two means differ
significantly from each other. It is employed in case of experiments involving only two groups. However,
for various reasons, f test is not adequate for comparisons involving more than two means. The most
serious objection to the use of t, when more than two comparisons are to be made, is the large number

a x2) . ae
of computations involved. For example, for three groups, 3 (73) comparisons or combinations taken

5x4 , and for 10 groups 45 CS *|


two at a time are required to be made, and for 5 groups, 10 (*5*)

comparisons are needed. Thus, as the number of groups increase, the number of comparisons to be
made increase rapidly, that is, the computation work increases disproportionately. Further, if a few
comparisons turn out to be significant, it will be difficult to interpret the results. Let us take up an
example to elucidate this point.

. k(k-1
* For k groups, the number of comparisons will be ( , ) i

|
20
Eun I ® Experimentat DESIGN IN BEHAVIOURAL RESEARCH
i . : ‘4 ; .
~~ Su a Ini aby experiment the investigator is interested in studying the effect of 10 treatments.
ently, 45 possible ¢ tests will have to be made for 10 treatment conditions. That is, first test
H 0: ‘ My = =; . then second test H,: |, =H; and so on, till we perform all the 45 ¢ tests
; for the difference
of 2 or 3 f's (.05 x 45)
oetveen every pair of means. Out of the 45 ¢ tests, we expect to find an average
5 differences are significant at .05
. e significant at 5 p.c. level by chance alone. Suppose, we find that
are true differences
evel. When f test is being applied, there is no way to know whether these differences
, for example several f tests, the
or within chance expectation. The more statistical tests we perform
significant purely by chance. Thus, the
more likely it is that some more differences will be statistically
three or more means. We would like the
t test is not an adequate procedure to simultaneously evaluate
probability of Type I error in the experiment to be .05 or less.
hand, permits us to evaluate three or more means
The analysis of variance or the F test, on the other
involving more than two means, the equality breaks
at one time. In making comparisons in experiments test for
always be preferred. The F is also an adequate
down. Hence, the analysis of variance should
(df=1), JF = tor F = &. Therefore, in
determining the significance of two means. For two groups
two tests
it is a matter of choice which one of the
case of two treatment conditions or two groups, variance
me. This means that the one -way analysis of
(t or F) is used. Both yield exactly the same outco two means.
eably in comparing the differences between
and the two-tailed ¢ test can be used interchang than the f test.
situation, F test is easier to perform
However, it will be found that in the same

THE CONCEPT OF VARIANCE _


pt. Let us, therefore,
imentati on and is an extremely useful conce
y Variance is the very foundation of exper
e handling simple analysis of variance.
try to understand its meaning and uses befor
to which
Variance is a measure O f the dispersion or
spread of a set of scores. It describes the extent
devia tion (s).
e root of the variance is called the standard
the scores differ from each other. The squar in
the v ariance is more useful than standard deviation
However, because of its mathematical properties, nce is only
synonymous ly are not identical terms. Varia
research. Variance and variation, though, used ore gener al term
variation. Variation is, thus, a m
one of the several statistical methods of representing
enting variation.
which includes variance as one of the methods of repres

NUMERICAL EXAMPLE
example. Suppose, an investigator
The concept of variance will be explained with the help of a numerical
on 5" grade children. Two independent
is interested in evaluating two different methods of instruction
of children in a 5th grade class.
groups of 10 children each are randomly selected from a large number
nt (methods of instruction)
The distribution of their achievement scores before administering the treatme
is as given in Table 2.1. The scores are arranged in the ascending order.
ANALYSIS OF VARIANCE! THE FOUNDATION OF EXPERIMENTAL DESIGN a a

Table 2.1 The Distribution of Scores in the two Subgroups


(Before Treatment)

Subject Number Xx, X,

l | 2
2 2 3
3 4 5
4 5 7
5 7 9
6 9 10

7 10 12
8 12 13
9 14 14
10 16 15

x 80 90

X, and X; represent scores of subgroups A and B respectively.


Their
In Figure 2.1, the distribution of 10 scores in each of the two subgroups has been presented.
means are X¥, and X, and variances s4 and s, respectively.

Groups

Bott + + +4 F4t4+ 5,=21.33


X,=90 J
Att +4 4,44 4 + +5,=25.78
X, = 8.0

bed
0 4 8 12 16 20
Scores

Fig. 2.1 Distribution of the scores in the two subgroups before the treatment

Careful observation of Figure 2.1 reveals:

(i) That, the scores vary about their subgroup means. Further, the variability (s4 = 25.78 and
of
s, = 21.33) of spread of scores about their respective means is similar, within the limits
chance variation.
e
(ii) That, the subgroup means (X, = 8.0 and X, =9.0) are similar, within the limits of chanc
variation.
WE ExperimenTat DesicN IN BEHAVIOURAL RESEARCH a,

The above two observations in the two samples are in accordance with the expectations of random
sampling. That is, the scores in each subgroup vary about the respective means to a similar extent and
further, the subgroup means are also similar but not identical, as the two samples were selected randomly
from the same population.
The investigator, then, administers the treatment to the two subgroups; treatments being assigned
randomly. That is, the two subgroups are given two different methods of instruction. After a period of
training, an achievement test is given to both the subgroups. The distribution of scores of the achievement
test, after the application of treatment is presented in Table 2.2. The scores are arranged in the ascending
order.
Table 2.2 Distribution of Scores in the two Subgroups
(After Treatment)

Subject Number xX, Xp

l 3 8
2 4 10
3 6 12
4 7 14
5 8 15
6 11 17
7 12 19
8 13 20
9 16 22
10 20 23

> 100 160

X, and X, represent scores of subgroups A and B respectively.


in Fig. 2.2 with their means
The distribution of scores in the two subgroups has been presented

(X, and X,) and variances (s4 and s% ).


Fig. 2.1
The following observations can be made from Fig. 2.2, in comparison to
= 10.0 and X, = 16.0).
(i) Within each subgroup, the scores vary about their subgroup means ( X,
s
However, the variability in each subgroup about their respective means is not much different (

= 29,33 and s, = 25.77), within the limits of chance variations.


of
(ii) The subgroup means after treatment (Fig. 2.2) have drifted apart in comparison to the closeness
means of the two subgroups observed before treatment (Fig. 2.1).
The comparative analysis, before and after the treatment, in Figs. 2.1 and 2.2, respectively is
particularly important for understanding the analysis of variance and the reasoning behind experimental
4
design. Let us examine.

fC —-
ANALYsIS OF VARIANCE: THE FOUNDATION OF EXPERIMENTAL Desicn ff i

Groups

B to¢ ob te ok Fe +4 5)=25.77
¥,= 16.0 ft
A t+ +++ gtt+t+ + + s/,= 29.33
z,-100)
Lobo ob tt ty yoy yop 4
0 4 8 12 16 20 24
Scores

Fig. 2.2 Distribution of scores in the two subgroups after the treatment.

The variability of subgroup means is of special importance in the analysis of variance, as it reflects
the variation attributable to the treatment effect as well as other uncontrolled sources of variation. Let
us again refer to Fig. 2.1. We find the two means are similar, within the limits of chance variation.
However, in Fig. 2.2, we can observe the effect of treatment of sub-group means; these have drifted
apart. This shows that the treatment has caused variation in the subgroup means. This is called between
group variation.
We have just seen that the treatment caused the subgroup means to drift apart. We have also observed
(Figs. 2.1 and 2.2) that the scores within each subgroup vary about their respective means (observe the
scattering of scores of the two groups around the arrow point, marking the subgroup means). This
variability is also of particular importance in the analysis of variance. The pooled variability of scores
about their respective subgroup means is called within group variation or “error”. It is free from the
influence of differential treatment.
Thus, we have been able to identify two sources of variation in the scores—one which reflects the
effect of treatment is called “between groups” variation and the one that reflects the variability within
the subgroups is called “within groups” or “error” variation. An increase in the difference among tne
means results in an increase in the variance of means, and it is this variance that we evaluate relative to
the error variance. The procedure adopted for this is called the analysis of variance. If the variability
between the groups is considerably greater than the error variability, this is indicative of the treatment
effect.
Perhaps, the most general way of classifying variability is as systematic variation and unsystematic
variation. Systematic variation causes the scores to lean more in one direction than another. We observed
two
in Fig. 2.2 that the application of treatment resulted in systematic variation in the means of the
subgroups. The variable manipulated by the experimenter is associated with systematic variation.
operation of
Unsystematic variation on the other hand, is the fluctuation in the scores due to the
of subjects
chance and other uncontrolled sources of variation in the experiment. Random assignment
or error.
in different groups helps in reducing the unsystematic variation
The most important function of experimental design is to maximize the systematic variation, control
in the
the extraneous source of variation, and minimize the unsystematic or error variation. We will see
designs.
later chapters that this objective is achieved in different ways in different
B® Exrerimentat Desicn in BeHaviourat RESEARCH ——s

is more frequently called a mean square,


A variance in the terminology of the analysis of variance
Variati SS
Mean square (MS) = — ar

. It is also the basic definition


In words, a mean square is the average variation per degree of freedom
of variance,
of variance and its importance in the
In the foregoing discussion, we have explored the concept ce, let
tation of simple or one-way analysis of varian
analysis of variance. Before we take up the compu
in the analysis of variance, like sum of squares,
us understand some other important concepts used
mean square, df, etc. and their computation.
NUMERICAL EXAMPLE
and the distribution of their scor' es is as given
Suppose, a group of 5 subjects is given a performance test
in Table 2.3.
Table 2.3 The Performance Scores of 5 Subjects

2 —3 9
I
4 i —] 1
2
3 5 0 0
4 6 1 1

5 8 3 9

XX = 25 xx =0 Xx? = 20

Xx =5
...(2.1)
Sum of squares or SS = Xx? = 20

sas (2.2)
or MS =
Mean square 2
N-1 4 “
q
Comments
of scores (X) is
Step 1. The sum of the raw scores (ZX) in column (i) is equal to 25. The mean
equal to 5 (ZX/N).
Note that the
Step 2. In column (ii), x is the deviation (X — X) of each score from the group mean.
sum of the deviations (Zx) from the exact mean is always zero.
Step 3. In column (iii), the deviations from the mean are squared. The sum of the squared deviations
around the mean is called sum of squares (Zx’) or SS in a shortened form. The sum of squares (SS) is
equal to 20.
ANALYSIS OF VARIANCE: THE FOUNDATION OF ExPERIMENTAL Desicn § a

Mean square (MS) is obtained by dividing the sum of squares by the degrees of freedom (df),
which in this case is 4 (N— 1). Thus, the mean square is equal to 5 [2x?/N — 1 = 20/4]. A variance, in the
terminology of the analysis of variance, is called a mean square or MS. The square root of variance is

designated as standard deviation (fax? WV =] ).

In the foregoing example, we have encountered a number of concepts that will help in understanding
the analysis of variance as well as the designs that follow. The two most important concepts are sum of
squares (SS) and mean square (MS).
The mean deviation method employed in computing the sum of squares and the mean square, is
time consuming and requires more effort than the method we are just going to describe. It is called the
raw score method or the direct method. Let us work out sum of squares and mean square from the data
presented in Table 2.3 by the direct method. It does away the need to compute x and x(X— xX).

Table 2.4 The Performance Scores of 5 Subjects

: ae Subjects: oe saree * ‘

1 2 4
2 4 16
3 5 25
4 6 36
5 8 64.

LX= 25 XX2 = 145

2 25 2

Sum of Squares = Ex? = XX? — OXY" = 145 - ar

= 145 - 125 =20 wil 23)

sy? 2X)
Mean Squar _ 2a S88" 0, 2.4
dere df df df 4 le)
Comments
Step 1. In column (ji), the sum of the scores (2X) has been worked out and is equal to 25.
Step 2. In column (ii), the scores have been squared and summed up. The sum of the squares of the
scores (XX%) is equal to 145.
The sum of squares (2x7), by the mean deviation method, was derived by adding up the square of
the deviation of the scores from the group mean. However, in the above computation (direct method),
we have squared the raw scores. Therefore, in order to derive the sum of squares, we have to apply a
correction (C). From the raw scores, the sum of squares can be derived directly by applying formula 2.3.
W@ Experimentat Desicn IN BEHAVIOURAL RESEARCH

A correction term [C = (ZX)?/N= (25)°/5 = 125] is subtracted from the sum of column (ii), i.¢., 2X7,
the sum of squares that is equal to 20, the value
Thus, subtracting 125 (C) from 145 (=.X?), we obtain the methods
obtained by the mean deviation method also. The value of the mean square derived by bothsquares by the
is, by dividing the sum of
is the same, which is obtained in the same mannet, that
df (SS/df), as given in formula 2.4.
in this book we shall always be
It is important to understand the working of the dir ect method, as
d over the mean deviation method
following the direct or the raw score method. This metho dis preferre
is available to the investigator.
for its elegance and ease. This method comes handy, if a calculator

ONE-WAY ANALYSIS OF VARIANCE.


puting the sum of
the methods employed for com
We have just explore d the variance notion and learnt the one-way analysis of variance
squares and the mean square. Now, we shall try to grasp the working of and
In shortened form, the analysis
of variance is called ANOVA
with the help of a numerical example.
sometimes ANOVAR.
of measures, composed
le of the analysi s of varianc e is that the total variability of a set
The rationa a given source of variation.
specific part s, each identifiable with
of several groups, can be partitioned into o two parts: a sum of
analy sis of varia nce, the total sum of squares is partitioned int
In the simple squares based upon variations
res base d upon the vari atio n betw een the group means, and a sum of
squa square, abbreviated as
the sum of squa res by df, we obtain mean
within the several groups. On dividing . The mean squares (sample
Here the samp le valu es are refer red to as mean squares and not variance
MS.
(population values).
values) are estimates of the variances
lation variance—between groups and
within groups. The
We have, thus, two estimates of the popu
F may, thus, be defined as
(2.5)
Between Groups Mean Square
“_
Within Groups Mean Square
comparison of the variability between the
The princip le involved in the analysis of variance is the is
within the groups. If the former variability
various groups with the sum of the variability fo und
sufficiently larger than the latter, then it is evidence 0 ftrea
tment effect, and we reject the null hypothesis
if the differe nce between the sources of
(H,) and accept the alternative hypothesis (H,). However,
the analysis of variance will lead to the
variability falls within the range expected from sampling error,
conclude that there was no evidence of
decision of retaining the null hypothesis (H,). We shall, then,
treatment effect and the differences between the group means were due to chance.
ions from which
The null hypothesis which is tested by ANOVA, is that the k means of the populat
H, tells
the samples were randomly drawn are all equal, that is, Hq : by = My = My --- = Hy The rejection of
us only that some inequality exists. To investigate the inequality, we test the means pairwise. The procedure
will be explained in chapter 4.
it may be noted that the decision to reject or not to reject the null hypothesis is a probabilistic
teeny the analysis of variance, the decision to reject or retain the null hypothesis is made on the
asis o te istributi
aeien i
: 5 ution tables, given ini Appendix,i Table B. In F test, we need two sets of degrees of
): One for the numerator and the other for the denominator.
SBE ESS ART ARS Meee ee eee OE I EO AE Te OPENED ETE SR CRT Te NN RARER TOS EME TT SYR SPEER ETS oe

|
ANALYSIS OF VARIANCE: THE FOUNDATION OF EXPERIMENTAL Desicn &@ ff

NUMERICAL EXAMPLE
An investigator is interested in exploring the most effective method of instruction in the classroom. He
decides to try three methods: Lecture (1); Seminar (2); and Discussion (3). He randomly selects 5
subjects for each of the three groups from a class of 10" grade students. A fter three months of instructions,
an achievement test is administered to the three groups. The distribution of achievement scores in the
three groups is as given in Table 2.5.

Table 2.5 The Distribution of Achievement Scores of Subjects Treated by


the Three Methods of Instructions

Subject Method
Number Lecture Seminar Discussion
. (1) (2) 3)
1 8 1 5
2 10 13 5
3 ll 13 8
4 M1 15 9
5 12 16 10
x 52 68 37 G

Here n = 5;k=3;N=kn=5x3=15 . e
Partitioning of Total Variation and df
In the simple analysis of variance, the total variation and df will have the following partitioning:

Between g-1 {2 Within


groups groups k(n—1) |} 12

Fig. 2.3 Schematic representation of the analysis

where n= number of subjects in each of the three subgroups (n = 5)


k = number of subgroups (k = 3)
kn = total number of subjects or observations in the experiment (N)
The left hand rectangles indicate the partitioning of the sum of squares and the adjoining rectangles _
indicate the partitioning of total df, in the general form. The numerals outside the rectangles are the df
associated with the numerical example. The double-line enclosed rectangles indicate the final partitioning.
In the equation form, the partitioning of the total sum of squares may be expressed as
SSyotal = 59 bet. groups +88 Ww. groups ++-(2.6)
W® Exeerimentat DesicN IN BEHAVIOURAL RESEARCH
observations
where ota = total sum of squares generated from the deviation of the individual
SS,
.
from the mean of the total observations in the experiment
of the subgroup means
BS he, gem between groups sum of squares generated from the deviation
in the experiment.
from the mean of the total observations
from the pooled deviation of the individual]
SS\.. groups = within groups sum of squares generated
means.
observation from the respective subgroup

Computation
G2 (157)"
= 1643.27
(i) Correction Term (C) = a = oP

(ii) Total SS = (2X7) —C


= (8?+ 10?+ 112+...+92+107)—C
= 1785.00 — 1643.27 = 141.73

x(x) 52? 68° 37°


aa) ~C=—+—+— -C
(iii) Between Groups SS = n 5 5 5

= 1739.4 — 1643.27 = 96.13


een Groups SS
(iv) Within Groups SS = Total SS — Betw
= 141.73 — 96.13 = 45.6
ysis of Variance
(v) Table 2.6. Summary of One-way Anal

2 48.07 ma 12.03"
Between Groups 96.13
45.60 12 3.80
Within Groups (Error)

141.73 14
Total

48.0
*4F, (2, 12) = 6.93 F= 380 12.65

Comments

Partitioning of Total Variation and df s—


of squares is partitioned into two component part
In the one-way analysis of variance, the total sum
other due to the variation within the groups. In the
one due to the variation between the groups and the
1), partitioned into two component parts, 2 df
present problem, the total degrees of freedom are 14 (kn —
and 12 df [k(n — 1)] to the variation within the
{(k — 1)] attributable to the variation between the groups
groups.
An important aspect of the analysis of variance is t he partiti
oning of total sum of squares and
with the nature of the design.
degrees of freedom. It will be observed later that the parti tioning differs
Once the partitioning of sum of squares and df of a design is un derstood, the comput
ation part is
ANALYSIS OF VARIANCE: THE FOUNDATION OF ExPERIMENTAL DesiGN @ a

mechanical. Therefore, before starting the actual analysis work, one should try to comprehend the
schematic representation of the analysis, and follow the computations step by step.
‘Computation
Step 1: Correction Term: As explained earlier, for computing the sum of squares by the direct method,
a correction is needed. The correction term (C) was, however, the same for deriving all the sum of
squares in the numerical example, with the exception of within groups sum of squares, explained under
the comments in Step 4.
. The correction term is obtained by squaring the grand total (G = 52 + 68 + 37 = 157) and then
dividing it by the total number of subjects or observations in the experiment (N = kn = 3 x 5 = 15). The
correction term (G?/N), was found to be equal to 1643.27.
Step 2: Total SS: The total sum of squares is a measure of the total variation of the individual scores
_about the combined mean. It reflects all the sources of variation, that is, between groups variation and
within groups variation in the present case.
The total sum of squares is obtained by combining the scores of the three groups and treating them ~
as one set of scores. In Step 2, each of the 15 raw scores is first squared, then the squares are summed,
and thereafter, correction term is subtracted from the obtained sum. The total sum of squares in the
present example is equal to 141.73.
In Step 2, all scores have not been displayed; the ommission of certain terms of the sequence has
_been indicated by dots. For example, 8? + 10° + ... + 9? + 10? indicates that the individual scores from
the first to the last of the distribution have been squared and added.
Step 3: Between Groups SS: The sum of squares between groups is a measure of the variation of the
group means about the combined mean. If the group means do not differ among themselves at all, the
sum of squares between groups will be zero. Thus, greater the variation in the group means, the larger
is the sum of squares between groups.
In Step 3, between groups sum of squares has been obtained by the direct method. The totals of
each of the three subgroups (i.e., 52, 68, and 37) have been squared and divided by the number of
' observations in each subgroup and summed [2(2X)?/n]. Finally, the correction term (C) has been
subtracted from the sum of squares. The between groups sum of squares is found to be equal to 96.13.
Step 4: Within Groups SS: The within group sum of squares is the pooled sum of squares based on the
variation within each group about its own mean. The within groups sum of squares is also called error
sum of squares. All the uncontrolled sources of variation are pooled in the within groups sum of squares.
In Step 4, the sum of squares within groups has been obtained by subtraction, taking advantage of
the addition theorem characterizing this analysis. From equation 2.6, it is observed that
SS total = Sper groups v SS, groups

SS w. groups SS total — SS bet.groups


By substituting the obtained values of SS,,,,, or the total sum of squares and SS,......, or the sum of
squares between groups, we obtain sum of squares within groups. It is equal to 45.6. However, there can
be no verification of the computation of the within groups sum of squares by the subtraction method.
Therefore, beginners would do well to calculate independently the within groups sum of squares. Let us
carefully observe the computation of the within groups sum of squares by the direct method.
EE IEEE!

t
RESEARCH
a ®@ Experimentat Desicn IN BEHAVIOURAL
s based on the
is the pooled sum of square
We have just learnt that the within groups sum of sq uares the sum
variation of the individual observations about the mean of the
particular subgroup. Therefore,
of squares within groups is equal to
2
52
SS within subgroup 1 = (87 + 10? +... + 12”)-
= 550.0 — 540.8 = 9.2
up (# = 5),
numb er of obs erv ati ons or subjects in the subgro
Note: The lower case letter ” represents the k
in the subgroups (N = 15), and
N repre sents the total numbe r of observations of subjects
upper case letter
(& = 3).
represents the number of groups or treatments 2
+ 132+... + 167) - =
SS within subgroup 2 = (112
= 940.0 — 924.8 = 15.2

5D » 3
a
within subgroup 3 = (5° + 52+...
+ 10°) - =
SS
= 295 — 273.8 = 21.2
=45.6
SS vithin = 2° 7 15.2 + 21.2
a obtained by the
the outc ome by the direc t method is exactly the same as
Ifno mistake is committed,
subtraction method. square of the respective subgr
oup
ecti on fact or for each subg roup 1 s different, that is, the
Note: (a) The corr .
of observations in each subgroup
total divided by 7, the number of the three subgroups.
is the sum of the indi vidu al sum of squares within each
(5) SS, group s, that is,
the pooled df of the subgroup
df asso ciat ed with the sum of squares within groups is also
(c) The
44+44+4=12. is the final step in
of varia nce table
mary Table: Preparing analysis
Step 5: Analysis of Variance Sum dar d form. First column is for
. Note care full y the form at of the table 2.6 which is of stan
the analysis
owed by df, MS, and finally F.
sources of variation, then SS, foll
te is obtained by dividing the SS
square (MS) or variance es tima
The reader will recall that the mean mon popu lation variance
by its df gives an estimate of the com
by the appropriate df. Dividing the SS of squares by its
pede nt of the vari atio n of the grou p. Thus, dividing the between groups sum
that is inde e is the estimate of the
this example is equal to 48.07. This valu
df, i¢., 96.13 by 2 gives MS which, in y, dividing of the
of the variation within groups. Similarl
common population variance independent n, this value is the
in grou ps SS by its df, z.e., 45.6 by 12 gives MS, which +s found to be 3.8. Agai
with group means.
which is independent of the variation in the
estimate of the common population variance ding
and the MS within groups is obtained by divi
Then, the ratio (F’) of the MS between groups r
equal to 12.65. It is entered in the first row unde
48.07 by 3.80. Here the obtained value of F is
column F.

Test of Significance
in the Appendix, Table B, for 2
The next step is to evaluate the obtained F value. We consult the F table
of freedom for greater
and 12 degrees of freedom. First we move along the top row, where degrees
ANALYsis OF VARIANCE: THE FOUNDATION OF ExPERIMENTAL Desicn §§ a

mean squares are given, and pause at 2. Then, we proceed downwards in column 2 until we find the row
entry corresponding to df 12. The values of F significant at the 5 p.c. point are given in light face type,
and those significant at 1 p.c. in bold (dark) face type. The critical value of F corresponding to 2 and 12
df at & = .01 is 6.93. Since our obtained value of F 12.65 far exceeds the critical or tabled value, 6.93,
we reject the null hypothesis (H,). The overall F indicates that the means of the three groups do not fall
on a straight line with zero slope. Hence, the null hypothesis that the three groups are random samples
from a common normal population is rejected. On the basis of the results of the experiment, we can
conclude that the three methods of instruction produced significant differences in the three groups. As F
is an overall index, further tests on means have to be carried to compare the pairs of means. This aspect
will be discussed in Chapter 4.

STRENGTH OF ASSOCIATION
The significant F indicates that the observed differences between the treatment means are not likely to
arise by chance. However, it does not indicate anything about the strength of the treatment effect. The
statistic omega square ((”) is a measure of the strength of treatment effect. It gives us the proportion of
the total variability in a set of scores that can be accounted for by the treatments. That is, what portion
of the variance in the scores can be accounted for by the differences in the treatment groups. The
formula for strength of association is

2 a Sistine ~ (k ~ L)MS within


(2.7)
SStotai + MS within

Let us now compute the strength of treatment effects in our numerical example. The values of
SSyerween? >Stotae And MS, 44, have been obtained from Table 2.6. The steps in computing w? are given
below:
SSyerween = 96-13
SS ora = 141.73
MS ¥ithin = 3-8
k = 3 (treatments)

9?= —————
96.13 - (3 -1)(3.8) =
141.73 + 3.8
Thus, approximately 61 per cent of the variance in the dependent variable is accounted for by the
difference in the method of instruction. In other words, there is fairly strong relationship between methods
of instruction and achievement scores of the subjects.

GENERAL COMMENTS
One may wonder why we keep the between groups variance in the position of the numerator and the
within groups variance in the denominator. The logic is simple. If the group means are significantly
different, then the mean square between groups should be larger than the mean square within groups
(error). It is rare that small values of F(F < 1) indicate anything but sampling variation. It is only large
values of F' that suggest treatment effects. Therefore, we refer to the F table only when the ratio is
greater than one. If the mean square between groups is smaller than the mean square within groups, then
B® Exrerimentat Desicn in BeHaviourAt RESEARCH
we simply ignore the value
the F value will be less than one. In the analysis of variance summary table,
as the data offers no evide nce against the null
of obtained F and there is no need to refer to the F tables
hypothesis.
differences in the
f instruction did produce
The significant F indicates that the three methods o differences among
ement scores of the groups. Howeve r, /’ does not in dicate which of the three
achiev subgroup means is
pair of means are signifi cant. To find this, post hoc compar isons between the
the will be discussed
a variety of method s for compar ing the individual means. Some of these
done. There are
in chapter 4.

SUMMARY OF STEPS
ce with detailed
d the comp utat ion of one- way or simple analysis of varian
We have just comp lete
ved. Let us summarize the steps
involved:
comments on the various steps invol
of the total variation (SS) and df.
1. Observe carefully the partitioning
Calculate the correction term (C).
(SS otal):
Calculate the total sum of squares
hwy

of squares (SS, ctween) '


Calculate the between groups sum
ction or by direct method.
of squares (SS \ithin ) by subtra
Calculate the within groups sum
Determine the between groups df.
by subtraction or directly.
Determine the within groups df
summary table.
Prep are the analysis of variance
ve df.
ween and SS,,nin ad their respecti
Enter the obtained values of SSye
SS, .ieen by its df.
n square (MS, -rween) by dividing
10. Compute the between groups mea
SS,nin DY its df.
mean square (MS thin) bY dividing
11. Compute the within groups
yithin)*
12. Calculate F ratio (MS, owen IMS
table.
with the critical F value from the F
13. Corapare the obtained F’ ratio of variance, we now
ng a deta iled disc ussi on and computation of one-way analysis
After presenti -wa y analysis of variance.
the rationale and computation of two
proceed a step forward to understand

“TWO-WAY ANALYSIS OF VARIANCE


the researcher
to the data of the previous example, in which
The one-way analysis of variance was applied of the subjects. The
of instruction on the achievement scores
investigated the effect of three methods selected
treated with one of the three methods, were
subjects in each of the three groups, who were
only one variable (method) was studied, which had
independently and at random. In that investigation,
way analysis of variance permits the simultaneous
three levels—lecture, seminar and discussion. The two-
of variance and the ¢ test do not permit the
study of two factors or variables. While the one-way analysis
way analysis of variance permits such
evaluation of interaction between two or more variables, the two-
evaluation.
ANALYsIs OF VARIANCE: THE FOUNDATION OF EXPERIMENTAL Desicn fl

In the single variable example, the investigator considered only the effect of three methods of
instruction on the achievement scores. However, it is expected that the method of instruction will have
different effect depending upon the level of intelligence of the subjects (interaction). It may be
hypothesized that the children of superior intelligence will gain more from the discussion and the seminar
methods, whereas the children of inferior intelligence will gain more from the lecture method. On the
basis of this hypothesis, the investigator designs a study in which the effect of two variables are studied
simultaneously, that is, effect of level of intelligence and method of instruction on the achievement
scores of the children. There is greater generality to the outcome of this investigation than that of the
first in which the effect of only one variable was studied. Further, it has an added advantage in that the
interaction effect can also be studied.
The first variable (Factor A) has two levels, that is, superior intelligence and inferior intelligence,
represented by a, and a,, respectively. The second variable (Factor B) has three levels, that is, lecture,
seminar and discussion methods, represented by ,, b,, and b,, respectively. The levels of the factors are
fixed! and do not represent a random sampling from a larger population of levels. It means that the
levels of the factors were chosen arbitrarily by the experimenter. It may be noted that the levels of factor
A are manipulated through selection and that of factor B are directly manipulated by the experimenter.
The total number of treatments in the experiment (k = 2 x 3 = 6) are presented in Table 2.7.
Table 2.7 The Six Treatment Conditions in a Two-way Analysis of Variance

Intelligence Method

(b) (b,)° | (b;)


(Lecture) (Seminar) (Discussion)

a ab,, ab, ab,,


(Superior Intelligence)
a, ab,, ab,, ab,,

(Inferior Intelligence)

In Table 2.7, we observe that the experiment will have 6 treatment conditions and the investigator
has decided to consider all the treatments and have n = 5 observations for each treatment condition. In
Table 2.7, the first subscript refers to the first letter representing level of factorA and the second subscript
refers to the second letter representing level of factor B. For example, treatment ab, , represents a treatment
condition in which a subject of superior intelligence (a,) is given instructions by lecture method (b,).
Similarly, treatment ab,, represents a treatment condition in which a subject of inferior intelligence is
given instructions by the discussion method.

NUMERICAL EXAMPLE
A total of 30 subjects were selected, 15 of superior intelligence and 15 of inferior intelligence. Five
subjects from each of the two groups were randomly assigned to each of the three methods of instruction.
That is, 5 subjects were randomly assigned to each of the six treatments. After three months of instruction,
an achievement test was administered to all the subjects. The outcome of the hypothetical experiment is
given in Table 2.8.

' See chapter 3 for discussion of fixed effect and random effect models.
34 a ® Exrerimentar Desicn in BEHAVIOURAL RESEARCH tes
Le 38

Table 2.8 Achievement Scores of Subjects Instructed by Lecture, Seminar


and Discussion Methods in each of the Two Groups
Method
Intelligence
b, b,
b, cee
I 2
13 13
15 15
a I
15 16
12
17 18
12 /
a
71 74
24
|
8 9
1
8 10
12
10 7
13
1 11
15
16 12 12
pO 53 G
67 49
pT
k=6, N=kn=30

Table
Table 2.9 AB Interaction

b, b, fo pee |
51 71 74 196
67 49 53 169
120 127 G
118
and df
Partitioning of Total Variation res and dfas follows:
way anal ysis of vari ance will have the partitioning of the total sum of squa
The two-

Total kn-1 | 29

ar k(n-1) |} 24
Between groups k-1 |5

A|r-1
Z |] |8 ole If axe | -De-D || 2.

Fig. 2.4 Schematic representation of the analysis

4
ANALYSIS OF VARIANCE: THE FOUNDATION OF EXPERIMENTAL DesicN a

where n= number of subjects or observations in each of the six treatment groups (n = 5)


k = number of treatment groups (k = 6)
r= number of rows representing levels of Factor A (r = 2)
II

c = number of columns representing levels of Factor B (c = 3)


II

The left hand rectangles represent the partitioning of the sum of squares and the adjoining rectangles
indicate partitioning of the total df, in the general form. The numerals outside the rectangles are the dfs
associated with the present numerical example.
In the equation form, the partitioning of the total sum of squares may be represented as
SS,otal = 994 + SS, + SS,, + SS within
(2.8)
where SS...) = total sum of squares generated from the deviation of each score from the mean of the
total scores.
SS ,= sum of squares of Factor A generated from the deviation of a, and a, means from the
mean of the total scores.
SS, = sum of squares of factor B generated from the deviation of b,, b,, and b, means from
the mean of the total scores.
SS ,, = sum of squares for interaction generated from the deviation of each subgroup mean
from the value predicted from that subgroup on the assumption of no interaction. |
SS vithin = POoled sum of squares within the six treatment groups generated from the deviation
of the individual scores from the mean of the respective subgroups representing error.
Computation
2 2
a
(Z) Correction Term (C) = 7 a = 4440.83
(ii) Total SS = (72 + 92 + ... + 112 + 122) —C = 4661.0 — 4440.83 = 220.17

5i* 67? 747 53°


(ii?) Between Groups SS = state tte C = 4563.40 — 4440.83 = 122.57

(iv) Within Groups SS = Total SS — Between SS = 220.17 — 122.57 = 97.6

6* 1697
(v) ASS = —- C = 4465.13 — 4440.83 = 24.30

118 120° 127°


(v) BSS== —— +——
+49 +
* ——
49 ~ C= 444530
— 4440.83 3 = 4.4 7

(vii) AB SS = Between Groups SS — (A SS + B SS) = 122.57 — (24.30 + 4.47) = 93.8


or directly from Table 2.9

5 67? 53°
AB SS = | +5 tt 5-7 © | [ASS
+ BSS]

= (4563.4 — 4440.83) — (24.30 + 4.47) = 93.8


ay B® ExPerimentat DESIGN IN BEHAVIOURAL RESEARCH
|,

(viii) Table 2.10 Summary of Two-way Analysis of Variance

Source of Variation SS df MS F

A (Intelligence) 24.30 I 24.30 5.97"


B (Method) 4.47 2 2.24
93.80 2 46.90 1132**
AB
Within Groups (Error) 97.60 24 | 407
Total 220.17 29

**F04(2, 24) = 5.61


*F.gg(1, 24) = 4.26

407
Intelligence=_ 4)
F for or Intelli 25.97

F fe or nee ds = 797> ~= 055 or <1


ethod

ion =n=
F forfor I Interactio 407
—_ = 11,52

Comments

Partitioning of Total Variation and df


o f squares is first partitioned into two components—
In the two-way analysis of variance, the total sum
th e other, due to the variation within the six groups
one due to the variation between the groups and
(also called error SS).
tioned into three components—one due to the
The between groups sum of squares is further parti
second due to the variation in the levels of factor
variation in the levels of factor A (i.e., a, and a,), the
ction of factors A and B.
B (i.e., b,, b, and 63), and the third due to the intera
ents, 5 df attributable to the
Similarly, the total df of 29 (kn — 1) is first partitioned into two compon
variation between the groups (k— 1) and 24 d f to the variation
within the groups [k(n — 1)]. The latter is
the df associated with the variation due to the error.
i) df attributable to the
Further, the between groups df (5) is partitioned into three components—(
to the interaction of A and
variation due to factorA (r — 1), (ii) df due to factor B (c — 1), and (iii) df due
Bl -I(ce-1)].

Computation
Step i: Correction Term: The correction term (C) is found by squaring the grand total (G) and then
dividing itby the total number of observations or cases (N = kn = 6 x 5 = 30) in the experiment. Thus,
the correction term (G2/N) is found to be equal to 4440.83. The correction term is same throughout the
analysis, except for computing the within groups sum of squares. However, the formula is the same for
calculating the C in case of the within groups sum of squares.

e
MENTAL Desicn @ i
ANALYSIS OF VARIANCE: THE FOUNDATION OF EXPERI

d by squaring each of the 30 scores or


Step 2: Total Sum of Squares. The total sum of squares is obtaine
subtracting the correction term
the entire set of observations in the experiment, adding them up and
from the sum. The total sum of squares in this example is equal to 220.17.
six subgroups (i.e., 51, 67, 71, 49, 74,
Step 3: Between Groups Sum of Squares: The sum of each of the
tions in each subgroup and summed
and 53) is first squared and then divided by the number of observa
n groups sum of squares
[X(ZX)’/n]. Finally, the correction term is subtracted from the sum. The betwee
means about the combined
is found to be equal to 122.57. This reflects the variation due to the subgroup
uted by the variation
mean. We can observe that out of the total variation of 220.17, 122.57 is contrib
of squares, with df= 5,
between the groups. As can be observed from Fig. 2.4, the between groups sum
B sum of squares with 2 df,
is further partitioned into three components—A sum of squares with 1 df,
first order interaction
andA x B sum of squares with 2 df. That is, the two main effects (4 and B) and the
(A x B) are the components of between group variation.
been derived by
Step 4: Within Groups Sum of Squares: In step 4, the within groups sum of squares has
it is observed
the subtraction method. Referring to the schematic representation of the analysis in Fig. 2.4,
between
that the total sum of squares has been partitioned into two components—one due to the variation
groups and the other, due to the variation within groups. That is
SS total =~ DS honest +SS within

SS within = 5 total SS between


we get
Substituting the values of the total sum of squares and between groups sum of squares,
SS vithin = 220-17 — 122.57 = 97.6
within groups
However, the sum of squares within groups can also be found directly. Let us find the
sum of squares directly also.
the variation
We know that the within groups sum of squares is the pooled sum of squares based on
it belongs, therefore,
of the individual observations about the mean of the respective subgroup to which

51°
SS within ab,, treatment group = (72+ 92+... + 127)— a

= 539.0 — 520.2 = 18.8

2
16?) — =
SS within ab,, treatment group = (112+ 122 +... +
= 915.0 — 897.8 = 17.2

2
SS within ab,, treatment group = (112 + 132+... +177) - —

= 1029.0 — 1008.2 = 20.8


zo 3 4
WS Experimentat Desicn in BEHAVIOURAL RESEARCH ——_

SS within ab,, treatment group = (82+ 82+... + 12?) - 5

= 493.0 — 480.2 = 12.8


42
SS within ab,, treatment group = (122 + 132+... + 18°) - 5.

= 1118.0 — 1095.2 = 22.8


2
+ 12) = 5)
SS within ab,, treatment group = (92+ 107+...

= 567.0 — 561.8 = 5.2


22.8 F 5.2 =97.6
SS. nin = 18-8 + 17.2 + 20.8 + 12.8 +
ods is the same.
The outcome of both the meth
the six subgroups 1s different.
Note: The correction factor for ¢ ach of
row effe ct in this experiment or main
5: A Sum of Squa res: The A sum of squares, also called the deviations of a,
Step sum of squares, generated from the
a com pon ent of bet wee n grou ps
effect of factor A, is of squares is obtained by squaring
the sum
the total scores. The A sum
and a, means from the mean of ber of observations under each
2.9) and dividing each by 15, the num
of a, (196) and a, (169) (see Table on term ( 4440.83). The A sum
fact or A, and then sum min g up and finally subtracting the correcti
level of A is 24.30.
the variation contributed by factor
of squares is equal to 24.30. Thus,
column effect in this experiment
the Bs um of squares, also called the
Step 6: B Sum of Squares: Similarly, squares, generated from the
ponen t of between groups sum of
or main effect of factor B, is a com of squares is obtained by
the mean o fthe total scores. The B sum
deviations of b,, b,, and 6, means from each by 10, the number
and b,(127) ( see Table 2.9) and dividing
squaring the sums of b, (118), b,(120), ing the correction
rvat ions unde r each level of fact or B, then s umming up and finally subtract
of obse by factor B is
l to 4.47. Thus, the variation contributed
term (4440.83). The B sum of squares is equa
4.47.
nor to
the factors attributable neither to intelligence
Step 7: AB Sum of Squares: Interaction measures
ly.
method acting singly, but rather to both acting simultaneous
ned by two different methods. F irst, the
| In Step 7, the AB interaction sum of squares has been obtai
F ig. 2.4, we find that between groups
interaction has been obtained by the subtraction method. Refer to
n of factor A, second due to
sum of squares is partitioned into three components—one due to variatio
the sum of squares
factor B, and the third due to the interaction of factors A and B. Thus, by subtracting
, which is
of A and B from between groups sum of squares, we obtain the interaction sum of squares
found to be equal to 93.8.
To obtain AB interaction sum of squares directly, we first prepare interaction table or two-way table
ord and B as shown in Table 2.9 (This table is also useful for computing the main effects of A and B.
or example, the row totals give us the levels of factor A and the column totals give us the levels of
| factor B),
a I aa eR. Ngee

ANALYSIS OF VARIANCE! THE FOUNDATION OF ExPERIMENTAL Desicn ff a

‘ In order to obtain AB interaction sum of squares, refer to Table 2.9. We first square the cell entries
and divide each by 5, the number of observations contributing to the sums entered in the cells of this
table. Then take the sum and subtract the correction term (4440.83). Finally, the interaction sum of
squares is obtained by subtracting the sum of squares for the main effects of A and B from the cells sum
of squares.
The six cell entries in Table 2.9, i.c., 51, 67, 71, 49, 74, and 53, represent the sums of the six
treatment groups (see Table 2.8). In each of the six treatment groups, 5 observations have been taken.
After summing, the correction term is subtracted from the sum. Here, the cell sum of squares is found to
be equal to 122.57. Now, subtract the 4 sum of squares (24.3) and B sum of squares (4.47). Thus, AB
sum of squares is found to be equal to 93.8. The outcome of both the methods is the same.

Geometric Representation of Interaction


The graphical presentation is useful in examining the nature of interaction and is also helpful in interpreting
the results. First, we work out the means from the totals given in the cells of Table 2.9. Each cell total is
based on five observations, thus, dividing by 5 we obtain the means presented in Table 2.11. As an aid
in the interpretation of interaction, the profiles corresponding to the means are better than the totals.
Table 2.11 Two-way Table of Means

b, b, b,

a, 10.2 14.2 14.8


a, 13.4 9.8 10.6

We take the levels of one factor on X-axis. Let us take levels of factor B on X-axis. It is purely for
convenience that we have taken factor B on X-axis, otherwise it is alright to take any factor on X-axis.
Now, plot the means for each level of factor A. That is, first plot all the means in row a,, (i.e., aby, =
10.2, ab,, = 14.2, and ab,, = 14.8) corresponding to the levels of factor B, join the points and label the
resulting curve a,. It represents achievement scores of subjects of superior intelligence. Similarly, plot
all the means in row a,, (i.e., ab,, = 13.4, ab,, = 9.8, and ab,, = 10.6), join the points and label the
resulting curve a,. It represents achievement scores of subjects of inferior intelligence.

14e
Mean scores

NO

|
by by b,

Fig. 2.5 AB interaction profile


Eo a B EXperimentAL DESIGN IN BEHAVIOURAL RESEARCH —_

are not parallel, therefore, the 4p


In Fig. 2.5, it is observed that the curves marked a, and a,
Table 2.10 that the AB interaction is highly
Interaction is significant (SS_,, # 0). It is observed from
that the interaction is non-significany
significant. It is obvious from the graph also. It should be noted
(SS_, = 0), when the curves are parallel to each other.
su ms. However, it is convenie
nt to take means. By
Note: (i) We can plot the means of the observations or their
this drawing of curves becomes casy.
the factor that hag
However, it is convenient to take
(ii) We can take any one of the two factors on X-axis. s in the grap h and thus helps in the
larger number as it reduces the number of curve
of levels,
interpretation.
, we obtained only one
Step 8: Summary of Analysis of Variance: In the one-way analysis of variance
A, B, and the Ap
In the prese nt examp le, we are intere sted in te sting significance of factors
F ratio. ea ch case shall be the
te three F-ratios. The denominator 1n
interaction effects. Thus, we have to compu
within group (Error). .
variance estimate or mean square of
corresponding degrees of
of the su m of squares (SS) by the
In Table 2.10, we have divided each squares of A, B, and
(df) to obtai n the mean squa res (MS) . Int he column headed F, the mean
freedom
group mean squar: e (Error).
AB have been divided by the within

Test of Significance
e 2.10 are to be evaluate d. The F ratio
in respect of factor A has been
The obtained values of F in Tabl B for 1 and 24 degrees of
table, given in the Appendix, Table
found to be 5.97. We consult the F’ .05. The observed value of
value is 7.82 at & = 01 and 4.26 at a=
freedom and observe that the critical Thus, the obtained
.01.
and is less than the critical value of « =
5.97 exceeds the critical value at a = .05 hypo thesis (H,) that the
level. Therefore, we reject the null
value of F = 5.97 is significant below .05 ally distributed
two groups, selected on the basis of intel ligence, are random samples from the same norm
population.
fore,
of fact orB is <1, hence it is not significant. There
Further, we observe that the F ratio in respect ly
that the three methods of instruction differential
we retain the null hypothesis (H/,) and cannot conclude
affect the achievement scores.
to be 11.52. The critic al value is 5.61 at a=.01.
The AB interaction F based on 2 and 24 df, is found
Thus, the F' associate d with the interaction of
The observed value of F far exceeds the critical value.
effectiveness of a particular meth od of
factors A and B is significant below .01 level. It indicates that the
n of superior intelligence do better
instruction depends upon the level of intelligence. While the childre
gence gain more from
with seminar and discussion methods of instruction, the children of inferior intelli
lecture method. This can be verified from the interaction profile (Fig. 9.5).

“ASSUMPTIONS UNDERLYING ANALYSIS OF VARIANCE ss


The ratio of between groups (treatments) to within groups (error) mean squares is distributed as F if the
assumptions underlying the analysis of variance are satisfied. If the assumptions are violated, the sampling
distribution of mean square ratios may differ from the F distribution. Ifthe assumptions are not sufficiently
approximated, the conclusions based on the F’ test may not be valid. These assumptions are:
1. Normality of the Distribution of Criterion Measures
2. Homogeneity of Variance
ANALYSIS OF VARIANCE! THE FOUNDATION OF ExPERIMENTAL Desicn ff a

3. Independence of the Dependent Score


Let us consider the assumptions underlying the /’ test in some detail.

NORMALITY
The assumption of normality states that the distribution of scores within each treatment population is
normal. This assumption is satisfied when the scores within the treatment groups are from normally
distributed population.
In an empirical study, Norton (1952) found that /’ distribution is practically unaffected by lack of
symmetry in the distribution of criterion measures, however, it is slightly affected if the distribution of
the criterion measures is roughly symmetrical but either /eptokurtic or platykurtic. In such situations, it
is desirable that scores are appropriately transformed and analysis of variance is carried out on the
transformed scores (see next section on transformations). Symmetric, non-normal distributions cause
slight inflation of the Type I error probability.
In general, the F distribution is insensitive to the form of the distribution of criterion measures and
therefore, there is no need, to apply any statistical test to detect non-normality. One can detect extreme
departures by mere inspection. In case of extreme departures in the form of distribution, an appropriate
transformation should be carried out.

HOMOGENEITY OF VARIANCE
One of the basic assumptions underlying F test is that the variances of scores in each of the & treatment
groups are homogeneous (i.¢., 67 = 65 =...= 0) =.= 07 ). That is, the variances of the individual groups
are equal. We know that the within groups variance is the sum of the variations within each of the many
groups.
treatment groups. This test can only be done if there is homogeneity of variance within treatment
The
This assumption can be tested by means of Bartlett’s (1 937) test for homogeneity of variance.
as great as
experimental evidence, however, indicates that moderate departures, even two or three times
analysis of variance.
another group, from this assumption do not seriously affect the appropriateness of
ity of variance.
Mathematical derivations by Box (1954) indicate that o level is inflated by heterogene
and all groups have equal
However, if all treatment populations are approximately normally distributed
from homogeneity of
ns, the inflation is slight. F test is considered robust with respect to departures
variance.
purpose. Let us take the
A simple check on the equality of sample variances can be made for the
taken. The sum of
numerical example presented in Table 2.5 in which three treatment groups have been
section (step 4). The variance estimate
squares of each group has been calculated under the comments
that is, by 4 (n— 1). The variance of the
(MS) can be obtained by dividing the sum of squares by the df,
(21.2/4). Now, let us take the smallest
three groups in the example are 2.3 (9.2/4), 3.8 (15.2/4), and 5.3
smallest. The observed F' is equal to 2.3
and the largest variance and divide the largest variance by the
the F table in the Appendix, Table B,
(5.3/2.3) and the associated degrees of freedom are 4 and 4. From
Our observed value of F’= 2.3 is not
we observe that the critical value for 4 and 4 df is 6.39 at « = .05.
differ significantly and hence, the
significant. This shows that the two extreme variances do not
experimental groups are homogeneous.
nce, and some of these are overly
Several methods are available for detecting heterogeneity of varia
an, 1947; Hartley, 1950). These tests may
sensitive to departures from normality (Bartlett, 1937; Cochr
42 B® Exrerimentat Desicn In BEHAVIOURAL RESEARCH —_
Ea
of Hartley’s and
be used when a sensitive test is required to detect heterogeneity of variance. T he use
Bartlett’s tests will be explained in chapter 7.
If the variances for two treatment groups differ significantly (heterogeneity of variance), it may
which account for
reflect in the form of non-additivity of treatment effects. It is assumed that the factors
we mean that if X, is the score of g
the deviations in an individual’s scores are additive. By “additive” be
experimenta | condition, the score would
given observation under control condition, then under the nt. If the effect of the
nt effect due to experimental treatme
X, = X, + a, where a, is a constant treatme
equal ), because addition
two conditions should be
treatment is additive then 5? =; (as varianc es of the the
mean. In other words,
of a const ant does not affec t the v ari ance, it affects only the
or subtr actio n nge only the mean
ment shoul d not affec t the withi n groups varia tion, it should cha
experimental treat
groups variation.
which ultimately will affect the between ner, then X, =
an additive manner, acts in a mu Itiplicative man
If the treatment, instead of acting in
by a constant results in
ly 3 = s/ a; , becau se multi plyin g each value of a variable
X,a, and consequent ment effect is
varia nce by the squar e of the constant. Thus, if the treat
multiplying the original tly from the variance of
then the varia nce of the treat ment group may differ significan
multiplicative, if one treatment effect is additive
ogeneity of variance. Further,
control group, thus, resulting in heter rences in the variances of
is multiplicative, this con dition can also result in generatin g diffe
and the other
the treatment groups.

INDEPENDENCE subject is inde pendent of the


states that the score for any particular
The assumption of independence only one observation from each
assumption is essent ‘al. If we take
scores of all other subjects. This condition s, the assumption of
random to the different treatment
subject and subjects are assigned at
y, be met. .
independence of scores will, generall scores from
expl ain the poin t with the help of an example. Suppose, we have obtained 5
Let us
n that the second score is 7 must
subjects in a group and the scores are: 9, 7, 5, 4, and 2. The determinatio
s in the group.
first score, 9, and:so on for the other score
in no way be influenced by or related to the tions
s. Let us exp lain one of the most common viola
This condition is often violated in experimental studie nses
iment on RT, an investigator takes 5 respo
committed by the investigators. Suppose, in an exper
RT’s of a subject under one treatment condition,
(RT’s) for each treatment condition from each subject. The
may erroneously treat each one of the five responses
say, are 180, 192, 178, 179, and 195 msecs. Someone
independent, for the second response is related
as value of ¥ in computing ZX. These five scores are not
are from the same subject
to the first response, the third to the first, and so on. That is, all the responses
be fast throughout
and are related to each other. In other words, if the subject is fast in responding, he will
ing the
the set of responses and the scores will be correlated. Positive correlations can result in inflat
é, 1959). This also
Type I error rate and negative correlations in its deflation (Cochran, 1947; Scheff
artificially inflates NV. Therefore, the appropriate procedure would be to obtain one ingle score for each
subject, say, the mean of the five responses, and n will be equal to 1, and not 5.
. In the case of repeated measures designs or within subjects designs, where each subject is tested on
different treatment conditions, we can expect that the scores will be correlated However the assumption
in such designs, generally, is not valid. We shall elaborate this point later in onpies 6 _
ANALYSIS OF VARIANCE: THE FOUNDATION OF EXPERIMENTAL DESIGN @ a

GENERAL COMMENTS ON ASSUMPTIONS


It has been observed in the foregoing discussion that even extreme non-normality does not affect Type
Tor Type I error rate much. Also, moderate departures from the assumption of homogeneity of variance
do not seriously affect the appropriateness of the F test. However, problems arise when heterogeneous
variances are accompanied by unequal n 5. Unequal n’s should be avoided wherever possible. If we
have equal number of subjects under cach treatment group, the homogeneity of variance assumption
can be violated without appreciably affecting the /* test. However, the assumption of independence of
score is essential.
Randomization is necessary to ensure validity of independence assumption. In practice, itis generally
difficult to follow dictates set forth by the theory of random sampling. In experimental work, usually we
include, as subjects, those members of the population that are easily accessible to us. For exainple, a
research worker in psychology may include in his sample male students from a course in Psychology,
because these subjects are easily available.
The experimenter should draw his experimental subjects at random from those subjects that are
easily accessible to him. If this is not possible, he can randomly assign the available subjects to the
different treatment conditions in his experiment. Having done this, he may contend that his experimental
groups are all random samples from the same population. Having employed the technique of assigning
subjects randomly to the experimental conditions, the experimenter should restrict the statistical inferences
to this hypothetical parent population.
It is necessary to assign the subjects at random to the treatment conditions as each subject has his
own unique effect which will interact with the treatment condition. If the subjects are randomly distributed
over the treatment conditions, these unique effects will be distributed evenly among the different treatment
conditions.

TRANSFORMATIONS.
In research investigations, it is not unusual to encounter situations in which one or more assumptions
underlying the F test are violated. One way to deal with such situations is to use some suitable non-
parametric test. Another way is to change the scale of measurement by suitable transformation.
Transformation is, thus, a change in the scale of measurement. For example, rather than time in seconds,
the scale of measurement may be reciprocal of time in seconds as criterion score.
There are different reasons for making transformations in the scale of measurement. Ordinarily, the
three asumptions, that is, additivity, normality and homogeneity, are violated together (Snedecor, 1956;
p. 315). Ideally, the transformation should be able to remedy all the problems, but in practice it is not
often possible. Additivity is the most essential requirement, and next is the homogeneity of variance.
Empirical studies have shown that even when the distribution departs appreciably from normality,
it has very little effect on the results. Box (1953) has shown that distribution of F ratio in the analysis of
variance is affected relatively little by inequalities in the variances which are pooled into the experimental
error. Further, Box has shown that the sampling distribution of F-ratio is relatively insensitive to moderate
departures from normality. For skewed or very leptokurtic or platykurtic distributions, scores should be
appropriately transformed.
E B® ExperRimeNTAL DESIGN IN BEHAVIOURAL RESEARCH
are calle 4
Let us consider some transformations that are appropriate for different conditions. These
monotonic transformations, as the transformations le ave the ordinal
relationship! unchanged.

Square Root Transformation Fy,


is likely to be Poisson in nature.
When we consider the counts of observations, th e distribution
od of time.
example, number of times the rat presses the bar in a fixed peri
. That is, U = 02,
les that have Poisso n distri bution , the me an is equal to the variance
For variab
That is, we shoul
suggests a square root transformation.
For Poisson distribution, Bartlett (1936)
suggest that transformations ate
transform each value of X to X +0.5 . Freeman and Tukey (1950)
provided a table
value of X. Mosteller and Bush (1954) have
improved by taking Vx +./X +1 for each

of values of AX bfX +1.


ances, then this
that the mean s of the trea tmen ts are equal to their respective vari
If we observe two treatment groups and we obser
ve
that the squa re root tran sformation is required. If we have group
sugg ests the vari ance of the other
group is, say, 3.8 and the mean and
that the mean and the variance of one root transformation 1s required
.
is Poisson and hence, square
is 6.3, then the distribution

Logarithmic Transformation of the observations


t stan dard devi atio ns vary direc tly as the means, the transformation
When the treatmen s are plotted against their
by Bartlett (1947). That is, if the mean
to a logarithmic scale is suggested iderable gradient; the standard
atio ns, there will be a clear linear relationship with cons
stan dard devi exists, a logarithmic
incr ease prop orti onat ely as the means increase. If such a relationship
deviations s and puts the data in a form
the variances independent of the mean
transformation of the data renders the variance for changes in
of variance. Apart from stabilizing
suitable for the application of analysis s of non-additivity where
, it also make s the data more norm al. It may correct more serious case
the mean
lett, 1947).
the square root transformation fails (Bart
For X equal to zero, the
transform each value of X to log X.
In logarithmatic transformation, we used when the effects are
mati on may take the form log (1 +X). Logarithmic transformation is
transfor means may be usefully
ive. The proportion of ranges and
known to be proportional instead of addit lly, it has been
of standard deviations and means. Empirica
checked instead of computing proportions
found to work satisfactorily.
number
viour of rats, found that the logarithm of the
Morgan (1945), ina study of the hoarding beha us with
were approximately normal and homogeneo
of pellets hoarded resulted in distributions that
respect to the variances.
Reciprocal Transformation ®
ortional to the square of the means, the approprial
When the treatment standard deviations are prop of X into 1/4
ns, that is, transformation of each value
transformation is the reciprocal of the observatio For examp!®
time is the dependent variable.
A reciprocal transformation is useful in experiments where
e time taken to solve the problem isthe
the transformation may be useful in RT experiments and wher
ed.
dependent variable. By this transformation, the variances get stabiliz
‘Greater than, equal to, or Jess than relationship.
ANALYSIS OF VARIANCE: THE FOUNDATION OF ExpERIMENTAL Desicn & a

Arcsin Transformation
When the observations have a binomial distribution, that is, when the observations are proportions or
percentages, the transformation is done to the angle whose sin is the square root of the proportion of
percentage. That is, sin“! Vp (read sin inverse) or arcsin \p, where p is the percentage or proportion of
correct responses in a fixed number of trials. Values of the transformation have been tabulated by Bliss
(1937). These tables have been reproduced by Snedecor (1956) and Guilford (1954). This transformation
stabilizes the variance. The table weighs more heavily the small percentages or proportions which have
small variance. The arcsin transformation Table is given in the Appendix, Table I.

General Comments on Transformations


We have briefly discussed some of the transformations that can be applied to data to rectify situations in
which the assumptions underlying the analysis of variance get violated. Transformations help in changing
the scale of measurements and are monotonic in nature. However, the major problem is the choice of
the transformation. The chosen transformation should be appropriate for a specific distribution.
In deciding which of the several possible transformations to use in a specific case, one may have to
try different types of transformations. The choice should be a transformation which puts the data in a
form that satisfies the basic assumptions underlying the analysis of variance. It is desirable to carefully
observe the data and have a rough estimate about the means, standard deviation or variance. The use of
range statistics, in place of standard deviation, provides a relatively simple method for checking which
of the possible transformations could be applied to the data with minimum of computational effort.
The transformations commonly used are summarized below.
(z) When the variance is proportional to the mean, take the square root of the observations.
(ii) When the standard deviation is proportional to the mean, take the logarithms of the observations.
(iii) When the standard deviation is proportional to the square of the mean, take the reciprocal of the
observations.
(iv) When the observations have a binomial distribution, the transformation is done to the angle whose
sin is the square root of the proportion or percentage.
When the transformation is required, it should be done before the test of homogeneity of variance
is carried out, as transformations generally obviate the need for testing homogeneity. The use of
transformations to obtain additivity should be considered more important than to attain homogeneity of
variance and normality. Tukey’s (1949) test for non-additivity is used for deciding between alternative
possible transformations. However, it is not possible to eliminate non-additivity in all cases.
Thus, if we have the data that needs transformation, first we take a decision regarding the nature of
transformation that will be appropriate for the data. The analysis of variance is carried out on the
transformed data. After the analysis, the outcome is retransformed for interpretation.
&

Linear Transformations
Sometimes we obtain data that can be handled better by computing mean or variance from arbitrary
values or observations, derived by subtracting a convenient number from each value, that is, by shifting
the origin. As an example, consider the RT of 10 subjects: 213, 211, 210, 209, 208, 207, 205, 204, 203
and 201 msecs. It would be easier to handle the data if we subtract 200 msecs from each response time
before computing the mean and the variance. The original and transformed observations, labelled X and
X’ respectively, are being presented in Table 2.12.
BEB Exrerimentat Desicn IN BEHAVIOURAL RESEARCH os,

formed Observations
| Table 2.12 The Original and Trans enn
ns

, ap ae
x? X(X-200) x2
p Subject . X
45369 13 169
l 213
I 121
211 44521
2 10 100
210 44100
3 81
43681 9
4 209
8 64
208 43264
5 49
42849 7
6 207
5 25
205 42025
7 16
41616 4
8 204
3 9
203 41209
9 1
40401 J
10 201 _ |
xX’ =71 xX” = 635
xX = 2071 LX? = 429035

= 2071 sary ge Dia


X= °°" 10
(2071)? (71)
, = 635 - ——
i0
_ —— SS
SS == 429035 10
a
= {30:9 = 130.9
“1 -----
have been presented. Then X’
e 2.12 , the X ob serv atio ns and the squares of X obs ervations
In Tabl from each of
ente d. It can be obse rve d that a number (200) has been subtracted
(X—200) has been pres the example that the
rvat ions and then the X’ obse rvat ions have been squared. We notice in
| _ the 10 obse observations ( x’) is
of the orig inal scor es ( x ) is equa l to 207.1, an d that of the transformed
| mean sformed observations,
l to 7.1. Thus , if we add the subt ract ed number (constant) to the mean 0 f tran
| equa
However, the sum
observations, ie, X =X’ +200 =207.1.
~ the sum becomes equal to mean of actual
shows that
not affected by shifting of the origin. This
of squares and consequently the variance is does not affect
reduces th e mean by the same value but
subtraction of a constant from the observations ng down the
subtracti ion of a constant results in movi
the standard deviation or the variance. Evidently, merely shifts the
ard deviation or variance. It
scale of measurement with no change in the sample stand
t. The procedure makes the computation
origin of measurement without changing the unit of measuremen
zed. Note that subtraction or addition
work much eesy and the chances of committing error are minimi
ce or standard deviation unaltered.
ofa constant affects the mean by the same amount, but leaves the varian
se in the mean value by the
Subtraction of a constant results in the reduction and addition in the increa
same constant,
Similarly, the scores can be divided by a constant to reduce the effort of computation. However, the
standard deviation and the variance are affected by the division. In this case, the unit of measurement is
changed and, thus, must be compensated by applying the inverse operation to the obtained standard
ANALYSIS OF VARIANCE: THE FOUNDATION OF EXPERIMENTAL DesiIGn a

deviation. That is, the standard deviation is multiplied by the constant to recapture the original unit.
Note that multiplication or division of the observed values by a constant will affect the mean as well as
the standard deviation. Multiplying a set of values by a constant greater than one will increase the
standard deviation and mean in the same ratio. Similarly, dividing a set of values by a constant greater
than one will decrease the standard deviation and the mean in the same ratio. For example, halving a set
of values will reduce their standard deviation and mean to one half, whereas, the variance will be
reduced to one fourth, as V = SD.

ANALYSIS OF VARIANCE BY RANKS

Sometimes, in research investigations, we have observations on more than two groups and the
measurements, for one reason or another, cannot be assumed to be normally distributed, or the
measurements are in the ordinal scale. The investigator may be interested in finding if the k samples are
from different populations. If the assumptions of the test cannot be met, then the null hypothesis (,)
whether the & independent samples are from different populations can be tested by the Kruskal-Wallis
one-way analysis of variance by ranks. Further, if the data have been cast in a two-way table (rows and
columns of the null hypothesis can be tested by the Friedman two-way analysis of variance by ranks.
These two tests are useful when the F test is not applicable.

The Kruskal-Wallis One-way Analysis of Variance


If we have k independent groups and we want to test whether the samples are from different populations,
the Kruskal-Wallis one-way analysis of variance by ranks is an useful technique. The test assumes that
the variable under study has an underlying continuous distribution and the measurement of that variable
is at least on an order scale. Kruskal and Wallis (1952) have developed a generalized test for comparing
several sets of ranks which they call H. Kruskal-Wallis test is applicable to two groups. Further, it is not
necessary for the number of observations in each group to be even. Compared with the most powerful
parametric F test, under conditions where the is applicable, the Kruskal-Wallis test has power efficiency
of 95.5 p.c. (Andrews, 1955). The H statistic is given by

k R2
ay
a leary |e n, N+)
where k = number of groups
n= number of observations in the jth group
N= total number of observations
R,= sum of the ranks in jth samples

>, to sum over the k samples


j=l

Kruskal and Wallis (1952) have shown that if the null hypothesis is true and if the number of
observations in each group is not too small (when there are more than 5 cases in the various groups),
then His distributed as x? (Chi-square) with k — 1 degrees of freedom. Thus, we can determine whether
7 Te_a } }
}~=~—
=

L RESEARCH
i B ExperimMentat DesIGNn IN BEHAVIOURA
to one less than the
table value of x2 for df equal
or not the null hypothesis is tenable by compa ring the
by the above formula.
number of groups with the value of H obtaine d from the data
Numerical Example on perceptual
and Ganguli (1975) investigated the effect of monetary reward and punishment
Broota They selected 8
Hindus, Indian Muslims, and U.S. Whites.
selectivity on three cultural groups—Indian the perceptual learning
was conducted in two stages. In the first,
children in each group. The experiment with monetary rewards
the children were made to learn the names of the profiles in association reward and
stage,
is, one half of the ambiguous figure was learnt in association with
and punishments. That Later, in the second stage, the
testing stage, the two
with punishment.
the other half in association that in tachistoscopic-
to yiel d an ambi guou s situation in a complete circle such
halves were com bin ed ground. Two such
coul d perc eive one profi le as figure and the other as back
exposure, the subj ects times the subjects
were used . The dep end ent vari able constituted the number of
ambiguous figures
rded profile. Total trials were 40.
reported having perceived the rewa
ishment differentially affected
inte rest ed in finding whether reward and pun
number of times each of the 24
The expe rime nter s were
of the experiment, in terms of the
the three groups. The outcome were regarded as reward
orga nise d in the thre e grou ps, reported rewarded profiles which
subjects,
Table 2.13.
scores. The scores are presented in
ts
Table 2.13 Reward Scores of the Subjec

Cultural Groups
Subject
Ry U.S. White R;
number Hindu | Ry Muslim
16.5 24 20
10 5 22
1 23
18 13 29
Z 9 4
9 23 18.5
16 ‘10.5 14
3
14 33 24
17 12 20
4
3 22 16.5
4 1 , 8
5
Z 23 18.5
12 6 5
6
10.5 22 21
21 15 16
7
27 22 13 La
8 13 re

R= 61.0 R= 90.0 R,= 149.0


O
Computation
a 2.9
We compute the value of H, uncorrected for ties, by the formul
_ 2 (16)? (90)? (149)?
H= sos| 3° gg 784 tD
_ 12 [465.125 + 1012.5 + 2775.125] — 75 = 10.055
~ 600
ANALYSIS OF VARIANCE: THE FOUNDATION OF ExPERIMENTAL DESIGN © a
—_

Comments
Step 1: The number of times each of the 24 subjects reported the rewarded profiles, regarded as reward
scores, have been presented in Table 2.13. First, we rank all the 24 scores from the lowest to the highest
number of rewarded profiles. That is, treating all the three groups as one, rank the responses giving rank
1 to the subject who gave minimum number of reward-associated responses, and rank 24 to the one who
gave the maximum. It is immaterial whether the ranking is from lowest to highest or from highest to
lowest. In the experiment, subject number 5 in the Hindu group reported the least number of rewarded
profiles, thus, he has been assigned rank 1. The 4th subject in the U.S. White group reported maximum
number of rewarded profiles, thus, he has been assigned rank 24. These ranks are, then, separately
added for the three groups to obtain R, = 61, R, = 90, and R, = 149, as has been shown in Table 2.13.
Observations tied for a rank have been assigned the average value of the ranks they would have ordinarily
occupied. For example, there are two observations having a score 13. These two observations would

. (7+8)}.
have ordinarily occupied ranks 7 and 8. So, the average rank of 15] y is assigned to each.

k R2
Step 2: The formula 2.9 is applied to obtain the value of H. Here N is equal to 24. Further, y ~ is
AY

obtained by taking the sum of the ranks of the three samples (R, = 61, R, = 90, and R, = 149) and
squaring each and dividing by n(8), the number of observations in each subgroup. In this way, the value
of H is obtained, which in this case is equal to 10.055.

Test of Significance
The next step is to determine the significance of the obtained H. Kruskal and Wallis (1952) show that if
the null hypothesis is true and if the number of observations in each group is not too small, then H is
distributed as x2 with k— 1 degrees of freedom. Consequently, we consult the x? table for the value of A
obtained, that is, 10.055 and with 2 df (3 — 1). We observe from the x? table (Appendix, Table C), that
for 2 degrees of freedom, the probability of obtaining a value of H equal to 10.055 is < .01. The null
hypothesis (H,) that the three samples have come from identical populations is, thus, rejected. As the
null hypothesis is rejected, we accept the experimental hypothesis (H,) that the three groups differ in
their perceptual responses. In general, we conclude that the three means are not equal.
Tied Observations
In the above experiment, whenever ties occur between two or more observations, each score is given
the mean of the ranks for which it is found to be tied. As long as the number of ties is not too large, the
correction introduced for the tied observations will have relatively little influence. However, if the
number of tied ranks is large, the value of H is somewhat influenced by ties. Thus, it is desirable to
apply a correction for the ties in computing H. To correct the value of H obtained by formula 2.9 for the
ties, it is divided by
2T
1 (2.10)
— NB-N
where T= —t, t being the number of tied observations in a tied group
N = total number of observations in the experiment
5 a B Experimenta DesiGN IN BEHAVIOURAL RESEARCH —__

data. The first groy


ia the reward and punishment experiment, t here were 4 groups of ties in the
third ang
had ties in 2 scores (13 & 13), the second group had ties in 2 scores (16 & 16). Similarly, the
Here, ¢ is equal to 2, ang
fourth groups had ties in 2 scores (22 & 22 and 23 & 23, respectively).
6). The total number of observations jp
therefore, T = 6 (i.e. 23-2). Thus, 2 T= 24 (i.e. 6 + 6+ 6 +
2.10,
the experiment is 24 (N = 24). Now, applying the formula

l- = = 9983
24° - 24
for ties ig
5 by .9983. Thus, the value of H corrected
Now, we divide the obtained value of H, 10.05
given by

H= se) = 10,072
~ 9983
result more significant
s the value of H and, thus, makes the
We observe that the correction increase d the value of H by
we observe that the correction has increase
than if uncorrected. In the present case, should be remembered
The corr ecti on has not made much difference in the significance level. It
017. the probability associated
observations are involved in ties, then
that if no more than 25 per cent of the making correction
uted with out the corr ecti on for ties is not changed by more than 10 p.c. on
with H comp on the size of the
The magnitude of the correction depends
for ties (Kruskal and Wallis, 1952, p. 587). In the present case,
ge of the observations involved in the ties.
ties (value of #) as well as on the percenta involved in the ties was
2 and the percentage of the observations
the length of ties in all the groups was
33 p.c. [(8 x 100/24].
Variance
The Friedman Two-way Analysis of the null hypothesis (H,)
nce is a useful te chnique for testing
The Friedman Two-Way Analysis of Varia kal-Wallis One-Way
k matc hed sampl es have been draw n from the same pop ulation. The Krus
that the from k independent
have to compare several means derived
Analysis of Variance is useful when we hed on certain variables
ysis of Variance, the k samples are matc
samples. In the Friedman Two-Way Anal only refers to the
under each of the & conditions. Two-way,
or the same group of subjects is studied matched groups on
of data in whic h there are r rows and k columns. The rows may represent the
casting find the differences
experimental conditions. Our interest is to
certain variables and the columns various
d be at least on ordinal scale.
among the k conditions only. The data shoul
representing conditions differ significantly and
The Friedman test determines whether the columns
the statistic used is y,, which is given by
(2.11)
k

om2 = a
12
Yi (R))?Pn -3r(k +1)
j=l

where yr = number of rows


k = number of columns

R, = sum of the ranks in jth column


k
» LYij) -=sum the squares of the sums of ranks over all k conditions
ANALYSIS OF VARIANCE: THE FOUNDATION OF ExPERIMENTAL Desicn ff a

Friedman (1937) shows that when the number of rows and/or columns is not too small, the x; is
distributed approximately as chi square with df= — 1. Thus, we can determine whether or not the null
hypothesis is tenable by consulting the table of? for the value of x2 obtained from the data by formula
2.11 for df equal to one less than the number of columns. Empirical study by Friedman (1937, p. 686)
has shown very favourable results with the 2 test as compared with the most powerful parametric F
test.

Numerical Example
Suppose, an experimenter interested in evaluating the effect of three types of reinforcement upon extent
of discrimination learning took 20 sets of rats; 3 in each set. The 3 rats in each of the sets were matched.
The three kinds of reinforcement were designated as RE,, RE,, and RE,. In each set, the 3 rats were
assigned randomly to the three reinforcement conditions. After the training with the three types of
reinforcement, the extent of learning was measured in terms of latency of correct response. In such
experiments, the null hypothesis (H,) that the different types of reinforcement have no differential
effect is evaluated, through the Friedman two-way analysis of variance. In this experiment incorrectly
written, the latency of correct response in each of the 20 groups was ranked, giving rank | to the rat that
had the fastest latency and 3 to the one that had the slowest latency. Rank 1, thus, signifies strong
learning. Remember that in the present example, the scores are ranked in each row from | to 3. The
ranks of the 20 matched groups in respect of the extent of learning under the three kinds of reinforcement
are given in Table 2.14.

Table 2.14 Ranks of Twenty Matched Groups in Respect of the Extent of Learning
Under the Three Kinds of Reinforcement

Group . _ Kind of Reinforcement _

1 2 5 1
2 1 2 3
3 3 I 2
4 3 1 2
5 3 2 i
6 2.5 23 1
7 1 3 2
8 2 3 1
9 3 1 2
10 3 2 1
11 2 3 1
12 3 1 2
13 2 3 1
14 1 3 2
15 3 1 2
16 2 1 3
17 Pie 2.5 1
18 2 3 1
19 3 2 1
20 2 3 1

(i) R, 46 43 ‘ 31
i ®@ Experimentat Desicn In BEHAVIOURAL RESEARCH

Computation
Applying formula 2.11,
5 ban
2. — 2 [462 + 432 + 317] — (3)(20)(3 + 1)
(ii) X, (20133 +)

= 6.3

Comments
row from 1 to 3 (1 to &) and presented in Table
Step 1: The observations have been ranked in each . The sums of the
2.14. The sums of the ranks in each of the three columns (R,) have been determined
in terms of scores,
46, 43 and 31 respectively. If the data is
ranks in column RE,, RE, and RE, are to the subject that
from | to k. Generally, rank 1 is assigned
remember to rank the scores in each row, least amount of attributes
are interested and the subject with the
has most of the attributes in which we cy was assigned
last rank or k (in the prese nt examp le 3). The rat with the fastest laten
is assigned the
rank | and rank 3 to the slowest.
l to 20, the number of rows
2: The value of x? is comp uted usin g formula 2.11. Here,’ is equa
Step k

Further, SiR iy is
to 3, the num ber of col umn s or types of reinforcement.
or groups, k is equal j=l

RE, = 43, and


addi ng up the sums of the rank s of the three columns (RE, = 46,
obtained by squaring and
ined value of x2 is equal to 6.3.
RE, = 31). In this way, the obta

Test of Significance g the probability


dete rmin e the sign ific ance of the obtained x2. The method of de terminin
The next step is to the
observed value of < depends on
occu rren ce unde r the null hypo thesis (H ) associated with the ix,
of g to the ? table in the Append
associated probability by referrin
size of r and/or k. We determine the ability of obtaining a value
C, for df= k-1 =2. We obse rve that for 2 degrees of freedom, the prob
Table The conclusion is that the
Thus, we reject the null hypothesis.
of 72 equal to 6.3 is b etween .05 and .02. nation learning in rats.
three different kinds 0 f reinforcemen
t had differential effect on discrimi

Tied Observations
tied for ranks 2 and
rats had the same latencies and were, thus,
In group 6 as well as 17, the RE, and RE, (1937) states that
3/2), the average of the tied ranks. Friedman
3. Both were assigned the ranks 2.5 (2 + t the validity of the x; test.
vations does not affec
the substitution of the average rank for tied obser
any correction for tied observations.
Therefore, in this test there is no need to apply
Small Samples
the values can be found from Tables provided
When the number of observations in each group is small,
the x? value for large and small
by Siegel (1956). The same formula (2.11) is used for determining
samples.

You might also like