Broota Sir Book
Broota Sir Book
Winer (1971) has compared the design of an experiment to an architect's plan for the structure of a
building. The designer of experiments performs a role similar to that of the architect. The prospective
owner of a building gives his basic requirements to the architect, who then exercising his ingenuity
prepares a plan or a blue-print outlining the final shape of the structure. Similarly, the designer of the
experiment has to do the planning of the experiment so that the experiment on completion fulfils the
objectives of research. According to Myers (1980), the design is the general structure of the experiment,
not its specific content.
EXPERIMENTAL DESIGN: AN INTRODUCTION ff a
Though, there are different objectives of designing of an experiment, it may not be out of proportion
to state that the most important function of experimental design is to control variance. According to
Lindquist (1956), “Research design is the plan, structure, and strategy of investigation conceived so as
to obtain answer to research question and to control variance”. The first part of the statement emphasizes
only upon the objective of research, that is, (o obtain answer to research question. The most important
function of the design is the strategy (o control variance. This point will be elaborated in the discussion
that follows,
Variance control, as we shall notice throughout this book, is the central theme of experimental design.
Variance is a measure of the dispersion or spread of a set of scores. It describes the extent to which the
scores differ from each other. Variance and variation, though used synonymously, are not identical
terms. Variation is a more general term which includes variance as one of the statistical methods of
representing variation. A lot more is discussed about variance in chapter 2. Here we shall confine the
discussion and only emphasize its importance and methods of its control.
The problem of variance control has three aspects that deserve full attention. The three aspects of
variance are: systematic variance, extraneous variance and error variance. Main functions of experimental
design are to maximize the effect of systematic variance, control extraneous source of variance!, and
minimize error variance. The major function of experimental design is to take care of the second function,
that is, control of extraneous source of variance. Here we shall consider this aspect in comparatively
greater detail. It will be seen later on that various designs are available for controlling the extraneous
source of variance in different situations, and with the help of these designs, an experimenter can draw
valid inference.
SYSTEMATIC VARIANCE
Systematic variance is the variability in the dependent measure due to the manipulation of the experimental
variable by the experimenter. An important task of the experimenter is to maximize this variance. This
objective is achieved by making the levels of the experimental variable/s as unlike as possible. Suppose,
an experimenter is interested in studying the effect of intensity of light on visual acuity. The experimenter
decides to study the effect by manipulating three levels of light intensity, ic., 10 mL, 15 mL, and
20 mL. As the difference between any two levels of the experimental variable is not substantial, there is
little chance of separating its effect from the total variance. Thus, in order to maximize systematic
variance, it is desirable to make the experimental conditions (levels) as different as possible. In this
- experiment, it would be appropriate, then, to modify the levels of light intensity to 10 mL, 20 mL, and
30 mL, so that the difference between any two levels is substantial.
EXTRANEOUS VARIANCE
In addition to the independent variable and the dependent variable, which are main concerns in any
experiment, extraneous variables are encountered in all experimental situations that can influence the
dependent variable.
IExtraneous source of variance is contributed by all the variables other than the independent variable whose effect is
being studied in the experiment. These variables have often been called extraneous variables, irrelevant variables,
secondary variables, nuisance variables etc. In this book, all variables in the experimental situation other than the
independent variable have been termed as extraneous variables or secondary variables.
RESEARCH
Experimenta DEesIGN IN BEHAVIOURAL is
There are five basic procedures for controlling the extraneous source of variance. These procedures
are:
(i) Randomization (ii) Elimination
Randomization
An important method of controlling extraneous variable/s is randomization. It is considered to be the
most effective way to control the variability due to all possible extraneous sources. If thorough
randomization has been achieved, then the treatment groups in the experiment could be considered
statistically equal in all possible ways. Randomization is a powerful method of controlling secondary
variables. In other words, it is a procedure for equating groups with respect to secondary variables,
According to Cochran and Cox (1957), “Randomization is somewhat analogous to insurance in that it is
a precaution against disturbances that may or may not occur and that may or may not be serious if they
do occur”.
Randomization in the experiment could mean random selection of the experimental units from the
larger population of interest to the experimenter, and/or random assignment of the experimental units or
subjects to the treatment conditions. Random assignment means that every experimental unit has an
equal chance of being placed in any of the treatment conditions or groups. However, in making groups
equal in the experiment, we may have random assignment with constraints. The assignment is random,
except for our limitations on number of subjects per group or equal number of males and females, and
so on. Random selection and random assignment are different procedures. It is possible to select a
random sample from a population, but then assignment of experimental units to groups may get biased.
Random assignment of subjects is critical to internal validity. If subjects are not assigned randomly,
confounding” may occur.
An experimental design that employs randomization as a method of controlling extraneous variable
is called randomized group design. For example, in the randomized group design (chapter 3), extraneous
source of variance due to individual differences is controlled by assigning subjects randomly to, say, k
treatment conditions in the experiment. According to McCall (1923), “Just as representativeness can be
secured by the method of chance, ... . so equivalence may be secured by chance, provided the number
of subjects to be used is sufficiently numerous”. This refers to achieving comparable groups through the
principle of chance. It may, however, be noted that randomization is employed even when subjects are
matched. In repeated measures design (within subject design), where each subject undergoes all the
treatment conditions, the order in which treatments are administered to the subjects is randomized
independently for each subject (see chapters 6, 11 and 12).
Fisher’s most fundamental contribution has been the concept of achieving pre-experimental equation
of groups through randomization. Equating of the effects through random assignment of subjects to
groups in the experiment is considered to be the overall best tool for controlling various sources of
*Term is used to describe an operation of variables in an experiment that confuses the interpretation of data. If the
independent variable is confounded with a secondary variable, the experimenter cannot separate the effects of the two
variables on the dependent measure.
EXPERIMENTAL DESIGN: AN INTRODUCTION fi ff za
extraneous variation at the same time. Perhaps, the most important discriminating feature of the
experimental design, as compared to the quasi-experimental design’, is the principle of randomization.
Elimination
Another procedure for controlling the unwanted extraneous variance is elimination of the variable by so
choosing the experimental units that they become homogencous, as far as possible, on the variable to be
controlled. Suppose, the sex of a subject, an unwanted secondary variable, is found to influence the
dependent measure in an experiment. Therefore, the variable of sex (secondary source of variance) has
to be controlled. The experimenter may decide to take either all males or all females in the experiment,
and thus, control through climination the variability due to sex variable. Procedure explained in this
particular example is also referred to as the method of constancy. Let us take another example to illustrate
the control of unwanted extraneous variance by elimination. Suppose, intelligence of the subjects in the
group is found to influence the scores of the subjects on achievement test. Its potential effect on the
dependent variable can be controlled by selecting subjects of nearly uniform intelligence. Thus, we can
control the extraneous variable by eliminating the variable itself. However, with this procedure we lose
the power of generalization of results. If we select subjects from a restricted range, then we can discuss
the outcome of experiment within this restricted range, and not outside it. Elimination procedure for
controlling the extraneous source of variance is primarily a non-experimental design control procedure.
Elimination as a procedure has the effect of accentuating the between group variance through decrease
in the within group or error variance.
Matching
Another procedure, which is also a non-experimental design procedure, is control of extraneous source
of variance through matching. The procedure is to match subjects on that variable which is substantially
related to the dependent variable. That is, if the investigator finds that the variable of intelligence is
highly correlated with the dependent variable, it is better to control the variance through matching on
the variable of intelligence. Suppose, an investigator is interested in studying the efficacy of method of
instruction on the achievement scores of the 10‘ grade children. The methods to be evaluated are:
lecture, seminar, and discussion. Here the method of instruction is the experimental variable of interest
to the investigator. The investigator discovers that the achievement scores (DV) are positively correlated
with the intelligence of the subjects, that is, subjects with high intelligence tend to score high on the
achievement test and those who are low on intelligence score are low on the achievement test. Thus, the
variable of intelligence (not of direct interest to the investigator) needs to be controlled because it is a
source of variance that will influence the achievement scores. In this experiment, the extraneous variable
(intelligence) can be controlled by matching the subjects in the three groups on intelligence (concomitant
variable).
However, matching as a method of control limits the availability of subjects for the experiment. If
the experimenter decides to match subjects on two or three variables, he may not find enough subjects
for the experiment. Besides this, the method of matching biases the principle of randomization. Further,
matching the subjects on one variable may result in their mismatching on other variables.
Additional Independent Variable
Sometimes the experimenter may consider elimination inexpedient or unpractical. He may not eliminate
the extraneous variable (of not direct interest to the experimenter) from the experiment and, thus, build
3In this book, as stated earlier, the subject of experimental design is treated in the “Fisher tradition”. The reader is
advised to go through the other aspect of designing, referred to in the introductory paragraph with the first meaning of
the term, experimental design. Campbell and Stanley (1963) have presented an excellent treatment of the subject
experimental and quasi-experimental designs in the non-statistical tradition.
Pa a ® Experimentat DesiGn IN BEHAVIOURAL RESEARCH
ee,
it right into the design as a second independent variable. Suppose, an experimenter is interested in
studying the efficacy of methods of instruction on achievement scores. He does not want to eliminate
the variable of intelligence. He introduces intelligence as an attribute* variable. He creates three groups
on the basis of intelligence scores of the subjects. The three groups consist of subjects of superior
intelligence, average intelligence and low intelligence as levels of the second variable (intelligence),
With the help of analysis of variance, the experimenter can take out the variance due to intelligence
(main effect of intelligence) from the total variance. The experimenter may decide to study the influence
of intelligence on achievement, and also the interaction between intelligence and method of instruction,
Thus, the secondary source of variance is controlled by introducing the secondary variable as an
independent variable in the experiment, and the experimenter gets the advantage of isolating the effect
of intelligence on achievement and the interaction effect as additional information.
Due outcome of such a control procedure is a factorial design. In the above example, it will be a
3 x 3 factorial design. There, the first variable or factor is intelligence (having three levels) and the
second variable is the method of instruction (three levels). The first factor or independent variable is a
classification variable or a control variable and the second one is the experimental variable, which was
directly manipulated by the experimenter.
Statistical Control
In this approach, no attempt is made to restrain the influence of secondary variables. In this technique,
one or more concomitant secondary variables (covariates) are measured, and the dependent variable is
statistically adjusted to remove the effects of the uncontrolled sources of variation. Analysis of covariance
is one such technique. It is used to remove statistically the possible amount of variation in the dependent
variable due to the variation in the concomitant secondary variable. The method has been presented in
chapter 14.
The extraneous source of variance can also be controlled with the help of various experimental
designs. For example, we can make the extraneous variable constant by “blocking” the experimental
units as in the randomized complete block design (chapter 5). In this design, the subjects pretested on
the concomitant secondary variable are grouped in blocks on the basis of their scores on the concomitant
variable so that the subjects within blocks are relatively homogeneous. The purpose is to create between
block differences. Later on, the variance between the blocks is taken out from the total variance. Thus,
the variability due to the extraneous variable is statistically held constant.
Let us take up an example to illustrate this point. Suppose, an investigator finds that anxiety level of
the subjects, an extraneous variable of no direct consequence to the purposes of the experiment, influences
the dependent variable in the experiment. The experimenter can control this secondary source of variation
through elimination, that is, by selecting subjects of low anxiety level only. However, this procedure
will limit the generality of the results. So the experimenter may decide to apply statistical technique to
control the extraneous variable (anxiety level). He can administer an anxiety test (to measure concomitant
variable) to all the subjects (selected randomly for the experiment), and then create blocks on the basis
of their anxiety scores such that within the blocks the subjects are as homogeneous as possible, and the
differences between the blocks are high. In such a design, the variability due to the block differences is
taken out from the total variation. Thus, the statistical control technique can be utilized by the experimenter
to control the variance contributed by an extraneous variable.
ERROR VARIANCE
The results of experiments are affected by extraneous variables which tend to mask the effect of
experimental variable. The term experimental error or error variance is used to refer to all such
4A characteristic that can be identified and measured.
EXPERIMENTAL DesiGN: AN INTRopucTION Mi ff
uncontrolled sources of variation in experiments. Error variance results from random fluctuations in the
experiment. Experimental errors can be controlled either through experimental procedures or some
statistical procedure. If we are not able to effectively control the extraneous source of variation, then it
will form the part of error variance, By controlling secondary source of variation, one can reduce the
experimental error.
Iwo main sourees of error variance may be distinguished. First is inherent variability in the
experuinentalunits to which treatments are applied. Second source of error variance is lack of uniformity
in physical conduet of experiment, or in other words lack of standardized experimental technique. This
refers to the errors of measurement.
Individuals vary a lot in respect of intelligence, aptitude, interests, anxiety, etc. All these person-
related variables tend to inflate the experimental error, The other source of error variance is associated
with errors of measurement and could be due to unreliable measuring instrument, fatigue on the part of
experimental units, transient emotional states of the subject, inattention by subjects at some point of
time, and so on.
Statistical controls can be applied to minimize such error variance. For example, repeated measures
design can be used to minimize the experimental errot. By this technique the variability due to individual
differences is taken out from the total variability, and thus, the error variance is reduced. Analysis of
covariance is also a technique to reduce the error variance. Further, error variance can be controlled by
increasing the reliability of measurements by giving clear and unambiguous instructions, and by using
a reliable measuring instrument, etc. ‘
It has been pointed out earlier that an important function of experimental design is to maximize the
systematic variance, control extraneous source of variance, and minimize error variance. The systematic
variance or variance due to experimental variable is tested against the error variance (F test is discussed
at length in chapter 2), therefore, the error variance should be minimized to give systematic variance a
chance to show the significance. In the; next,chapter, we shall learn that for the variability due to
experimental variable (between group. variance) to be accurately evaluated for significant departure
from chance expectations, the denominator, that is, error variance should be an accurate measure of the
error. wal! p* met og ak ow
VALIDITY ns
Validity is an important concept in measurement, 1‘may itbe i ina | testing situation or inriepeniinentall
situation. In experimental situation, validity iis related to, the control of secondary variables. More the
secondary variation that slips into an investigation, greater is the possibility that the independent variable
was not wholly responsible for dependent variable changes. Secondary, or extraneous variation may
influence the dependent variable to an extent, where the conclusions drawn become invalid.
In experimental] situations, the validity problem is divided into two parts—internal and external
validity. Internal validity is basic minimum without which the outcome of any experiment is
uninterpretable. That is, it is concerned with making certain that the independent variable manipulated
in the experiment was responsible for the variation in dependent variable. On the other hand, external
validity is concerned with generalizability. That is, to what populations, settings, treatment variables,
etc., can the effect (obtained in an experiment) be generalized. For detailed discussion on internal and
external validity, the reader may refer to Campbell and Stanley (1963).
Fart B® Exrerimentat Desicn IN BEHAVIOURAL RESEARCH
In behavioural sciences, specially in education and social research, it 1s not always possible to exercise
full control over the experimental situation. For example, the experimenter may not have the liberty of
assigning subjects randomly to the treatment groups or the experimenter may not be in a position to
apply the independent variable whenever or to whomever he wishes. Collectively, such experimental]
situations form part of quasi-experimental designs. |
In another research situation, the objective may be to study intensively a particular individual rather
than a group of individuals. In the former case the researcher may be interested in answering questions
about a certain person or about a person’s specific behaviour. For example, behaviour of a particular
individual may be observed to note changes over a period of time to study the effect of a behaviour
modification technique. All such designs in which observations or measurements are made on individual
subject are categorized as single case experimental designs, in contrast to the designs in which groups
of subjects are observed and the experimenter has full control over the experimental situation (as in
experimental design).
The experimental situations in which experimenter can manipulate the independent variable/s and
has liberty to assign subjects randomly to the treatment groups and control the extraneous variables are
designated as true experiments. The designs belonging to this category are called experimental designs
and in this book we are concerned with such regions only.
Understanding the nature of experimental design will be easier if we fully comprehend the nature
and meaning of quasi-experimental design and single case experimental design. Let us consider the
three types of designs—single case experimental design, quasi-experimental design, and experimental
design, in some detail.
Quasi-Experimental Design
All such experimental situations in which experimenter does not have full control over the assignment
of experimental units randomly to the treatment conditions, or the treatment cannot be manipulated, are
collectively called quasi-experimental designs. For example, in an ex-post-facto study the independent
. variable has already occurred and hence, the experimenter studies the effect after the occurrence of the
variable. In another situation, three intact groups are available for the experiment but the experimenter
,
ExPERIMENTAL DESIGN: AN INTRODUCTION & ff a
cannot assign the subjects to the treatment conditions; only treatments can be applied randomly to the
three intact groups. There are various such situations in which the experimenter does not have full
control over the situation. The plan of such experiments constitutes the quasi-experimental design.
Let us take an example from research to distinguish quasi-experimental design from experimental
design. First, we give an example of an experimental design. Suppose, an investigator is interested in
evaluating the eflicacy of three methods of instruction (lecture, seminar and discussion) on the
achievement scores of the students of 10" grade. The experimenter draws a random sample of kn subjects
irom a large population of 10!" grade students. Then 1 subjects are assigned randomly to each of the k
(here & = 3) treatment conditions. Each of the 1 subjects in each of the & treatment groups is given
instructions with a method for one month. Thereafter, a common achievement test is administered to all
the subjects. The outcome of the experiment is evaluated statistically in accordance with the design of
the experiment (randomized group design or single factor experiment).
Let us consider an example of the quasi-experimental design. Suppose, for the aforesaid problem,
the experimenter cannot draw a random sample of 10" grade students as the schools will not permit the
experimenter to regroup the classes to provide instructions with the methods he is interested in. Ideal
conditions being unavailable, the experimenter finds three schools, following the same curriculum and
each providing instructions by one of the three methods. He administers an achievement test to the
subjects from the three schools and compares the outcome to evaluate the effect of each method of
instruction (ex-post-facto) on achievement scores.
It is observed from the example that the experimenter in the second condition did not have control
over the selection of subjects and also over the assignment of subjects to the treatments. Further, the
experimenter could not manipulate the independent variable (providing instructions with the three
methods) as the independent variable had already occurred. This experiment constitutes what we call
quasi-experiment.
Notice that the objective of the experiment was same in both the designs. However, random
assignment of subjects to the treatment groups was not possible in the quasi-experiment and it was,
therefore, a handicap in controlling secondary variables. These investigations are as sound as experimental
investigations, but are less powerful in drawing causal relationships between independent and dependent
variables. The statistical tests applied to the data obtained from quasi-experimental designs are same as
those applied to data in experimental designs. It is possible to perform even analysis of covariance on
data of such studies. However, the conclusions cannot be drawn with as much confidence as from the
studies employing experimental designs because some of the assumptions (e.g., randomization)
underlying the statistical tests are violated in the quasi-experiments. Besides this, the experimenter does
not have full control over the secondary variables.
Though quasi-experimental investigations have limitations, nevertheless these are advantageous in
certain respects. It is possible to seek answers to several kinds of problems about past situations and
those situations which cannot be handled by employing experimental design.
o
Experimental Design
Included in this category are all those designs in which large number of experimental units or subjects
are studied, the subjects are assigned randomly to the treatment groups, independent variable is
‘ manipulated bythe experimenter, and the experimenter has complete control over’ the scheduling of ~
independent variable/s. Fisher’s statistical innovations had tremendous influence on the growth of this
io B® Exeerimentat Desicn IN BeHaviourAL RESEARCH 2
ee
subject; his special contribution was the problem of induction or inference. After the invention by
Fisher of the technique of analysis of variance, it became possible to compare groups and study
simultaneously the influence of more than one variable.
There are three types of experimental designs—between subjects design, within subjects design
and mixed design. In the between subjects design, each subject is observed only under one of the
several treatment conditions. In the within subjects design or repeated measures design, each subject is
observed under all the treatment conditions involved in the experiment. Finally, in the mixed design,
some factors are between subjects and some within subjects.
As in any area of study, experimental designs also have some terminology which we shall be using in
the chapters that follow. It is essential to get acquainted with the terminology for clear understanding of
the designs and analysis given in the following chapters.
Factor
A factor is a variable that the experimenter defines and controls so that its effect can be evaluated in the
experiment. The term factor is used interchangeably with terms, treatment or experimental variable.
A factor is also referred to as an independent variable. A factor may be an experimental variable which
is manipulated by the experimenter. For example, the experimenter manipulates the intensity of
illuminance to study its effect on visual acuity. Here, illuminance is an experimental variable and is
referred to as treatment factor. Then, there are subject related variables which cannot be directly
manipulated by the experimenter but can be manipulated through selection. For example, if the
experimenter is interested in studying the effect of age on RT (response time), he may manipulate the
age by selecting subjects of different age levels. When the variable is manipulated through selection, it
is generally referred to as classification factor. Variables of this category allow the researcher to assess
the extent of differences between the subjects.
The independent variable or factor that is directly manipulated by the experimenter is also known
as E type of factor and one that is manipulated through selection is known as S type of factor. S type of
factors are generally included to classify the subjects for the purposes of control. At times, the experimenter
may be interested in evaluating the effect of S' type of factors. For example, an experimenter may
classify subjects into low, medium and high economic groups to assess the extent of differences between
the subjects in the three groups. However, most of the time the classification factor is built into the
design, not because of intrinsic interest in the effects but because the results are likely to be difficult to
interpret if these factors are not included. These factors are defined by their function in the design and
may be either classification or treatment factors.
Factors are denoted by the capital letters 4, B, C, D and so on. For example, in an experiment
having two factors, Factor A refers to the variable of intensity of light and Factor B to the variable of
size.
Levels
Each specific variation in a factor is called the level of that factor. For example, the factor light intensity
may consist of three levels: 10 mL, 20 mL and 30 mL. Experimenter may decide to choose the number
of levels of a factor. The number of potential levels of a factor is generally very large. The choice. of
EXPERIMENTAL DESIGN: AN INTRODUCTION ff a
levels to be included and manner of selection of the levels, from among the
large number available to
the experimenter, ina design is a major decision on the part of the experime
nter. Some factors may have
infinite number of potential levels (c.g., light intensity) and others may have few
(e.g., sex of the subject).
In case the experimenter decides to select P levels from potential P levels availabl
e on the basis of some
systematic, non-random procedure, the factor is considered a fixed factor. In contrast
to this systematic
sclection procedure, when the experimenter decides to include p levels from the
potential P levels
through random procedure, then the factor is considered a random factor. A detailed
discussion on the
manner of selection of levels of factors and the statistical models involved
have been presented in
chapter 3.
The potential levels ofa factor are desi gnated by the corresponding lower case-(small) letters
of the
factor symbol with a subscript. For example, the potential levels of factor A will be designat
ed by the
symbols a,, a,, Ay vy Ay, Similarly, the potential levels of factor B will be designa
ted by the symbols
Ls Hag: Bg, ening b,.
°
Dimensions
The dimensions of a factorial experiment are indicated by the number of levels of each factor
and the
number of factors. For example, a three factor experiment in which the first factor has p levels,
second
q levels and the third r levels, will be designated as p * qr factorial experiment. This is the general
form and the dimensions in the specific case may assume any value for p, g, and r. A factorial
experiment,
for example, in which there are three factors, first having 2 levels, second having 4 levels
and third
having 4 levels, is called 2 x 4 x 4 (read as two by four by four) factorial experiment. The
dimension of
this experiment is 2 x 4 x 4.
.
Treatment Combinations
A treatment is an independent variable in the experiment. In this text, the term treatment will be used
to
refer to a particular set of experimental conditions. For example, in a 2 x 4 factorial experiment
, the
subjects are assigned to 8 treatments. The term treatment and treatment combinations will be
used
interchangeably. In a single factor experiment, the levels of the factor constitute the treatments. Suppose,
in an experiment the investigator is interested in studying the effect of levels of illumination on visual
acuity and the experimenter decides to have three levels of illuminance. Thus, there will be three treatments
and in a randomized group design, each of the n subjects will be assigned to each of the three treatments
randomly. Let us take another example to present a case of treatment combination. In a2x3x4
factorial experiment, there will be a total of 24 treatment combinations and each of the n subjects will
be assigned randomly to one of the 24 treatment combinations.
Replication |
The term replication refers to an independent repetition of the experiment under as nearly identical
conditions as possible. The experimental units in the repetitions being independent samples from the
population being studied. It may be pointed out that an experiment with n observations per cell is to be
distinguished from an experiment with n replications with one observation per cell. The total number of
observations per treatment in the two experiments is the same, but the manner in which the two
experiments are conducted differs. For example, a 2 x 2 x 2 factorial experiment having 8 treatment
combinatior.s with 5 observations per treatment is different from an experiment with 5 replications with
one observation per cel]. The total number of observations per treatment is the same, that is, five. The
purpose of a replicated experiment is to maintain more uniform conditions within each cell of the
experiment to eliminate possible extraneous source of variation between cells. The partitioning of total
Bie.
BES BS Exeerimentar Desicn in BeHAviourAL RESEARCH ee
variation and df (degrees of freedom) in the replicated and non-replicated experiments will differ (see
chapter 7). It is quite important that the number of observations per cell for any single replication
all cells of the experiment.
should be the maximum so as to ensure uniform conditions within
Main Effects
over other
The difference in performance from one level to another for a particular factor, averaged
(MS) for the levels of factors
factcrs is called main effect. In a factorial experiment, the mean squares
of a 2 x 3 x 4 factorial experiment
are called the main effects of the factors. Let us consider an example corresponds to
factor C four. The A sum of squares
in which factor 4 has two levels, factor B three and and
of squares to a comparison between levels b,, b,,
a comparison between levels a, and a,, the B sum differ ence in
between levels C,, Cy, C3» and c,. The
b,, and the C sum of squares to a comparison effect
of factors B and C, is called the main
performance between levels a, and a,, averaged over levels
levels b,, b,, and b,, averaged over levels of
of A. Similarly, the difference in performance among
so on.
factors 4 and C, is called the main effect of B and
representing the mean performance on
The main effect, graphically, is the curve joining the points
other factors in the experiment. A significant main
the levels of a particular factor averaged over the
, the curve will not be parallel to the X-axis.
effect will have significant slope or, in other words
Simple Effects
is
on one factor at a given level of the other factor
Ina factorial experiment, the effect of a treatment in which facto rs 4
le of a 2 x 2 factorial experiment
called the simple effect. Let us consider an examp each of the two levels of B
treatment on two levels of A under
and B have two levels each. The effect of each of the two
calle d simpl e effect of A. Simil arly, the effect of treatment on two levels of B under
is
levels of A is called simple effect of B.
r interaction. For
nted in the same manner as the two facto
Graphically, the simple effects are prese levels of factor A are
le, the simpl e effect of A, graph icall y, is the AB interaction profile where the
examp effect s at
sented by the levels 5, and b, are the simple
marked on the X-axis, and the two curves repre the levels of
effect of B is the AB interaction profile where
each of the two levels. Similarly, the simple a, are the simple effects at each
curves, represented by a, and
factor B are marked on X-axis and the two
level.
.
Interaction Effect one
w the investigator to study the effects of more than
Factorial designs are important because they allo over the single factor
efficiency of factorial design
variable at a time. Apart from the advantage of
experiment, it at the same time permits the invest igator to evaluate the interaction among the independent
in research. It can be evaluated in all
variables that are present. Interaction is an important concept
les.
experiments having two or more independent variab
s of one variable alters
Interaction between two variables is said to occur when change in the value
n, from a statistician’
the effects on the other. However, it may be noted that the presence of interactio
at the firs!
point of view, destroys the additivity of the main effects. That is, what is added by one factor
n, the othe!
level of the other is different from what is added at another level. Absence of interactio on
hand, means that the additive property applies to the main effects, that is, they are indep enden t.
Let us explain the concept of interaction with the help of an example. An experimenter is interested
in evaluating the effect of two study hours (i.e., 4 hrs. and 8 hrs.) on the achievement scores of the 10
EXPERIMENTAL DESIGN: AN INTRODUCTION §§ a
grade students. In order to control the influence of secondary variable of intelligence, the experimenter
includes intelligence as a second independent variable in the experiment. Two groups of subjects are
included in the experiment, one of high intelligence and the other of low intelligence. The students are
assigned randomly to the two levels of the experimental variable (study hours). It is, thus, a 2? or 2 x 2
factorial experiment. The mean scores of the four groups are summarized in a2 x 2 contingency table
below:
Study hours
4 hrs 8 hrs
. i 1 9.
Intelligence Level Hl ae :
Lo 5.0 6.5
Difference for high intelligence group: 9.8 — 7.1 = 2.7
Difference for low intelligence group: 6.5—5.0 = 1.5
Interaction (Difference) = 1.2
Alternatively,
F STATISTICAL ANALYSI
We obtain a sample, conduct an experiment in accordance with the design of the experiment and finally
test the hypotheses, i.e., draw inferences beyond the data. Once the data have been obtained, the next
important problem for researcher is how to evaluate objectively the evidence provided by a set of
observations. In the tradition of experimental design presented in this book, the method of collecting
data, layout for the set of observations to be made, and statistical analysis, all are decided in advance of
the actual conduct of the experiment. Once a particular design is selected, all aspects of the experiment
from the initial to the final stage are taken care of.
B® Experimenta Desicn In BEHAVIOURAL RESEARCH
The statistical tests are useful tools to draw conclusions from evidence provided by samples, [t is
expected that the student or researcher using this book will have knowledge of elementary statistics
However, for completeness of the volume, some of the statistical concepts occurring in the following
chapters are briefly recapitulated here.
Level of Significance
Level of significance is our own decision making procedure. In advance of the data collection, for the
requirement of objectivity, we specify the probability of rejecting the null hypothesis, which is called
the significance level of the test and is indicated by a. Conventionally, «= .05 and .01 have been chosen
as the levels of significance. We reject a null hypothesis whenever the outcome of the experiment has a
probability equal to or Jess than .05. The frequent use of .05 and .01 levels of significance is a matter of
convention having little scientific basis.
In contemporary statistical decision theory, this convention of adhering rigidly to an arbitrary .05
level has been rejected. It is not uncommon to report the probability value even when probability
associated with the outcome is greater than the conventional level .05. The reader can apply his own
judgement in making his decision on the basis of the reported probability level. In fact, the choice of
EXPERIMENTAL DESIGN: AN INTRODUCTION a
level of significance should be determined by the nature of the problem for which we seek an answer
and the consequences of the findings. For example, in medical research where the efficacy of a particular
medicine is being evaluated, .05 level may be considered a lenient standard. Perhaps, a stringent level
of significance, say .001 is more appropriate in this situation. However, if we select a very small value
of &, we decrease the probability of rejecting the null hypothesis when it is in fact false. The choice of
level of significance is related to the two types of errors in arriving at a decision about H/o.
Region of Rejection
Region of rejection of H, is defined with reference to the sampling distribution. The decision rules
specify that H, be rejected if an observed statistic has any value in the region of rejection. The probability
associated with any value in the region of rejection is equal to o or less than a.
We @ ExeerimMentat DEsIGN IN BEHAVIOURAL RESEARCH
The location of region of rejection is affected by the nature of experimental hypothesis (H,). If H,
predicts the direction of the difference, then a one-tailed test is applied. However, if the direction of the
difference is not indicated by H,, then the two-tailed test is applied. It may be noted that one-tailed and
affected.
two-tailed tests differ only in the location of the region of rejection, size of the region is not
The one-tailed region and two-tailed region are being presented in Figs. 1.1 a and b respectively.
of the sampling
It can be seen in Fig. 1.1a that the region of rejection is entirely at one end or tail
test (Fig. 1.1b), the
distribution, 5 per cent of the entire area being under the curve. In a two-tailed
per cent of total area on each
region of rejection is located at both ends of the sampling distribution, 2.5
side of the distribution. . ‘
Analysis of Variance:
The Foundation of
Experimental Design
a x2) . ae
of computations involved. For example, for three groups, 3 (73) comparisons or combinations taken
comparisons are needed. Thus, as the number of groups increase, the number of comparisons to be
made increase rapidly, that is, the computation work increases disproportionately. Further, if a few
comparisons turn out to be significant, it will be difficult to interpret the results. Let us take up an
example to elucidate this point.
. k(k-1
* For k groups, the number of comparisons will be ( , ) i
|
20
Eun I ® Experimentat DESIGN IN BEHAVIOURAL RESEARCH
i . : ‘4 ; .
~~ Su a Ini aby experiment the investigator is interested in studying the effect of 10 treatments.
ently, 45 possible ¢ tests will have to be made for 10 treatment conditions. That is, first test
H 0: ‘ My = =; . then second test H,: |, =H; and so on, till we perform all the 45 ¢ tests
; for the difference
of 2 or 3 f's (.05 x 45)
oetveen every pair of means. Out of the 45 ¢ tests, we expect to find an average
5 differences are significant at .05
. e significant at 5 p.c. level by chance alone. Suppose, we find that
are true differences
evel. When f test is being applied, there is no way to know whether these differences
, for example several f tests, the
or within chance expectation. The more statistical tests we perform
significant purely by chance. Thus, the
more likely it is that some more differences will be statistically
three or more means. We would like the
t test is not an adequate procedure to simultaneously evaluate
probability of Type I error in the experiment to be .05 or less.
hand, permits us to evaluate three or more means
The analysis of variance or the F test, on the other
involving more than two means, the equality breaks
at one time. In making comparisons in experiments test for
always be preferred. The F is also an adequate
down. Hence, the analysis of variance should
(df=1), JF = tor F = &. Therefore, in
determining the significance of two means. For two groups
two tests
it is a matter of choice which one of the
case of two treatment conditions or two groups, variance
me. This means that the one -way analysis of
(t or F) is used. Both yield exactly the same outco two means.
eably in comparing the differences between
and the two-tailed ¢ test can be used interchang than the f test.
situation, F test is easier to perform
However, it will be found that in the same
NUMERICAL EXAMPLE
example. Suppose, an investigator
The concept of variance will be explained with the help of a numerical
on 5" grade children. Two independent
is interested in evaluating two different methods of instruction
of children in a 5th grade class.
groups of 10 children each are randomly selected from a large number
nt (methods of instruction)
The distribution of their achievement scores before administering the treatme
is as given in Table 2.1. The scores are arranged in the ascending order.
ANALYSIS OF VARIANCE! THE FOUNDATION OF EXPERIMENTAL DESIGN a a
l | 2
2 2 3
3 4 5
4 5 7
5 7 9
6 9 10
7 10 12
8 12 13
9 14 14
10 16 15
x 80 90
Groups
bed
0 4 8 12 16 20
Scores
Fig. 2.1 Distribution of the scores in the two subgroups before the treatment
(i) That, the scores vary about their subgroup means. Further, the variability (s4 = 25.78 and
of
s, = 21.33) of spread of scores about their respective means is similar, within the limits
chance variation.
e
(ii) That, the subgroup means (X, = 8.0 and X, =9.0) are similar, within the limits of chanc
variation.
WE ExperimenTat DesicN IN BEHAVIOURAL RESEARCH a,
The above two observations in the two samples are in accordance with the expectations of random
sampling. That is, the scores in each subgroup vary about the respective means to a similar extent and
further, the subgroup means are also similar but not identical, as the two samples were selected randomly
from the same population.
The investigator, then, administers the treatment to the two subgroups; treatments being assigned
randomly. That is, the two subgroups are given two different methods of instruction. After a period of
training, an achievement test is given to both the subgroups. The distribution of scores of the achievement
test, after the application of treatment is presented in Table 2.2. The scores are arranged in the ascending
order.
Table 2.2 Distribution of Scores in the two Subgroups
(After Treatment)
l 3 8
2 4 10
3 6 12
4 7 14
5 8 15
6 11 17
7 12 19
8 13 20
9 16 22
10 20 23
fC —-
ANALYsIS OF VARIANCE: THE FOUNDATION OF EXPERIMENTAL Desicn ff i
Groups
B to¢ ob te ok Fe +4 5)=25.77
¥,= 16.0 ft
A t+ +++ gtt+t+ + + s/,= 29.33
z,-100)
Lobo ob tt ty yoy yop 4
0 4 8 12 16 20 24
Scores
Fig. 2.2 Distribution of scores in the two subgroups after the treatment.
The variability of subgroup means is of special importance in the analysis of variance, as it reflects
the variation attributable to the treatment effect as well as other uncontrolled sources of variation. Let
us again refer to Fig. 2.1. We find the two means are similar, within the limits of chance variation.
However, in Fig. 2.2, we can observe the effect of treatment of sub-group means; these have drifted
apart. This shows that the treatment has caused variation in the subgroup means. This is called between
group variation.
We have just seen that the treatment caused the subgroup means to drift apart. We have also observed
(Figs. 2.1 and 2.2) that the scores within each subgroup vary about their respective means (observe the
scattering of scores of the two groups around the arrow point, marking the subgroup means). This
variability is also of particular importance in the analysis of variance. The pooled variability of scores
about their respective subgroup means is called within group variation or “error”. It is free from the
influence of differential treatment.
Thus, we have been able to identify two sources of variation in the scores—one which reflects the
effect of treatment is called “between groups” variation and the one that reflects the variability within
the subgroups is called “within groups” or “error” variation. An increase in the difference among tne
means results in an increase in the variance of means, and it is this variance that we evaluate relative to
the error variance. The procedure adopted for this is called the analysis of variance. If the variability
between the groups is considerably greater than the error variability, this is indicative of the treatment
effect.
Perhaps, the most general way of classifying variability is as systematic variation and unsystematic
variation. Systematic variation causes the scores to lean more in one direction than another. We observed
two
in Fig. 2.2 that the application of treatment resulted in systematic variation in the means of the
subgroups. The variable manipulated by the experimenter is associated with systematic variation.
operation of
Unsystematic variation on the other hand, is the fluctuation in the scores due to the
of subjects
chance and other uncontrolled sources of variation in the experiment. Random assignment
or error.
in different groups helps in reducing the unsystematic variation
The most important function of experimental design is to maximize the systematic variation, control
in the
the extraneous source of variation, and minimize the unsystematic or error variation. We will see
designs.
later chapters that this objective is achieved in different ways in different
B® Exrerimentat Desicn in BeHaviourat RESEARCH ——s
2 —3 9
I
4 i —] 1
2
3 5 0 0
4 6 1 1
5 8 3 9
XX = 25 xx =0 Xx? = 20
Xx =5
...(2.1)
Sum of squares or SS = Xx? = 20
sas (2.2)
or MS =
Mean square 2
N-1 4 “
q
Comments
of scores (X) is
Step 1. The sum of the raw scores (ZX) in column (i) is equal to 25. The mean
equal to 5 (ZX/N).
Note that the
Step 2. In column (ii), x is the deviation (X — X) of each score from the group mean.
sum of the deviations (Zx) from the exact mean is always zero.
Step 3. In column (iii), the deviations from the mean are squared. The sum of the squared deviations
around the mean is called sum of squares (Zx’) or SS in a shortened form. The sum of squares (SS) is
equal to 20.
ANALYSIS OF VARIANCE: THE FOUNDATION OF ExPERIMENTAL Desicn § a
Mean square (MS) is obtained by dividing the sum of squares by the degrees of freedom (df),
which in this case is 4 (N— 1). Thus, the mean square is equal to 5 [2x?/N — 1 = 20/4]. A variance, in the
terminology of the analysis of variance, is called a mean square or MS. The square root of variance is
In the foregoing example, we have encountered a number of concepts that will help in understanding
the analysis of variance as well as the designs that follow. The two most important concepts are sum of
squares (SS) and mean square (MS).
The mean deviation method employed in computing the sum of squares and the mean square, is
time consuming and requires more effort than the method we are just going to describe. It is called the
raw score method or the direct method. Let us work out sum of squares and mean square from the data
presented in Table 2.3 by the direct method. It does away the need to compute x and x(X— xX).
: ae Subjects: oe saree * ‘
1 2 4
2 4 16
3 5 25
4 6 36
5 8 64.
2 25 2
sy? 2X)
Mean Squar _ 2a S88" 0, 2.4
dere df df df 4 le)
Comments
Step 1. In column (ji), the sum of the scores (2X) has been worked out and is equal to 25.
Step 2. In column (ii), the scores have been squared and summed up. The sum of the squares of the
scores (XX%) is equal to 145.
The sum of squares (2x7), by the mean deviation method, was derived by adding up the square of
the deviation of the scores from the group mean. However, in the above computation (direct method),
we have squared the raw scores. Therefore, in order to derive the sum of squares, we have to apply a
correction (C). From the raw scores, the sum of squares can be derived directly by applying formula 2.3.
W@ Experimentat Desicn IN BEHAVIOURAL RESEARCH
A correction term [C = (ZX)?/N= (25)°/5 = 125] is subtracted from the sum of column (ii), i.¢., 2X7,
the sum of squares that is equal to 20, the value
Thus, subtracting 125 (C) from 145 (=.X?), we obtain the methods
obtained by the mean deviation method also. The value of the mean square derived by bothsquares by the
is, by dividing the sum of
is the same, which is obtained in the same mannet, that
df (SS/df), as given in formula 2.4.
in this book we shall always be
It is important to understand the working of the dir ect method, as
d over the mean deviation method
following the direct or the raw score method. This metho dis preferre
is available to the investigator.
for its elegance and ease. This method comes handy, if a calculator
|
ANALYSIS OF VARIANCE: THE FOUNDATION OF EXPERIMENTAL Desicn &@ ff
NUMERICAL EXAMPLE
An investigator is interested in exploring the most effective method of instruction in the classroom. He
decides to try three methods: Lecture (1); Seminar (2); and Discussion (3). He randomly selects 5
subjects for each of the three groups from a class of 10" grade students. A fter three months of instructions,
an achievement test is administered to the three groups. The distribution of achievement scores in the
three groups is as given in Table 2.5.
Subject Method
Number Lecture Seminar Discussion
. (1) (2) 3)
1 8 1 5
2 10 13 5
3 ll 13 8
4 M1 15 9
5 12 16 10
x 52 68 37 G
Here n = 5;k=3;N=kn=5x3=15 . e
Partitioning of Total Variation and df
In the simple analysis of variance, the total variation and df will have the following partitioning:
Computation
G2 (157)"
= 1643.27
(i) Correction Term (C) = a = oP
2 48.07 ma 12.03"
Between Groups 96.13
45.60 12 3.80
Within Groups (Error)
141.73 14
Total
48.0
*4F, (2, 12) = 6.93 F= 380 12.65
Comments
mechanical. Therefore, before starting the actual analysis work, one should try to comprehend the
schematic representation of the analysis, and follow the computations step by step.
‘Computation
Step 1: Correction Term: As explained earlier, for computing the sum of squares by the direct method,
a correction is needed. The correction term (C) was, however, the same for deriving all the sum of
squares in the numerical example, with the exception of within groups sum of squares, explained under
the comments in Step 4.
. The correction term is obtained by squaring the grand total (G = 52 + 68 + 37 = 157) and then
dividing it by the total number of subjects or observations in the experiment (N = kn = 3 x 5 = 15). The
correction term (G?/N), was found to be equal to 1643.27.
Step 2: Total SS: The total sum of squares is a measure of the total variation of the individual scores
_about the combined mean. It reflects all the sources of variation, that is, between groups variation and
within groups variation in the present case.
The total sum of squares is obtained by combining the scores of the three groups and treating them ~
as one set of scores. In Step 2, each of the 15 raw scores is first squared, then the squares are summed,
and thereafter, correction term is subtracted from the obtained sum. The total sum of squares in the
present example is equal to 141.73.
In Step 2, all scores have not been displayed; the ommission of certain terms of the sequence has
_been indicated by dots. For example, 8? + 10° + ... + 9? + 10? indicates that the individual scores from
the first to the last of the distribution have been squared and added.
Step 3: Between Groups SS: The sum of squares between groups is a measure of the variation of the
group means about the combined mean. If the group means do not differ among themselves at all, the
sum of squares between groups will be zero. Thus, greater the variation in the group means, the larger
is the sum of squares between groups.
In Step 3, between groups sum of squares has been obtained by the direct method. The totals of
each of the three subgroups (i.e., 52, 68, and 37) have been squared and divided by the number of
' observations in each subgroup and summed [2(2X)?/n]. Finally, the correction term (C) has been
subtracted from the sum of squares. The between groups sum of squares is found to be equal to 96.13.
Step 4: Within Groups SS: The within group sum of squares is the pooled sum of squares based on the
variation within each group about its own mean. The within groups sum of squares is also called error
sum of squares. All the uncontrolled sources of variation are pooled in the within groups sum of squares.
In Step 4, the sum of squares within groups has been obtained by subtraction, taking advantage of
the addition theorem characterizing this analysis. From equation 2.6, it is observed that
SS total = Sper groups v SS, groups
t
RESEARCH
a ®@ Experimentat Desicn IN BEHAVIOURAL
s based on the
is the pooled sum of square
We have just learnt that the within groups sum of sq uares the sum
variation of the individual observations about the mean of the
particular subgroup. Therefore,
of squares within groups is equal to
2
52
SS within subgroup 1 = (87 + 10? +... + 12”)-
= 550.0 — 540.8 = 9.2
up (# = 5),
numb er of obs erv ati ons or subjects in the subgro
Note: The lower case letter ” represents the k
in the subgroups (N = 15), and
N repre sents the total numbe r of observations of subjects
upper case letter
(& = 3).
represents the number of groups or treatments 2
+ 132+... + 167) - =
SS within subgroup 2 = (112
= 940.0 — 924.8 = 15.2
5D » 3
a
within subgroup 3 = (5° + 52+...
+ 10°) - =
SS
= 295 — 273.8 = 21.2
=45.6
SS vithin = 2° 7 15.2 + 21.2
a obtained by the
the outc ome by the direc t method is exactly the same as
Ifno mistake is committed,
subtraction method. square of the respective subgr
oup
ecti on fact or for each subg roup 1 s different, that is, the
Note: (a) The corr .
of observations in each subgroup
total divided by 7, the number of the three subgroups.
is the sum of the indi vidu al sum of squares within each
(5) SS, group s, that is,
the pooled df of the subgroup
df asso ciat ed with the sum of squares within groups is also
(c) The
44+44+4=12. is the final step in
of varia nce table
mary Table: Preparing analysis
Step 5: Analysis of Variance Sum dar d form. First column is for
. Note care full y the form at of the table 2.6 which is of stan
the analysis
owed by df, MS, and finally F.
sources of variation, then SS, foll
te is obtained by dividing the SS
square (MS) or variance es tima
The reader will recall that the mean mon popu lation variance
by its df gives an estimate of the com
by the appropriate df. Dividing the SS of squares by its
pede nt of the vari atio n of the grou p. Thus, dividing the between groups sum
that is inde e is the estimate of the
this example is equal to 48.07. This valu
df, i¢., 96.13 by 2 gives MS which, in y, dividing of the
of the variation within groups. Similarl
common population variance independent n, this value is the
in grou ps SS by its df, z.e., 45.6 by 12 gives MS, which +s found to be 3.8. Agai
with group means.
which is independent of the variation in the
estimate of the common population variance ding
and the MS within groups is obtained by divi
Then, the ratio (F’) of the MS between groups r
equal to 12.65. It is entered in the first row unde
48.07 by 3.80. Here the obtained value of F is
column F.
Test of Significance
in the Appendix, Table B, for 2
The next step is to evaluate the obtained F value. We consult the F table
of freedom for greater
and 12 degrees of freedom. First we move along the top row, where degrees
ANALYsis OF VARIANCE: THE FOUNDATION OF ExPERIMENTAL Desicn §§ a
mean squares are given, and pause at 2. Then, we proceed downwards in column 2 until we find the row
entry corresponding to df 12. The values of F significant at the 5 p.c. point are given in light face type,
and those significant at 1 p.c. in bold (dark) face type. The critical value of F corresponding to 2 and 12
df at & = .01 is 6.93. Since our obtained value of F 12.65 far exceeds the critical or tabled value, 6.93,
we reject the null hypothesis (H,). The overall F indicates that the means of the three groups do not fall
on a straight line with zero slope. Hence, the null hypothesis that the three groups are random samples
from a common normal population is rejected. On the basis of the results of the experiment, we can
conclude that the three methods of instruction produced significant differences in the three groups. As F
is an overall index, further tests on means have to be carried to compare the pairs of means. This aspect
will be discussed in Chapter 4.
STRENGTH OF ASSOCIATION
The significant F indicates that the observed differences between the treatment means are not likely to
arise by chance. However, it does not indicate anything about the strength of the treatment effect. The
statistic omega square ((”) is a measure of the strength of treatment effect. It gives us the proportion of
the total variability in a set of scores that can be accounted for by the treatments. That is, what portion
of the variance in the scores can be accounted for by the differences in the treatment groups. The
formula for strength of association is
Let us now compute the strength of treatment effects in our numerical example. The values of
SSyerween? >Stotae And MS, 44, have been obtained from Table 2.6. The steps in computing w? are given
below:
SSyerween = 96-13
SS ora = 141.73
MS ¥ithin = 3-8
k = 3 (treatments)
9?= —————
96.13 - (3 -1)(3.8) =
141.73 + 3.8
Thus, approximately 61 per cent of the variance in the dependent variable is accounted for by the
difference in the method of instruction. In other words, there is fairly strong relationship between methods
of instruction and achievement scores of the subjects.
GENERAL COMMENTS
One may wonder why we keep the between groups variance in the position of the numerator and the
within groups variance in the denominator. The logic is simple. If the group means are significantly
different, then the mean square between groups should be larger than the mean square within groups
(error). It is rare that small values of F(F < 1) indicate anything but sampling variation. It is only large
values of F' that suggest treatment effects. Therefore, we refer to the F table only when the ratio is
greater than one. If the mean square between groups is smaller than the mean square within groups, then
B® Exrerimentat Desicn in BeHaviourAt RESEARCH
we simply ignore the value
the F value will be less than one. In the analysis of variance summary table,
as the data offers no evide nce against the null
of obtained F and there is no need to refer to the F tables
hypothesis.
differences in the
f instruction did produce
The significant F indicates that the three methods o differences among
ement scores of the groups. Howeve r, /’ does not in dicate which of the three
achiev subgroup means is
pair of means are signifi cant. To find this, post hoc compar isons between the
the will be discussed
a variety of method s for compar ing the individual means. Some of these
done. There are
in chapter 4.
SUMMARY OF STEPS
ce with detailed
d the comp utat ion of one- way or simple analysis of varian
We have just comp lete
ved. Let us summarize the steps
involved:
comments on the various steps invol
of the total variation (SS) and df.
1. Observe carefully the partitioning
Calculate the correction term (C).
(SS otal):
Calculate the total sum of squares
hwy
the researcher
to the data of the previous example, in which
The one-way analysis of variance was applied of the subjects. The
of instruction on the achievement scores
investigated the effect of three methods selected
treated with one of the three methods, were
subjects in each of the three groups, who were
only one variable (method) was studied, which had
independently and at random. In that investigation,
way analysis of variance permits the simultaneous
three levels—lecture, seminar and discussion. The two-
of variance and the ¢ test do not permit the
study of two factors or variables. While the one-way analysis
way analysis of variance permits such
evaluation of interaction between two or more variables, the two-
evaluation.
ANALYsIs OF VARIANCE: THE FOUNDATION OF EXPERIMENTAL Desicn fl
In the single variable example, the investigator considered only the effect of three methods of
instruction on the achievement scores. However, it is expected that the method of instruction will have
different effect depending upon the level of intelligence of the subjects (interaction). It may be
hypothesized that the children of superior intelligence will gain more from the discussion and the seminar
methods, whereas the children of inferior intelligence will gain more from the lecture method. On the
basis of this hypothesis, the investigator designs a study in which the effect of two variables are studied
simultaneously, that is, effect of level of intelligence and method of instruction on the achievement
scores of the children. There is greater generality to the outcome of this investigation than that of the
first in which the effect of only one variable was studied. Further, it has an added advantage in that the
interaction effect can also be studied.
The first variable (Factor A) has two levels, that is, superior intelligence and inferior intelligence,
represented by a, and a,, respectively. The second variable (Factor B) has three levels, that is, lecture,
seminar and discussion methods, represented by ,, b,, and b,, respectively. The levels of the factors are
fixed! and do not represent a random sampling from a larger population of levels. It means that the
levels of the factors were chosen arbitrarily by the experimenter. It may be noted that the levels of factor
A are manipulated through selection and that of factor B are directly manipulated by the experimenter.
The total number of treatments in the experiment (k = 2 x 3 = 6) are presented in Table 2.7.
Table 2.7 The Six Treatment Conditions in a Two-way Analysis of Variance
Intelligence Method
(Inferior Intelligence)
In Table 2.7, we observe that the experiment will have 6 treatment conditions and the investigator
has decided to consider all the treatments and have n = 5 observations for each treatment condition. In
Table 2.7, the first subscript refers to the first letter representing level of factorA and the second subscript
refers to the second letter representing level of factor B. For example, treatment ab, , represents a treatment
condition in which a subject of superior intelligence (a,) is given instructions by lecture method (b,).
Similarly, treatment ab,, represents a treatment condition in which a subject of inferior intelligence is
given instructions by the discussion method.
NUMERICAL EXAMPLE
A total of 30 subjects were selected, 15 of superior intelligence and 15 of inferior intelligence. Five
subjects from each of the two groups were randomly assigned to each of the three methods of instruction.
That is, 5 subjects were randomly assigned to each of the six treatments. After three months of instruction,
an achievement test was administered to all the subjects. The outcome of the hypothetical experiment is
given in Table 2.8.
' See chapter 3 for discussion of fixed effect and random effect models.
34 a ® Exrerimentar Desicn in BEHAVIOURAL RESEARCH tes
Le 38
Table
Table 2.9 AB Interaction
b, b, fo pee |
51 71 74 196
67 49 53 169
120 127 G
118
and df
Partitioning of Total Variation res and dfas follows:
way anal ysis of vari ance will have the partitioning of the total sum of squa
The two-
Total kn-1 | 29
ar k(n-1) |} 24
Between groups k-1 |5
A|r-1
Z |] |8 ole If axe | -De-D || 2.
4
ANALYSIS OF VARIANCE: THE FOUNDATION OF EXPERIMENTAL DesicN a
The left hand rectangles represent the partitioning of the sum of squares and the adjoining rectangles
indicate partitioning of the total df, in the general form. The numerals outside the rectangles are the dfs
associated with the present numerical example.
In the equation form, the partitioning of the total sum of squares may be represented as
SS,otal = 994 + SS, + SS,, + SS within
(2.8)
where SS...) = total sum of squares generated from the deviation of each score from the mean of the
total scores.
SS ,= sum of squares of Factor A generated from the deviation of a, and a, means from the
mean of the total scores.
SS, = sum of squares of factor B generated from the deviation of b,, b,, and b, means from
the mean of the total scores.
SS ,, = sum of squares for interaction generated from the deviation of each subgroup mean
from the value predicted from that subgroup on the assumption of no interaction. |
SS vithin = POoled sum of squares within the six treatment groups generated from the deviation
of the individual scores from the mean of the respective subgroups representing error.
Computation
2 2
a
(Z) Correction Term (C) = 7 a = 4440.83
(ii) Total SS = (72 + 92 + ... + 112 + 122) —C = 4661.0 — 4440.83 = 220.17
6* 1697
(v) ASS = —- C = 4465.13 — 4440.83 = 24.30
5 67? 53°
AB SS = | +5 tt 5-7 © | [ASS
+ BSS]
Source of Variation SS df MS F
407
Intelligence=_ 4)
F for or Intelli 25.97
ion =n=
F forfor I Interactio 407
—_ = 11,52
Comments
Computation
Step i: Correction Term: The correction term (C) is found by squaring the grand total (G) and then
dividing itby the total number of observations or cases (N = kn = 6 x 5 = 30) in the experiment. Thus,
the correction term (G2/N) is found to be equal to 4440.83. The correction term is same throughout the
analysis, except for computing the within groups sum of squares. However, the formula is the same for
calculating the C in case of the within groups sum of squares.
e
MENTAL Desicn @ i
ANALYSIS OF VARIANCE: THE FOUNDATION OF EXPERI
51°
SS within ab,, treatment group = (72+ 92+... + 127)— a
2
16?) — =
SS within ab,, treatment group = (112+ 122 +... +
= 915.0 — 897.8 = 17.2
2
SS within ab,, treatment group = (112 + 132+... +177) - —
‘ In order to obtain AB interaction sum of squares, refer to Table 2.9. We first square the cell entries
and divide each by 5, the number of observations contributing to the sums entered in the cells of this
table. Then take the sum and subtract the correction term (4440.83). Finally, the interaction sum of
squares is obtained by subtracting the sum of squares for the main effects of A and B from the cells sum
of squares.
The six cell entries in Table 2.9, i.c., 51, 67, 71, 49, 74, and 53, represent the sums of the six
treatment groups (see Table 2.8). In each of the six treatment groups, 5 observations have been taken.
After summing, the correction term is subtracted from the sum. Here, the cell sum of squares is found to
be equal to 122.57. Now, subtract the 4 sum of squares (24.3) and B sum of squares (4.47). Thus, AB
sum of squares is found to be equal to 93.8. The outcome of both the methods is the same.
b, b, b,
We take the levels of one factor on X-axis. Let us take levels of factor B on X-axis. It is purely for
convenience that we have taken factor B on X-axis, otherwise it is alright to take any factor on X-axis.
Now, plot the means for each level of factor A. That is, first plot all the means in row a,, (i.e., aby, =
10.2, ab,, = 14.2, and ab,, = 14.8) corresponding to the levels of factor B, join the points and label the
resulting curve a,. It represents achievement scores of subjects of superior intelligence. Similarly, plot
all the means in row a,, (i.e., ab,, = 13.4, ab,, = 9.8, and ab,, = 10.6), join the points and label the
resulting curve a,. It represents achievement scores of subjects of inferior intelligence.
14e
Mean scores
NO
—
|
by by b,
Test of Significance
e 2.10 are to be evaluate d. The F ratio
in respect of factor A has been
The obtained values of F in Tabl B for 1 and 24 degrees of
table, given in the Appendix, Table
found to be 5.97. We consult the F’ .05. The observed value of
value is 7.82 at & = 01 and 4.26 at a=
freedom and observe that the critical Thus, the obtained
.01.
and is less than the critical value of « =
5.97 exceeds the critical value at a = .05 hypo thesis (H,) that the
level. Therefore, we reject the null
value of F = 5.97 is significant below .05 ally distributed
two groups, selected on the basis of intel ligence, are random samples from the same norm
population.
fore,
of fact orB is <1, hence it is not significant. There
Further, we observe that the F ratio in respect ly
that the three methods of instruction differential
we retain the null hypothesis (H/,) and cannot conclude
affect the achievement scores.
to be 11.52. The critic al value is 5.61 at a=.01.
The AB interaction F based on 2 and 24 df, is found
Thus, the F' associate d with the interaction of
The observed value of F far exceeds the critical value.
effectiveness of a particular meth od of
factors A and B is significant below .01 level. It indicates that the
n of superior intelligence do better
instruction depends upon the level of intelligence. While the childre
gence gain more from
with seminar and discussion methods of instruction, the children of inferior intelli
lecture method. This can be verified from the interaction profile (Fig. 9.5).
NORMALITY
The assumption of normality states that the distribution of scores within each treatment population is
normal. This assumption is satisfied when the scores within the treatment groups are from normally
distributed population.
In an empirical study, Norton (1952) found that /’ distribution is practically unaffected by lack of
symmetry in the distribution of criterion measures, however, it is slightly affected if the distribution of
the criterion measures is roughly symmetrical but either /eptokurtic or platykurtic. In such situations, it
is desirable that scores are appropriately transformed and analysis of variance is carried out on the
transformed scores (see next section on transformations). Symmetric, non-normal distributions cause
slight inflation of the Type I error probability.
In general, the F distribution is insensitive to the form of the distribution of criterion measures and
therefore, there is no need, to apply any statistical test to detect non-normality. One can detect extreme
departures by mere inspection. In case of extreme departures in the form of distribution, an appropriate
transformation should be carried out.
HOMOGENEITY OF VARIANCE
One of the basic assumptions underlying F test is that the variances of scores in each of the & treatment
groups are homogeneous (i.¢., 67 = 65 =...= 0) =.= 07 ). That is, the variances of the individual groups
are equal. We know that the within groups variance is the sum of the variations within each of the many
groups.
treatment groups. This test can only be done if there is homogeneity of variance within treatment
The
This assumption can be tested by means of Bartlett’s (1 937) test for homogeneity of variance.
as great as
experimental evidence, however, indicates that moderate departures, even two or three times
analysis of variance.
another group, from this assumption do not seriously affect the appropriateness of
ity of variance.
Mathematical derivations by Box (1954) indicate that o level is inflated by heterogene
and all groups have equal
However, if all treatment populations are approximately normally distributed
from homogeneity of
ns, the inflation is slight. F test is considered robust with respect to departures
variance.
purpose. Let us take the
A simple check on the equality of sample variances can be made for the
taken. The sum of
numerical example presented in Table 2.5 in which three treatment groups have been
section (step 4). The variance estimate
squares of each group has been calculated under the comments
that is, by 4 (n— 1). The variance of the
(MS) can be obtained by dividing the sum of squares by the df,
(21.2/4). Now, let us take the smallest
three groups in the example are 2.3 (9.2/4), 3.8 (15.2/4), and 5.3
smallest. The observed F' is equal to 2.3
and the largest variance and divide the largest variance by the
the F table in the Appendix, Table B,
(5.3/2.3) and the associated degrees of freedom are 4 and 4. From
Our observed value of F’= 2.3 is not
we observe that the critical value for 4 and 4 df is 6.39 at « = .05.
differ significantly and hence, the
significant. This shows that the two extreme variances do not
experimental groups are homogeneous.
nce, and some of these are overly
Several methods are available for detecting heterogeneity of varia
an, 1947; Hartley, 1950). These tests may
sensitive to departures from normality (Bartlett, 1937; Cochr
42 B® Exrerimentat Desicn In BEHAVIOURAL RESEARCH —_
Ea
of Hartley’s and
be used when a sensitive test is required to detect heterogeneity of variance. T he use
Bartlett’s tests will be explained in chapter 7.
If the variances for two treatment groups differ significantly (heterogeneity of variance), it may
which account for
reflect in the form of non-additivity of treatment effects. It is assumed that the factors
we mean that if X, is the score of g
the deviations in an individual’s scores are additive. By “additive” be
experimenta | condition, the score would
given observation under control condition, then under the nt. If the effect of the
nt effect due to experimental treatme
X, = X, + a, where a, is a constant treatme
equal ), because addition
two conditions should be
treatment is additive then 5? =; (as varianc es of the the
mean. In other words,
of a const ant does not affec t the v ari ance, it affects only the
or subtr actio n nge only the mean
ment shoul d not affec t the withi n groups varia tion, it should cha
experimental treat
groups variation.
which ultimately will affect the between ner, then X, =
an additive manner, acts in a mu Itiplicative man
If the treatment, instead of acting in
by a constant results in
ly 3 = s/ a; , becau se multi plyin g each value of a variable
X,a, and consequent ment effect is
varia nce by the squar e of the constant. Thus, if the treat
multiplying the original tly from the variance of
then the varia nce of the treat ment group may differ significan
multiplicative, if one treatment effect is additive
ogeneity of variance. Further,
control group, thus, resulting in heter rences in the variances of
is multiplicative, this con dition can also result in generatin g diffe
and the other
the treatment groups.
TRANSFORMATIONS.
In research investigations, it is not unusual to encounter situations in which one or more assumptions
underlying the F test are violated. One way to deal with such situations is to use some suitable non-
parametric test. Another way is to change the scale of measurement by suitable transformation.
Transformation is, thus, a change in the scale of measurement. For example, rather than time in seconds,
the scale of measurement may be reciprocal of time in seconds as criterion score.
There are different reasons for making transformations in the scale of measurement. Ordinarily, the
three asumptions, that is, additivity, normality and homogeneity, are violated together (Snedecor, 1956;
p. 315). Ideally, the transformation should be able to remedy all the problems, but in practice it is not
often possible. Additivity is the most essential requirement, and next is the homogeneity of variance.
Empirical studies have shown that even when the distribution departs appreciably from normality,
it has very little effect on the results. Box (1953) has shown that distribution of F ratio in the analysis of
variance is affected relatively little by inequalities in the variances which are pooled into the experimental
error. Further, Box has shown that the sampling distribution of F-ratio is relatively insensitive to moderate
departures from normality. For skewed or very leptokurtic or platykurtic distributions, scores should be
appropriately transformed.
E B® ExperRimeNTAL DESIGN IN BEHAVIOURAL RESEARCH
are calle 4
Let us consider some transformations that are appropriate for different conditions. These
monotonic transformations, as the transformations le ave the ordinal
relationship! unchanged.
Arcsin Transformation
When the observations have a binomial distribution, that is, when the observations are proportions or
percentages, the transformation is done to the angle whose sin is the square root of the proportion of
percentage. That is, sin“! Vp (read sin inverse) or arcsin \p, where p is the percentage or proportion of
correct responses in a fixed number of trials. Values of the transformation have been tabulated by Bliss
(1937). These tables have been reproduced by Snedecor (1956) and Guilford (1954). This transformation
stabilizes the variance. The table weighs more heavily the small percentages or proportions which have
small variance. The arcsin transformation Table is given in the Appendix, Table I.
Linear Transformations
Sometimes we obtain data that can be handled better by computing mean or variance from arbitrary
values or observations, derived by subtracting a convenient number from each value, that is, by shifting
the origin. As an example, consider the RT of 10 subjects: 213, 211, 210, 209, 208, 207, 205, 204, 203
and 201 msecs. It would be easier to handle the data if we subtract 200 msecs from each response time
before computing the mean and the variance. The original and transformed observations, labelled X and
X’ respectively, are being presented in Table 2.12.
BEB Exrerimentat Desicn IN BEHAVIOURAL RESEARCH os,
formed Observations
| Table 2.12 The Original and Trans enn
ns
, ap ae
x? X(X-200) x2
p Subject . X
45369 13 169
l 213
I 121
211 44521
2 10 100
210 44100
3 81
43681 9
4 209
8 64
208 43264
5 49
42849 7
6 207
5 25
205 42025
7 16
41616 4
8 204
3 9
203 41209
9 1
40401 J
10 201 _ |
xX’ =71 xX” = 635
xX = 2071 LX? = 429035
deviation. That is, the standard deviation is multiplied by the constant to recapture the original unit.
Note that multiplication or division of the observed values by a constant will affect the mean as well as
the standard deviation. Multiplying a set of values by a constant greater than one will increase the
standard deviation and mean in the same ratio. Similarly, dividing a set of values by a constant greater
than one will decrease the standard deviation and the mean in the same ratio. For example, halving a set
of values will reduce their standard deviation and mean to one half, whereas, the variance will be
reduced to one fourth, as V = SD.
Sometimes, in research investigations, we have observations on more than two groups and the
measurements, for one reason or another, cannot be assumed to be normally distributed, or the
measurements are in the ordinal scale. The investigator may be interested in finding if the k samples are
from different populations. If the assumptions of the test cannot be met, then the null hypothesis (,)
whether the & independent samples are from different populations can be tested by the Kruskal-Wallis
one-way analysis of variance by ranks. Further, if the data have been cast in a two-way table (rows and
columns of the null hypothesis can be tested by the Friedman two-way analysis of variance by ranks.
These two tests are useful when the F test is not applicable.
k R2
ay
a leary |e n, N+)
where k = number of groups
n= number of observations in the jth group
N= total number of observations
R,= sum of the ranks in jth samples
Kruskal and Wallis (1952) have shown that if the null hypothesis is true and if the number of
observations in each group is not too small (when there are more than 5 cases in the various groups),
then His distributed as x? (Chi-square) with k — 1 degrees of freedom. Thus, we can determine whether
7 Te_a } }
}~=~—
=
L RESEARCH
i B ExperimMentat DesIGNn IN BEHAVIOURA
to one less than the
table value of x2 for df equal
or not the null hypothesis is tenable by compa ring the
by the above formula.
number of groups with the value of H obtaine d from the data
Numerical Example on perceptual
and Ganguli (1975) investigated the effect of monetary reward and punishment
Broota They selected 8
Hindus, Indian Muslims, and U.S. Whites.
selectivity on three cultural groups—Indian the perceptual learning
was conducted in two stages. In the first,
children in each group. The experiment with monetary rewards
the children were made to learn the names of the profiles in association reward and
stage,
is, one half of the ambiguous figure was learnt in association with
and punishments. That Later, in the second stage, the
testing stage, the two
with punishment.
the other half in association that in tachistoscopic-
to yiel d an ambi guou s situation in a complete circle such
halves were com bin ed ground. Two such
coul d perc eive one profi le as figure and the other as back
exposure, the subj ects times the subjects
were used . The dep end ent vari able constituted the number of
ambiguous figures
rded profile. Total trials were 40.
reported having perceived the rewa
ishment differentially affected
inte rest ed in finding whether reward and pun
number of times each of the 24
The expe rime nter s were
of the experiment, in terms of the
the three groups. The outcome were regarded as reward
orga nise d in the thre e grou ps, reported rewarded profiles which
subjects,
Table 2.13.
scores. The scores are presented in
ts
Table 2.13 Reward Scores of the Subjec
Cultural Groups
Subject
Ry U.S. White R;
number Hindu | Ry Muslim
16.5 24 20
10 5 22
1 23
18 13 29
Z 9 4
9 23 18.5
16 ‘10.5 14
3
14 33 24
17 12 20
4
3 22 16.5
4 1 , 8
5
Z 23 18.5
12 6 5
6
10.5 22 21
21 15 16
7
27 22 13 La
8 13 re
Comments
Step 1: The number of times each of the 24 subjects reported the rewarded profiles, regarded as reward
scores, have been presented in Table 2.13. First, we rank all the 24 scores from the lowest to the highest
number of rewarded profiles. That is, treating all the three groups as one, rank the responses giving rank
1 to the subject who gave minimum number of reward-associated responses, and rank 24 to the one who
gave the maximum. It is immaterial whether the ranking is from lowest to highest or from highest to
lowest. In the experiment, subject number 5 in the Hindu group reported the least number of rewarded
profiles, thus, he has been assigned rank 1. The 4th subject in the U.S. White group reported maximum
number of rewarded profiles, thus, he has been assigned rank 24. These ranks are, then, separately
added for the three groups to obtain R, = 61, R, = 90, and R, = 149, as has been shown in Table 2.13.
Observations tied for a rank have been assigned the average value of the ranks they would have ordinarily
occupied. For example, there are two observations having a score 13. These two observations would
. (7+8)}.
have ordinarily occupied ranks 7 and 8. So, the average rank of 15] y is assigned to each.
k R2
Step 2: The formula 2.9 is applied to obtain the value of H. Here N is equal to 24. Further, y ~ is
AY
obtained by taking the sum of the ranks of the three samples (R, = 61, R, = 90, and R, = 149) and
squaring each and dividing by n(8), the number of observations in each subgroup. In this way, the value
of H is obtained, which in this case is equal to 10.055.
Test of Significance
The next step is to determine the significance of the obtained H. Kruskal and Wallis (1952) show that if
the null hypothesis is true and if the number of observations in each group is not too small, then H is
distributed as x2 with k— 1 degrees of freedom. Consequently, we consult the x? table for the value of A
obtained, that is, 10.055 and with 2 df (3 — 1). We observe from the x? table (Appendix, Table C), that
for 2 degrees of freedom, the probability of obtaining a value of H equal to 10.055 is < .01. The null
hypothesis (H,) that the three samples have come from identical populations is, thus, rejected. As the
null hypothesis is rejected, we accept the experimental hypothesis (H,) that the three groups differ in
their perceptual responses. In general, we conclude that the three means are not equal.
Tied Observations
In the above experiment, whenever ties occur between two or more observations, each score is given
the mean of the ranks for which it is found to be tied. As long as the number of ties is not too large, the
correction introduced for the tied observations will have relatively little influence. However, if the
number of tied ranks is large, the value of H is somewhat influenced by ties. Thus, it is desirable to
apply a correction for the ties in computing H. To correct the value of H obtained by formula 2.9 for the
ties, it is divided by
2T
1 (2.10)
— NB-N
where T= —t, t being the number of tied observations in a tied group
N = total number of observations in the experiment
5 a B Experimenta DesiGN IN BEHAVIOURAL RESEARCH —__
l- = = 9983
24° - 24
for ties ig
5 by .9983. Thus, the value of H corrected
Now, we divide the obtained value of H, 10.05
given by
H= se) = 10,072
~ 9983
result more significant
s the value of H and, thus, makes the
We observe that the correction increase d the value of H by
we observe that the correction has increase
than if uncorrected. In the present case, should be remembered
The corr ecti on has not made much difference in the significance level. It
017. the probability associated
observations are involved in ties, then
that if no more than 25 per cent of the making correction
uted with out the corr ecti on for ties is not changed by more than 10 p.c. on
with H comp on the size of the
The magnitude of the correction depends
for ties (Kruskal and Wallis, 1952, p. 587). In the present case,
ge of the observations involved in the ties.
ties (value of #) as well as on the percenta involved in the ties was
2 and the percentage of the observations
the length of ties in all the groups was
33 p.c. [(8 x 100/24].
Variance
The Friedman Two-way Analysis of the null hypothesis (H,)
nce is a useful te chnique for testing
The Friedman Two-Way Analysis of Varia kal-Wallis One-Way
k matc hed sampl es have been draw n from the same pop ulation. The Krus
that the from k independent
have to compare several means derived
Analysis of Variance is useful when we hed on certain variables
ysis of Variance, the k samples are matc
samples. In the Friedman Two-Way Anal only refers to the
under each of the & conditions. Two-way,
or the same group of subjects is studied matched groups on
of data in whic h there are r rows and k columns. The rows may represent the
casting find the differences
experimental conditions. Our interest is to
certain variables and the columns various
d be at least on ordinal scale.
among the k conditions only. The data shoul
representing conditions differ significantly and
The Friedman test determines whether the columns
the statistic used is y,, which is given by
(2.11)
k
om2 = a
12
Yi (R))?Pn -3r(k +1)
j=l
Friedman (1937) shows that when the number of rows and/or columns is not too small, the x; is
distributed approximately as chi square with df= — 1. Thus, we can determine whether or not the null
hypothesis is tenable by consulting the table of? for the value of x2 obtained from the data by formula
2.11 for df equal to one less than the number of columns. Empirical study by Friedman (1937, p. 686)
has shown very favourable results with the 2 test as compared with the most powerful parametric F
test.
Numerical Example
Suppose, an experimenter interested in evaluating the effect of three types of reinforcement upon extent
of discrimination learning took 20 sets of rats; 3 in each set. The 3 rats in each of the sets were matched.
The three kinds of reinforcement were designated as RE,, RE,, and RE,. In each set, the 3 rats were
assigned randomly to the three reinforcement conditions. After the training with the three types of
reinforcement, the extent of learning was measured in terms of latency of correct response. In such
experiments, the null hypothesis (H,) that the different types of reinforcement have no differential
effect is evaluated, through the Friedman two-way analysis of variance. In this experiment incorrectly
written, the latency of correct response in each of the 20 groups was ranked, giving rank | to the rat that
had the fastest latency and 3 to the one that had the slowest latency. Rank 1, thus, signifies strong
learning. Remember that in the present example, the scores are ranked in each row from | to 3. The
ranks of the 20 matched groups in respect of the extent of learning under the three kinds of reinforcement
are given in Table 2.14.
Table 2.14 Ranks of Twenty Matched Groups in Respect of the Extent of Learning
Under the Three Kinds of Reinforcement
1 2 5 1
2 1 2 3
3 3 I 2
4 3 1 2
5 3 2 i
6 2.5 23 1
7 1 3 2
8 2 3 1
9 3 1 2
10 3 2 1
11 2 3 1
12 3 1 2
13 2 3 1
14 1 3 2
15 3 1 2
16 2 1 3
17 Pie 2.5 1
18 2 3 1
19 3 2 1
20 2 3 1
(i) R, 46 43 ‘ 31
i ®@ Experimentat Desicn In BEHAVIOURAL RESEARCH
Computation
Applying formula 2.11,
5 ban
2. — 2 [462 + 432 + 317] — (3)(20)(3 + 1)
(ii) X, (20133 +)
= 6.3
Comments
row from 1 to 3 (1 to &) and presented in Table
Step 1: The observations have been ranked in each . The sums of the
2.14. The sums of the ranks in each of the three columns (R,) have been determined
in terms of scores,
46, 43 and 31 respectively. If the data is
ranks in column RE,, RE, and RE, are to the subject that
from | to k. Generally, rank 1 is assigned
remember to rank the scores in each row, least amount of attributes
are interested and the subject with the
has most of the attributes in which we cy was assigned
last rank or k (in the prese nt examp le 3). The rat with the fastest laten
is assigned the
rank | and rank 3 to the slowest.
l to 20, the number of rows
2: The value of x? is comp uted usin g formula 2.11. Here,’ is equa
Step k
Further, SiR iy is
to 3, the num ber of col umn s or types of reinforcement.
or groups, k is equal j=l
Tied Observations
tied for ranks 2 and
rats had the same latencies and were, thus,
In group 6 as well as 17, the RE, and RE, (1937) states that
3/2), the average of the tied ranks. Friedman
3. Both were assigned the ranks 2.5 (2 + t the validity of the x; test.
vations does not affec
the substitution of the average rank for tied obser
any correction for tied observations.
Therefore, in this test there is no need to apply
Small Samples
the values can be found from Tables provided
When the number of observations in each group is small,
the x? value for large and small
by Siegel (1956). The same formula (2.11) is used for determining
samples.