The document discusses the theory of psychological tests, focusing on the relationship between item discrimination, test reliability, and the impact of time constraints on performance. It highlights the importance of item intercorrelation and the optimal composition of test items for effective discrimination, while also addressing the complexities of speed versus power in testing. Additionally, it examines how motivation influences test performance and the distinct factors of speed and power in time-limited assessments.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0 ratings0% found this document useful (0 votes)
4 views5 pages
Speed and Power
The document discusses the theory of psychological tests, focusing on the relationship between item discrimination, test reliability, and the impact of time constraints on performance. It highlights the importance of item intercorrelation and the optimal composition of test items for effective discrimination, while also addressing the complexities of speed versus power in testing. Additionally, it examines how motivation influences test performance and the distinct factors of speed and power in time-limited assessments.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 5
THEORY
RY OF PSYCHOLOGICAL TESTS 365
Ferguson also presents i
sabe tetwreen the haben rane of discrimination, which exj
ratio boner he numberof discrimination a text actualy provides did i
rere Could provide (3, p. 67). ‘The formula for
3- a _ w+ nar sp
Nt nN? (13.40)
ee neaial in sample
-y at each score
at Sie of items in test
jis coefficient varies fro: indivi
to10 when the distribution is eetangulaes Wie the nae
Sepends upon the degree of intercortalation of items, it aloe desule an
other properties of the test. We saw previously thet whee nos frescoes
ton is moderately high the score dstbution becomes rectangular Furies
t oderately hig] re distribution becomes rectangular. Further
increase in item intercorrelation bring bimodal and finall i
tutions as well as higher reliability. ‘The coefficient of dainitetoa
greatest when the distribution is rectangular, and it decreases with bimodal
and U-shaped distributions. Discriminations are admittedly poorer in the
latter instances, except at restricted score levels. Thus, maximal discrimina.
tion all along the line is best only when the test is short of perfect reliability.
Itis interesting that the theory from this approach comes to much the same
conclusions concerning the optimal item composition for a test as those from
other approaches. These conclusions agree that the best items are those of
medium difficulty and of a narrow range of difficulty. ‘There is agreement
on the desirability of high item intercorrelations, though the discrimination
approach warns us more clearly against carrying this goal too far. Thus,
something short of perfect reliability of the internal-consistency variety is
optimal for numerous and widespread discriminations.
f A word of warning is perhaps necessary Jest too much weight be placed on
discriminations as such. There is also the question as to what the discrimina-
tions mean and whether they are likely to be stable, that is, in the same direc-
tions, in a parallel form of the test or in a repetition of the same test. It is
liscriminations depends upon high item
reassuring that the high number of pon high
intercorrelations, a condition that should tend to guarantee a similar kind of
discrimination along the line. e kind of discrimination and
‘The question of the
whether it is what we want takes us over into the problem of validity on
which the theory has nothing to say.
Seep AND POWER PROBLEMS
ing it is sometimes essential, or at least it is very
e to be allotted the same limited working time
raised many questions concerning
'd upon measurement, There is the
ime limit to be adopted for the
In the practice of group test
convenient, for every examine
on a test.” This fact, particularly, has
the effect of working time upon scores an
Practical question concerning the optimal ti366 PSYCHOMETRIC METHODS
test, where optimal is defined in terms of some value judgment in the light
of some psychological-measurement goal. ‘There arc the more fundamental
questions concerning what psychological qualitics are measured when time
is liberal versus when time is short. ‘The problems of time cannot be divorced
from the problems of difficulty, as we shall see. ‘There has been some theory
aimed at these problems and very recently some experimental and factor.
analytical investigations have been aimed at their solution.
Theory of Relationships of Speed and Power. ‘There scems to have been
a working hypothesis that speed and power, as descriptive variables of psy.
chological performance, are relatively interchangeable, that we can measure
the same abilities either by determining how many units of work can be pro-
duced per unit of time or by determining the level of difficulty that can be
mastered in liberal time. This would mean that an individual could obtain
the same score (number of successful acts—responses to items) under different
combinations of time and difficulty levels so long as the product of time times
difficulty were constant. This picture of the relationships is undoubtedly
much too simple.
Probability of Success in Relation to Difficulty and Time. Thurstone made
the first attempt at a rationale of this problem in 1937 (26), an attempt which
does not seem to have been followed up but which would seem to be promis-
ing. His reasoning was essentially as follows. He defined the power of an
individual as that level of difficulty of tasks at which his probability of suc-
cess is .5 when given infinite time. In practice, infinite time is all the time
the examinee will take. It is true that the amount of time he will take
depends upon his motivation, but we will assume a high level of desire to cor
plete the task, such as we usually assume in connection with testing of abil
ties. Along with this definition of power is the assumption that the proba-
bility of success is a descending ogive function of difficulty. ‘This is not a
new assumption, for we saw it represented in Fig. 13.1. Here, however,
we may extend the assumption by adding “within a constant time
interval.”
_Figure 13.4 (first diagram) represents the assumed relationship of proba-
bility of success to difficulty level when time varies (7), Tz, ..., Ta). At
infinite time the median of the psychometric curve comes at difficulty level
A, which by definition is the measure of this individual’s power in this kind
of task. At decreasing time limits, the ogive moves to the left on the diffi-
culty an One may assume that the precision of the curve remains the
ae en tae ae ot task and this individual. ‘The slope and shape of the
in both slope fF ie cberimental determination, There may be changes
reflection polit lsat the Geatta, Soe diagram assumes symmetry, that the
naan ae (or the same individual in the same kind of
levels (Dy, Ds. ss Dy) we aera of Fig. 13.4, at different difficulty
assume that the regression of probability of sae level Di. we may
ascending ogive form. For easy items the proteins eon time is of the
precision, but as difficulty increases, the meeceae curve has very high
The main reason for this is that zero time Imperceen tes Curves decreases.
' Imposes a limit. At zero time theTHEORY OF PSYCHOLOGICAL TESTS 367
robability of success is zero,
Some difficult ti
pility less than .5 even after k ‘asks are mi
e diff astered with proba-
‘ong time intervals,
There are some tasks so
ing them, even in infinite
deductions concerning the eff
8 7 ect of limited time
ment. Would the ordinary summation scores yield the
of individuals when there is a time I
the individual’s power in th
time. We saw that under
fect correlation between summation scores
test. In Fig. 134, diagram 1, we see that the
and limen scores in the same
limen score decreases systemati-
a
SG
Difficulty 4
Timo
probability of success on a task asa function of difliculty at different
ind as a function of time at different difficulty levels.
Fis. 13.4, Hypothetical
time limits ai
cally as time decreases, The question would be whether all individuals’
limen scores will decrease in the same ratio as time decreases. If the psy-
chometric functions involved have equal precisions for all persons and at all
difficulty levels, we might expect this question to be answered in the affirma-
tive, But equality of precision is probably not the case, since there is much
empirical evidence that obtained scores measure different factors as speed
Versus power is emphasized. The individual’s power in a task, as defined, is
Foeantity independent of his speed of work. It is best measured under
Power conditions and estimates of it from scores obtained under conditions
that limit ti
‘ime are made with some risk, depending upon how severe the time
limitation is.
There have been a few studies bearing directly or indirectly on Thurstone’s
theory of the difficulty-time manifold in relation to success, most of them
Providing some support. In a study by Philip (20), for example, difficulty
of a psychophysical judgment was varied by making color-spot patterns368 PSYCHOMETRIC METHODS
more or less alike, the observer being asked to judge differences. Expoture
times were varied from .133 to .668 sec. At each level of difficulty, error,
were S-shaped functions of exposure time. A study of Hunter and Sigler
(12) demonstrated similar effects in an investigation of span of apprehension,
in which difficulty was controlled by varying the illumination level.
Relation of Speed and Power to Motivation. Thurstone (26) also specu.
lated concerning the effects of motivation upon probability of success ag
difficulty and time vary. One conclusion was that increased motivation has
no effect upon power but may increase speed. Trying hard to master an
item will not increase the probability of success in infinite time, but if the
individual can master it at all, he can do so in shorter time. It is likely that
the easier the task, the greater relatively the effect of increased motivation,
In the easiest tests the task becomes essentially one of reaction time. The
effects of motivation on reaction time are notable, but even with simple tasks
the relation of speed of response to degree of motivation is probably, one of
sharply negative acceleration, the greatest increases of speed being noticed
from changes at low levels of motivation. There is also probably an optimal
motivation level for difficult tasks, for there is empirical evidence that one
may “try too hard,” as in learning experiments where effort is varied. Thus,
the relations of rate of successful performance to motivation are not very
well known or very simple. It would be best to say that the rationale stated
for relations of time and difficulty to performance, discussed above, apply
when there are moderately high, but not maximal, degrees of motivation.
Definitions of Speed and Power Tests. A speed test is often defined as
one in which no examinee has time to attempt all items. A power test is
often defined as one in which every examinee has a chance to attempt every
item. Most tests fall somewhere between these two extremes; some rela-
tively more speeded and some relatively less.
Gulliksen has given more rigorous definitions of speed and power tests, in
terms of statistical criteria, He first defines the following symbols (11,
p. 230):
W = number of wrong answers
umber of items unattempted
X = W + U = total error score (items not correctly answered)
In a pure speed test W = 0 so that X = U, Mz = My, and oz =0%. AY
test approaches a pure speed test to the extent that Me and o. appros®
zero and M, and ¢, approach Mf, and oz, respectively. In a pure power te
M, and oy equal zero, and My and oe equal Mand ez, respectively. Tot
extent that any test approaches these conditions it is a power test. A ba”
and-fast line between speed and power tests is not possible to fix. Gulliks®?
(11, p. 233) offers the criterion that if the ratio ov/a is very small the test ¥
essentially a speed test and if the ratio o,/az is very small the test is essen
a power test. These statements assume that there have been essentially”
omitted items. d
Speed and Power Factors in Time-limi investigation’
soe a ime-limit Tests. Some investigatl”
time-imit tests throw considerable light on what is measured 25 tis F
varied. All show that there are speed factors distinct from power
facto® |THEORY OF PSYCHOLOGICAL TESTS 369
pavidson and Carroll (2) factor-analyzed separately speed scores and
“le
yel”” of power scores derived from the same tests. They also related the
‘ame factors to time-limit scores as obtained under the standard administra-
tion of the same tests. The speed score for a test was the number of items
rempted, disregarding errors. The power score was number of correct
Anewers When all items were attempted. The results showed a general
gpecil factor common to the speed scores and related to some extent to some
of the time-limit scores.
Some of the well-known common factors were found
jn the analysis of the power scores. Corresponding to one of these factors, @
rea
ning factor, there was a speed-of-reasoning factor in the analysis of speed
scores. The time mit scores were related to different degrees to the various
“teed and power factors, some more to the one kind and some more to the
other.
Myers (19) administered three forms of a figure-analogies test, each form
peing given in three ways, with 10, 20, and 30 items, respectively, adminis-
tered in 12 min. In the power tests (10 items), 97 to 100 per cent of the
examinees completed the items. In the speed tests (30 items) 30 to 41 per
cent completed the items. The effect of speed conditions on various scores
was studied— the number of items attempted, the number of right answers,
the number of omissions, and the number of wrong answers. A factor
analysis of each form of the test showed two “factors,” one recognized as
speed and the other as power.! Both the number-attempted and the num-
herright scores under speed conditions measured the speed “factor.” ‘The
vimber-attempted score means different things for different individuals.
Some examinees make no responses until they feel confident of the answer,
“tule others record an answer even when théy know they are guessing. Ina
speed test, too, differences in motivation level may have.an important bearing
Pethe number of answers recorded. ‘Thus, speed conditions where items are
wot very easy open the door to many uncontrolled determiners of individual
differences in scores.
Tate (23) approached the speed versus power problem experimentally.
His major problem of investigation was to determine whether there is a fac~
tor of mental speed that is independent of power and independent of the kind
of task in which it is measured. Tate used four kinds of tasks, including
tests of arithmetic reasoning, number series, sentence completion, and spatial
relations, The items were at three difficulty levels. At the easy level, 3 to
1
ai
1 per cent failed the items; at the moderate level, 17 to 40 per cent failed;
nd at the difficult level, 42 to 61 per cent failed. The items were admi
tered individually, and the response time to each item was recorded sepa-
rately, This is an important condition, since in the group administration of
tests, although total working time is controlled, each examinee regulates his
own timing within a test. On
mGmteresting feature of Tate’s treatment of
results wae that conversion of his working-time scores into log-time measures
K
esulted in normal distributions. ae
Sevetal ot Tate's findings are noteworthy (28, p. 373). With difficulty
and accuracy (proportion correct) ‘controlled, there were still very large
“The term ‘factora” is in quotation marks here because each factor is probably a com-
Posite of several.