0% found this document useful (0 votes)

41 views23 pages

Guiding Principle

Uploaded by

borgonialykha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views23 pages

Guiding Principle

Uploaded by

borgonialykha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 23

College of Education, Arts, and Sciences

Teacher Education Department

SOC STUD 119: Assessment and Evaluation in Social Science

Guiding Principles of Different Types of Test

What are the Major Categories and Formats of Traditional Tests?

(Balagtas-Ubina et al., 2020) For the purposes of classroom assessment, traditional tests fall
into two general categories:
(1) selected response type, in which learners select the response from the given options, and
(2) constructed-response type, in which the learners are asked to formulate their answers. The
cognitive capabilities required to answer selected-response items are different from those
required by constructed-response items, regardless of content.

A. Selected-Response Tests (Balagtas-Ubina et al., 2020)

Selected-response tests require learners to choose the correct answer or best alternative from
several choices. While they can cover a wide range of learning materials very efficiently and
measure a variety of learning outcomes, they are limited when assessing learning outcomes
that involve more complex and higher-level thinking skills. Selected-response tests include a
multiple-choice test, true-false or alternative response test, and matching type test. In the
following sections, the nature of the three selected-response tests and the rules for writing
them are discussed.

A1. Nature of Multiple-Choice Items (Keith Waugh & Gronlund, 2013)

The multiple-choice items consist of a stem, which presents a problem situation, and several
alternatives (options or choices), which provide possible solutions to the problem. The stem
may be a question or an incomplete statement. The alternatives include the correct answer
and several plausible wrong answers called distracters. The function of the latter is to distract
those students who are uncertain of the answer. The following items illustrate the question
form and the incomplete statement form of a multiple-choice item.

Example of Multiple-Choice Item in Question Form

Which of the following item types is an example of a supply-type test item?
a. Multiple-choice item.
b. True-false item.
c. Matching item.
d. Short-answer item.*

Example of Multiple-Choice Item in Incomplete Statement Form

An example of a supply-type test item is the
a. multiple-choice item.
b. true-false item.
c. matching item.
d. short-answer item.*
Below are the strengths and limitations of a multiple-choice test.

Strengths
1. Learning outcomes from simple to complex can be measured.
2. Highly structured and clear tasks are provided.
3. A broad sample of achievement can be measured.
4. Incorrect alternatives provide diagnostic information.
5. Scores are less influenced by guessing than true-false items.
6. Scoring is easy, objective, and reliable.
Limitations
1. Constructing good items is time-consuming.
2. It is frequently difficult to find plausible distracters.
3. This item is ineffective for measuring some types of problem-solving and the ability
to organize and express ideas.
4. Score can be influenced by reading ability.

Rules for Writing Multiple-Choice Items (Keith Waugh & Gronlund, 2013)
An effective multiple-choice item presents students with a task that is both important and
clearly understood and one that can be answered correctly by anyone who has achieved
the intended learning outcome. Nothing in the content or structure of the item should
prevent an informed student from responding correctly. Similarly, nothing in the content
or structure of the item should enable an uninformed student to select the correct answer.

The following rules for item writing are intended as guides for the preparation of multiple-
choice items that function as intended.

1. Design each item to measure an important learning outcome. The problem situation
around which an item is to be built should be important and should be related to the
intended learning outcome to be measured. When writing the item, focus on the
functioning content of the item and resist the temptation to include irrelevant material
or more obscure and less significant content to increase item difficulty.
2. Present a single, clearly formulated problem in the stem of the item. The task outlined
in the stem of the item should be so clear that a student can understand it without reading
the alternatives. A good check on the clarity and completeness of a multiple-choice stem
is to cover the alternatives and determine whether it could be answered without the
choices.

Example
Poor: A table of specifications
a. indicates how a test will be used to improve learning.
b. provides a more balanced sampling of content.*
c. arranges the instructional objectives in order of their importance.
d. specifies the method of scoring to be used on a test.

Better: What is the main advantage of using a table of specifications when preparing
an achievement test?
a. It reduces the amount of time required.
b. It improves the sampling of content.*
c. It makes the construction of test items easier.
d. It increases the objectivity of the test.
3. State the stem of the item in simple, clear language. The problem in the stem of a
multiple-choice item should be stated as precisely as possible and should be free of
unnecessarily complex wording and sentence structure. Poorly stated item stems
frequently introduce sufficient ambiguity to prevent a knowledgeable student from
responding correctly. Also, complex sentence structure may make the item a measure
more of reading comprehension than of the intended outcome.

Example
Poor: The paucity of plausible, but incorrect, statements that can be related to a central
idea poses a problem when constructing which one of the following types of test
items?
a. Short answer.
b. True-false.
c. Multiple choice.*
d. Essay.

Better: The lack of plausible, but incorrect, alternatives will cause the greatest
difficulty when constructing
a. short-answer items.
b. true-false items.
c. multiple-choice items.*
d. essay items.

Another common fault in stating multiple-choice items is to load the stem with irrelevant
and, thus, nonfunctioning material. This is probably caused by the instructor’s desire to
continue to teach the students – even while testing them. The following example illustrates
the use of an item stem as “another chance to inform students.”

Example:
Poor: Testing can contribute to the instructional program of the school in many
important ways. However, the main function of testing in teaching is:
Better: The main function of testing in teaching is:

4. Put as much of the wording as possible in the stem of the item. Avoid repeating the same
material in each of the alternatives. By moving all the common content to the stem, it is
usually possible to clarify the problem further and to reduce the time the student needs
to read the alternatives.

Example
Poor: In objective testing, the term objective
a. refers to the method of identifying the learning outcomes.
b. refers to the method of selecting the test content.
c. refers to the method of presenting the problem.
d. refers to the method of scoring the answers.*
Better: In objective testing, the term objective refers to the method of
a. identifying the learning outcomes.
b. selecting the test content.
c. presenting the problem.
d. scoring the answers.*

In many cases, the problem is not simply to move the common words to the stem but to
reword the entire item. The following examples illustrate how an item can be improved by
revising the stem and shortening the alternatives.

Example
Poor: Instructional objectives are most apt to be useful for test-construction purposes
when they are stated in such a way that they show
a. the course content to be covered during the instructional period.
b. the kinds of performance students should demonstrate upon reaching the goal.*
c. the things the teacher will do to obtain maximum student learning.
d. the types of learning activities to be participated in during the course.

Better: Instructional objectives are most useful for test-construction purposes when
they are stated in terms of
a. course content.
b. student performance.*
c. teacher behavior.
d. learning activities.

5. State the stem of the item in positive form, wherever possible. A positively phrased item
tends to measure more important learning outcomes than a negatively stated item. This
is because knowing such things as the best method or the most relevant argument
typically has greater educational significance than knowing the poorest method or the
least relevant argument. The use of negatively stated items in stem results all too
frequently from the ease with which such items can be constructed rather than from the
importance of the learning outcomes measured. The test maker who becomes frustrated
by the inability to think of a sufficient number of plausible distracters for an item, as in
the first following example, suddenly realizes how simple it would be to construct the
second version.

Example
Item One: Which one of the following is a category in the revised taxonomy of the
cognitive domain?
a. Understand.*
b. (distracter needed)
c. (distracter needed)
d. (distracter needed)

Item Two: Which one of the following is not a category in the revised taxonomy of
the cognitive domain?
a. Understand
b. Apply
c. Analyze
d. (answer needed)*

Note in the second version that the categories of the taxonomy serve as distracters and that
all that is needed to complete the item is a correct answer. This could be any term that
appears plausible but is not one of the categories listed in the taxonomy. Although such
items are easily constructed, they are apt to have a low level of difficulty and are likely to
measure relatively unimportant learning outcomes. Being able to identify answers that do
not apply provides no assurance that the student possesses the desired knowledge.
This solution to the lack of sufficient distracters is most likely to occur when the test
maker is committed to the use of multiple-choice items only. A more desirable procedure
for measuring the “ability to recognize the categories in the taxonomy of the cognitive
domain” is to switch to a modified true-false form, as in the following example.

Example
Directions: Indicate which of the following are categories in the taxonomy of the
cognitive domain by circling Y for yes and N for no.
*Y N Understand
Y N* Critical Thinking
Y N* Reasoning
*Y N Create

6. Emphasize negative wording whenever it is used in the stem of an item. In some

instances, the use of negative wording is basic to the measurement of an important
learning outcome. Any potentially dangerous situation may require a negative emphasis.
There are also, of course, less dire circumstances where negative phrasing is useful.
Almost any set of rules or procedures places some emphasis on practices to be avoided.
When negative wording is used in the stem of an item, it should be emphasized by being
underlined or capitalized and by being placed near the end of the statement.

Example
Poor: Which one of the following is not a desirable practice when preparing multiple-
choice items?
a. Stating the stem in positive form.
b. Using a stem that could function as a short-answer item.
c. Underlining certain words in the stem for emphasis.
d. Shortening the stem by lengthening the alternatives.*

Better: All of the following are desirable practices when preparing multiple-choice
items EXCEPT
a. stating the stem in positive form.
b. using a stem that could function as a short-answer item.
c. underlining certain words in the stem for emphasis.
d. shortening the stem by lengthening the alternatives.*

7. Make certain that the intended answer is correct or clearly best. When the correct-
answer form of a multiple-choice item is used, there should be only one correct answer
and it should be unquestionably correct. With the best-answer form, the intended answer
should be one that competent authorities would agree is clearly the best. In the latter
case, it may also be necessary to include “of the following” in the stem of the item to
allow for equally satisfactory answers that have not been included in the item.
Example

Poor: What is the best method of selecting course content for test items?
Better: Which one of the following is the best method of selecting course content for
test items?

The proper phrasing of the stem of an item can also help avoid equivocal answers
when the correct-answer form is used. In fact, an inadequately stated problem frequently
makes the intended answer only partially correct or makes more than one alternative
suitable.

Example
Poor: What is the purpose of classroom testing?
Better: One purpose of classroom testing is or The main purpose of classroom testing is

8. Make all alternatives grammatically consistent with the stem of the item and parallel in
form. The correct answer is usually carefully phrased so that it is grammatically
consistent with the stem. Unless care is taken to check them against the wording in the
stem and the correct answer, they may be inconsistent in the tense, article, or
grammatical forms. This, of course, could provide a clue to the correct answer, or at
least make some of the distracters ineffective. A general step that can be taken to prevent
grammatical inconsistency is to avoid using the articles “a” or “an: at the end of the stem
of the item.

Example
Poor: The recall of factual information can be measured best with a
a. matching item.
b. multiple-choice item.
c. short-answer item.*
d. essay question.

Better: The recall of factual information can be measured best with

a. matching items.
b. multiple-choice items.
c. short-answer items.*
d. essay questions.

The indefinite article “a” in the first version makes the last distracter obviously
wrong. By simply changing the alternatives from singular to plural, it is possible to omit
the article. In other cases, it may be necessary to add an article (“a,” “an,” or as appropriate)
to each alternative or to rephrase the entire item.
Stating all the alternatives in parallel form also tends to prevent unnecessary clues
from being given to students. When the grammatical structure of one alternative differs
from that of the others, some students may more readily detect that alternative as a correct
or an incorrect response.

Example
Poor: Why should negative terms be avoided in the stem of a multiple-choice item?
a. They may be overlooked.*
b. The stem tends to be longer.
c. The construction of alternatives is more difficult.
d. The scoring is more difficult.

Better: Why should negative terms be avoided in the stem of a multiple-choice item?
a. They may be overlooked.*
b. They tend to increase the length of the stem.
c. They make the construction of alternatives more difficult.
d. They may increase the difficulty of the scoring.

9. Avoid verbal clues that enable students to select the current answer or to eliminate an
incorrect alternative. One of the most common sources of extraneous clues in multiple-
choice items is the wording of the item. Some such clues are rather obvious and are
easily avoided. Others require the constant attention of the test maker to prevent them
from slipping in unnoticed. Let’s review some of the verbal clues commonly found in
multiple-choice items.

a. Similarity of wording in both the stem and the correct answer is one of the most obvious
clues. Keywords in the stem may unintentionally be repeated verbatim in the correct
answer, a synonym may be used, or the words may simply sound or look alike.

Poor: Which one of the following would you consult first to locate research articles
on achievement testing?
a. Journal of Educational Psychology
b. Journal of Educational Measurement
c. Journal of Consulting Psychology
d. Review of Educational Research*

The word “research” in both the stem and the correct answer is apt to provide a clue to the
correct answer to the uninformed but testwise student. Such obvious clues might better be
used in both the stem and an incorrect answer, to lead the uninformed away from the correct
answer.

b. Stating the correct answer in the textbook language or stereotyped phraseology or may
cause students to select it because it looks better than the other alternatives, or because
they vaguely recall having seen it before.

Example
Poor: Learning outcomes are most useful in preparing tests when they are
a. clearly stated in performance terms.*
b. developed cooperatively by teachers and students.
c. prepared after the instruction has ended.
d. stated in general terms.

The pat phrasing of the correct answer is likely to give it away. Even the most poorly
prepared student is apt to recognize the often-repeated phrase “clearly stated in
performance terms,” without having the foggiest notion of what it means.
c. Stating the correct answer in greater detail may provide a clue. Also, when the answer
is qualified by modifiers that are typically associated with true statements (for example,
“sometimes,” “may,” “usually”), it is more likely to be chosen.

Example
Poor: Lack of attention to learning outcomes during test preparation
a. will lower the technical quality of the items.
b. will make the construction of test items more difficult.
c. will result in the greater use of essay questions.
d. may result in a test that is less relevant to the instructional program.*

The term “may” is rather obvious in this example, but this type of error is common and
appears frequently in a subtler form.

d. Including absolute terms in the distracters enables students to eliminate them as possible
answers because such terms (“always,” “never,” “all,” “none,” “only,”) are commonly
associated with false statements. This makes the correct answer obvious or at least
increases the chances that the students who do not know the answer will guess it.

Example
Poor: Achievement tests help students improve their learning by
a. encouraging them all to study hard.
b. informing them of their progress.*
c. giving them all a feeling of success.
d. preventing any of them from neglecting their assignments.
Such absolutes tend to be used by the inexperienced test maker to ensure that the incorrect
alternatives are clearly wrong. Unfortunately, they are easily recognized by the student as
unlikely answers, making them ineffective as distracters.

e. Including two responses that are all-inclusive makes it possible to eliminate the other
alternatives since one of the two must obviously be the correct answer.

Example
Poor: Which of the following types of test items measures learning outcomes at the
recall level?
a. Supply-type items.*
b. Selection-type items.
c. Matching items.
d. Multiple-choice items.

Since the first two alternatives include the only two major types of test items, even poorly
prepared students are likely to limit their choices to these two. This, of course, gives them
a fifty-fifty chance of guessing the correct answer.

f. Including two responses that have the same meaning makes it possible to eliminate them
as potential answers. If two alternatives have the same meaning and only one answer is
to be selected, it is fairly obvious that both alternatives must be incorrect.

Example
Poor: Which of the following is the most important characteristic of achievement-test
results?
a. Consistency.
b. Reliability.
c. Relevance.*
d. Objectivity.

In this item, both “consistency” and “reliability” can be eliminated because they mean
essentially the same thing. Extraneous clues to the correct answer must be excluded from test
items if the items are to function as intended. It is frequently good practice, however, to use
such clues to lead the uninformed away from the correct answer. If not overdone, this can
contribute to the plausibility of the incorrect alternatives.

10. Make the distracters plausible and attractive to the uninformed. The distracters in a
multiple-choice item should be so appealing to the students who lack the knowledge
called for by the item that they select one of the distracters in preference to the correct
answer. This is the ideal, of course, but one toward which the test maker must work
continually. The art of constructing a good multiple-choice item depends heavily on
the development of effective distracters. You can do several things to increase the plausibility
and attractiveness of distracters:

a. Use the common misconceptions of errors of students as distracters.

b. State the alternatives in the language of the student.
c. Use “good-sounding” words (“accurate,” “important,”) in the distracters as well as
in the correct answer.
d. Make the distracters similar to the correct answer in both length and complexity of
wording.
e. Use extraneous clues in the distracters, such as stereotypes phrasing, scientific-
sounding answers, and verbal associations with the stem.
f. Make the alternatives homogenous, but in doing so beware of fine discriminations
that are educationally insignificant.

The greater plausibility resulting from the use of more homogenous alternatives can be seen
in the improved version of the following item.

Example
Poor: Obtaining a dependable ranking of students is of major concern when using
a. norm-referenced summative tests.*
b. behavior descriptions
c. checklists
d. questionnaires.

Better: Obtaining a dependable ranking of students is of major concern when using

a. norm-referenced summative tests.*
b. teacher-made diagnostic tests.
c. mastery achievement tests.
d. criterion-referenced formative tests.
11. Vary the relative length of the correct answer to eliminate length as a clue. There is a
tendency for the correct answer to be longer than the alternatives because of the need
to qualify statements to make them unequivocally correct. The relative length of the
correct answer can be removed as a clue by varying it in such a manner that no apparent
pattern is provided. That is, it should sometimes be longer, sometimes shorter, and
sometimes of equal length – but never consistently or predominantly of one relative
length. In some cases, it is more desirable to make the alternatives approximately equal
length by adjusting the distracters rather than the correct answer.

Example
Poor: One advantage of multiple-choice items over essay questions is that they
a. measure more complex outcomes.
b. depend more on recall.
c. require less time to score.
d. provide for a more extensive sampling of course content.*
Better: One advantage of multiple-choice items over essay questions is that they
a. provide for the measurement of more complex learning outcomes.
b. place greater emphasis on the recall of factual information.
c. require less time for test preparation and scoring.
d. provide for a more extensive sampling of course content.

12. Avoid using the alternative “all of the above,” and use “none of the above” with
extreme caution. When test makers are having difficulty in locating a sufficient number
of distracters, they frequently resort to the use of “all of the above” or “none of the
above” as the final option. These special alternatives are seldom used appropriately
and almost always render the item less effective than it would be without them.
The inclusion of “all of the above” as an option makes it possible to answer the item
based on partial information. Since students are to select only one answer, they can
detect “all of the above” as the correct choice simply by noting that two of the
alternatives are correct. They can also detect it as a wrong answer by recognizing that
at least one of the alternatives is incorrect; of course, their chance of guessing the
correct answer from the remaining choices then increases proportionally. Another
difficulty with option is that some students, recognizing that the first choice is correct,
will select it without reading the remaining alternatives. Obviously, the use of “none of the
above” is not possible with the best-answer type of multiple-choice item since the alternatives
vary in appropriateness and the criterion of absolute correctness is not applicable. When used
as the right answer in a correct- answer type of item, this option may be measuring nothing
more than the ability to detect incorrect answers. Recognizing that certain answers are wrong
is no guarantee that the student knows what is correct. For example, a student may be able to
answer the following item correctly without being able to name the categoriesin the
taxonomy.

Example
Poor: Which of the following is a category in the revised taxonomy of the cognitive
domain?
a. Critical Thinking.
b. Scientific Thinking.
c. Reasoning Ability.
d. None of the above.
13. Vary the position of the correct answer in a random manner. The correct answer should
appear in each alternative position about the same number of times, but its placement
should not follow a pattern that may be apparent to the person taking the test.
Sufficient variation without a discernible pattern might also be obtained by simply
placing the responses in alphabetical order, based on the first letter in each, and letting
the correct answer fall where it will.

When the alternative responses are numbers, they should always be listed in order
of size, preferably in ascending order. This will eliminate the possibility of a clue, such
as the correct answer being the only one that is not in numerical order.

14. Control the difficulty of the item either by varying the problem in the stem or by
changing the alternatives. It is usually preferable to increase item difficulty by
increasing the level of knowledge called by making the problem more complex.
However, it is also possible to increase the difficulty by making the alternatives more
homogenous. When this is done, care must be taken that the finer discriminations
called for are educationally significant and are in harmony with the learning outcomes
to be measured.

15. Make certain each item is independent of the other items in the test. Occasionally
information given in the stem of one item will help the students answer another item.
This can be remedied easily by a careful review of the items before they are assembled
into a rest. A different type of problem occurs when the correct answer to an item depends on
knowing the correct answer to the item preceding it. The student who is unable to
answer the first item is unable to answer the first item, of course, has no basis for
responding to the second. Such chains of interlocking items should be avoided. Each
item should be an independently scorable unit.

16. Use an efficient item format. The alternatives should be listed in separate lines, under
one another, like the examples provided here. This makes the alternatives easy to read
and compare. It also contributes to ease of scoring since the letter of the alternatives
all appears on the left side of the paper.

17. Follow the normal rules of grammar. If the stem is in question form, begin each
alternative with a capital letter and end with a period or other appropriate punctuation
mark. Omit the period with numerical answers, however, to avoid confusion with
decimal points. When the stem is an incomplete statement, start each alternative with
a lowercase letter and end with whatever terminal punctuation mark is appropriate.

18. Break (or bend) any of these rules if it will improve the effectiveness of the items. These
rules for constructing multiple-choice items are stated rather dogmatically as an aid to
the beginner. As experience in item writing is obtained, situations are likely to occur
where ignoring or modifying a rule may be desirable.

A2. Nature of True-False Items (Keith Waugh & Gronlund, 2013)

True-false items are typically used to measure the ability to identify whether
statements of fact are correct. The basic format is simply a declarative statement that the
student must judge as true or false. There are modifications of this basic form in which the
student must respond “yes” or “no,” “agree” or “disagree,” “right” or “wrong,” “fact” or
“opinion,” and the like. Such variations are usually given the more general name of
alternative-response items. In any event, this item type is characterized by the fact that only
two responses are possible.

Example
T *F True-false items are classified as supply-type items.
In some cases, the student is asked to judge each statement as true or false, and then
to change the false statements so that they are true. When this is done, a portion of each
statement is underlined to indicate the part that can be changed. In the example given, for
instance, the words “supply type” would be underlined. The key parts of true statements,
of course, must also be underlined.
Another variation is the cluster-type true-false format. In this case, a series of items
is based on a common stem.

Example
Which of the following terms indicate observable student performance? Circle Y for yes
and N for no.
*Y N 1. Explains
*Y N 2. Identifies
Y *N 3. Learns
*Y N 4. Predicts
Y *N 5. Realizes

This stem format is especially useful for replacing multiple-choice items that have
more than one correct answer. Such items are impossible to score satisfactorily. This is
avoided with the cluster-type item because it makes each alternative a separate scoring unit
of one point.

Below are the strengths and limitations of a multiple-choice test.

Strengths
1. The item is useful for outcomes where there are only two possible alternatives (e.g.,
fact or opinion, valid or invalid).
2. Less demand is placed on reading ability than in multiple-choice items.
3. A relatively large number of items can be answered in a typical testing period.
4. A complex outcome can be measured when used with interpretive exercises.
5. Scoring is easy, objective, and reliable.

Limitations
1. It is difficult to write items beyond the knowledge level that are free from ambiguity.
2. Making an item false provides no evidence that the student knows what is correct.
3. No diagnostic information is provided by the incorrect answers.
4. Scores are more influenced by guessing than with any other item type.

Rules for Writing True-False Items (Keith Waugh & Gronlund, 2013)
The purpose of a true-false item, as with all item types, is to distinguish between
those who have and those who have not achieved the intended learning outcome. Achievers
should be able to select the correct alternative without difficulty, while nonachievers should
find the incorrect alternative at least as attractive as the correct one. The rules for writing
true-false items are directed towards this end.

1. Include only one central idea in each statement. The main point of the item should be a
prominent position in the statement. The true-false decision should not depend on some
subordinate point or trivial detail. The use of several ideas in each statement should
generally be avoided because these tend to be confusing, and the answer is more apt to
be influenced by reading ability than the intended outcome.
Example
Poor: T F* The true-false item is also called an alternative-response item.
Better: T* F The true-false item, which is favored by test experts, is also called an alternative-
response item.

The “poor” example must be marked false because test experts do not favor the true-
false item. Such subordinate points are easily overlooked when reading the item. If the point
is important, it should be included as the main idea in a separate item.

2. Keep the statement short and use simple vocabulary and sentence structure. A short,
simple statement will increase the likelihood that the point of the item is clear. All
students should be able to grasp what the statement is saying.

Example
Poor: T* F The true-false item is more subject to guessing but it should be
used in place of a multiple-choice item, if well-constructed,
when there is a dearth of plausible distracters.

Better: T* F The true-false item should be used in place of a multiple-choice item when only
two alternatives are possible.

3. Word the statement so precisely that it can unequivocally be judged true or false. True
statements should be true under all circumstances and yet free of qualifiers (“may,”
“possible,” and so on), which might provide clues. This requires the use of precise words
and the avoidance of such vague terms as “seldom,” “frequently,” and “often.” The same
care, of course, must also be given to false statements so that their falsity is not too
readily apparent from differences in wording

Example
Poor: T F* Lengthening a test will increase its reliability.
Better: T* F Lengthening a test by adding items like those in the test will increase its
reliability.

4. Use negative statements sparingly and avoid double negatives. The “no” and/or “not” in
negative statements are frequently overlooked and are read as positive statements. Thus,
negative statements should be used only when the learning outcome requires it (e.g., in
avoiding a harmful practice), and then the negative words should be emphasized by
underlining or by using capital letters. Statements including double negatives tend to be
so confusing that they should be restated in positive form.

Example
Poor: T* F Correction-for-guessing is not a practice that should never be used in testing.
Better: T* F Correction-for-guessing is a practice that should sometimes be used in testing

5. Statements of opinion should be attributed to some source unless used to distinguish

facts from opinion. A statement of opinion is not true or false by itself, and it is a poor
instructional practice to have students respond to it as if it were a factual statement.
Obviously, the only way students could mark such an item correctly would be to agree
with the opinion of the item writer. It is much defensible to attribute the item to some
source, such as an individual or organization. It then becomes a measure of how well
the student knows the beliefs or values of that individual or organization.

Example
Poor: T F Testing should play a major role in the teaching-learning process.
Better: T* F Gronlund believes that testing should play a major role in the teaching-learning
process.

In some cases, it is useful to use a series of opinion statements that pertain to the same
individual or organization. This permits a more comprehensive measure of how well the
student understands a belief or value system.

Example
Would the author of your textbook agree or disagree with the following statements?
Circle A for agree, D for disagree.
A* D 1. The first step in achievement testing is to state the intended learning outcomes in
performance tests.
A D* 2. True-false tests are superior to multiple-choice tests for measuring achievement.

Using about 10 items like those listed here would provide a good indication of the
students’ grasp of the author’s point of view. Another valuable use of opinion statements
is to ask students to distinguish between statements of fact and statements of opinion.

Example
Read each of the following statements and circle F if it is a fact and circle O if it is an
opinion.
F* O 1. The true-false item is a selection-type item.
F O* 2. The true-false item is difficult to construct.
F O* 3. The true-false item encourages student guessing.
F* O 4. The true-false item can be scored objectively.

In addition to illustrating the use of opinion statements in test items, the last two
examples illustrate variations from the typical true-false format. These are more logically
called alternative-response items.

6. When cause-effect relationships are being measured, use only true propositions. The
true-false item can be used to measure the “ability to identify cause-effect relationships,”
and this is an important aspect of understanding. When used for this purpose, both
propositions should be true and only the relationship judge true or false.

Example
Poor: T F* True-false items are classified as objective items because students must supply the
answer.
Better: T F* True-false items are classified as objective items because there are only two
possible answers.

7. Avoid extraneous clues to the answer. There are several specific determiners that provide
verbal clues to the truth or falsity of an item. Statements that include such absolutes as
“always,” “never,” “all,” “none,” and “only” tend to be false; statements with qualifiers
such as “usually,” “may,” and “sometimes” tend to be true. Either these verbal clues
must be eliminated from the statements, or their use must be balanced between true items
and false items.

Example
Poor: T F* A statement of opinion should never be used in a true-false item.
Poor: T* F A statement of opinion may be used in a true-false item.
Better: T* F A statement of opinion, by itself, cannot be marked true or false.

The length and complexity of the statement might also provide a clue. True statement
tends to be longer and more complex than false ones because of their need for qualifiers.
Thus, a special effort should be made to equalize true and false statements in these respects.
A tendency to use a disproportionate number of true statements, or false statements,
might also be detected and used as a clue. Having approximately, but not exactly, an equal
number of each seems to be the best solution. When assembling the test, it is, of course,
also necessary to avoid placing the correct answers in some discernible pattern (for
instance, T, F, T, F). Random placement will eliminate this possible clue.

8. Base items on introductory material to measure more complex learning outcomes. True-
false or alternative-response items are frequently used in interpreting written materials,
tables, graphs, maps, or pictures. The use of introductory material makes it possible to
measure various types of complex learning outcomes.

A3. Nature of Matching Items (Keith Waugh & Gronlund, 2013)

The matching item is simply a variation of the multiple-choice form. A good practice
isto switch to the matching format only when it becomes apparent that the same alternatives
are being repeated in several multiple-choice items.
Below are the strengths and limitations of matching items.

Strengths
1. A compact and efficient form is provided where the same set of responses fit a series
of item stems (i.e., premises).
2. Reading and response time are short.
3. This item type is easily constructed if converted from multiple-choice items having a
common set of alternatives.
4. Scoring is easy, objective, and reliable.

Limitations
1. This item type is largely restricted to simple knowledge outcomes based on
association.
2. It is difficult to construct items that contain enough homogenous responses.
3. Susceptibility to irrelevant clues is greater than in other item types.

Example
Which test item is least useful for educational diagnosis?
a. Multiple-choice item.
b. True-false item.
c. Short-answer item.

Which test item measures the greatest variety of learning outcomes?

a. Multiple-choice item.
b. True-false item.
c. Short-answer item.

Which test item is difficult to score objectively?

a. Multiple-choice item.
b. True-false item.
c. Short-answer item.

Which test item provides the highest score by guessing?

a. Multiple-choice item.
b. True-false item.
c. Short-answer item.

By switching to a matching format, we can eliminate the repetition of the alternative

answers and present the same items in a more compact form. The matching format consists
of a series of stems, called premises, and a series of alternative answers called responses.
These are arranged in columns with directions that set the rules for matching. The following
example illustrates how our multiple-choice items can be converted to matching forms.

Example
Directions: Column A contains a list of characteristics of test items. On the line to the
left of each statement, write the letter of the test item in Column B that best fits the
statement. Each response in Column B may be used once, more than once, or not at all.

Column A Column B

A 1. Is least useful for educational diagnosis. A. Multiple-choice item.

A 2. Measures greatest variety of learning outcomes. B. True-false item.

C 3. Is most difficult to score objectively. C. Short-answer item.

B 4. Provides the highest score by guessing.

The conversion to matching item illustrated here is probably the most defensible use
of this item type. All too frequently, matching items consist of a disparate collection of
premises, each of which has only one or two plausible answers. This can be avoided by
starting with multiple-choice items and switching to the matching format only when it
provides a more compact and efficient means of measuring the same achievement.

Rules for Writing Matching Items (Keith Waugh & Gronlund, 2013)
A good matching item should function the same as a series of multiple-choice items.
As each premise is considered, all the responses should serve as plausible alternatives. The
rules for item writing are directed toward this end.

1. Include only homogenous material in each matching item. In our earlier example of a
matching item, we included only types of test items and their characteristics. Similarly,
an item might include only authors and their works, inventors and their inventions,
scientists and their discoveries, or historical events and their dates. This homogeneity is
necessary if all responses are to serve as plausible alternatives (see earlier example).

2. Keep the list of items short and place the brief responses on the right. A short list of
items (say fewer than 10) will save reading time, make it easier for the student to locate
the answer, and increase the likelihood that the responses will be homogenous and
plausible. Placing the brief responses on the right also saves reading time.

3. Use a larger, or smaller, number of responses than premises, and permit the responses
to be used more than once. Both an uneven match and the possibility of using each
response more than once reduce the guessing factor. As we noted earlier, proper use of
the matching form requires that all responses be plausible alternatives for each premise.
This, of course, dictates that each response be eligible for use more than once.

4. Place the responses in alphabetical or numerical order. This will make the selection of
the responses easier and avoid possible clues foe to placement.

5. Specify in the directions the basis for matching and indicate that each response may be
used once, more than once, or not at all. This will clarify the task for all students and
prevent any misunderstanding. Take care, however, not to make the directions too long
and involved. The previous example illustrates adequate detail for directions

6. Put all the matching items on the same page. This will prevent the distraction of flipping
pages back and forth and prevent students from overlooking responses on another page.

B. Constructed-Response Test (Balagtas-Ubina et al., 2020)

Constructed-response items require learners to supply answers to a given question or

problem.

B1. Nature of Short-Answer Items (Keith Waugh & Gronlund, 2013)

The short-answer (or completion) item requires the examinee to supply the
appropriate words, numbers, or symbols to answer a question or complete a statement.
What are the incorrect responses in a multiple-choice item called? (distracters)
The incorrect responses in a multiple-choice item are called (distracters).

This item type also includes computational problems and any other simple item form
that requires supplying the answer rather than selecting it. Except for its use in
computational problems, the short-answer item is used primarily to measure the simple
recall of knowledge.

The short-answer item appears to be easy to write and use, but there are two major
problems in constructing short-answer items. First, it is extremely difficult to phrase the
question or incomplete statement so that only one answer is correct. Second, there is the
problem of spelling. If credit is given only when the answer is spelled correctly, the poor
spellers will be prevented from showing their true level of achievement, and the test scores
will become an uninterpretable mixture of knowledge and spelling skills.
Below are the strengths and limitations of short-answer items.

Strengths
1. It is easy to write test items.
2. Guessing is less likely than in selection-type items.
3. This item type is well suited to computational problems and other learning outcomes
where supplying the answer is important.
4. A broad range of knowledge outcomes can be measured.

Limitations
1. It is difficult to phrase statements so that only one answer is correct.
2. Scoring is contaminated by spelling ability when responses are nonverbal.
3. Scoring is tedious and time-consuming.
4. This item type is not very adaptable to measuring complex learning outcomes.

Rules for Writing Short-Answer Items (Keith Waugh & Gronlund, 2013)
1. State the item so that only a single, brief answer is possible. This requires great skill in
phrasing and the use of precise terms. What appears to be a simple, clear question to the
test maker can frequently be answered in many ways.

2. Start with a direct question and switch to an incomplete statement only when greater
conciseness is possible by doing so. The use of a direct question increases the likelihood
that the problem will be stated clearly and that only one answer will be appropriate.
Also, incomplete statements tend to be less ambiguous when they are based on problems
that were first stated in questions form.

Example
What is another name for true-false items? (alternative-response items)
True-false items are also called (alternative-response items)
In some cases, it is best to leave it in question form. This may make the item clearer,
especially to younger students.

3. It is best to leave only one blank, and it should relate to the main point of the statement.
Leaving several blanks to be filled in is often confusing and the answer to one blank
may depend on the answer in another.
Example
Poor: In terms of the type of response, the (matching) item is most like the (multiple-choice)
item.
Better: In terms of the type of responses, which item is most like the matching item?
(multiple choice)

In the “poor” version, several different responses would have to be given credit,
such as “short answer” and “essay,” and “true-false,” and “multiple choice.” Obviously,
the item would not function as originally intended. It is also important to avoid asking
students to respond to unimportant or minor aspects of a statement. Focus on the main idea of
the item and leave a blank only for the key response.

4. Place the blanks at the end of the statement. This permits the student to read the complete
problem before coming to the blank to be filled. With this procedure, confusion and
rereading of the item are avoided, and scoring is simplified. Constructing incomplete
statements with blanks at the end is more easily accomplished when the item is first
stated as a direct question, as suggested earlier. In some cases, it may be a matter of
rewording the item and changing the response to be made.

Example
Poor: (Reliability) is likely to increase when a test is lengthened.
Better: When a test is lengthened, reliability is likely to (increase).

With this particular item, the “better” version also provides a more clearly focused
item. The “poor” version could be answered by “validity,” “time for testing,” “fatigue,”
and other unintended but clearly correct responses. This again illustrates the great care
needed in phrasing short-answer items.

5. Avoid extraneous clues to answer. One of the most common clues in short-answer items
is the length of the blank. If a long blank is used for a long word and a short word, this
is obviously a clue. Thus, all blanks should be uniform in length. Another common clue
is the use of the indefinite article “a” or “an” just before the blank. It sometimes gives
away the answer or at least rules out some possible incorrect answers.

Example
Poor: The supply item used to measure the ability to organize and integrate material
is called an (essay item).
Better: Supply-type items used to measure the ability to organize and integrate
material are called (essay items).

The “poor” version rules out “short-answer item,” the only other supply item
because it does not follow the article “an.” One solution is to include both articles, using
a(an). Another solution is to eliminate the article by switching to plural, as shown in the
“better” version.

6. For numerical answers indicate the degree of precision expected and the units in which
they are expressed. Indicating the degree of precision (e.g., to the nearest whole number)
will clarify the task for students and prevent them from spending more time on an item
that is required. Indicating the units in which to express the answer will aid scoring by
providing a more uniform set of responses (e.g., minutes rather than fractions of an
hour). When the learning outcome requires knowing the type of unit in common use and
the degree of precision expected, this rule must then be disregarded.

B2. Nature of Essay Questions (Keith Waugh & Gronlund, 2013)

The most notable characteristic of the essay question is the freedom of response it
provides. As with the short answer item, students must produce their own answers. With
the essay question, however, they are free to decide how to approach the problem, what
factual information to use, how to organize the answer, and what degree of emphasis to
give each aspect of the response. Thus, the essay question is especially useful for measuring
the ability to organize, integrate, and express ideas. These are the types of performance for
which selection-type items and short-answer items are so inadequate.

Types of Essay Questions (Keith Waugh & Gronlund, 2013)

The freedom of response permitted by essay questions varies considerably. Students
may be required to give a brief and precise response, or they may be given great freedom
in determining the form and scope of their answers. Questions of the first type are
commonly called restricted-response questions and those of the second type is called
extended-response questions. This is an arbitrary but convenient pair of categories for
classifying essay questions.

Restricted-Response Questions. The restricted-response question places strict limits on the

answer to be given. The boundaries of the subject matter to be considered are usually
narrowly defined by the problem, and the specific form of the answer is also commonly
indicated (by words such as “list,” “define,” and “give reasons”). In some cases, the
response is limited further using introductory material or by the use of special directions.

Example
Describe the relative merits ofselection type test items and essay questionsfor measuring
learning outcomes at the understanding level. Confine your answers to one page.

Example
Mr. Rogers, a ninth-grade science teacher, wants to measure his students’ “ability to
interpret scientific data” with the paper-and-pencil test.

1. Describe the steps that Mr. Rogers should follow.

2. Give reasons to justify each step.

Extended-Response Questions. The extended-response question gives students almost

unlimited freedom to determine the form and scope of their responses. Although in some
instances rather rigid practical limits may be imposed, such as time limits or page limits,
restrictions on the material to be included in the answer and on the form of response are
held to a minimum.

Example
Evaluation Outcome: (The student is given a complete achievement test that includes
errors or flaws in the directions, in the test items, and the
arrangement of the items.) Write a critical evaluation of this test
using as evaluative criteria the rules and standards for test
construction described in your textbook. Include a detailed
analysis of the test’s strengths and weaknesses and an evaluation
of its overall quality and probable effective.

The following are the strengths and limitations of the essay questions:

Strengths
1. The highest-level learning outcomes (analyzing, evaluating, creating) can be
measured.
2. Preparation time is less than that for selection-type items.
3. The integration and application of ideas are emphasized.
Limitations
1. There is an adequate sampling of achievement due to the time needed for answering
each question.
2. It is difficult to relate to intended learning outcomes because of the freedom to select,
organize, and express ideas.
3. Scores are raised by writing skill and bluffing and lowered by poor handwriting,
misspelling, and grammatical errors.
4. Scoring is time-consuming and subjective and tends to be unreliable.

Rules for Writing Essay Questions (Keith Waugh & Gronlund, 2013)

1. Use essay questions to measure complex learning outcomes only. Most recalls of
knowledge outcomes profit little from being measured by essay questions. These
outcomes can usually be measured more effectively by objective items that lack the
sampling and scoring problems that essay questions introduce. There may be few
exceptions, as when supplying the answer is a basic part of the learning outcome. But for
most, recall of knowledge outcomes essay questions simply provide a less reliable
measure with no compensating benefits.

2. Relate the questions as directly as possible to the learning outcomes being measured.
Essay questions will not measure complex learning outcomes unless they are carefully
constructed to do so. Each question should be specifically designed to measure one or
more well-defined outcomes. Thus, the place to start, as is the case with objective items,
is with a precise description of the performance to be measured. This will help determine
both the content and form of the item and will aid in the phrasing of it.

3. Formulate questions that present a clear task to be performed. Phrasing an essay

question so that the desired response is obtained is no simple matter. Selecting precise
terms and carefully phrasing and rephrasing the question with the desired response in
mid will help clarify the task to the student. Since essay questions are to be used as a
measure of complex learning outcomes, avoid starting such questions with “who,”
what,” “when,” “where,” “name,” and “list.” These terms tend to limit the response to
the recall of knowledge outcomes. Complex achievement is most apt to be called forth
by such words as “why,” “describe,” “explain,” “compare,” “relate,” “contrast,”
“interpret,” “analyze,” “criticize,” and “evaluate.” The specific terminology to be used
will be determined largely by the specific behavior described in the learning outcomes
to be measured.

4. Do not permit a choice of questions unless the learning outcome requires it. In most tests
of achievement, it is best to have all students answer the same questions. If they are
permitted to write on only a fraction of the questions, such as three out of five, their
answers cannot be evaluated on a comparative basis. Also, since the students will tend
to choose those questions, they are best prepared to answer, their responses will provide
a sample of their achievement that is less representative than that obtained without
optional questions. As discussed earlier, one of the major limitations of the essay test is
the limited and unrepresentative sampling it provides. Giving students a choice among
questions simply complicates the sampling problem further and introduces greater
distortion into the test results. In some situations, the use of optional questions might be
defensible. For example, if the essay is to be used as a measure of writing skill only,
some choice of topics on which to write may be desirable. This might also be the case
if the essay is used to measure some aspects of creativity, or if the students have pursued
individual interests through independent study.

5. Provide ample time for answering and suggest a time limit on each question. Since essay
questions are designed most frequently to measure intellectual skills and abilities, time
must be allowed for thinking as well as for writing. Thus, generous time limits should
be provided. Informing students of the appropriate amount of time they should spend on
each question will help them use their time efficiently; ideally, it will also provide a
more adequate sample of achievement. If the length of the answer is not clearly defined
by the problem, as in some extended-response questions, it might also be desirable to
indicate page limits.

Rules for Scoring Essay Answers (Keith Waugh & Gronlund, 2013)

1. Evaluate answers to essay questions in terms of the learning outcomes being measured.
The essay test, like the objective test, is used to obtain evidence concerning the extent
to which clearly defined learning outcomes have been achieved. Thus, the desired
student performance specified in these outcomes should serve as a guide both for
constructing the question and for evaluating the answers.

2. Score restricted-response answers by the point method, using the model answer as a
guide. Scoring with the aid of a previously prepared scoring key is possible with the
restricted-response item because of the limitations placed on the answer. The procedure
involves writing a model answer to each question and determining the number of points
to be assigned to it and the parts within it. The distribution of points within an answer
must, of course, consider all scorable units indicated in the learning outcomes being
measured.

3. Grade extended-response answers by rating method, using defined criteria as a guide.

Extended-response items allow so much freedom in answering that the preparation of a
model answer is frequently impossible. Thus, the test maker usually grades each answer
by judging its quality in terms of a previously determined set of criteria, rather than
scoring it point by point with a scoring key.
4. Evaluate all the students’ answersto one question before proceeding to the next question.
Scoring or grading essay tests question by question, rather than student by student,
makes it possible to maintain a more uniform standard for judging the answers to each
question. This procedure also helps offset the halo effect in grading. When all the
answers on one paper are read together, the grader’s impression of the paper as a whole
is apt to influence the grades assigned to the individual answers. Grading question by
question prevents the formation of this overall impression of a student’s paper.

5. Evaluate answers to essay questions without knowing the identity of the reader. This is
another way to control personal bias during scoring. Answers to essay questions should
be evaluated in terms of what is written, not in terms of what is known about the writers
from other contacts with them. The best way to prevent prior knowledge from biasing
our judgment is to evaluate each answer without knowing the identity of the writer.

6. Whenever possible, have two or more persons grade each answer. The best way to check
on the reliability of the scoring of essay answers is to obtain two or more independent
judgments. Although this may not be a feasible practice for routine classroom testing, it
might be done periodically with a fellow teacher (one who is equally competent in the
area).

Submitted By: Rosemarie B. Gilbuena

Submitted To: Prof. Matilde D. Tonel

Lesson 5c MCQ Test Construction
100% (6)
Lesson 5c MCQ Test Construction
38 pages
Handout in Session 3
No ratings yet
Handout in Session 3
7 pages
Writing Selection Items
No ratings yet
Writing Selection Items
57 pages
PSC 352 Handout Planning A Test
No ratings yet
PSC 352 Handout Planning A Test
14 pages
Types of Test Items
No ratings yet
Types of Test Items
58 pages
Designing Classroom Language Tests
100% (4)
Designing Classroom Language Tests
9 pages
Module 5 Development of Classroom Assessment Tools
100% (2)
Module 5 Development of Classroom Assessment Tools
14 pages
Multiple Choice Items: Paper Nature of Student Assessment
No ratings yet
Multiple Choice Items: Paper Nature of Student Assessment
23 pages
Methods of Evaluation
No ratings yet
Methods of Evaluation
34 pages
Assessment - Writing Good Multiple Choice Test Questions
No ratings yet
Assessment - Writing Good Multiple Choice Test Questions
10 pages
Unit 2 Lesson 2 PDF
No ratings yet
Unit 2 Lesson 2 PDF
7 pages
Educator's Guide to Test Formats
100% (1)
Educator's Guide to Test Formats
4 pages
Module 5 - Development of Classroom Assessment Tools
No ratings yet
Module 5 - Development of Classroom Assessment Tools
10 pages
Types of Tests
No ratings yet
Types of Tests
26 pages
Writing Good Multiple Choice Test Questions
No ratings yet
Writing Good Multiple Choice Test Questions
10 pages
Module 6a Types of Tests
No ratings yet
Module 6a Types of Tests
18 pages
November 24, 2023 CLASSWORK
No ratings yet
November 24, 2023 CLASSWORK
12 pages
MC Better Items Revised
No ratings yet
MC Better Items Revised
30 pages
Multiple Choice: Prepared by
No ratings yet
Multiple Choice: Prepared by
34 pages
Test Construction and Evaluation
No ratings yet
Test Construction and Evaluation
55 pages
Science Evaluation Techniques
No ratings yet
Science Evaluation Techniques
19 pages
Educator's Guide to Test Design
100% (1)
Educator's Guide to Test Design
34 pages
Al1 Chapt 2 L5 Handout
0% (1)
Al1 Chapt 2 L5 Handout
5 pages
Construction of Test Items 1
100% (1)
Construction of Test Items 1
30 pages
Writing Good Multiple Choice Test Questions: by Cynthia J. Brame, CFT Assistant Director
No ratings yet
Writing Good Multiple Choice Test Questions: by Cynthia J. Brame, CFT Assistant Director
19 pages
Multiple Choice
No ratings yet
Multiple Choice
31 pages
Traditional Assessments
No ratings yet
Traditional Assessments
79 pages
Educational Assessment and Evaluation (8602)
No ratings yet
Educational Assessment and Evaluation (8602)
42 pages
Crafting Effective Multiple-Choice Questions
100% (1)
Crafting Effective Multiple-Choice Questions
51 pages
Test Design for Educators
100% (2)
Test Design for Educators
52 pages
Pt. 3 ASL 1 SY 24 25
No ratings yet
Pt. 3 ASL 1 SY 24 25
77 pages
Types of Items and Their Utility in Educational Context
No ratings yet
Types of Items and Their Utility in Educational Context
35 pages
Achievement Test Insights
100% (1)
Achievement Test Insights
57 pages
DEVELOPMENT OF CLASSROOM ASSESSMENT (AutoRecovered)
No ratings yet
DEVELOPMENT OF CLASSROOM ASSESSMENT (AutoRecovered)
6 pages
Different Formats of Classroom Assessment Tools
No ratings yet
Different Formats of Classroom Assessment Tools
23 pages
Assessment Work
No ratings yet
Assessment Work
13 pages
Test Design for Educators
No ratings yet
Test Design for Educators
43 pages
Testing Notes B
No ratings yet
Testing Notes B
11 pages
Reviewer-Profed6 CH 5
No ratings yet
Reviewer-Profed6 CH 5
11 pages
Educational Assessement Assignment
No ratings yet
Educational Assessement Assignment
8 pages
Dominican College of Tarlac Capas, Tarlac College of Education A.Y. 2022-2023, SECOND SEMESTER Obe Faculty-Designed Module
No ratings yet
Dominican College of Tarlac Capas, Tarlac College of Education A.Y. 2022-2023, SECOND SEMESTER Obe Faculty-Designed Module
6 pages
Construct Test Item
No ratings yet
Construct Test Item
39 pages
Multiple Choice Test PED 6
100% (2)
Multiple Choice Test PED 6
19 pages
Types of Tests: Unit: 4
No ratings yet
Types of Tests: Unit: 4
20 pages
Unit No. 4 Types of Tests
No ratings yet
Unit No. 4 Types of Tests
24 pages
Lesson 5 G5 - 082520
No ratings yet
Lesson 5 G5 - 082520
27 pages
QUIZZES and EXAM Making-Tips
No ratings yet
QUIZZES and EXAM Making-Tips
34 pages
Constructing Multiple-Choice Tests: Silliman University
No ratings yet
Constructing Multiple-Choice Tests: Silliman University
49 pages
Detailed Lesson Plan in Assessment of Student Learning 2
100% (2)
Detailed Lesson Plan in Assessment of Student Learning 2
14 pages
Development of Selection Type Test Items UNIT:4.C.C:8602
No ratings yet
Development of Selection Type Test Items UNIT:4.C.C:8602
30 pages
Construction of Written Tests
No ratings yet
Construction of Written Tests
41 pages
Constructing Test Items BARU
100% (1)
Constructing Test Items BARU
324 pages
Test Constructions Best Note File For Test Constraction
No ratings yet
Test Constructions Best Note File For Test Constraction
11 pages
Pambayang Kolehiyo NG Mauban: By: Maribel V. Villaverde Unit Earner
No ratings yet
Pambayang Kolehiyo NG Mauban: By: Maribel V. Villaverde Unit Earner
23 pages
Quiz Preparing Hacks by KHTuhin
No ratings yet
Quiz Preparing Hacks by KHTuhin
22 pages
E-BSES412 Module 4 Guidelines in Constructing Objective Non-Objective Test Items
100% (1)
E-BSES412 Module 4 Guidelines in Constructing Objective Non-Objective Test Items
13 pages
Assessment Group 4
No ratings yet
Assessment Group 4
33 pages
Economic Anthropology Basics
No ratings yet
Economic Anthropology Basics
14 pages
Essay 1
No ratings yet
Essay 1
3 pages
The Survey On The Use of ICT in The Teaching and Learning For Teachers
No ratings yet
The Survey On The Use of ICT in The Teaching and Learning For Teachers
1 page
Reflective Essay Sumagaysay 1
No ratings yet
Reflective Essay Sumagaysay 1
5 pages
Assessment and Evaluation
No ratings yet
Assessment and Evaluation
4 pages
Final Requirement in Pe 112
No ratings yet
Final Requirement in Pe 112
2 pages
Book Accountability Form: Subjects Book Title Code
No ratings yet
Book Accountability Form: Subjects Book Title Code
4 pages
DHC 8 Sop PDF
No ratings yet
DHC 8 Sop PDF
251 pages
Assignment 2 - Elasticity Solutions: Problem 1 (2 Marks)
No ratings yet
Assignment 2 - Elasticity Solutions: Problem 1 (2 Marks)
4 pages
Abay Fana Dairy Farm Investment Analysis
100% (1)
Abay Fana Dairy Farm Investment Analysis
42 pages
On Training and Development of Bangladesh Development Bank Limited
No ratings yet
On Training and Development of Bangladesh Development Bank Limited
19 pages
Parametric Experimental Analysis of Erosion Wear On Mild Steel Material - Response Surface Methodology
No ratings yet
Parametric Experimental Analysis of Erosion Wear On Mild Steel Material - Response Surface Methodology
7 pages
Chapter-3 Rework
No ratings yet
Chapter-3 Rework
12 pages
Resume Introduction To Economic 4
No ratings yet
Resume Introduction To Economic 4
4 pages
Đáp Án Đề Thi Thử Số 42 (2019-2020)
100% (1)
Đáp Án Đề Thi Thử Số 42 (2019-2020)
6 pages
Lal Kitab: Astrology and Palmistry Guide
No ratings yet
Lal Kitab: Astrology and Palmistry Guide
7 pages
Aruba 3810 Switch Series Data Sheet
No ratings yet
Aruba 3810 Switch Series Data Sheet
30 pages
Checklist For Starting PT College
No ratings yet
Checklist For Starting PT College
20 pages
Laboratory Department Tracking Sheet Original
No ratings yet
Laboratory Department Tracking Sheet Original
124 pages
Yi Camera Remote Guide
No ratings yet
Yi Camera Remote Guide
2 pages
DH XVR7104 08 4KL B X - Datasheet - 20201027
No ratings yet
DH XVR7104 08 4KL B X - Datasheet - 20201027
3 pages
Moroccan Export Promotion Guide
No ratings yet
Moroccan Export Promotion Guide
50 pages
Fuck Better A Simple Guide To Superior Sex
73% (11)
Fuck Better A Simple Guide To Superior Sex
86 pages
Module III Valuation in GST
No ratings yet
Module III Valuation in GST
89 pages
Silent Knight 6820
No ratings yet
Silent Knight 6820
4 pages
FLuconazole Anhydrate Form
No ratings yet
FLuconazole Anhydrate Form
14 pages
DVB-H Reception On Portable Handsets
No ratings yet
DVB-H Reception On Portable Handsets
4 pages
Jipmat 2023 Paper
No ratings yet
Jipmat 2023 Paper
12 pages
CSP Project File (Umar-F3)
No ratings yet
CSP Project File (Umar-F3)
35 pages
CPX en
No ratings yet
CPX en
215 pages
Materializing The Digital: Architecture As Interface: Materia Arquitectura #13
No ratings yet
Materializing The Digital: Architecture As Interface: Materia Arquitectura #13
5 pages
Texas Edible Wild Plant Foraging Beginner Foraging Field Guide For Finding, Identifying, Harvesting, and Preparing Edible Wild Food
100% (9)
Texas Edible Wild Plant Foraging Beginner Foraging Field Guide For Finding, Identifying, Harvesting, and Preparing Edible Wild Food
27 pages
Deontology Ethics and Social Responsibility of Education
No ratings yet
Deontology Ethics and Social Responsibility of Education
17 pages
Gabe l3 Unit 22 Test
No ratings yet
Gabe l3 Unit 22 Test
6 pages
Nat Aa 21
No ratings yet
Nat Aa 21
41 pages
Classic Strategies: Still Effective?
No ratings yet
Classic Strategies: Still Effective?
7 pages

Guiding Principle

Uploaded by

Guiding Principle

Uploaded by

College of Education, Arts, and Sciences

Teacher Education Department

Guiding Principles of Different Types of Test

What are the Major Categories and Formats of Traditional Tests?

A. Selected-Response Tests (Balagtas-Ubina et al., 2020)

A1. Nature of Multiple-Choice Items (Keith Waugh & Gronlund, 2013)

Example of Multiple-Choice Item in Question Form

Example of Multiple-Choice Item in Incomplete Statement Form

6. Emphasize negative wording whenever it is used in the stem of an item. In some

Better: The recall of factual information can be measured best with

a. Use the common misconceptions of errors of students as distracters.

Better: Obtaining a dependable ranking of students is of major concern when using

A2. Nature of True-False Items (Keith Waugh & Gronlund, 2013)

Below are the strengths and limitations of a multiple-choice test.

5. Statements of opinion should be attributed to some source unless used to distinguish

A3. Nature of Matching Items (Keith Waugh & Gronlund, 2013)

Which test item measures the greatest variety of learning outcomes?

Which test item is difficult to score objectively?

Which test item provides the highest score by guessing?

By switching to a matching format, we can eliminate the repetition of the alternative

A 1. Is least useful for educational diagnosis. A. Multiple-choice item.

A 2. Measures greatest variety of learning outcomes. B. True-false item.

C 3. Is most difficult to score objectively. C. Short-answer item.

B 4. Provides the highest score by guessing.

B. Constructed-Response Test (Balagtas-Ubina et al., 2020)

Constructed-response items require learners to supply answers to a given question or

B1. Nature of Short-Answer Items (Keith Waugh & Gronlund, 2013)

B2. Nature of Essay Questions (Keith Waugh & Gronlund, 2013)

Types of Essay Questions (Keith Waugh & Gronlund, 2013)

Restricted-Response Questions. The restricted-response question places strict limits on the

1. Describe the steps that Mr. Rogers should follow.

Extended-Response Questions. The extended-response question gives students almost

3. Formulate questions that present a clear task to be performed. Phrasing an essay

3. Grade extended-response answers by rating method, using defined criteria as a guide.

Submitted By: Rosemarie B. Gilbuena

You might also like