0% found this document useful (0 votes)
50 views23 pages

Reading Assesment

Uploaded by

vxvlk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views23 pages

Reading Assesment

Uploaded by

vxvlk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

11 Testing reading

This chapter begins by considering how we should specify what candidates


can be expected to do, and then goes on to make suggestions for setting
appropriate test tasks.

Specifying what the candidate should be able to do


Operations
The testing of reading ability seems deceptively straightforward when it
is compared to, say, the testing of speaking ability. You take a passage, ask
some questions about it, and there you are. But while it is true that you
can very quickly construct a reading test, it may not be a very good test,
and it may not measure what you want it to measure.
The basic problem is that the exercise of receptive skills does not
necessarily, or usually, manifest itself directly in overt behaviour. When
people write and speak, we see and hear; when they read and listen, there
will often be nothing to observe. The challenge for the language tester is
to set tasks which will not only cause the candidate to exercise reading (or
listening) skills, but will also result in behaviour that will demonstrate the
successful use of those skills. There are two parts to this problem. First,
there is uncertainty about the skills which may be involved in reading and
which, for various reasons, language testers are interested in measuring;
many have been hypothesised but few have been unequivocally
demonstrated to exist. Second, even if we believe in the existence of a
particular skill, it is still difficult to know whether an item has succeeded
in measuring it.
The proper response to this problem is not to resort to the simplistic
approach to the testing of reading outlined in the first paragraph, while we
wait for confirmation that the skills we think exist actually do. We believe
these skills exist because we are readers ourselves and are aware of at least
some of them. We know that, depending on our purpose in reading and the
kind of text we are dealing with, we may read in quite different ways. On
one occasion we may read slowly and carefully, word by word, to follow,
say, a philosophical argument. Another time we may flit from page to page,
pausing only a few seconds on each, to get the gist of something. At yet
another time we may look quickly down a column of text, searching for
a particular piece of information. There is little doubt that accomplished
readers are skilled in adapting the way they read according to purpose and

140
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
text. This being so, we see no difficulty in including these different kinds

11 Testing reading
of reading in the specifications of a test.
If we reflect on our reading, we become conscious of other skills we have.
Few of us will know the meaning of every word we ever meet, yet we can
often infer the meaning of a word from its context. Similarly, as we read,
we are continually making inferences about people, things and events. If,
for example, we read that someone has spent an evening in a pub and that
he then staggers home, we may infer that he staggers because of what he
has drunk. (We realise that he could have been an innocent footballer who
had been kicked on the ankle in a match and then gone to the pub to drink
lemonade, but we didn’t say that all our inferences were correct.)
It would not be helpful to continue giving examples of the reading skills
we know we have. The point is that we do know they exist. The fact that
not all of them have had their existence confirmed by research is not a
reason to exclude them from our specifications, and thereby from our
tests. The question is: Will it be useful to include them in our test? The
answer might be thought to depend at least to some extent on the purpose
of the test. If it is a diagnostic test which attempts to identify in detail
the strengths and weaknesses in learners’ reading abilities, the answer is
certainly yes. If it is an achievement test, where the development of these
skills is an objective of the course, the answer must again be yes. If it is a
placement test, where a rough-and-ready indication of reading ability is
enough, or a proficiency test where an ‘overall’ measure of reading ability
is sufficient, one might expect the answer to be no. But the answer ‘no’
invites a further question. If we are not going to test these skills, what
are we going to test? Each of the questions that were referred to in the
first paragraph must be testing something. If our items are going to test
something, surely on grounds of validity, in a test of overall ability, we
should try to test a sample of all the skills that are involved in reading and
are relevant to our purpose. This is what we would recommend.
Of course, the weasel words in the previous sentence are ‘relevant to
our purpose’. For beginners, there may be an argument for including
in a diagnostic test items which test the ability to distinguish between
letters (e.g. between b and d). But normally this ability will be tested
indirectly through higher-level items. The same is true for grammar and
vocabulary. They are both tested indirectly in every reading test, but the
place for grammar and vocabulary items is, we would say, in grammar
and vocabulary tests. For that reason we will not discuss them further in
this chapter.
To be consistent with our general framework for specifications, we will
refer to the skills that readers perform when reading a text as operations.
In the boxes that follow are checklists (not meant to be exhaustive) which
it is thought the reader of this book may find useful. Note the distinction,
based on differences of purpose, between expeditious (quick and efficient)
reading and slow and careful reading. There has been a tendency in the

141
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
past for expeditious reading to be given less prominence in tests than it
11 Testing reading

deserves. The backwash effect of this is that many students have not been
trained to read quickly and efficiently. This is a considerable disadvantage
when, for example, they study overseas and are expected to read
extensively in very limited periods of time. Another example of harmful
backwash!

EXPEDITIOUS READING OPERATIONS


Surveying
The candidate can decide the relevance of a text (or part of a text) to their
needs by looking at the author, sub-headings, graphics, etc.
Skimming
The candidate can:
• obtain main ideas and discourse topics quickly and efficiently;
• establish quickly the structure of a text.
Search reading
The candidate can quickly find information on a predetermined topic.
Scanning
The candidate can quickly find:
• specific words or phrases;
• figures, percentages;
• specific items in an index;
• specific names in a bibliography or a set of references.

Note that any serious testing of expeditious reading will require candidates
to respond to items without having time to read the full contents of a
passage.

CAREFUL READING OPERATIONS


• identify pronominal reference;
• identify discourse markers;
• interpret complex sentences;
• interpret topic sentences;
• outline logical organisation of a text;
• outline the development of an argument;
• distinguish general statements from examples;
• identify explicitly stated main ideas;
• identify implicitly stated main ideas;
• recognise writer’s intention;

142
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
• recognise the attitudes and emotions of the writer;

11 Testing reading
• identify addressee or audience for a text;
• identify what kind of text is involved (e.g. editorial, diary, etc.);
• distinguish fact from opinion;
• distinguish hypothesis from fact;
• distinguish fact from rumour or hearsay.

Make inferences:
• infer the meaning of an unknown word from context;
• make propositional informational inferences, answering questions
beginning with who, when, what;
• make propositional explanatory inferences concerned with motivation,
cause, consequence and enablement, answering questions beginning
with why, how;
• make pragmatic inferences.

The different kinds of inference described above deserve comment.


Propositional inferences are those which do not depend on information
from outside the text. For example, if John is Mary’s brother, we can
infer that Mary is John’s sister (if it is also clear from the text that Mary
is female). Another example: if we read the following, we can infer that
Harry was working at her studies, not at the fish and chip shop. Harry
worked as hard as she had ever done in her life. When the exam results came
out, nobody was surprised that she came top of the class.
Pragmatic inferences are those where we have to combine information
from the text with knowledge from outside the text. We may read, for
example: It took them twenty minutes by road to get from Reading to Heathrow
Airport. In order to infer that they travelled very quickly, we have to
know that Reading and Heathrow Airport are not close by each other.
The fact that many readers will not know this allows us to make the point
that where the ability to make pragmatic inferences is to be tested, the
knowledge that is needed from outside the text must be knowledge which
all the candidates can be assumed to have1.

Texts
Texts that candidates are expected to be able to deal with can be specified
along a number of parameters: type, form, graphic features, topic, style,
intended readership, length, readability or difficulty, range of vocabulary
and grammatical structure.

1.
It has to be admitted that the distinction between propositional and pragmatic inferences
is not watertight. In a sense all inferences are pragmatic: even being able to infer, say, that a
man born in 1941 will have his ninetieth birthday in 2031 (if he lives that long) depends on
knowledge of arithmetic, it could be argued. However, the distinction remains useful when
we are constructing reading test items. Competent readers integrate information from the text
into their knowledge of the world.

143
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
Text types include: textbooks, handouts, articles (in newspapers, journals
11 Testing reading

or magazines), poems/verse, encyclopaedia entries, text messages, tweets,


dictionary entries, web pages, blogs, leaflets, letters, forms, diary entries,
maps or plans, advertisements, postcards, social media posts, timetables,
novels (extracts) and short stories, reviews, manuals, computer Help
systems, notices and signs.
Text forms include: description, exposition, argumentation, instruction,
narration. (These can be broken down further if it is thought appropriate:
e.g. expository texts could include outlines, summaries, etc.)
Graphic features include: tables, charts, diagrams, cartoons, illustrations,
infographics.
Topics may be listed or defined in a general way (such as non-technical,
non-specialist) or in relation to a set of candidates whose background is
known (such as those familiar to the students).
Style may be specified in terms of formality.
Intended readership can be quite specific (e.g. expert speaking science
undergraduate students) or more general (e.g. young expert speakers).
Length is usually expressed in number of words. The specified length will
normally vary according to the level of the candidates and whether one is
testing expeditious or careful reading (although a single long text could be
used for both).
Readability is an objective, but not necessarily very valid, measure of
the difficulty of a text. Where this is not used, expert judgements may be
relied on.
Range of vocabulary may be indicated by a complete list of words
(as for the Cambridge tests for young learners), by reference either to a
word list or to indications of frequency in a learners’ dictionary. The free
online resource, English Vocabulary Profile (EVP) is particularly useful here.
Range may be expressed more generally (e.g. non-technical, except where
explained in the text).
Range of grammar may be a list of structures, or a reference to those to
be found in a course book or (possibly parts of) a grammar of the language.
The reason for specifying texts in such detail is that we want the texts
included in a test to be representative of the texts candidates should be
able to read successfully. This is partly a matter of content validity but also
relates to backwash. The appearance in the test of only a limited range of
texts will encourage the reading of a narrow range by potential candidates.
It is worth mentioning authenticity at this point. Whether or not authentic
texts (intended for expert speakers) are to be used will depend at least in
part on what the items based on them are intended to measure.

144
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
Speed

11 Testing reading
Reading speed may be expressed in words per minute. Different speeds
will be expected for careful and expeditious reading. In the case of the
latter, the candidate is, of course, not expected to read all of the words.
The expected speed of reading will combine with the number and
difficulty of items to determine the amount of time needed for the test,
or part of it. While research has suggested that 250 words per minute
is a reasonable target reading speed for fluent second language reading,
expectations for particular groups of learners will vary according to their
general level of proficiency, the nature of the text and the tasks which
they are asked to perform. Observation of learners reading texts is the best
guide to setting a reading speed.

Criterial level of performance


In norm-referenced testing our interest is in seeing how candidates
perform by comparison with each other. There is no need to specify
criterial levels of performance before tests are constructed, or even
before they are administered. This book, however, encourages a broad
criterion-referenced approach to language testing. In the case of the testing
of writing, as we saw in Chapter 9, it is possible to describe levels of
writing ability that candidates have to attain. While this would not satisfy
everyone’s definition of criterion-referencing, it is very much in the spirit
of that form of testing, and would promise to bring the benefits claimed for
criterion-referenced testing.
Setting criterial levels for receptive skills is more problematical. Traditional
pass marks expressed in percentages (40 percent? 50 percent? 60 percent?)
are hardly helpful, since there seems no way of providing a direct
interpretation of such a score. To our minds, the best way to proceed is to
use the test tasks themselves to define the level. All of the items (and so
the tasks that they require the candidate to perform) should be within the
capabilities of anyone to whom we are prepared to give a pass. In other
words, in order to pass, a candidate should be expected, in principle, to
score 100 percent. But since we know that human performance is not so
reliable, we can set the actual cutting point rather lower, say at the 80
percent level. In order to distinguish between candidates of different levels
of ability, more than one test may be required.
As part of the development (and validation) of a reading test, one might
wish to compare performance on the test with the rating of candidates’
reading ability using scales like those of ACTFL or the ILR. This would
be most appropriate where performance in the productive skills is being
assessed according to those scales and some equivalence between tests of
the different skills is being sought. Similarly, performance on the test may
be compared with candidates’ ability assessed in terms of CEFR/ALTE ‘Can
do’ statements.

145
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
Setting the tasks
11 Testing reading

Selecting texts
Successful choice of texts depends ultimately on experience, judgement
and a certain amount of common sense. Clearly these are not qualities that
a handbook can provide; practice is necessary. It is nevertheless possible to
offer useful advice. While these points may seem rather obvious, they are
often overlooked.
1. Keep specifications constantly in mind and try to select as
representative a sample as possible. Do not repeatedly select texts of a
particular kind simply because they are readily available.
2. Choose texts of appropriate lengths. Expeditious reading tests may call
for passages of up to 2,000 words or more. Detailed reading can be
tested using passages of just a few sentences.
3. In order to obtain both content validity and acceptable reliability,
include as many passages as possible in a test, thereby giving
candidates a good number of fresh starts. Considerations of practicality
will inevitably impose constraints on this, especially where scanning or
skimming is to be tested.
4. In order to test search reading, look for passages which contain plenty
of discrete pieces of information.
5. For scanning, find texts which have the specified elements that have to
be scanned for.
6. To test the ability to quickly establish the structure of a text, make sure
that the text has a clearly recognisable structure. (It’s surprising how
many texts lack this quality.)
7. Choose texts that will interest candidates but which will not over-excite
or disturb them. A text about cancer, for example, is almost certainly
going to be distressing to some candidates.
8. Avoid texts made up of information that may be part of candidates’
general knowledge. It may be difficult not to write items to which
correct responses are available to some candidates without reading the
passage. On a reading test we encountered once, one of us was able to
answer eight out of 11 items without reading the text on which they
were based. The topic of the text was rust in cars, an area in which we
had had extensive experience.
9. Assuming that it is only reading ability that is being tested, do not
choose texts that are too culturally laden.

146
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
10. Do not use texts that students have already read (or even close

11 Testing reading
approximations to them). This happens surprisingly often.

Writing items
The aim must be to write items that will measure the ability in which we
are interested, that will elicit reliable behaviour from candidates, and that
will permit highly reliable scoring. Since the act of reading does not in
itself demonstrate its successful performance, we need to set tasks that will
involve candidates in providing evidence of successful reading.

Possible techniques
It is important that the techniques used should interfere as little as possible
with the reading itself, and that they should not add a significantly difficult
task on top of reading. This is one reason for being wary of requiring
candidates to write answers, particularly in the language of the text. They
may read perfectly well but difficulties in writing may prevent them
demonstrating this. Possible solutions to this problem include:
Multiple choice
The candidate provides evidence of successful reading by making a mark
against one out of a number of alternatives. The superficial attraction
of this technique is outweighed in institutional testing by the various
problems enumerated in Chapter 8. This is true whether the alternative
responses are written or take the form of illustrations, as in the following:
Choose the picture (A, B, C or D) that the following sentence describes:
The man with the child was shouted at by the woman on the bike.

A B

C D

147
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
It has already been pointed out that True/False items, which are to be
11 Testing reading

found in many tests, are simply a variety of multiple choice, with only one
distractor and a 50 percent probability of choosing the correct response by
chance! Having a ‘not applicable’ or ‘we don’t know’ category adds a second
‘distractor’ and reduces the likelihood of guessing correctly to 33 percent.
Short answer
The best short answer questions are those with a unique correct response,
for example:
In which city do the people described in the ‘Urban Villagers’ live?
to which there is only one possible correct response, e.g. Bombay.
The response may be a single word or something slightly longer (e.g. China
and Japan; American women).
The short answer technique works well for testing the ability to identify
referents. An example (based on the newspaper article about the re-
creation of ancient foods on page 152) is:
What does the word ‘she’ (line 53) refer to?
Care has to be taken that the precise referent is to be found in the text. It
may be necessary on occasion to change the text slightly for this condition
to be met.
The technique also works well for testing the ability to predict the
meaning of unknown words from context. An example (also based on the
ancient foods article) is:
Find a single word in the passage (between lines 10 and 20) which has
the same meaning as ‘minute opening or passage’. (The word in the
passage may have an ending like -s, -tion, -ing, -ed, etc.)
The short answer technique can be used to test the ability to make various
distinctions, such as that between fact and opinion. For example:
Basing your answers on the text, mark each of the following sentences
as FACT or OPINION by writing F or O in the correct space on your
answer sheet. You must get all three correct to obtain credit.
1. Farm owners are deliberately neglecting their land.
2. The majority of young men who move to the cities are successful.
3. There are already enough farms under government control.
Because of the requirement that all three responses are correct, guessing
has a limited effect in such items.
Scanning can be tested with the short answer technique:
Which town listed in Table 4 has the largest population?

148
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
According to the index, on which page will you learn about Nabokov’s

11 Testing reading
interest in butterflies?
The short answer technique can also be used to write items related to the
structure of a text. For example:
There are five sections in the paper. In which section do the writers
deal with:
a. choice of language in relation to national unity [Section …..]
b. the effects of a colonial language on local culture [Section …..]
c. the choice of a colonial language by people in their fight for
liberation [Section …..]

Exam folder 
d. practical difficulties in using local languages for education [Section …..]
e. the relationship between power and language [Section …..]

Reading
Again, and Use
guessing of English
is possible Partbut
here, 7 the probabilities are lower than with
Gapped text
straightforward multiple choice.
In this part of the Reading and Use of English test, you read an article from which six paragraphs have been
Aremoved.
similar example
The paragraphs is shown
are placed below
in a jumbled from
order after Cambridge
the main Complete
text. You need to First
decide where in the 2nd
text the paragraphs have been taken
2 from. This tests that you can recognise how a text is structured, and how
edition Student’s Book :
a text creates meaning across paragraphs.

1 You are going to read an extract from a magazine article. Six paragraphs have been removed from the extract.
Choose from the paragraphs A–G the one which fits each gap 1–6. There is one extra paragraph which you do
not need to use.

2 Work in pairs. Discuss the words/phrases which helped you to decide what fits where.

Is your glass half full or half empty?


Are you happy? Did you open the curtains this morning, ‘Through monkeys, humans and lots of
see that it’s yet another day of sunshine and bounce out of animals, the amount of activity in the
bed? Or are you the kind of person who sees the sun and front cortex does seem to be a good
starts worrying about getting sunburnt and the problems marker for positivity and negativity.’
it may cause for gardeners? Positive people have a more active left
frontal cortex; the presenter was found
1 to have a substantially more active right
But a television documentary, which is to be broadcast frontal cortex – proving his assertion
next week, suggests that in fact they play only a very that he is one of life’s pessimists. ‘When
small part and that you can, in fact, train yourself to have I look into the future, I see all the things
a more sunny attitude to life. It argues that it may indeed that are going to go wrong, rather than
be simple to change negative people into positive ones. the things that will probably go right,’ he says. He also
suffers from insomnia. Professor Fox is among a growing
2 number of psychologists, however, who believe that he
Next week’s programme is timely, because the happiness and others like him can change this brain asymmetry and
of individuals is something that policymakers have thus their personality through a series of exercises.
started to take very seriously indeed. Indeed, yesterday,
a new charity called MindFull suggested that mental 5
health should be taught in schools. And later this month, It seems simple. But surely, trying to pick out a smiling
the Office for National Statistics (ONS) will publish its expression isn’t going to make me more optimistic.
National Well-being report. This will draw on a number of Professor Fox tells me: ‘I was very sceptical when I got
studies which suggest that our positivity has an impact on into this initially. But the task we used in the show has
our health and our educational achievements. been used with kids with self-esteem issues. And it does
seem to have very powerful effects. It’s early days, but the
3 signs are that it is definitely effective.’
In other words, being happy could add years to your life.
It doesn’t just benefit your health, either. Educational 6
attainment, too, seems to be linked to attitude. Nick Of course, many psychologists argue that relentless
Baylis, a consultant psychologist, works with the pupils happiness is neither normal nor healthy. Professor Fox
at a school in London that, five years ago, had very poor says: ‘There are situations when things go wrong, and
academic results. Now, 87% of its pupils are leaving school having a healthy dose of pessimism can be good. But the
with good qualifications. Baylis believes that teaching evidence shows that, broadly, having a positive attitude
both the staff and pupils ‘well-being’ and coping strategies really does boost your well-being.’
was key to this success.

10
2.
Notee Xthat
a M Fthis er 
o L D example is taken from an exam preparation book, hence the instruction to
work in pairs, which of course would not be appropriate in a test proper.
149
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
11 Testing reading
A F
The most striking example comes from Oxford, Ohio, For years, many scientists believed that your
which in the 1970s conducted a study of its inhabitants, personality was predetermined. They were of the
then aged over 50. So who has survived in good health? opinion that it was your genes which were responsible
Those who had a positive outlook on their life and for whether you were an optimist or a pessimist.
impending old age have lived, on average, 7.6 years
longer than those with negative views.
G
Next week’s documentary will try to provide a
physiological explanation for their achievements. For
B
It worked for the presenter, who over a couple of the programme, the presenter had his brain scanned by
months of exercising was able to recalibrate his brain. Professor Elaine Fox, a neuroscientist at Oxford and
He says that he is sleeping better ‘though I wouldn’t author of Rainy Brain, Sunny Brain. She says brain
call myself a heavy sleeper yet’, and that he is more asymmetry is very closely linked to our personalities.
optimistic. So should we all be doing the exercises? ‘I
think anyone could do them, but I suspect a fair number
who start then let it slide,’ he says.
EXAM ADVICE
● Read the whole of the text first.
C ● Read through paragraphs A–G and notice the differences
If the show touches a nerve in the same way as last between them.
autumn’s documentary by the same director about ● Pay careful attention to connecting words throughout
fasting – which kick-started the phenomenally popular the text and paragraphs, as well as at the beginnings and
5:2 diet – many of us could soon be undertaking mental ends of paragraphs.
workouts in our lunch hour. ● Consider each paragraph for every gap. Don’t assume
you have been correct in your previous answers as you
go along!
● Read the whole of the text again when you have
D completed the task.
Professor Fox gives her views on the subject in next ● Don’t rely on matching up names, dates or numbers in
week’s programme, pointing out that the research has the text and paragraphs just because they are the same
very significant implications for schools and for health or similar.
professionals. ‘However, more work needs to be done ● Don’t rely on matching up individual words or phrases
before the results can be considered conclusive.’ in the text and the paragraphs just because they are the
same or similar.

E
The most basic one is called Cognitive Bias Modification.
To do it, you look at a screen for 10 minutes every day
over several weeks. During those minutes, a series of 15
faces are flashed up. All (except one) are either angry,
upset or unhappy. You have to spot, and click on, the
one happy face.

It should be noted that the scoring of ‘sequencing’ items of this kind


can be problematical. If a candidate puts one element of the text out
of sequence, it may cause others to be displaced and require complex
decision-making on the part of the scorers.
One should be wary of writing short answer items where correct responses
are not limited to a unique answer. Thus:
According to the author, what does the increase in divorce
e X a M Frates
o L D e rshow
 10
about people’s expectations of marriage and marriage partners?
might call for an answer like:
(They/Expectations) are greater (than in the past).
The danger is of course that a student who has the answer in his or
her head after reading the relevant part of the passage may not be able
to express it well (equally, the scorer may not be able to tell from the
response that the student has arrived at the correct answer).

Gap filling
This technique is particularly useful in testing reading. It can be used any
time that the required response is so complex that it may cause writing
(and scoring) problems. If one wanted to know whether the candidate had
150
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
grasped the main idea(s) of the following paragraph, for instance, the item

11 Testing reading
might be:
Complete the following, which is based on the paragraph below.
‘Many universities in Europe used to insist that their students
speak and write only . Now many of them accept
as an alternative, but not a of
the two.’

Until recently, many European universities and colleges not only


taught EngEng but actually required it from their students; i.e.
other varieties of standard English were not allowed. This was the
result of a conscious decision, often, that some norm needed to
be established and that confusion would arise if teachers offered
conflicting models. Lately, however, many universities have come to
relax this requirement, recognising that their students are as likely (if
not more likely) to encounter NAmEng as EngEng, especially since
some European students study for a time in North America. Many
universities therefore now permit students to speak and write either
EngEng or NAmEng, so long as they are consistent.

(Trudgill and Hannah 2017)


A possible weakness in this particular item is that the candidate has to
provide one word (mixture or combination) which is not in the passage. In
practice, however, it worked well.
Gap filling can be used to test the ability to recognise detail presented to
support a main idea:
To support his claim that the Mafia is taking over Russia, the author
points out that the sale of in Moscow has increased
by percent over the last two years.
Gap filling can also be used for scanning items:
According to Figure 1, percent of faculty members
agree with the new rules.
Gap filling is also the basis for what has been called ‘summary cloze’. In
this technique, a reading passage is summarised by the tester, and then gaps
are left in the summary for completion by the candidate. This is really an
extension of the gap filling technique and shares its qualities. It permits the
setting of several reliable but relevant items. Here is an extended reading
example based on a newspaper article, with higher-level students in mind:
Below, you will find a newspaper article about the modern re-creation of
ancient food, followed by a summary of the article.
The summary contains gaps. You must fill the gaps using only words
from the article. There must be ONLY ONE WORD in each gap.

151
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
11 Testing reading

Ancient foods
During a 1954 BBC documentary about Tollund Man, the mysterious
body of a hanged man discovered in a peat bog in Denmark, the
noted archaeologist Sir Mortimer Wheeler ate a reconstruction of the
2,000-year-old’s last meal. After tasting the porridge of barley, linseed
5 and mustard seeds, he dabbed at his moustache and declared the
mystery was solved: Tollund Man had killed himself rather than eat
another spoonful.
Food reconstruction has come a long way since then. Last week
Seamus Blackley, a scientist more famous for creating the Xbox, baked
10 a sourdough loaf using yeast cultured from scrapings off 4,500-year-
old Egyptian pottery at his home in California. The results, said one of
his collaborators, Dr Serena Love, an Egyptologist from the University
of Queensland, were “tangy and delicious”. “I met Seamus for the first
time today,” she said. “As soon as I walked in the door he gave me a
15 plate of bread.” Blackley extracted samples from inside the ceramic
pores of a clay pot from the Peabody Museum at Harvard University
three weeks ago. Most are being examined by the third member of the
team, Richard Bowman, a molecular biologist, but Blackley kept one
to turn it into yeast to make bread. “Food puts you in touch with the
20 humanity of the past,” Love said. “That’s a tactile thing, something
that’s visceral – you can actually experience the ancients, with at least
one of the actual ingredients.”
Ancient and historical foods are having a bit of a moment. The
growing interest can be seen in the number of cookbooks available
25 including An Early Meal, a Viking Age Cookbook by Daniel Serra and
Hanna Tunberg and Khazana by Saliha Mahmood Ahmed with recipes
inspired by the Mughal empire, as well as in the increasing number
of food re-enactments. Graham Taylor’s Potted History firm makes
amphoras and Neolithic pottery for experimental archaeologists such
30 as Sally Grainger who has investigated and made versions of garum,
a Roman fish sauce, as well as Jill Hatch who cooks authentic Roman
food for the Ermine Street Guard enthusiasts and similar groups. But
those looking for original ingredients to recreate tastes of the past need
to be cautious, says Professor Dorian Fuller, an archaeobotanist from
35 University College London. “Yeast is everywhere. It’s hard to know if
something wasn’t contaminated when it was dug out of the ground, or
when it was put on a ship to Boston collecting yeasts along the way.
These things haven’t been kept in sterile conditions.”
Because human diets have been founded on grains for millennia, beer,
40 bread and porridge are the main focus of attempts to recreate truly
ancient foods. “The latest study that came out in the ‘80s said grain
made up about 70% of the daily diet of Romans, although I think

152
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
11 Testing reading
that’s a little high,” said Farrell Monaco, an archaeologist specialising
in Roman culture who has worked in Pompeii and Herculaneum.
45 “Although I think that’s a little high, bread and pulses were the
two vehicles to get calories into the Roman daily diet.” Pompeii has
commercial bakeries on every street corner, she said. “And religion as
well – bread was so valuable that you would offer it to the gods.”
Monaco uses replicas of Roman and Greek kitchen tools to make
50 dishes described by ancient writers such as Columella, Pliny and
Cato: fig vinegar, moretum (salads), hypotrimma (a sweet paste) and
defrutum (a grape syrup) as well as panis quadratus, a round loaf
that has been excavated at many sites around Vesuvius. She believes
making ancient food with original techniques is a vital archaeological
55 tool. “To use your hand, your eyes, nose, tastebuds, to labour
over something, to use a handmill to make a loaf of bread, so you
understand how much labour and sweat went into making it – you
start to understand how much value it had.”

Summary
In a television documentary in 1954, an archaeologist made a joke,
saying that a man had killed himself 2,000 years ago rather than eat
any more of his , the remains of which had been
found in his body.
Times have changed. Recently, scrapings were taken from 4,500
year old Egyptian . Most were kept for study by
a molecular biologist, but one was retained to culture yeast, which
was then used to bake a loaf. An Egyptologist who
tasted it said that it was tangy and delicious.
Growing interest in ancient foods is evidenced by the number of
which are being written, including two which
provide recipes for Viking and Mughal empire inspired food. A firm
called ‘Potted History’ makes amphoras and Neolithic pottery for
archaeologists who want to make authentic ancient Roman food.
At the same time, one archaeobotanist has warned that care should
be exercised in such cookery, since yeast is everywhere and may
whatever is dug out of the ground.
The main focus of attempts to recreate ancient foods has been on
beer, bread and porridge. This is because human diets have been
based on for thousands of years. A study in
the 1980s claimed that about 70% of the diet
consisted of grain. Although she thinks that estimate to be a little
high, Farrell Monaco, an archaeologist, admits that bread and pulses

153
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
11 Testing reading

were what provided Romans with their . Pompeii


had bakeries on every street corner, she added. Monaco uses replica
to make dishes described by ancient writers. She
believes that making bread in this way helps one understand the
it had for ancient peoples.

Information transfer
One way of minimising demands on candidates’ writing ability is to
require them to show successful completion of a reading task by supplying
simple information in a table, following a route on a map, labelling a
picture, and so on. As can be seen in the example below, from the IELTS
Academic module, a single text may be used for more than one task (in this
case, completing a table and labelling a picture).

[Note: This is an extract from an Academic Reading passage on the subject of dung beetles. The text
preceding this extract gave some background facts about dung beetles, and went on to describe a
decision to introduce non-native varieties to Australia.]

Introducing dung1 beetles into a pasture is a simple process: approximately 1,500 beetles are released, a
handful at a time, into fresh cow pats2 in the cow pasture. The beetles immediately disappear beneath the
pats digging and tunnelling and, if they successfully adapt to their new environment, soon become a
permanent, self-sustaining part of the local ecology. In time they multiply and within three or four years
the benefits to the pasture are obvious.

Dung beetles work from the inside of the pat so they are sheltered from predators such as birds and
foxes. Most species burrow into the soil and bury dung in tunnels directly underneath the pats, which are
hollowed out from within. Some large species originating from France excavate tunnels to a depth of
approximately 30 cm below the dung pat. These beetles make sausage-shaped brood chambers along the
tunnels. The shallowest tunnels belong to a much smaller Spanish species that buries dung in chambers
that hang like fruit from the branches of a pear tree. South African beetles dig narrow tunnels of
approximately 20 cm below the surface of the pat. Some surface-dwelling beetles, including a South
African species, cut perfectly-shaped balls from the pat, which are rolled away and attached to the bases
of plants.

For maximum dung burial in spring, summer and autumn, farmers require a variety of species with
overlapping periods of activity. In the cooler environments of the state of Victoria, the large French
species (2.5 cms long), is matched with smaller (half this size), temperate-climate Spanish species. The
former are slow to recover from the winter cold and produce only one or two generations of offspring
from late spring until autumn. The latter, which multiply rapidly in early spring, produce two to five
generations annually. The South African ball-rolling species, being a sub-tropical beetle, prefers the
climate of northern and coastal New South Wales where it commonly works with the South African
tunneling species. In warmer climates, many species are active for longer periods of the year.

Glossary
1. dung: the droppings or excreta of animals

2. cow pats: droppings of cows

154
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
© UCLES 2009. This material may be photocopied (without alteration) and distributed for classroom
use provided no charge is made. For further information see our Terms and Conditions
IELTS Academic Reading Task Type 10 (Diagram Label Completion Activity) –
IELTS Academic Reading Task Type 10 (Diagram Label Completion Activity) –
Student Worksheet
Student Worksheet
Questions 6 – 8
Questions 6 – 8

11 Testing reading
Label the tunnels on the diagram below using words from the box.
Label the tunnels on the diagram below using words from the box.
Write your answers in boxes 6-8 on your answer sheet.
Write your answers in boxes 6-8 on your answer sheet.
Cow pat (dung)
Cow pat (dung)
Approximate depth in
Approximate depth in
cms below surface
cms below surface
0
0
8 …………
10 8 …………
10 6 …………
6 …………

20
20

30 7 …………
30 7 …………

Dung Beetle Types


Dung Beetle Types

French Spanish
French Spanish
Mediterranean South African
Mediterranean South African
Australian native South African ball roller
Australian native South African ball roller
Academic Reading sample task – Table completion

Question 9 – 13
1. What does this diagram show? What features can you explain from the information given? Compare
1. What
Complete
yourthedoes this
tablewith
ideas diagram
a partner.show? What features can you explain from the information given? Compare
below.
your ideas
2. Look at thewith a partner.
instructions and the answer spaces 6, 7 and 8. What kind of information is required for the
2. answers?
Look
Choose at the instructions
NO MORE THAN THREE and the answer
WORDS spaces
from the 6, 7for
passage and 8. What
each kind of information is required for the
answer.
3. answers?
Which are the key words in the diagram?
3. your
Write Which are the
answers key words
in boxes in the
9-13 on yourdiagram?
answer sheet.
4. In what order would you do the following with the reading text? Why?
4.
- In what order
detailed would you do the following with the reading text? Why?
reading
-- detailed
scanningreading
-- scanning Number of
skimming Size Preferred Complementary Start of active
-Species
skimming climate species period
generations
per year

French 2.5 cm cool Spanish late spring 1-2


© UCLES 2009. This material may be photocopied (without alteration) and distributed for classroom
© UCLES
use 2009.
provided This material
no charge may
is made. befurther
For photocopied (without
information seealteration)
our Termsand
anddistributed
Conditionsfor classroom
use provided no charge is made. For further information see our Terms and Conditions
Spanish 1.25 cm 9 ............ 10 ............ 11 ............

South African
12 ............ 13 ………...
ball roller

Relatively few techniques have been presented in this section. This


is because, in our view, few basic techniques are needed, and non-
professional testers will benefit from concentrating on developing their
skills within a limited range, always allowing for the possibility of
modifying these techniques for particular purposes and in particular
circumstances. Many professional testers appear to have got by with
155
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
just one – multiple choice! The more usual varieties of cloze and the
11 Testing reading

C-Test technique (see Chapter 14) have been omitted because, while they
obviously involve reading to quite a high degree, it is not clear that reading
ability is all that they measure. This makes it all the harder to interpret
scores on such tests in terms of criterial levels of performance.

Which language for items and responses?


The wording of reading test items is not meant to cause candidates any
difficulties of comprehension. It should always be well within their
capabilities, and less demanding than the text itself. In the same way,
responses should make minimal demands on writing ability. Where
candidates share a single native language, this can be used both for items
and for responses. There is a danger, however, that items may provide
some candidates with more information about the content of the text than
they would have obtained from items in the foreign language.

Procedures for writing items


The starting point for writing items is a careful reading of the text, having the
specified operations in mind. One should be asking oneself what a competent
reader should derive from the text. Where relevant, a note should be taken of
main points, interesting pieces of information, stages of argument, examples,
and so on. The next step is to decide what tasks it is reasonable to expect
candidates to be able to perform in relation to these. It is only then that draft
items should be written. Paragraph numbers and line numbers should be
added to the text if items need to make reference to these. The text and items
should be presented to colleagues for moderation. Items and even the text
may need modification. A moderation checklist follows:

MODERATION CHECKLIST
YES NO
1. Is the English of text and item grammatically correct?

2. Is the English natural and acceptable?

3. Is the item in accordance with specified parameters?

4. Is the specified reading sub-skill necessary in order to


respond correctly?
5. (a) Multiple choice: Is there just one correct
response?(b) Gap filling and summary cloze: Are
there just one or two correct responses for each gap?
(c) Short answer: Is the answer within productive
abilities? Can it be scored validly and reliably? (d)
Unique answer: Is there just one clear answer?
6. Multiple choice: Are all the distractors likely to
distract?
7. Is the item economical?

8. Is the key complete and correct?

156
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
Practical advice on item writing

11 Testing reading
1. In a scanning test, present items in the order in which the answers
can be found in the text. Not to do this introduces too much random
variation and so lowers the test’s reliability.
2. Do not write items for which the correct response can be found without
understanding the text (unless that is an ability that you are testing!).
Such items usually involve simply matching a string of words in the
question with the same string in the text. Thus (around line 50 in the
ancient foods passage, on page 153):
Who uses replicas of Roman and Greek kitchen tools to make dishes
described by ancient writers such as Columella, Pliny and Cato?
Better might be:
Name the archaeologist who makes food described by Pliny and others.
Items that demand simple arithmetic can be useful here. We may learn
in one sentence that before 2004 there had only been three hospital
operations of a particular kind; in another sentence, that there have
been 45 since. An item can ask how many such operations there have
been to date, according to the article.
3. Do not include items that some candidates are likely to be able to
answer from general knowledge without reading the text. For example:
Yeast is used in the making of
It is not necessary, however, to choose esoteric topics.
4. Make the items independent of each other; do not make a correct response
on one item depend on another item being responded to correctly.
In the following example, the candidate who does not respond correctly
to the first item is unlikely to be able to respond to the following two
parts (the second of which uses the Yes/No technique). For such a
candidate, b) and c) might as well not be there.
a) Which man is suspected by the detective?
b) What was the man wearing?
c) Did the man attempt to escape?
However, complete independence is just about impossible in items
that are related to the structure of a text.
5. Be prepared to make minor changes to the text to improve an item.
If you do this and are not an expert speaker, ask an expert speaker to
look at the changed text.

157
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
A note on scoring
11 Testing reading

General advice on obtaining reliable scoring has already been given in


Chapter 5. It is worth adding here, however, that in a reading test (or a
listening test), errors of grammar, spelling or punctuation should not be
penalised, provided that it is clear that the candidate has successfully
performed the reading task which the item set. The function of a reading
test is to test reading ability. To test productive skills at the same time
(which is what happens when grammar, etc. are taken into account) simply
makes the measurement of reading ability less valid.

READER ACTIVITIES
1. Following the procedures and advice given in the chapter, construct a ­
six-item reading test based on the extract ‘The secrets of happiness’
on pages 159–160. (The passage comes from Cambridge Complete
First 2nd edition.)
a. For each item, make a note of the skill(s) (including sub-skills) you
believe it is testing. If possible, have colleagues take the test and provide
critical comment. Try to improve the test. Again, if possible, administer the
test to an appropriate group of students. Score the tests. Interview a few
students as to how they arrived at correct responses. Did they use the
particular sub-skills that you predicted they would?
b. Compare your questions with the ones in Appendix 3. Can you explain
the differences in content and technique? Are there any items in the
appendix that you might want to change? Why? How?
2. Do the sequencing item that is based on the text ‘Is your glass half full or
half empty?’ In Cambridge Complete First 2nd edition on pages 149 and
150. Do you have any difficulties? If possible, get a number of students of
appropriate ability to do the item, and then score their responses. Do you
have any problems in scoring?
3. Write a set of short answer items with unique correct responses to replace
the sequencing items that appear with the ‘Is your glass half full or half
empty?’ text.
4. The following is an exercise designed to help students learn to cope with
complex sentences. How successful would this form of exercise be as part
of a reading test? What precisely would it test? Would you want to change
the exercise in any way? If so, why and how? Could you make it non-
multiple choice? If so, how?
The refusal of the government to consider alternatives to its policy on
prisons, which was criticised by various human rights groups, both
within the country and abroad, led to its downfall.
What is the subject of ‘led to its downfall’?
a. the refusal
b. policy on prisons
c. human rights groups
d. the government

158
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
ght make 1 You are going to read an article by a psychologist
nd phrases. about happiness. Read the article quickly to find out
what he thinks makes people happy.
or work

11 Testing reading
round you Article Video Picture gallery

ourhood The secrets of happiness


u enjoy
Mihaly Csikszentmihalyi has devoted his life to
studying happiness. He believes he has found the key.

I’ve been fascinated by happiness most of my life.


think are When I was a small boy, I noticed that though
ink are not many of the adults around me were wealthy and
educated, they were not always happy and this
ich make 5 sometimes led them to behave in ways which I,
as a child, thought strange. As a result of this, I
decided to understand what happiness was and
below. how best to achieve it. It was not surprising,
then, that I decided to study psychology.
10 On arrival at the University of Chicago 50 years
ago, I was disappointed to find that academic
mpare the psychologists were trying to understand human
ght be behaviour by studying rats in a laboratory. I felt
that there must be other more useful ways of
15 learning how we think and feel. Although my
original aim had been to achieve happiness for
myself, I became more ambitious. I decided to
build my career on trying to discover what made
others happy also. I started out by studying
20 creative people such as musicians, artists and
athletes because they were people who devoted
their lives to doing what they wanted to do,
rather than things that just brought them
financial rewards.

2
25 Later, I expanded the study by inventing a system
called ‘the experience sampling method’. Ordinary
people were asked to keep an electronic pager
for a week which gave out a beeping sound
eight times a day. Every time it did so, they
30 wrote down where they were, what they were
doing, how they felt and how much they were
concentrating. This system has now been used
on more than 10,000 people, and the answers
are consistent: as with creative people, ordinary
4 35 people are happiest when concentrating hard.

159
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
11 Testing reading

Exam advice
When a question ask
• read carefully wha
• make sure you und
options.

2 For questions 1 an
give you the answ
questions and the
answer (A, B, C or
After carrying out 30 years of research and to the underlined
writing 18 books, I believe I have proved that
1 What does this in
happiness is quite different from what most
A the writer’s de
people imagine. It is not something that can
B the writer’s int
40 be bought or collected. People need more than
C the writer’s ob
just wealth and comfort in order to lead happy
D the writer’s un
lives. I discovered that people who earn less
2 What sort of peo
than £10,000 are not generally as happy as
at the start of his
people whose incomes are above that level. This
A People who we
45 suggests that there is a minimum amount of
B People with m
money we need to earn to make us happy, but
C People whose
above that dividing line, people’s happiness has
D People whose
very little to do with how much poorer or richer
they are. Multi-millionaires turn out to be only 3 Now, for questions
50 slightly happier than other people who are not which you think fi
so rich. What is more, people living below the
3 The ‘experience
dividing line and in poverty are often quite happy
A creative peopl
too.
B uncreative peo
I found that the most obvious cause of happiness C people’s happ
55 is intense concentration. This must be the D people are hap
main reason why activities such as music, art, activity.
literature, sports and other forms of leisure have 4 that dividing line
survived. In order to concentrate, whether you’re A living more co
reading a poem or building a sandcastle, what B poor countries
60 you need is a challenge that matches your ability. C happy people
The way to remain continually happy, therefore, D millionaires an
is to keep finding new opportunities to improve 5 According to the
your skills. This may mean learning to do your are doing
job better or faster, or doing other more difficult A something wh
65 jobs. As you grow older, you have to find new B something wh
challenges which are more appropriate to your C something wh
age. I have spent my life studying happiness and D many things at
now, as I look back, I wonder if I have achieved it. 6 What impression
Overall, I think I have, and my belief that I have A He has becom
70 found the keys to its secret has increased my B He has been u
happiness immeasurably. C He has always
D He has only be
Adapted from The Times
4 Work in groups.
160 • Did anything surp
people happy? If
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
5. Subject the following True/False exercise from a student coursebook to the same

11 Testing reading
considerations as the previous exercise type.

7 Natural solutions
B
Articles

1 Look at the two photos and describe what they show.


What do you think the connection is between them?

2 Read a lecture handout about Velcro. Check your answers


to 1 and answer the true or false sentences.
a The seeds George de Mestral found had a special quality. T/F
b Velcro is a natural product. T/F
c Biomimicry is a complicated idea. T/F
d Plants and animals can help us solve design problems. T/F

FOCUS

Articles
The articles the, a and an come at the beginning
of a noun phrase. In some cases we do not use
an article.

The invention of Velcro


We use the:
• when both the speaker/writer and the listener/
One day in 1941, Swiss engineer George de Mestral reader know the thing being referred to
went for a walk with his dog. When he got back, he • when there can only be one thing we are
referring to
noticed some plant seeds stuck to the dog’s fur. He • before a superlative.
inspected the seeds more closely to see how they Examples Where’s Jim? He’s in the kitchen.
stuck to things so effectively. Using a microscope he Neil Armstrong was the first man on
saw that each seed had a hook and the hook allowed the moon.
You’re the greatest!
the seed to stick to anything it touched. De Mestral
We use a and an:
decided to use the same idea to invent a material
• to refer to something for the first time
which could fasten and attach to things. As a result, • to classify or define something
Velcro was invented. • after there is when referring to a single noun.
Examples I saw a man outside the house.
The story of Velcro is probably the most famous
Velcro is a type of material.
example of ‘biomimicry’, the science of copying There’s a spider in the bath.
nature to solve design challenges. The idea behind We don’t use an article with plural and uncountable
biomimicry is simple – nature is the best engineer nouns when we are talking about things or people
in general.
and the plants and animals around us are the perfect
Example Scientists sometimes copy nature.
models for product designers and scientists to copy.

English for the 21st Century • Unit 7 93


(Hughes and Scott-Barrett 2017)
C21_L5_CB_260x200.indb 93 09/10/2017 11:30

FURTHER READING

General
Alderson (2000) provides a very full treatment of the testing of reading.
Hubley (2012) is a very accessible summary of the issues related to the
testing of reading. Weir et al. (2002) describe the development of the
specifications of a reading test in China.

161
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press
Sub-skills
11 Testing reading

Issues in the testing of reading sub-skills are addressed in Weir et al. (1993),
Weir and Porter (1995), Alderson (1990a, 1990b, 1995) and Lumley (1993,
1995). Aryadoust and Zhang (2016) identify two subgroups of readers –
one with high lexico-grammatical knowledge, the other with skimming and
scanning skills.

Texts in reading tests


Kobayashi (2002) reports on a study which shows how the organisation
of a text in a reading test influences the performance of test-takers. Green
et al. (2010) use automated textual analysis to compare the appropriacy
of texts in tests of academic English.

Multiple choice
Rupp et al. (2006) suggest that multiple choice items prompt test-takers to
respond differently from how they would read in a non-testing context. In’nami
and Koizumi (2009) compare multiple choice and open-ended formats
in reading tests. Shizuka et al. (2006) investigate the merits of reducing the
number of multiple choice items in a reading test from four to three.

Other item types


Alderson et al. (2000) explore sequencing as a test technique. Freedle and
Kostin (1993) investigate the variables that affect the difficulty of reading
items. Trites and McGroarty (2005) report on attempts to design more
complex reading tests.

Non-linguistic factors in test performance


Krekeler (2006) investigates the effect of background knowledge on
reading test performance. Allan (1992) reports on the development of a
scale to measure ‘test-wiseness’ of people taking reading tests.

162
https://doi.org/10.1017/9781009024723.011 Published online by Cambridge University Press

You might also like