0% found this document useful (0 votes)
38 views70 pages

Indexing

The document discusses the history and evolution of indexing and abstracting, detailing early methods of information retrieval from ancient civilizations to modern indexing systems. It outlines the roles of indexers, the process of indexing, and the various types of indexes, including alphabetical, classified, and concordance indexes. Additionally, it highlights the functions and purposes of an index in aiding users to efficiently locate relevant information.

Uploaded by

Reichilee Unabia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views70 pages

Indexing

The document discusses the history and evolution of indexing and abstracting, detailing early methods of information retrieval from ancient civilizations to modern indexing systems. It outlines the roles of indexers, the process of indexing, and the various types of indexes, including alphabetical, classified, and concordance indexes. Additionally, it highlights the functions and purposes of an index in aiding users to efficiently locate relevant information.

Uploaded by

Reichilee Unabia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

1

Indexing and
Abstracting
Janny S. Surmieda

2
History of information retrieval

• Papyrus scroll used by ancient Greeks and Romans was


not the most efficient way of storing information in a
written form and retrieving it

• Greek and Roman scholars devise various means of


organizing the materials to make locating certain
passages easier for the reader

3 During this period kings and


History of indexing various rules utilised clay, bone,
prepared skins, papyrus and
• 2000 BC - 100 AD parchment to record historical
- Inventories and finding list written on clay tablets or events, trade transactions and
sheets of parchment are the first evidence of
catalogues of books administrative details.
- 2000 BC, clay envelops enclosing Mesopotamian During this period, records are
cuneiform documents. Clay envelops was use to
preserve the documents from tampering. Documents collected by libraries and are
would be written in full or abstracted on the envelop
tagged and catalogue.
4
History of information retrieval

• Table of Contents

- Pliny the Elder (died 79 A.D. wrote a massive work


called The Natural History in 37 Books

- A kind of encyclopedia that comprised information


on a wide range of subjects

- The entire first book is a gigantic table of contents


where a book-by-book list of subjects is discussed

5
History of information retrieval

• Table of Contents

- Piny the Elder even added a list of Greek and Roman


authors used in compiling the information for that
book

- He also indicated in the last part of his preface that


the practice was first employed in Latin Literature by
Valerius Soranus, who lived in the last part of the
second century B.C. and first part of first century B.C

6
History of information retrieval

• Alphabetization

- Probably first devised by Greek Scholars of the third


century B.C.

- Probably first implemented in the library of


Alexandria in Egypt to help them organised the
growing collection of Greek literary works
7
History of information retrieval

• Hierarchies of information

- Collection of memorable deed and saying written by


Valerius Maximus ca. 30 A.D.
- Divided into nine books
- Each book is subdivided into chapters
- Each chapter has its own heading
- All entries within each chapter contain
anecdotes taken from ancient literature and
history which illustrate that theme

8 Marcus Julius Frontinus is a


History of information retrieval Roman senator of the late first
century A.D. and early second
• Hierarchies of information
century A.D.
- A book of military strategems in four books written
by Marcus Julius Frontinus
- Each book contains specific area of warfare
- Each book is divided into chapters containing
one specific area of the book’s major theme
- Each chapter has a heading to clue the reader
- Each chapter consist of extracts from historical
works that illustrate the practical application of
the topic

9
History of information retrieval

• Hierarchies of information

- The Attic Nights ca.160 A.D. in 20 books written by


Aulus Gellius
- Contains items regarding Greek and Roman
history, philosophy, grammar, rhetoric, and
antiquarian material
- No specific order, each chapter is specifically
about a certain subject
10
Information retrieval systems

• Organization of information may be manual, computerised


or both in form

• IRS is a tool to carry out the function of information


retrieval process

• IRS of the modern age may fall on one of the following


- data retrieval
- reference retrieval
- text retrieval

11

Basic Concepts

12

Information System

• Collection, processing, storage, dissemination and use of


information
13

Information Retrieval

• The process of searching relevant information resource


relevant to an information need

• Differs from data retrieval, because information retrieval


implies satisfaction of a request for information by
providing the information as a direct answer to a question

14 Came from the latin word indicare,


Index which means to point out
• systematic arrangement of entries designed to enable users to
locate information in a document (British indexing standard
BS3700:1988)

• an alphabetically arranged list of headings consisting of


the personal names, places, and subjects treated in a written work,
with page numbers to refer the reader to the point in the text at
which information pertaining to the heading is found (ODLIS)

• A systematically arranged tool that directs a user to the source of


information or the information

• A guide to the content to the contents of a written knowledge


record

15

Indexer

• A person who performs indexing

• A professional who create a systematic tool to direct


users to the information and/or information source

• Responsible in analysing and tagging a document with


subjects and/or other designator and takes into
consideration what is important to the user
16 Indexing may also refer to the
Indexing process of analyzing the
information content of a document
• come from the Latin word indicare meaning “to point out”
and expressing or converting the
• A process of creating an index
information content in the
• the process of compiling one or more indexes for a
single publication, such as a monograph or language of an indexing system.
multivolume reference work, or adding entries for
new documents to an open-end index covering a
particular publication format(example: newspapers),
works of a specific literary form (biography, book reviews, Indexing involves selecting
etc.), or the literature of an academic field, discipline, or
group of disciplines. (ODLIS) indexable concepts and
expressing this concepts in the
language of the indexing system
as index entries.

17

Indexing System

• A set of prescribed procedures intended for organizing


the contents of records of knowledge for purposes of
retrieval and dissemination

18 More than one locator may be found in a heading


within an index entry but the combination of a heading
Index entry and its locators represent a single entry.

• Guide researchers to the content of the work

• The representation of a documentary unit in a displayed


index (NISO Guidelines for Indexes)

• At a minimum, an entry should contain a heading and a


locator

• May contain multi-level heading and a document


surrogate as an addition to the required locator
19 Index heading
- the term chosen to represent the item in
Index entry an index
- concept derived from the material being
• Elements of an index entry indexed
1. Index heading
Modification
- one or more component that narrow the
- ex. Smith, John William, 19, 24, 30 focus and specificity of one of its subclasses.
2. Modification / Modifier / Qualifier Locator
- part of an index entry that leads to the
- ex. Sucrose (biological studies)
actual information sought by the researcher
3. Locator (Reference) Scope Note
- a short explanation on how to use a
4. Scope Note
descriptor or index heading

20 An example of a Scope Note

CULTURAL BACKGROUND
SN: The total social heritage and experience of an
individual or group including institutions,
folkways, literature, mores, and communal
experience.

21

Who does indexing?


• In the US non-fiction books are indexed by authors (most
author don’t actually do it)

• Few publisher have in-house indexers

• Freelance indexers made most of indexing works. They


are hired by authors, publishers, and/or packagers

• A packager is an independent business which manages


the production of a book by hiring freelancers to
accomplish various tasks, which may include researching,
writing, illustrating, copyediting, proofreading and indexing
22

Who does indexing?

• In the Philippines indexing is done by

- Librarians

- Subject experts

23

How is indexing done?

1. Indexer receives a set of page proofs for the book

2. The indexer reads the page proof and make a list of


headings and subheadings and the location of each
pertinent reference

3. Indexer edits for structure, clarity and consistency,


format it to specifications, proofreads it and submit it to
the client in hard-copy form, on disk, or by email.

24

Steps in indexing

1. Analyze the content of the information source

2. Express the aboutness of an information item

3. Indicate the location of the information


25
Skills needed by an indexer

• Excellent language skill

• High clerical aptitude

• Accuracy and attention to details

26

• Cleaveland list four things that happens when you use an


index

• You do not find the information although it is there *

• The information you find is not what you thought it


would be *

• You only find portion of the available information *

• You find the exact information you need

27
Information retrieval
process: the role of indexing
• The information is created and acquired for the system

• Knowledge records are analysed and tagged by index terms

• Knowledge records are stored physically and the index term are
stored into a structured file, either manual or using a computer

• The user’s query is tagged with sets of index terms and then is
matched against the tagged records

• Matched documents are retrieved for review

• Feedback may lead to several reiterations of the search


28

Purpose of an Index

• Minimize the time and effort in finding the information

• Maximize the searching success of a user

• Construct representations of documents in a form that is


suitable for the users to browse through in different forms

29 According to the NISO guidelines for indexes the


following are the function of an index
Function of an index

• Provide users with an effective and systematic means for


locating documentary units that are relevant to his/her
information needs

• An index should therefore:

a. identify documentary units that treat particular


topics of process particular features

30 According to the NISO guidelines for indexes the


following are the function of an index
Function of an index
• An index should therefore:

b. indicate all important topics or features of


documentary units in accordance with the level of
exhaustivity appropriate for the index

c. discriminate between major and minor treatments


of particular topics or manifestations of particular
feature

d. provide access to topics or features using the


terminology of prospective users
31 According to the NISO guidelines for indexes the
following are the function of an index
Function of an index
• An index should therefore:

e. provide access to topics or features using the


terminology of verbal texts being indexed when
ever possible

f. use terminology that is as specific as


documentary units warrant and the indexing
language

g. provide access through synonymous and


equivalent terms

32 According to the NISO guidelines for indexes the


following are the function of an index
Function of an index
• An index should therefore:

h. guides users to terms representing related


concepts (narrower terms, other related terms,
broader terms)

i. provide for the combination of terms to facilitate


the identification of particular types or aspects of
topics or features and to eliminate unwanted
types or aspects

33 According to the NISO guidelines for indexes the


following are the function of an index
Function of an index

• An index should therefore:

j. provide means for searching for particular topics


or features by means of a systematic
arrangement of entries in displayed indexes or for
non-displayed indexes by means of a clearly
documented and displayed method for entering,
combining, and modifying terms to create search
statements and for reviewing retrieved items
34

Types of index

35

By arrangement

• Alphabetical Index

• Classified Index

• Concordance

• alphabetics-classed

36 Alphabetical index has one drawback, that is, synonym


and scattering of entries. ex. Pecan Trees, should we
Alphabetical look under Pecan Trees or Tress?

• based on orderly principles of letters of the alphabet and


is used for the arrangement of subject headings, cross-
references, and qualifying terms, as well as main
headings

• all entry items are in one alphabetical order, including


subject terms, author names, name of place, and even
chemical formula
37 Alphabetical index has one drawback, that is, synonym
and scattering of entries. ex. Pecan Trees, should we
Alphabetical look under Pecan Trees or Tress?

• Advantages

‣ Easy and convenient to use and follow because of user


familiarity

• Disadvantages

‣ Problem of synonymy and scattering of entries

- Scattering means that subcategories of a subject are not


drawn together but are cross-referenced from the wrong term
to the preferred term

38

Classified Indexes

• an index where the contents are arrange systematically by


classes or subject headings

• have an important role to play specially in scientific


indexing, but mystify general users when used in general
indexes

39

Classified Indexes
• Advantages

‣ Useful for generic searching when retrieval is aiming for


classes of documents

‣ Group similar things together

• Disadvantages

‣ Alienate users because they are not familiar hot it is


constructed
40

Concordance
• also known as Word and Name Indexes

• refer to the index of individual names and words that the


author used

• closely represent the information and ideas the author has


in mind when creating the manuscript

• uses the exact term or word within the context of the


document and pinpoint the subject discussed and its
location

41

Concordance
• Paved way for the preparation of the Concordance of the
Bible by Alexander Cruden in 1737.

• uses

‣ Locate a partly or completely remembered passage

‣ Assemble subject matter

‣ Compare and analyze word meaning and word usage

42

Alphabetico-classed

• Broad headings arranged alphabetically

• Narrower headings are grouped under broader headings


and arranged alphanumerically or relationally on the basis
of hierarchy, inclusion, chronology, or other association
43
Numerical or serial order

• Index entries are arrange numerically

‣ Ex.

- Patent-number index

- Table index

44
Numerical or serial order
• Example

‣ https://courts.michigan.gov/Administration/SCAO/Forms/Pages/Probate-Court-Index.aspx

45
By Typed of Form of
Material Indexed

• Book Index

• Periodical Index

• Newspaper Index

• Audiovisual Materials Index


46

Book Index
• Alphabetical list of words at the back of a book that gives
the exact page location of the subject or name associated
with each word

• The components of a book index are:


- main index term
- subdivision under the main index term
- page locator
- cross references

47

Example of a
book index

48

Book Index
• Why prepare a book index?

‣ To reduce the frustration of information overload

‣ Enable users to compare books prior to purchase

‣ Collects the different ways of wording the the same concept

‣ Provides a well-worded sub-entries

‣ Guides the users directly to a specific aspect of a topic

‣ Filters information for the reader


49

Book Index
• Component of a book index

‣ Entry

- Principal subdivision of an index

‣ Heading

- Identifies the subject

‣ Locator

- Tells the users where to find the specified subject

‣ Subheading

- Represent some aspect of the main heading

50

Periodical Index
• An alphabetical list of topics, names, and/or title of works
that are discussed in an article inside a journal title

• Open ended projects usually performed by a group of


people

• Two types of periodical indexes

- Individual indexes - index to individual journal

- broad indexes - index to a group of journal

51
52

53
Book Indexes vs Periodical
Indexes
Book Periodical

Continuous process most of the time


Compiled only once and done by single person
performed by a team

Deals with a more or less defined central topic Great variety of topics

Indexing terms are derived from a controlled


Indexing terms are derived from the book
vocabulary

Specificity is governed by the text itself Terms are presricred by a controlled vocabulary

Articles are scanned for indexable items and


Must be read page by page
may rely on an abstract or summary compiles

Entire text is subject to indexing Depend on a number of policy decision

Always bound with the indexed text Compiled separately

54

Newspaper Index

• an alphabetical list of names, titles of works, and/or


subjects of works which are discussed in a news,
columns, articles on a newspaper title or several titles
55

Newspaper Index

• Problems

‣ Vocabulary control - newspaper may contain name,


places or subject that may not occur again

‣ Some stories are added, dropped or shifted to other


page

56

Computerized Index to Philippine Periodical Articles

57
Audiovisual Material Index

• alphabetical list of topics and names which refer to


images found in audiovisual materials

• Textual labelling is needed (index terms or descriptive


narrative)
58 Index to short stories
By Typed of Form of
- may contain author, title,
Material Indexed
and subject
• Other
- Poetry Index to plays
- Fiction
- Short Stories - may contain author, title,
- Illustrations subject
- Pictures / Paintings
- Artifacts - may also include list of
- Software
- Computer-readable text cast or cast analysis
- Maps
- Sound Recordings Index to poems
- may include first line,
author, subject and title
- provide a guide to locating
poem in a particular anthology
Index to essays
- may contain author, title,
and subject

59
Indexing scheme for fiction
and other imaginative works
• Subject matter

‣ Action and course of events

‣ Psychological development and description

‣ Social patterns

• Frame

‣ Time: past, present, future

‣ Place: geographical, social environment, profession


60
Indexing scheme for fiction
and other imaginative works
• Author’s intention

‣ Emotional experience

‣ Cognition and information

• Accessibility

‣ Readability

‣ Physical characteristics

‣ Literary form

61
Indexes by medium/
physical form
• printed of written (indexer’s marking on the item, use of bibliographic worksheets)

• microform

• electronic media including online, CD-ROM

• braille

• Card index (card catalog)

• Computerized index

‣ Automatic indexing - uses a computer to construct indexes

‣ Computer-assisted - human does the intellectual task while computers does


the mundane work

62 authors index - all types of document creators such as


writers, composers, illustrators, translators, editors,
Other types of index choreographers, artists, sculptors, painters, inventors

subject index - are topics treated in documents and/or


features of documentary units (for example genre,
format, methodological approach.
• Indexes by type of object referred to

- authors

- subjects
63

Other types of index


• Indexes by type of term used for headings

- name - proper nouns, such as names of persons,


places, and/or corporate bodies

- number or notations - numerical or coded


designations, such as classification notation, patent
number, ISBN, date.

- words and phrases - common words and phrases


(as opposed to names or proper nouns)

64

Other types of index


• Indexes by type or extent of indexable matter on which an index is
based

- full text of documents

- abstracts

- titles only

- first lines only (ex. first lines of poems)

- citations - reference to other works; consists of a list of


articles, with sublist under each article of subsequently
published papers that cite the articles

65

Other types of index


• Indexes by method of document analysis

- human intellectual analysis and identification of


topics and concepts expressed and/or features
manifested

- computer algorithms designed to identify useful


terms, phrases, or features

- combination of computer-based and human analysis


66

Other types of index

• Indexes by method of term selection

- assignment of terms to represent topics and


features (whether or not the term is in the
documentary unit being indexed)

- extraction of terms from the documentary

- combination of assignment and extraction

67

Other types of index


• Indexes by method of term coordination

- pre-coordinated combination. (ex. subject heading


indexes, string indexes, chain indexes, keyword
indexes (including KWIC, KWOC, KWAC), rotated,
and permuted indexes

- post-coordinated combination. (includes the use of


Boolean operators, proximity measures, and
combination of weighted terms)

68

Other types of index

• Indexes by proximity of documentary units

- index published together with the documentary units


(ex. back-of-the-book index, full-text databases)

- indexes published separately from the documentary


units
69 Authored index are created by one or more person
through intellectual analysis of text.
Other types of index
Automatically generated indexes are electronically
created through algorithmic analysis of text.
• Indexes by periodicity of the index

- one-time, closed-end indexes

- continuing, open-end indexes

• Indexes by authorship

- authored

- automatically generated

70

Principles of Indexing

71

Principle of Indexing

• Exhaustivity

- Summarization

- Depth indexing

• Specificity

• Consistency
72 In indexing two possible scenario may occur - 1.too
much is indexed or 2. too little is index. Too much is
Exhaustivity indexed is better as it increase the triviality level.

• the extent to which concepts or topics are written by means Exhaustive indexing results to high recall but low
of broad and specific index terms
precision
• implies that all possible terms have been exhausted for a
particular document
Summarization
• an item is that is exhaustively index is more likely to be - identifies only the dominant or the general
discovered because of the wider range of subject terms subject that would describe the item.
Depth Indexing
• means that every facet of the item is described
- extract all the main concepts describe
• refers to the detail with which topics of a document unit are within a document, recognizing many subtheme and
analysed and described
subtopics

73 Specificity results to low recall but high precision

Specificity

• extent to which a topic is identified by a precise term in a


hierarchical tree

• the preciseness to which we describe the document

• more specific term means more precise result

• refers to the closeness of fit between index terms and the


topics they represent

74 Inter-indexer consistency is the extent of agreement


regarding the indexing term to be used among different
Consistency indexer

Intra-indexer consistency refers to the extent an


• the extent to which the agreement on the terms to be indexer is consistent to himself in assigning indexer
used in indexing a document
terms.
• two types of consistency

1. inter-indexer consistency (agreement between


indexers)

2. intra-indexer consistency (an indexer consistent to


himself)
75

Indexing method

• Derived Indexing - descriptors are extracted from the text


or document itself. Also known as free text indexing or
natural language indexing

• Assigned Indexing - descriptors from the text or


document are translated into standard index term using a
standard authority or controlled vocabullary.

76

Indexing language

77

Indexing language

• refers to any vocabulary, controlled or uncontrolled, used


for indexing together with the rules of usage

• contains mechanism for structuring and using those


terms

• may also refer to the system for naming or identifying


subjects contained in a document
78
Indexing language: purpose

• Minimize the ambiguity of isolated vocabulary terms that


may be totally out of context

• Reduce the obscurity and redundancy of a general


vocabulary

• Indexing language do not reduce the effectiveness of the


user’s personal vocabulary

79

Indexing language: types

• Derived-term — all descriptors are taken from the


document itself. May also be called as indexing by
extraction

• Assigned-term — indexer assign terms or descriptors,


base from a controlled vocabulary, on the basis of
subjective interpretation of the concepts implied in the
document.

80 Syntax is concern with the clarity of expression,


efficient and unambiguous communication.
Indexing language: features
Syntactic relationship:
Order of sequence (ex. Philippines, Republic
of / Republic of the Philippines)
• Vocabulary - a list of terms, arranged alphabetically or in a
Use of markers which may take the form of
classified manner, that is selected for use in indexing
inflections or preposition (ex. Bird’s Nest / Nest of a
• Syntax - the combination and modification of terms to bird)
form simple or multilevel index terms.

• Semantics - refers to the to the permanent relations


between index terms. Class relations among index terms
81 Quasi-synomym are words that
Indexing language: features seemingly have the same meaning

• Three type of semantic relationship


There are three types of semantics in indexing.
1. Equivalence Relationship - use of a term denoting same Equivalence relationship - use of a term
concept
denoting same concept (USE) (synonym, preferred
- synonyms (ex. Feminism <—> Women’s Rights
- quasi-synonym (ex. Economics <—> Cost and spelling, acronym and abbreviations, current
Financing) established terms, translations)
- preferred spelling (ex. Meter <—> Metre)
- acronyms (ex. UN <—> United Nations)
- Current and established terms (ex. Third world <—>
Developing countries)
- Translation (ex. Manila hemp <—> Abaca)

82
Hierarchical Relationship - tree type of
Indexing language: features relationship between terms

• Three type of semantic relationship

2. Hierarchical Relationship - tree type of relationship


between terms
- genus/species
- (ex.
- Agroindustry
- Food Industry
- Meat Industry
- whole/part
- (ex. Hand -> Thumb)

83
Associative or non-hierarchical - denotes a
Indexing language: features relationship of terms where one term can be related or
associated with other terms primarily because of
experience. Can be identified by the term “Related
• Three type of semantic relationship Terms”
3. Associative or non-hierarchical relationship

- denotes a relationship of terms where one term can


be related or associated with other terms primarily
because of experience. Can be identified by the term
“Related Terms”
- (Ex. Men-Women, Education-Teaching, Maintenance-
Repair)
84 Natural /Free language - use of terms use by humans.
In this indexing language the terms or keywords are
Types of indexing language extracted from the text or document. Commonly use in
derived indexing or free-text indexing

• Natural/Free language
Controlled vocabulary - in this indexing language only
‣ use of terms use by humans. In this indexing language
the terms or keywords are extracted from the text or approved list of words can be use as index terms. This
document. Commonly use in derived indexing or free- is also use to manage synonyms and near synonyms
text indexing
‣ Tends to improve recall by providing more access and to bring together semantically related terms
points but reduces precision Classification schemes – generally hierarchical
‣ Greater level of redundancy with secondary alphabetical indexes
‣ Uses more current term Subject headings – Library of Congress Subject
‣ Tends to be favored by subject specialist or the end- Headings (LCSH), Sears List of Subject Headings.
user
MeSH
Thesaurus – provides for the controlled
vocabulary of one subject or field of interest,
presents equivalence of terms, homographs,
hierarchies and affinities, aims to promote
consistency in indexing

85 Natural /Free language - use of terms use by humans.


In this indexing language the terms or keywords are
Types of indexing language extracted from the text or document. Commonly use in
derived indexing or free-text indexing

• Controlled vocabulary Controlled vocabulary - in this indexing language only


approved list of words can be use as index terms. This
‣ in this indexing language only approved list of words is also use to manage synonyms and near synonyms
can be use as index terms. This is also use to manage
synonyms and near synonyms and to bring together and to bring together semantically related terms
semantically related terms Classification schemes – generally hierarchical
with secondary alphabetical indexes
‣ Authority list that enables an indexer to establish a Subject headings – Library of Congress Subject
standard description for each concept Headings (LCSH), Sears List of Subject Headings.
MeSH
Thesaurus – provides for the controlled
vocabulary of one subject or field of interest,
presents equivalence of terms, homographs,
hierarchies and affinities, aims to promote
consistency in indexing

86
Types of indexing language

• Controlled vocabulary

‣ Functions
- To control synonyms by choosing one form as the
standard term
- To make distinctions among homographs
- To bring or link together closely related terms
- Establishes its size or scope
- Records hierarchical and affinitative/associative
relations
- Controls variant spellings
87
Types of indexing language

• Controlled vocabulary

‣ Syndetic devices used in a controlled vocabulary

- USE and UF for synonym

❖ USE indicates that another term is to be used in


preference

❖ UF indicates that a term is used instead of another

88
Types of indexing language

• Controlled vocabulary

‣ Syndetic devices used in a controlled vocabulary

❖ BT, NT, RT reference for differing levels of


specificity and certain near synonyms and
antonyms

❖ Parenthetical qualifiers to resolve semantic


ambiguity

89
Types of indexing language

• Controlled vocabulary

‣ Advantages
- Ensure inter indexer consistency
- Indexers and users are led to desired topic ban
syndetic features
- Helps searchers to focus their thoughts when they
approach the information system without a full and
precise realization of what information they need
90
Types of indexing language

• Controlled vocabulary

‣ Disadvantages
- High input cost
- Human error in interpreting a document’s subject
matter
- Incompatibility to different indexing languages
- Possibility of out-of-date vocabulary
- Possibility of inadequate vocabulary

91 Natural /Free language - use of terms use by humans.


In this indexing language the terms or keywords are
Types of indexing language extracted from the text or document. Commonly use in
derived indexing or free-text indexing
• Controlled vocabulary
Controlled vocabulary - in this indexing language only
‣ Types of controlled vocabulary
approved list of words can be use as index terms. This
- Subject Heading List is also use to manage synonyms and near synonyms
and to bring together semantically related terms
❖ Follows an alphabetical arrangement of terms
Classification schemes – generally hierarchical
with secondary alphabetical indexes
❖ Covers a broad area of knowledge
Subject headings – Library of Congress Subject
❖ Ex. Library of Congress Subject Headings (LCSH), Headings (LCSH), Sears List of Subject Headings.
Sears List of Subject Headings. MeSH MeSH
Thesaurus – provides for the controlled
vocabulary of one subject or field of interest,
presents equivalence of terms, homographs,
hierarchies and affinities, aims to promote
consistency in indexing

92 Natural /Free language - use of terms use by humans.


In this indexing language the terms or keywords are
Types of indexing language extracted from the text or document. Commonly use in
derived indexing or free-text indexing
• Controlled vocabulary
Controlled vocabulary - in this indexing language only
‣ Types of controlled vocabulary
approved list of words can be use as index terms. This
- Thesaurus is also use to manage synonyms and near synonyms
❖ Latin word that means treasure and to bring together semantically related terms
❖ Use to control indexing vocabulary in one subject
Classification schemes – generally hierarchical
or field of interest
❖ provides for the controlled vocabulary of one
with secondary alphabetical indexes
subject or field of interest, presents equivalence of Subject headings – Library of Congress Subject
terms, homographs, hierarchies and affinities, Headings (LCSH), Sears List of Subject Headings.
aims to promote consistency in indexing MeSH
Thesaurus – provides for the controlled
vocabulary of one subject or field of interest,
presents equivalence of terms, homographs,
hierarchies and affinities, aims to promote
consistency in indexing
93 Coordinate Indexes
* Coordinate indexes allows terms to be combined
Indexing systems or coordinated.
* Modern era coordinate indexes began with the idea
of punching or notching a card and using a
mechanical device, such as long needles, to drop
out cards containing the combination of index
• Coordinate Indexes
terms of interest
- Pre-coordinate indexes * combining two or more single index terms to create
a new class creates coordinating indexes.
- Post-coordinate indexes
* ex. Pecan and Trees -> Pecan Trees
Pre-coordinate indexes
* Pre-coordinate indexes is an index type where
searching terms are combined prior to searching.
The combination is not under the control of the
user
Post-coordinate indexes
* Post-coordinate indexes are index where searching
terms are combined at the time of searching by the
user.
94 Classified indexes
Indexing Systems * Classified indexes has its own
contents arranged
systematically by classes or
• Classified indexes
subject headings.
- Enumerative indexes * Classification is based on an
- Faceted indexes
existing classification scheme
such as DDC, LCSH, and etc.

Enumerative indexes are index in


which all the elements are name
and placed in fixed relationship
prior to use.

Faceted indexes are based on


any definable aspect that makes
up a subject. Composite concepts
are then created in to provide
access to each facet or notation,
contained in the subject composite

95 Classified indexes
Indexing Systems * Classified indexes has its own
contents arranged
systematically by classes or
• Classified indexes subject headings.
- Enumerative indexes * Classification is based on an
❖ index in which all the elements are name and existing classification scheme
placed in fixed relationship prior to use
such as DDC, LCSH, and etc.

Enumerative indexes are index in


which all the elements are name
and placed in fixed relationship
prior to use.

Faceted indexes are based on


any definable aspect that makes
up a subject. Composite concepts
are then created in to provide
access to each facet or notation,
contained in the subject composite

96 Classified indexes
Indexing Systems * Classified indexes has its own
contents arranged
• Classified indexes

- Faceted indexes
systematically by classes or
❖ based on any definable aspect that makes up a subject. subject headings.
* Classification is based on an
Composite concepts are then created in to provide access
to each facet or notation, contained in the subject
composite

❖ A type of synthetic classification and is also called an


existing classification scheme
analytics-synthetic system
such as DDC, LCSH, and etc.
❖ Pre-corrdinated at the time of indexing and is arranged in
classification order rather than a straight alphabetical order

Enumerative indexes are index in


which all the elements are name
and placed in fixed relationship
prior to use.

Faceted indexes are based on


any definable aspect that makes
up a subject. Composite concepts
are then created in to provide
access to each facet or notation,
contained in the subject composite
97

Indexing Systems
• Chain indexes

- A technique for constructing an organized set of entries for


an alphabetical subject index of a classified catalog

- Provide that every concept becomes linked or chained to


its directly related concept in the hierarchy system

- alphabetically arranged indexes with separately provided


entry for each term or link for all terms used in a
classification or subject heading scheme

- developed by S.R. Ranganathan

98
Chain indexing: example
Using DDC

214.8 -

200 - RELIGION

210 - NATURAL RELIGION

214 - THEODICITY

214.8 - PROVIDENCE

http://www.vanbellenet.be/mywikis/chain_indexing.html#Instructions

99

Indexing Systems
• Permuted Title Indexes

- relies heavily on the title of the documents

- will work effectively if the title of the document is


highly specific and expressive

- a stoplist is constructed which contains a list of


words which are not of value for indexing.

- KWIC, KWAC, KWOC


100 Permuted Title Indexes: example

101 KWIC indexing employs the


Indexing Systems following principles
1. Titles are informative
• KWIC
2. Words extracted from the title
- Keyword in Context
can be use to effectively guide
- introduced by Hans Peter Luhn in 1959
users to the source of
- Rotated index commonly derived from title of the
documents information
- Arranged alphabetically down the centre with 3. Although the meaning of the
surrounding title or text segment preserved on both
side. word viewed in isolation may
be ambiguous or too general,
the context surrounding the
word helps to define and
explain its meaning
102

KWIC: example

• Title: Classification of Books in a University Library

Classification of BOOKS in a University Library


University Library CLASSIFICATION of Books in a
in a University LIBRARY/Classification of Books
of Books in a UNIVERSITY LIBRARY/Classification

103

KWIC: example

• Create a KWIC entry for the following

‣ Title: The Cat and the Fiddle

104

KWIC: example

• Create a KWIC entry for the following

‣ Title: The Cat and the Fiddle

‣ Answer:

The Cat and the Fiddle


the Fiddle/ The Cat and
105

KWIC: example

• Create a KWIC entry for the following

‣ Title: Dogs and Cats and their Diseases

106

KWIC: example
• Create a KWIC entry for the following

‣ Title: Dogs and Cats and their Diseases

‣ Answer:

Dogs and Cats and their Diseases


their Diseases/ Dogs and Cats and
their Diseases Dogs and Cats and

107

Indexing Systems

• KWAC

- Keyword Alongside Context

- provides for the enrichment of the keyword of the title


108

KWAC: example

• Title: Cataloging and classification for Croatians

Cataloging and classification for Croatians.


classification for Croatians. Cataloging and
Croatians . Cataloging and classification for

109

KWAC: example

• Create a KWAC entry for the following

‣ Title: The Cat and the Fiddle

110

KWAC: example

• Create a KWAC entry for the following

‣ Title: The Cat and the Fiddle

‣ Answer:

Cat and the Fiddle. The


Fiddle. The Cat and the
111

KWAC: example

• Create a KWAC entry for the following

‣ Title: Dogs and Cats and their Diseases

112

KWAC: example
• Create a KWAC entry for the following

‣ Title: Dogs and Cats and their Diseases

‣ Answer:

Cats and their Diseases. Dogs and


Diseases. Dogs and Cats and their
Dogs and Cats and their Diseases.

113

Indexing Systems

• KWOC

- Keyword out of context

- keyword or access point is shifted to extreme left at its


normal place in the beginning of the line and followed
by the complete title
114

KWOC: example

• Title: Cataloging and classification for Croatians

Cataloging Cataloging and classification for Croatians

classification Cataloging and classification for Croatians

Croatians Cataloging and classification for Croatians

115

KWOC: example

• Create a KWOC entry for the following

‣ Title: The Cat and the Fiddle

116

KWOC: example

• Create a KWOC entry for the following

‣ Title: The Cat and the Fiddle

‣ Answer:

Cat The Cat and the Fiddle


Fiddle The Cat and the Fiddle
117

KWOC: example

• Create a KWOC entry for the following

‣ Title: Dogs and Cats and their Diseases

118

KWOC: example
• Create a KWOC entry for the following

‣ Title: Dogs and Cats and their Diseases

‣ Answer:

Cats Dogs and Cats and their Diseases


Diseases Dogs and Cats and their Diseases
Dogs Dogs and Cats and their Diseases

119

Exercises
120
Create KWIC, KWAC and KWOC entry for the following titles
1. The Lord of the Rings

KWIC

The Lord of the Rings


the Rings/ The Lord of
KWAC

Lord of the Rings. The


Rings. The Lord of the

KWOC

Lord The Lord of the Rings


Rings The Lord of the Rings

121
Create KWIC, KWAC and KWOC entry for the following titles
2. The Catcher in the Rye

KWIC

The Catcher in the Rye


the Rye/ The Catcher in
KWAC

Catcher in the Rye. The


Rye. The Catcher in the

KWOC

Catcher The Catcher in the Rye


Rye The Catcher in the Rye

122

Create KWIC, KWAC and KWOC entry for the following titles
3. The Great Gatsby
KWIC

Great Gatsby/ The


The Great Gatsby
KWAC

Gatsby. The Great


Great Gatsby. The

KWOC

Gatsby The Great Gatsby


Great The Great Gatsby
123
Create KWIC, KWAC and KWOC entry for the following titles
4. The Lion, the Witch and the Wardrobe

KWIC
The Lion the witch and the wardrobe
the Wardrobe / The lion the witch and
The Witch and the wardrobe/ The lion

KWAC
Lion the witch and the wardrobe. The
Wardrobe. The Lion the witch and the
Witch and the wardrobe.The Lion, the

KWOC
Lion the witch and the wardrobe. The
Wardrobe The Lion, the Witch and the Wardrobe
Witch The Lion, the Witch and the Wardrobe

Create KWIC, KWAC and KWOC entry for the following titles 124
5. Adventures of Huckleberry Finn
KWIC
Huckleberry Finn Adventures of
Huckleberry Finn/ Adventures of
Adventures of Huckleberry Finn
of Huckleberry Finn / Adventures

KWAC
Adventures of Huckleberry Finn
Finn. Adventures of Huckleberry
Huckleberry Finn. Adventures of
Huckleberry Finn. Adventures of

KWOC
Adventures Adventures of Huckleberry Finn
Finn Adventures of Huckleberry Finn
Huckleberry Adventures of Huckleberry Finn
Huckleberry Finn Adventures of Huckleberry Finn

125

Indexing Systems
• Citation Indexes

- its development can be traced back 100 years ago in


the development of Shepard’s Citations, an account
for legal decisions and legal citations created by
legal professionals

- Developed by Eugene Garfield for bibliographic


purposed and as a research tool in studying the
behavioral characteristics of the literature
126

Indexing Systems

• Citation Indexes

- an ordered list of cited articles along with a list of


citing articles

- The cited article serves as the reference and the


citing article as the source

127 String indexing system dates back to the works of Farradane,


Ranganathan, Cutter and other classificationists. It aims to
Indexing Systems display a series of rotating index entries from a basic list of
index terms that make up a string.
PRECIS or Preserved Context Index. This system was
developed by Derek Austin in 1968 as an indexing system for
the British National Bibliography. In this system, indexers writes
• String Indexes
the subject as a string of terms forming a title-like statement. A
‣ PRECIS role operator is allocated to each term to represent term
relationship.
‣ POPSI PRECIS has the following important features:-
The system derives headings that are co-extensive
‣ NEPHIS with the subject at all access points.
It is not bound to any classification scheme.
‣ CIFT The terms are context dependent in nature, which
enables the users to identify the entries correctly.
The entries are generated automatically by the
computer references between semantically related terms.
It also provides an adequate arrangement of
references between semantically related terms.
It is a flexible system, as it is able to incorporate
newly emerging terms accordingly.
It has introduced the PRECIS table which puts forth
a set pattern for the preparation of entries, thus bringing about
consistency in work.
128 POPSI (Postulate-based Permuted Subject Indexing) was
developed by G. Bhattacharya at the Documentation Research
POPSI and Training Centre in Bangalore, India. Derived its postulate
from Ranganathan’s theories of classification.

POPSI is specifically based on:


a. set of postulated Elementary Categories of the elements fit to form component
of subject proposition
Elementary Categories:
Discipline (D) - It covers conventional field of study, e.g. Chemistry,
Physics, etc.,
Entity (E) - e.g. Plant, Lens, Eye, Book, etc.,
Action (A) - e.g. Treatment, Migration, etc; and
Property (P) - It includes ideas denoting the concept of ‘attribute’ –
qualitative or quantitative. e.g. Power, Capacity, Property, etc.
b. set of rules of syntax with reference to the Elementary Categories (based of
Ranganathans general theory of classification
c. set of indicator digits or notations to denote the Elementary Categories and
their subdivisions.
d. vocabulary control device designated as “Classaurus”

129 NEPHIS (Nested Phrase Indexing System) developed by


Timothy Craven to provide a simple way of generating strings
NEPHIS from which index entries could be generated by a computer.
Symbols used in NEPHIS:
@ sign to mark a term not to be used as an access
point
NEPHIS (Nested Phrase Indexing System) developed by end of a phrase embedded or nested within a larger
Timothy Craven to provide a simple way of generating strings phrase
from which index entries could be generated by a computer. ? question mark to introduce a connective
< > left and right angular brackets mark the
Symbols used in NEPHIS: beginning and
@ sign to mark a term not to be used as an access point
end of a phrase embedded or nested within a larger phrase
? question mark to introduce a connective
< > left and right angular brackets mark the beginning and

130 PRECIS or Preserved Context Index. This system was


developed by Derek Austin in 1968 as an indexing system for
PRECIS the British National Bibliography. In this system, indexers writes
the subject as a string of terms forming a title-like statement. A
role operator is allocated to each term to represent term
relationship.

• Preserved Context Index. PRECIS has the following important features:-


The system derives headings that are co-extensive
• This system was developed by Derek Austin in 1968 as an with the subject at all access points.
indexing system for the British National Bibliography. It is not bound to any classification scheme.
The terms are context dependent in nature, which
• In this system, indexers writes the subject as a string of enables the users to identify the entries correctly.
terms forming a title-like statement. A role operator is The entries are generated automatically by the
allocated to each term to represent term relationship.
computer references between semantically related terms.
It also provides an adequate arrangement of
references between semantically related terms.
It is a flexible system, as it is able to incorporate
newly emerging terms accordingly.
It has introduced the PRECIS table which puts forth
a set pattern for the preparation of entries, thus bringing about
consistency in work.
131 PRECIS or Preserved Context Index. This system was
developed by Derek Austin in 1968 as an indexing system for
PRECIS the British National Bibliography. In this system, indexers writes
the subject as a string of terms forming a title-like statement. A
role operator is allocated to each term to represent term
• Preserved Context Index involves: relationship.

‣ Determining the subject content of the document PRECIS has the following important features:-
The system derives headings that are co-extensive
‣ Analyzing the subject statement to determine the role of with the subject at all access points.
each significant term It is not bound to any classification scheme.
The terms are context dependent in nature, which
‣ Computers will manipulate the coded string to create enables the users to identify the entries correctly.
the index entries The entries are generated automatically by the
computer references between semantically related terms.
‣ Determine the relationship of a term to other terms in
It also provides an adequate arrangement of
the database and how should all theses terms be linked
references between semantically related terms.
It is a flexible system, as it is able to incorporate
newly emerging terms accordingly.
It has introduced the PRECIS table which puts forth
a set pattern for the preparation of entries, thus bringing about
consistency in work.

132 PRECIS or Preserved Context Index. This system was


developed by Derek Austin in 1968 as an indexing system for
PRECIS the British National Bibliography. In this system, indexers writes
the subject as a string of terms forming a title-like statement. A
role operator is allocated to each term to represent term
• Important features: relationship.
‣ The system derives headings that are co-extensive with the subject
at all access points.
‣ It is not bound to any classification scheme. PRECIS has the following important features:-
‣ The terms are context dependent in nature, which enables the The system derives headings that are co-extensive
users to identify the entries correctly. with the subject at all access points.
‣ The entries are generated automatically by the computer references It is not bound to any classification scheme.
between semantically related terms.
‣ It also provides an adequate arrangement of references between The terms are context dependent in nature, which
semantically related terms. enables the users to identify the entries correctly.
‣ It is a flexible system, as it is able to incorporate newly emerging The entries are generated automatically by the
terms accordingly.
‣ It has introduced the PRECIS table which puts forth a set pattern computer references between semantically related terms.
for the preparation of entries, thus bringing about consistency in It also provides an adequate arrangement of
work.
references between semantically related terms.
It is a flexible system, as it is able to incorporate
newly emerging terms accordingly.
It has introduced the PRECIS table which puts forth
a set pattern for the preparation of entries, thus bringing about
consistency in work.

133 PRECIS or Preserved Context Index. This system was


Secondary Operators
developed by Derek Austin in 1968 as an indexing system for
PRECIS
Coordinate concepts
•f - "Bound" coordinate concept the British National Bibliography. In this system, indexers writes
•g - Standard coordinate concept the subject as a string of terms forming a title-like statement. A
Dependent elements
role operator is allocated to each term to represent term
Primary Operators •p - Part; property
Environment of core concepts •q - Member of quasi group relationship.
•0 - Locations •r - Assembly
Core concepts Special class of action PRECIS has the following important features:-
•1 - Key system; Object of •s - Role identifier The system derives headings that are co-extensive
transitive action; Agent of •t - Author attributed association
transitive action •u - Two-way interaction with the subject at all access points.
•2 - Action; Effect of action It is not bound to any classification scheme.
• 3 - Performer of transitive action Primary Codes
The terms are context dependent in nature, which
Extra-core concepts Theme interlinks
•4 - View point -as-form •$x - 1st concept in coordinate theme enables the users to identify the entries correctly.
•5 - Selected instance; e.g. study •$y - 2nd subsequent concept in theme The entries are generated automatically by the
region, sample population •$z - Common concept computer references between semantically related terms.
•6 - Form of document; target Term codes
user •$a - Common concept It also provides an adequate arrangement of
•$c - Proper name references between semantically related terms.
•$d - Place name It is a flexible system, as it is able to incorporate
newly emerging terms accordingly.
It has introduced the PRECIS table which puts forth
a set pattern for the preparation of entries, thus bringing about
consistency in work.
134

PRECIS: example

Example: Computerisation of libraries in India


(0) India
(1) Libraries
(2) Computerisation

1. INDIA
Libraries. Computerisation

2. LIBRARIES India
Computerisation

3. COMPUTERISATION Libraries. India

135

PRECIS: example

• Create a PRECIS entry for the following

‣ Title: Selection of personnel in paper industries in the


Philippines

136

PRECIS: example
• Create a PRECIS entry for the following

‣ Title: Selection of personnel in paper industries in the


Philippines

‣ Answer:

(0) Philippines
(1) paper industries
(P) personnel
(2) selection
137

Indexing Systems
• CIFT

- Conceptual Indexing and Faceted Taxonomic Access System

- developed for the Modern Language Association by J.D.


Anderson

- alphabetical entries are created from strings provided by


indexers who assign facets derived from literature,
linguistics, and folklore

- has a set of 21 facets for the description and classification of


art literature and these are provided in a worksheet

138
Factors that affect index
quality

• Subject knowledge

• Knowledge of user needs

• Experience

• Concentration

• Reading ability and comprehension

139
Factors for selecting title to
be indexed

• Usefulness

• Subject coverage or content

• Class and range of readership

• Availability in most libraries

• Being indexed in other indexing service


140

Subject indexing
• A process by which the subject matter/content of document
is represented in an index

• Steps

1. Determine the “aboutness” of the document

2. Subject analysis - also called conceptual analysis.


Required to decide which of an item’s aspects should
be represented in the bibliographic record

3. Translation - converting concepts derived from the


document into a particular set of index terms

141
Subject indexing process
1. Recording bibliographic data

‣ Important concepts in the documents are recorded ( Author, title, publication data,
etc.)

2. Subject of conceptual analysis

• Decides on what the document is about

• Parts that needs to be analyzed:

• Title

• Abstract

• Text

• Reference Section

142
Subject indexing process
3. Subject Determination

• Determine the aboutness of the material

• Formulation of a concept list

4. Translation to standard or controlled vocabulary term

‣ Conversion of concepts into a list of acceptable index terms

‣ Or if a controlled vocabulary is being used, use authority list to convert concept


terms

5. Generate index entries

‣ Index entries may be generated manually to using a computer.

‣ Determine the arrangement


143

Evaluation of the
Index

144 Recall example:


Measure of effectiveness of
If the library have 1000
an indexing system
relevant document an the indexing
• Recall system was able to retrieve 700.
- the simple quantitative ration of relevant documents The recall ration is 700 out of 1000
retrieved to the total number of relevant documents
available in the system (700/1000). Recall for this search
• Precision is 70 percent effective
- the ration of relevant document retrieved to the total
documents retrieve.
Precision example
If the indexing system
retrieve 500 documents and 450 of
these items are relevant, the
precision ration is 450 out of 500
(450/500). Precision in this search
is 90 percent precise.
145
Guidelines/Criteria for
Determining the Quality of Index

• Subject error • Entry


• Generic Searching Determination
• Terminology • Spelling and
• Internal Guidelines punctuation
• Cross references • Filing
• Accuracy in • Layout
referring • Cost
• Entry scattering • Standards

146

Index process

147 3. Take into consideration their


Indexing Books significance to the central topic of
the book
1. Examine the text carefully
2. Read the text several times, page by page, to able to
analyze the contents and determine the indexable topic
3. Select the topics to be index
4. Name the topics that were chosen to be indexed
5. Alphabetize the entries
6. Edit entries
7. Determine the design of the index after compilation of
the entries
8. Typing, proofreading, and final review
148
Techniques in Indexing
Periodical Articles

1. Indexable topics
• Name of person honored by awards or prize
• Eulogized in obituaries
• Sports event
• Editorials
• Economic news
• Letters to the editors
• Social trends
• Local politics

149 4. Index name of person who


Techniques in Indexing
criticized the article and the
Periodical Articles
author’s response

2. Articeles with permanent value


• This should be index under all topics and issues dealt
with
3. Editorial must be index under the topic. Add (Ed.) / (E) to
differentiate from other articles. Index titles of editorials
under a collective heading “Editorials”
4. Letters to the editor must be index under the topic, not
under the caption assigned by the editor.

150

The Thesaurus
151

Thesaurus

• an organised list of terms from a specialised vocabulary


arranges to facilitate the selection of index terms as well

• an authority list that helps to encode the documents’


subject at the input stage (index terms) and also at
searching stage (search terms) is called as a thesaurus.

152
Thesauri vs Subject
heading list

• Similarities

‣ both provide subject access to information by


providing consistent terminology

‣ Use preferred term and reference non-preferred terms

‣ provides hierarchy that present relationship between


terms

153
Thesauri vs Subject
heading list
• Difference

‣ subject heading list are generally hierarchical with


secondary alphabetical indexes, while thesauri are
generally alphabetical with hierarchical structure built in by
the use of cross references.

‣ relationship between terms in a thesaurus are more


specific

‣ thesauri descriptors are often dependent on another term


and are intended to be combine with other terms where in
subject heading list the term can stand alone
154 BT - reference shows hierarchical
Relationship of terms relationship upward in the
classification tree
• SN Scope Note RT - reference refer to a descriptor
• UF Used For
that can be used in addition to the
• BT Broader Term
basic term but is not in a
• RT Related Term
hierarchical relationship
• SA See Also
UF - deals primarily with synonyms
or variant forms of the prefer
descriptor
USE - refers to the preferred
descriptor from a non usable term
SN - a short description on how to
use a descriptor

155 Based from Bates 1989.


Basic feature of a thesaurus
• Includes a list of all terms in use in the database

• Carefully distinguishes terms actually used in a given


database from those that are not

• Provides scope notes for problems likely to be


encountered by end users

• Uses self-explanatory names for terms or relationships

• Includes a vast entry vocabulary, geared to end user


requirements (this is important)
156

Thesaurus construction
1. Identify subject field

2. Identify the nature of the literature to be indexed

3. Identify the users

4. Identify the file structure

5. Consult published indexes, glossaries, dictionaries, and other tools


in the subject areas for the raw vocabulary

6. Cluster the terms

7. Established term relationships

157

Indexing standards
• International Organization for Standardization. Information and Documentation Guidelines for the
Content, Organization, and Presentation of Indexes. (ISO 999-1996). Geneva: ISO, 1996.
• Guidelines for the establishment and development of monolingual thesauri. (ISO 2788-1986).
Geneva: ISO, 1986.
• Guidelines for the establishment and development of multilingual thesauri. (ISO5964-1985).
Geneva: ISO, 1985.
• Rules for the abbreviation of title words and titles of publications. (ISO 4-1997). Geneva: ISO,
1997.
British Standards Institution. Recommendations for Examining Documents, Determining Their
Subjects, and Selecting Indexing Terms. (BS 6529:1984). London: BSI, 1984.
• Guide to establishment and development of monolingual thesauri. (BS 5723:1987). London: BSI,
1987.
• Guide to establishment and development of multilingual thesauri. (BS6723: 1985). London: BSI,
1985.
American National Standards Institute. Guidelines for Abstracts. (ANSI/NISO Z39.14-1994 (R2002).
New York: ANSI, 1994.
• Guidelines for the Construction, Format, and Management of Monolingual Thesauri. (ANSI/NISO
Z39.19 – 2003). New York: ANSI, 2003.

158

Indexing Activities
159

Abstracting

160

History
• 3rd century B.C. - abstracts of business records

• 1200 A.D. - rise of the scholarly journals

• 14th century - Pope Pierce II had abstracts of his favorite readings

• 16th-17th century – unpublished Elizabethan scientific abstracts

• 18th century – rise of general abstracting journals 19th century – rise of


specializing journals

• 20th-21st century – characterized by information explosion and


computers; automatic abstracting; information overload and the Internet

161

• Abstract

- An abbreviated, accurate representation of the significant


contents of a document (ISO 214)

- A brief and objective representation of a document or an oral


presentation (NISO Z39.14).

- A condensed representative surrogate of a knowledge record.


A narrative description of a document which may include
pertinent data and critical content (Cleveland and Cleveland)

- A brief but accurate representation of the contents of a


document (Lancaster)


162

• Abstracting – process of analyzing and providing a brief,


accurate and clear representation of the significant
contents of a document.

163

Document Surrogates
• Annotation – description of the contents of a document, usually to clarify
the title.

• Extract – one or more portions from a document lifted verbatim to


represent the whole.

• Summary – brief restatement of the salient findings and conclusions


intended to complete
the orientation of the reader; may be found at the beginning of the article
or at the end.

• Terse literature – highly abbreviated statement that encapsulates the


major points of a
document.

164

Use of abstract
• Promote current awareness

• Save reading time

• Facilitate selection and literature searches

• Help overcome the language barrier

• Improve indexing efficiency

• Aid in the preparation of reviews


Accuracy
165
- As far as practicable, abstracts should avoid errors in

Principles of Abstracting representing the actual document. The information delivered by


abstracts must be confined within what is contained in the
actual document, and what information is really important on the
original document.
- consistency and correctness of information contained in the
abstract; includes only info included in the original document

• Accuracy
Brevity
- Apparently, an abstract should be much shorter that the
• Brevity
original document from which they are derived. . This saves the
users' time in searching and retrieving their desired information,
• Clarity
and lowers the cost of producing abstracts as well. To achieve
this, loss of novelty should not be sacrificed. Brevity should also
mean the prevention of redundancy.
- short, well-written, complete sentences are required for easy
access to the information; gets straight to the point; contains
precise language and does not include superfluous adjectives

Clarity
- This quality ensures that abstracts should be free from all
sorts of ambiguities. As much as possible, an abstract is written
in a language and style clearly understood by the user
- to avoid confusion and provide readability, only the most
common abbreviations and standard symbols should be used;
does not contain jargon or colloquialisms and always explains
any acronyms

166

Purpose of abstracting

• The main purpose of a modular abstract is to eliminate


duplication and waste of intellectual effort involved in the
independent abstracting of the same documents by
several abstracting services.
167

Purpose of abstracting
• To decrease the time and effort it takes to search the overwhelming
output from research and scholarship around the world

• To help researchers to decide which document is appropriate for


their research

• To satisfy users needs for both current and retrospective


information

• To overcome language barrier

• Plays an important role in the structure of a computer-based


system

168 Facilitate document selection or


Functions of abstracts determination of document
relevance to user interest
• Promote current awareness

• Save reading time

• Make document selection easy

• Help in literature search

• Aid in overcoming language barrier

169 Facilitate document selection or


Functions of abstracts determination of document
relevance to user interest

• Help in preparation of indexes, reviews, and


bibliographies

• Improve indexing efficiency

• Quickly identify document content

• Help the user decide where to read the entire article


170

Types of abstract
• INDICATIVE ABSTRACT - simply describes or indicates what the
document is about.

• INFORMATIVE ABSTRACT - provides readers with quantitative


and qualitative information in the document.

• SLANTED ABSTRACT - the information or description reported


in a document is oriented to a specific discipline to which the
abstracting service is devoted. discipline-oriented abstract
mission-oriented abstract

• CRITICAL ABSTRACT - an evaluative abstract. Contains views


and comments on the quality of work of the author and
comparison/contrast with other works.

171

Indicative abstract
• or descriptive abstract, provides the content of
the original paper without data or comment.
• It simply describes what type of record is being

abstracted and what it is about. In many cases,


it is somewhat shorter and is written in general
terms and does not give the reader/user a
progressive account of the paper’s development.
• It is the type of abstract that is described as

an alerting device.

172
Sample of indicative
abstract
"Bonanza Creek LTER [Long Term Ecological Research] 1997 Annual Progress
Report“. http://www.lter.alaska.edu/pubs/1997pr.html

We continue to document all major climatic variables in the uplands and


floodplains at Bonanza Creek. In addition, we have documented the
successional changes in microclimate in 9 successional upland and
floodplain stands at Bonanza Creek (BNZ) and in four elevational locations
at Caribou-Poker Creek (CPCRW). A sun photometer is operated
cooperatively with NASA to estimate high-latitude atmospheric extinction
coefficients for remote-sensing images. Electronic data are collected
monthly and loaded into a database which produces monthly summaries.
The data are checked for errors, documented, and placed on-line on the
BNZ Web page. Climate data for the entire state have been summarized
for the period of station records and krieged to produce maps of climate
zones for Alaska based on growing-season and annual temperature and
precipitation.

Source: http://writing.colostate.edu/guides/documents/abstract/pop5a.cfm
173

Informative abstract
• presents the specific quantitative and qualitative
data contained in the material
• most useful for documents reporting on

experimental research
• It gives enough details about the whole paper

and readers often do not need to retrieve the


paper for further information
• Specific details such as formulas, statistical

results, and parts of tables are often included

174 The purpose section of an informative abstract should state


Informative abstract 4 either the reason for or the primary objectives of the
experiment or investigation. The purpose section of an
essential points informative abstract might also contain the hypothesis of
the experiment.

The methodology section of an informative abstract should


• Objective and scope of the work describe the techniques used in conducting the
(purpose) experiment. This section should give only as much detail as is
necessary to understand the experiment; the abstract
• Methodology should not focus entirely on research methods unless that is
• Results the primary focus of the original document.

• Conclusion The results section of an informative abstract should relate


the observations and/or data collected during the
experiment. This section should be concise and informative,
and only the most important results need be included.

The conclusion section of an informative abstract should


state the evaluation or analysis of the experiment results. It
should also briefly state the implications of these results.
This conclusion section might also state whether the driving
hypothesis of the experiment was correct.
175
Sample of informative
abstract
Palmquist, M., & Young, R. (1992). The Notion of Giftedness and Student Expectations About
Writing. Written Communication, 9(1), 137-168.

Research reported by Daly, Miller, and their colleagues suggests that writing apprehension is
related to a number of factors we do not yet fully understand. This study suggests that
included among those factors should be the belief that writing ability is a gift. Giftedness, as
it is referred to in the study, is roughly equivalent to the Romantic notion of original genius.
Results from a survey of 247 postsecondary students enrolled in introductory writing courses
at two institutions indicate that higher levels of belief in giftedness are correlated with
higher levels of writing apprehension, lower self-assessments of writing ability, lower levels
of confidence in achieving proficiency in certain writing activities and genres, and lower
self-assessments of prior experience with writing instructors. Significant differences in levels
of belief in giftedness were also found among students who differed in their perceptions of
the most important purpose for writing, with students who identified "to express your own
feelings about something" as the most important purpose for writing having the highest
mean level of belief in giftedness. Although the validity of the notion that writing ability is a
special gift is not directly addressed, the results suggest that belief in giftedness may have
deleterious effects on student writers.

https://writing.colostate.edu/guides/popup.cfm?pageid=1257&guideid=59

Considered a third category abstract that makes value judgment or


176 editorial comments on a paper. Generally used on published paper

Critical abstract with broad overviews, on reviews, on monographs, but can also be
used for single papers

A critical abstract provides, in addition to describing main findings and


• makes a value judgement or editorial comment information, a judgment or comment about the study’s validity,
on the paper reliability, or completeness. The researcher evaluates the paper and
• Condensed critical review
often compares it with other works on the same subject. Critical
abstracts are generally 400-500 words in length due to the additional
• Evaluative
interpretive commentary. These types of abstracts are used
• Abstractor expresses views on the quality of the infrequently.
work of the author
• Publications with critical abstract

● Mathematical Reviews
● Applied Mechanics Review

177
Sample of critical abstract

Rosensweig, R. E. ; Beecher, N. Theory for the ablation of


fiberglass-reinforced phenolic resin. Amer Inst Aero Astron
J 1:1802-1809 1963

The theory of ablation of carbon-contaminated glass,


extended from the char-layer theory , gives 38% under
prediction of results of the experiment. A thorough error
analysis was not included. Spalding and Scala have
treated similar problems.
178

Parts of abstract
• REFERENCE - complete bibliographic citation of the
original document

• BODY - describes the content of the original document


briefly and succinctly

• SIGNATURE - indicates abstractor’s name and his


affiliation

• KEYWORDS - used in indexing by information retrieval


systems

179 1. Reading/Understanding –
Abstracting process introductory paragraphs and text
are scanned for key information
2. Selection – abstractor marks the
• Reading and understanding
important phrases and passages
• Selection
and jots down marginal notes
• Interpretation
3. Interpretation -abstractor uses
• Synthesis and analytical description
reasoning and inference; starts
organizing the phrases and
passages previously marked as
well as the marginal notes jotted
down
4. Synthesis/Analytical description
– desired type of abstract is
carefully considered in writing the
final draft
180

Citation format

• Monographs
- Surname/s of author/s, forename/s
- Title of publication (underlined/italics/all caps)
- Edition number
- Place of publication
- Publisher
- Year of publication
- Total number of pages

181

Sample

Lancaster, F.W. Indexing and Abstracting in Theory and


Practice. 3rd ed. London: Facet Publishing, 2003. 451 p.

182

Citation format

• Journal articles
- Surname/s of author/s, forename/s
- Title of journal (underlined/italics/all caps)
- Volume number
- Issue number in parenthesis
- Inclusive pages
- Date of publication of journal
183

Sample

Parkinson, Claire L. Paradigm transitions in


mathematics. Philos Math 2(2): 127-150 2005

184
Qualities of a good abstract
• Generally consist of one paragraph

• Short, simple, complete sentences are required for easy


access to the information

• Should not repeat what is in the title and should avoid


naming type of document

• Technical words and phrases should be based on the the


subject field under consideration

• Use most common abbreviations and standard symbols

185
Qualities of a good abstract

• Discuss the following in order: objective, methods,


results, and conclusions

• Use verb in active voice

• Use third person pronoun

• Provide logical connections between materials

• Should not contain background information or detailed


discussion of method
186
Qualities of a good abstract
• No restriction should be place on absolute length of the abstract

• Length of abstract
- Articles, monographs 250 words
- Technical reports 250 words
- Thesis and disserations 300 words

• Approximate proportions of parts


- Nature and scope 3%
- Objective 7%
- Research method 15%
- Findings 70%
- Conclusion 5%

187

Abstracting process
1. Accurately and fully record the reference

2. Content analysis

3. Write the annotation

4. Append abstractor’s name ( to give credit and


responsibility)

5. Arrange the abstracts

188
Style of writing an abstract

• Must be intelligible to the reader

• Avoid using the following for clarity purposes:

‣ Footnotes

‣ List of references

‣ References to the text of the original documents

https://groups.niso.org/apps/group_public/download.php/14601/Z39-14-1997_r2015.pdf
189 Retain balance and give emphasis
Style of writing an abstract to the original document except to
slanted abstract

• Retain balance and give emphasis to the original


document

• Be concise and fulfill requirements

• Use transitional words and phrases

https://groups.niso.org/apps/group_public/download.php/14601/Z39-14-1997_r2015.pdf

190
Style of writing an abstract
• Length
‣ Differ according to type of document being abstract
‣ If length is not specified the following is usually
adequate

https://groups.niso.org/apps/group_public/download.php/14601/Z39-14-1997_r2015.pdf

191
Style of writing an abstract

• Paragraphing and structured abstract

‣ Write the abstract in single paragraph

• Complete sentences

‣ Use complete sentences

https://groups.niso.org/apps/group_public/download.php/14601/Z39-14-1997_r2015.pdf
192
Style of writing an abstract
• First Sentence
‣ Avoid naming the type of document in the first
sentence
• Use of active verbs
‣ Use verb in active voice

https://groups.niso.org/apps/group_public/download.php/14601/Z39-14-1997_r2015.pdf

193
Style of writing an abstract
• Terminology

‣ Avoid using the following

- Unfamiliar words

- acronyms

- abbreviations

- symbols

https://groups.niso.org/apps/group_public/download.php/14601/Z39-14-1997_r2015.pdf

194 Further details should be place at


Style of writing an abstract either the end of the abstract to as
part of the bibliographic references
• Nontextual materials
(only if chosen to be added)
‣ The following should be added if no alternatives exist and
if they are needed for brevity and clarity

- Short tables, equations, structural formulas, and


diagrams

• Treatment of added details

‣ Further details should be place at either the end of the


abstract to as part of the bibliographic references

https://groups.niso.org/apps/group_public/download.php/14601/Z39-14-1997_r2015.pdf
195

Standards
! ISO 214-1976 Abstracts for publication and documentation
! Prescriptive rather than definitive
! Presents guidelines for

◦ Preparing and presenting abstracts


◦ Subject analysis
◦ Style to be used
◦ Length of the abstract
! Provides definitions of related terms

! Emphasizes preparation of abstracts by authors of primary


documents although these are also applicable to other
persons preparing abstracts, i.e. subject experts and
professional abstractors

196

Standards

! EPCE. - Draft recommendation for abstracts and


abstracting. 1976
! IAEA-INIS. - Instructions for submitting abstracts 1976

! ACS. - Directions for abstractors. 1971

! DFS. - An introduction to indexing and abstracting for

technical information systems. 1971

197

References
1. Chowdhury, G. G. (2007). Organizing Information : From the Shelf to the Web.
London : Facet.

2. Cleveland, D. and Anna Cleveland. (2001). Introduction to Indexing and Abstracting. Englewood, COLO: Libraries
Unlimited, Inc. ISBN 1563086417.

3. Lancaster, F. W. (Frederick W. (1998). Indexing and abstracting in theory and practice (2nd ed.). Library Association
Pub.

4. Pascua, S. (2018). Handouts in Indexing and Abstracting

5. Padernal, C. (2018). Review notes in Indexing and Abstracting.

6. NISO Standard on Abstracting. (2015).

7. NISO Standard on Abstracting. (2011).

8. A brief history of indexing. (n.d.). Retrieved April 19, 2019, from https://www.anzsi.org/resources/about-indexers-
and-indexing/a-brief-history-of-indexing/

9. ABC-CLIO > ODLIS > odlis_I. (n.d.). Retrieved April 19, 2019, from https://www.abc-clio.com/ODLIS/odlis_i.aspx

10. Frequently Asked Questions | American Society for Indexing. (n.d.). Retrieved April 19, 2019, from https://
www.asindexing.org/about-indexing/frequently-asked-questions/

11. History of Book Indexing | Clive Pyne Book Indexer. (n.d.). Retrieved April 19, 2019, from http://
www.cpynebookindexing.com/resources/history-of-book-indexing/

You might also like