0% found this document useful (0 votes)
64 views4 pages

Abstrak: Controlled Vocabulary in The Digital Age

This document summarizes the article "Controlled Vocabulary in the Digital Age" which discusses the importance and development of controlled vocabularies for libraries. It describes how libraries are changing from traditional to hybrid collections and need standardized vocabularies to manage digital information. The factors driving this include the growth of the internet, integrated library systems, and varied information formats. Controlled vocabularies help improve information retrieval and have been used since the early 1900s, with ongoing development of taxonomy types like thesauri.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views4 pages

Abstrak: Controlled Vocabulary in The Digital Age

This document summarizes the article "Controlled Vocabulary in the Digital Age" which discusses the importance and development of controlled vocabularies for libraries. It describes how libraries are changing from traditional to hybrid collections and need standardized vocabularies to manage digital information. The factors driving this include the growth of the internet, integrated library systems, and varied information formats. Controlled vocabularies help improve information retrieval and have been used since the early 1900s, with ongoing development of taxonomy types like thesauri.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Kekal Abadi 29(1) 2011

Controlled Vocabulary in the Digital Age

Muhamad Faizal Abd Aziz


University of Malaya Library
mfaizal@um.edu.my

Abstract

Libraries are undergoing changes from managing traditional to hydrid information. With the rapid development in IT,
especially the Internet, libraries need to establish a standard to manage the information such as having a controlled
vocabulary. This article describes the underlying factors in the digital age for the establishment of controlled
vocabulary.

Abstrak

Perpustakaan kini sedang mengalami perubahan dari segi pengurusan maklumat, iaitu secara tradisional kepada
hibrid. Dengan perkembangan pantas teknologi maklumat (IT), terutama internet, perpustakaan perlu
membangunkan satu piawaian untuk perbendaharaan kata terkawal (controlled vocabulary). Artikel ini
menerangkan faktor-faktor asas untuk membangunkan perbendaharaan kata terkawal dalam era digital.

Introduction to achieve consistency in the description of content


objects and to facilitate retrieval. Basically, the need for
Libraries are undergoing change from traditional to vocabulary control arises from two basic features of
hybrid and currently to the electronic or virtual library. natural language which are: (i) two or more words or
Challenges faced by librarians are tremendous in terms can be used to represent a single concept and (ii)
maintaining the collection and services. As we witness two or more words that have the same spelling can
the development of the physical library, we should not represent different concepts.
forget the development of its content or collection.
Today, libraries are not only holding just books, but also Controlled vocabulary can be simply defined as a list or
different kind of information and formats whether in collection of terms or words available for use. In library
printed, audio, digital, and electronic resources and information science, controlled vocabulary is a
available on the web which have substantially increased carefully selected list of words and phrases, which are
in the recent years. The resurgence of interest in used to tag units of information (document or work) so
controlled vocabularies in the recent decade is related that they may be more easily retrieved by a search. To
to the development of contents and format of library get a better understanding of controlled vocabulary let
materials. Basically, these are caused by three main us look into the definitions below.
factors, which are: Development of Internet
Technology, Development in Integrated Library System
and Variations of Information Content and Format. All Wikipedia.com gives a definition of controlled
these factors are seen as reasons why researchers and vocabularies as providing a way to organize knowledge
librarians are paying more attention in the vocabulary for subsequent retrieval which are used in subject
control activities. In this article, the definitions from indexing schemes, subject headings, thesauri and
various sources, history and development will be taxonomies.
discussed.
According to Larson (1998), controlled vocabulary is the
attempt to provide a standardized and consistent set of
Definition of Controlled Vocabularies terms (such as subject headings, names, classifications,
etc.) with the intent of aiding the searcher in finding
Before looking at the definition of the term ‘controlled information.
vocabularies’ (CVs), we need to know why controlled
vocabularies are important and what are the effects of
not having it. Vocabulary control is used to improve the
effectiveness of information storage and retrieval
system, web navigation systems, and other
environments that seek to both identify and locate
desired content via some sort of descriptions using
language. The primary purpose of vocabulary control is
Kekal Abadi 29(1) 2011

History And Development of Controlled iv) Thesaurus


Vocabularies A Thesaurus is a structured controlled vocabulary
arranged in a known order so that the various
Controlled Vocabularies have been in place since early relationship among terms are displayed clearly and
1900, when the first printed subject headings were identified by standardized relationship indicators.
published in 1909. In any case, Library of Congress Relationship indicators are usually employed
Subject Headings was published earlier than that in reciprocally.
1898, after it was converted from an author-plus a
classed-catalogue to a Dictionary Catalogue. The first Factors Contributing to the Resurgence of
Library of Congress Subject Headings used the Interest
American Library Associations List of Subject Headings
for use in dictionary catalogue. The actual printed There has been an increased interest in the
subject headings used in the Dictionary Catalogues at development of technology that affects library and its
the Library of Congress (later been titled as Library of collections. The first reason would be the development
Congress Subject Headings) began in summer 1909. of Internet Technology which drives a drastic change in
Dewey Decimal Classification (DDC) was originated in the way we use and access library collection
1873 and had published and patented in 1876. In
1950s, government agencies began to develop a) Development of Internet Technology
controlled vocabularies for the burgeoning journal It was reported that librarians were among the
literature in specialized fields; for example Medical earliest professionals to use the Internet. Internet
Subject Headings (MeSH) was developed by United technology underwent rapid development in the
States National Library of Medicine. Sears List of early stages, which started with four computers,
Subject Headings, which first appeared in 1923. The telnet, dial-up and the latest is wireless connection.
development of controlled vocabularies did not stop at These developments have played an important role in
that point; it continues to develop in more modern changing the library and librarianship fields. Internet
ways. Now the subject headings are available in online enables access to various information resources in
format, for example; Classification Web for Library of many formats. Traditionally, library is only accessible
Congress and MeSH online. to a group of people or community staying nearby
and the collection will only be available within the
Types of Controlled Vocabularies library building. With internet technology the library
has become borderless and its contents virtually
There are many types of controlled vocabularies. Listed accessible from anywhere. To accommodate these
below are some common ones: changes, librarians and researchers have to find ways
to make information retrieval possible through
i) List or “pick list" Internet. The searching criteria or access point need
A list or pick list is a limited set of terms arranged in a to be refined to get an accurate search result. Most
simple alphabetical list or in some other logically websites, search engines and web portals use natural
evident ways. Lists are used to describe aspects of language or free-text language as their controlled
content object or entities that have limited number vocabularies that results in wider and broader search
of possibilities. Examples of lists would be that for results, increasing the hit list but decreasing the
Geography which list country, state and city; for precisions. The use of natural language or free-text
Language (English, France and Germany). language is to accommodate the layman searching
capabilities. Arguments arose among professionals on
ii) Synonym ring the advantages and disadvantages of these two
Synonym ring is a set of terms considered equivalent options as accurate and reliable retrieval tools.
for the purposes of retrieval. Synonym rings usually Thomas (2000) commented: “with the Web
occur as flat lists. Use of synonym ring ensures that a estimated to be increased by 10 million pages weekly,
concept can be described by multiple synonyms or the task of indexing the internet resources is clearly
quasi-synonym terms and retrieved if any one of the argentums, and not something that can be done
terms is searched. overnight by the cataloguer. Instead of relying on the
catalogue to identify and retrieved web pages, users
iii) Taxonomy have to turn to web portals which use metadata”.
Taxonomy is controlled vocabulary consisting of
preferred terms, all of which are connected in a Research prove that controlled vocabulary has more
hierarchy or poly-hierarchy way. advantages over natural language and free-text
language. Gerhan (1991) found that catalogue users
retrieved more records in fewer attempts making use
of the Library of Congress Subject Headings. Arellano

2
Kekal Abadi 29(1) 2011

(1991) discovered that a great deal of material was communities and should be able to be
missing. Referring to the importance of a controlled integrated with other subject languages, and
vocabulary, Tillet (2000) pointed out, “Authority 5. Authoritativeness – there should be a method
control enables “precision and recall” which are of reaching consensus on terminology,
lacking from today‘s web searches. The above structure, revision that includes user
findings show the importance of controlled communities.
vocabulary for subject retrieval in a network
environment. Some of the controlled vocabularies have already
adjusted to the electronic environment such as
AGROVOC the agricultural thesaurus, WebDewey,
b) Development of Integrated Library Systems which is Dewey Decimal Classifications adapted to
Libraries began to automate and network their electronic environment and California Environmental
catalogue in the late 1960s. Frederick G. Kilgour at Resources (CERES) thesaurus.
the Ohio College Library Center (know OCLC, Inc) led
the networking at Ohio libraries during the ‘60s and c) Variations of Information Content and Format
’70s. The automated catalogue became available to Nowadays library holdings are not just limited to
the world, first through telnet or TN3270 via IBM and books, but also different formats of information such
only became web-based on 1997 with the as visual images, audio recordings, electronic
introduction of HyWebCat. In the conventional way resources. Many organisations and individuals are
of searching, it will be done through a catalogue card using the internet for generating and delivering
which is made available in the library. Today, with the electronic information. The amount of electronic
enrichment of Internet technology, the library resources that are available on the web have
integrated system has replaced the catalogue card substantially increased in recent years and there is an
and the Online Public Access Catalogue (OPAC) to urgent need to include them into the library
provide more precise searching options. With this collection and consequently, to include their
Web-based Catalogue or OPAC, users can retrieve surrogates in the library catalogue. New terminology
information not only about holdings in the individual specifically in the Internet and Information
library, but also can examine holdings from other technology fields have been forced to burst-up in this
libraries. Card catalogues have given way to online recent decade. The new contents that entered the
catalogues to incorporate new search options, library collections among others are websites and
particularly subject searches. In card catalogues, the web portals. Websites are very unique format of
options for retrieving information about the holdings materials and new controlled vocabulary have to be
of a library are by author, title and subject. In developed. Web pages have specific characteristics
comparison, online catalogues enable searches by such as hyperlinks, anchors and metadata. Web
title words or words included in any other field as portals use free-text and natural language types of
surrogates possible. In this way, the possibilities of controlled vocabularies which are not really reliable
subject access in online and web-based catalogue are when searching. The World Wide Web has transpired
not limited to subject headings and a controlled a new type of controlled vocabulary which is ontology
language, but they are extended to key words, mainly and directory-style subject browsing that is very
those from titles which are the basic constituent of popular in commercial search engines (directories
free-text. Realizing the new needs to accommodate and web pages).
the current trend, the existing controlled
vocabularies need to be improvised for new roles in
the electronic environment, with the aspects of Conclusion
improvement in areas such as:
Although the use of free-text language and natural
1. Improved currency, hospitality for new topic, language is an easy and cheap option for indexing
and capability for accommodating new activities, there is still a need to use controlled
terminology vocabularies for the storage and retrieval of the precise
2. Flexibility and expandability – including information that matches user needs. Any search
possibilities for decomposing faceted notation engine or directory and other home grown scheme in
for retrieval purposes the web, even those with well-developed
3. Intelligibility, intuitiveness, and transparency – terminological policies such as Yahoo and Google still
it should be easy for users to use, responsive suffer from a lack of understanding of principles of
to individual learning style, able to adjust to classification design and development. In this way
the interest of users, and allow for custom controlled vocabulary will continue to play an
views important role in the organization of knowledge and
4. Universality – the scheme should be applicable librarians will have to be more adequately prepared to
for different types of collections and

3
Kekal Abadi 29(1) 2011

face the challenges that technology and the new types


of information resources impel in future. Should We Control Vocabulary?. Retrieved 1 January
2009, from
References http://www.nelinet.net/edserv/conf/
cataloging/2007/ohnmitchell.pdf
American National Standards Institute. (2007).
Guideline for the construction, format, and Stone, A. (2000). The LCSH century: A brief history of the
management of monolingual controlled library of congress subject headings, and introduction
vocabularies. Retrieved 1 January 2009, from to the centennial essays. Cataloging & Classification
http://www.slis.kent.edu Quarterly, 29(1&2), 1-15. Retrieved from http://
catalogingandclassificationquarterly.com/ccq29nr1-
Bates, M.J. (1988). How to use controlled vocabularies 2ed.htm
more effectively in online searching. Online, 12(6),
45-56. Retrieved from http://proquest.umi.com/ Tenopir, C. (1987). Searching by controlled vocabulary
pqdweb or free text?. Library Journal, 112(19), 58-59.
Retrieved from http://web.ebscohost.com
Chan, L.M. (2001). Subject vocabulary for web
resources. Retrieved 1 January 2009, from http:// Windsor, R. (1995). Designing a controlled vocabulary
klement.nkp.cz/Csalin/caslin01/sbornik/ for use with Digital Asset Libraries. Retrieved 1
subjectvoc.html January 2009, from http://www.daydream.co.uk/
controlled_vocabulary.asp
Controlled Vocabulary. Retrieved 1 January 2009, from
http://en.wikipedia.org

Golub, K. (2006). Automated subject classification of


textual web pages, based on a controlled
vocabulary: Challenges and recommendations. New
Review of Hypermedia and Multimedia, 12(1), 11-
27. Retrieved from http://www.informaworld.com

Golub, K. (2006). Using controlled vocabularies in


automated subject classification of textual web
pages, in the context of browsing. TCDL Bulletin, 2
(2), 1-10. Retrieved from http://www.ieee.tcdl.org

Hornby, A. S. (1953). Vocabulary control: History and


principles. ELT Journal, VIII(I), 15-21. Retrieved
from http://eltj.oxfordjournals.org

Lancaster, F. W. (1986). Vocabulary control for


information retrieval. (2nd Ed). Virginia: Information
Resources Press.

Lima, C., et al., [n.d.]. A historical perspective on the


evolution of controlled vocabularies in Europe.
Retrieved 1 January 2009, from http://
www.irbdirekt.de/daten/iconda/CIB7425.pdf

Marshall, J. (2006). Control vocabularies:


Implementation and evaluation. Key Words, 14(2),
53-59. Retrieved from http://
www.informaworld.com

Martinez-Arellano, F.F. (2001). Teaching of subject


access and retrieval at Mexican LIS schools. Paper
presented at the 67th IFLA Council and General
Conference , Boston. Retrieved from http://
www.ifla.org/IV/ifla67/papers/026-142e.pdf
4

You might also like