0% found this document useful (0 votes)

615 views289 pages

Lies SL

Uploaded by

Марцис Гасунс

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

615 views289 pages

Lies SL

Uploaded by

Марцис Гасунс

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 289

i i

“LIES” — 2011/6/21 — 15:43 — page i — #1

i i

Linguistic Issues in Encoding Sanskrit

Peter M. Scharf Malcolm D. Hyman

Brown University MPIWG

June 21, 2011

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page iv — #2

i i

Scharf, Peter M. and Malcolm D. Hyman. Linguistic Issues in Encoding

Sanskrit. Providence: The Sanskrit Library, 2011.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page v — #3

i i

Foreword
by G EORGE C ARDONA

Questions surrounding the encoding of speech have been considered since

scholars began to consider the history of different writing systems and of
writing itself. In modern times, attention has been paid to such issues as
standardizing systems for portraying in Roman script the scripts used for
recording other languages, and this has given rise to discussions about
distinctions such as that between transliteration and transcription. In re-
cent times, moreover, the advent and general use of digital technology
has allowed us not only to replicate with relative ease details of various
scripts and to produce machine searchable texts but also to reproduce
images of manuscripts that can be viewed and manipulated, a true boon
to philologists in that they are thus enabled to consult and study mate-
rials with all the details found in original manuscripts, such as different
hands that can be discerned and clues to modifications made due to fea-
tures of different scripts. At the source of such endeavors lie the facts
of language: phonological and phonetic matters that scripts portray with
various degrees of fidelity.
India can justifiably lay claim to being the home of what is doubt-
less the most thorough and sophisticated consideration of speech pro-
duction, phonetics, and phonology in ancient times. The preservation of
Vedic texts and their proper recitation according to the norms of vari-
ous groups of reciters led to the early analysis of continuously recited
texts (saṁhitāpāt.ha) into constituents — called pada — characterized
by phonological alternations that appear at word boundaries, including
boundaries before particular morphemes within syntactic words. A text

v
i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page vi — #4

i i

vi FOREWORD

i i

i i

xiv PREFACE

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page xv — #13

i i

Contents

Foreword by G EORGE C ARDONA v

Preface xi

Illustrations xix

Abbreviations xxi

1 Introduction 1
1.1 Technologies for representing spoken language . . . . . 2
1.2 The Sanskrit language . . . . . . . . . . . . . . . . . . . 8
1.3 The Devanāgarı̄ script . . . . . . . . . . . . . . . . . . . 9
1.4 Roman transliteration . . . . . . . . . . . . . . . . . . . 16
1.5 The All-India Alphabet . . . . . . . . . . . . . . . . . . 18

2 Existing encoding systems for Sanskrit 21

2.1 A brief history of Indian printing . . . . . . . . . . . . . 21
2.2 Legacy systems: before standards . . . . . . . . . . . . 25
2.3 UPACCII . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4 ISCII . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5 Unicode: Indic scripts . . . . . . . . . . . . . . . . . . . 30
2.6 CS (Classical Sanskrit) and CSX (Classical Sanskrit
Extended) . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.7 TITUS Indological 8-bit Encoding . . . . . . . . . . . . 33
2.8 Unicode: Indic transliteration . . . . . . . . . . . . . . . 34
2.9 7-bit meta-transliterations . . . . . . . . . . . . . . . . . 35

xv
i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page xvi — #14

i i

xvi CONTENTS

2.10 Velthuis transliteration and ITRANS . . . . . . . . . . . 36

2.11 wx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.12 Kyoto-Harvard . . . . . . . . . . . . . . . . . . . . . . 37
2.13 Varn.amālā . . . . . . . . . . . . . . . . . . . . . . . . . 38

3 Critique of encoding systems seen so far 41

3.1 Ambiguity and redundancy . . . . . . . . . . . . . . . . 42
3.2 Ambiguity in the encoding of accentuation . . . . . . . . 45

4 The basis for encoding: a reanalysis 47

4.1 Axis I: Spoken communication is prior to written . . . . 48
4.2 Axis II: General remarks on the units of spoken and
written language . . . . . . . . . . . . . . . . . . . . . . 52
4.2.1 Segments . . . . . . . . . . . . . . . . . . . . . 52
4.2.2 Features . . . . . . . . . . . . . . . . . . . . . . 53
4.3 Axis III: What is relevant for encoding? . . . . . . . . . 56
4.4 Encoding Sanskrit language vs. Devanāgarı̄ script . . . . 57

5 Sanskrit phonology 61
5.1 Description of Sanskrit sounds . . . . . . . . . . . . . . 62
5.2 Phonetic and phonological differences . . . . . . . . . . 65
5.2.1 Phonetic differences . . . . . . . . . . . . . . . 65
5.2.2 Sounds of problematic characterization . . . . . 68
5.2.3 Differences in phonological classification of
segments . . . . . . . . . . . . . . . . . . . . . 71
5.2.4 Differences in the system of feature classification 73
5.2.5 Indian treatises on phonological features . . . . . 73
5.2.6 Modern feature analysis . . . . . . . . . . . . . 75

6 Sound-based encoding 79
6.1 Criteria for selecting distinctive elements to encode . . . 79
6.1.1 Phoneme . . . . . . . . . . . . . . . . . . . . . 80
6.1.2 Generative grammar . . . . . . . . . . . . . . . 84
6.1.3 Historical linguistics . . . . . . . . . . . . . . . 85
6.1.4 Paralinguistic semantics . . . . . . . . . . . . . 87
6.1.5 Contrastive segments . . . . . . . . . . . . . . . 89
6.1.6 Phoneme in the broader sense . . . . . . . . . . 91
6.1.7 Contrastive phonologies . . . . . . . . . . . . . 92

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page xvii — #15

i i

CONTENTS xvii

6.2 Higher-order protocols . . . . . . . . . . . . . . . . . . 93

6.2.1 The phonetic encoding schemes . . . . . . . . . 98

7 Script-based encoding 101

7.1 Featural analysis . . . . . . . . . . . . . . . . . . . . . 103
7.2 Analysis of Devanāgarı̄ script . . . . . . . . . . . . . . . 108
7.3 Component analyses of Devanāgarı̄ script . . . . . . . . 109

8 Conclusions 113
8.1 Dynamic transcoding . . . . . . . . . . . . . . . . . . . 117
8.2 Text-to-speech and speech-recognition . . . . . . . . . . 118
8.3 Higher-level encoding . . . . . . . . . . . . . . . . . . . 119

Appendices 121

A Tables 123
A.1 Phonetic features . . . . . . . . . . . . . . . . . . . . . 124
A.2 Sounds categorized by Āpiśali . . . . . . . . . . . . . . 126
A.3 Sounds categorized by Śaunaka . . . . . . . . . . . . . . 128
A.4 Sounds categorized after Halle et al. . . . . . . . . . . . 130
A.5 Sanskrit phonetics . . . . . . . . . . . . . . . . . . . . . 132
A.6 Sanskrit phonetics according to Āpiśali . . . . . . . . . 134
A.7 Sanskrit phonetics according to Śaunaka . . . . . . . . . 136
A.8 Sanskrit phonemics . . . . . . . . . . . . . . . . . . . . 138
A.9 Sanskrit sounds derived from PIE by Burrow . . . . . . 140
A.10 PIE phonemics according to Burrow . . . . . . . . . . . 142
A.11 PIE phonemics according to Szemerényi . . . . . . . . . 144
A.12 Feature tree after Halle . . . . . . . . . . . . . . . . . . 146
A.13 Graphic features of Devanāgarı̄ according to Ivanov and
Toporov . . . . . . . . . . . . . . . . . . . . . . . . . . 148

B Sanskrit Library Phonetic Basic 151

B.1 Basic Segments . . . . . . . . . . . . . . . . . . . . . . 152
B.2 Punctuation . . . . . . . . . . . . . . . . . . . . . . . . 153
B.3 Modifiers . . . . . . . . . . . . . . . . . . . . . . . . . 153
B.3.1 Stricture . . . . . . . . . . . . . . . . . . . . . . 153
B.3.2 Length . . . . . . . . . . . . . . . . . . . . . . 153
B.3.3 Accent . . . . . . . . . . . . . . . . . . . . . . 154

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page xviii — #16

i i

xviii CONTENTS

B.3.4 Nasalization . . . . . . . . . . . . . . . . . . . . 154

B.4 Modifier combinations and usage notes . . . . . . . . . 154
B.4.1 Stricture . . . . . . . . . . . . . . . . . . . . . . 154
B.4.2 Length . . . . . . . . . . . . . . . . . . . . . . 155
B.4.3 Surface accent . . . . . . . . . . . . . . . . . . 155
B.4.4 Syllabified visarga and anusvāra accent . . . . . 156
B.4.5 Nasals . . . . . . . . . . . . . . . . . . . . . . . 156

C Sanskrit Library Phonetic Segmental 159

D Sanskrit Library Phonetic Featural 205

E Malcolm D. Hyman 215

E.1 A Memoir by Phoebe Pettingell . . . . . . . . . . . . . . 215
E.2 Curriculum Vitae . . . . . . . . . . . . . . . . . . . . . 221

Bibliography 231

Index 261

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page xix — #17

i i

Illustrations

1.1 Some of Gutenberg’s ligatures and abbreviations . . . . 3

xxi
i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page xxii — #20

i i

xxii ABBREVIATIONS

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 1 — #21

i i

Chapter 1

Introduction

Human beings express knowledge in various modes: through images in

visual art; through movement in dance, theatrical performance, and ges-
tures; and through speech in spoken language. Each of these means of
expression includes means to encode knowledge, and each is used to
express knowledge originally encoded in one of the others. Poetry de-
scribes depicted scenes, while epics narrate the events depicted there.
Manuscript images depict scenes from the epics the texts they decorate
narrate, while Kathakali enacts the epics in performance. Certain media
dominate as the primary methods for the transmission of detailed infor-
mation at different times and places. Oral tradition dominated the tradi-
tion of Sanskrit in India in the first and second millennia B.C.E. Writing
overtook orality in the first millennium C.E. and dominated until replaced
gradually by printing beginning in the 15th century in Europe and in the
19th century in India. Since the invention of digital electronic transmis-
sion in the 19th century, the digital medium has slowly expanded its do-
main and now is replacing printing as the dominant means of knowledge
transmission. In order to rescue the enormous body of literature extant in
print, writing, and living memory from being marginalized and becom-
ing extinct, it is vital to reflect on the nature of transitions in knowledge
transmission in order to understand the nature of the present transition
from the printed to digital now taking place. Consciousness of the nature
of the transition taking place will allow deliberate steps to maximize the

1
i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 2 — #22

i i

2 CHAPTER 1. INTRODUCTION

preservation of inherited learning. Such consciousness will additionally

open avenues of research not previously practicable without features of
the digital medium.
Today people use computers to manipulate linguistic and textual data
in sophisticated ways; yet current encoding systems tend to reflect vi-
sual and orthographic design factors to the exclusion of more relevant
information-processing principles. Thus these systems reproduce de-
ficiencies inherent in the traditional orthographies themselves. In this
book we examine some fundamental issues in the coding of natural lan-
guage texts. We consider above all the relation the information selected
for encoding bears to natural language structure. We focus on Sanskrit,
which is characterized by an extensive oral tradition, a highly phonetic
orthography, and a copious literature. We survey various Sanskrit en-
coding schemes in past and present use and investigate their suitability
for particular applications. We conclude by advancing some concrete
proposals.

1.1 Technologies for representing spoken lan-

guage
Problems that arise in current encoding schemes stem from a long history
of adaptation in technologies for the visual representation of language.
The history of these technologies reveals a recurrent tendency to imitate
the appearance of earlier technologies and the possibility of information
loss at each transition (cf. Waller 1988, 262; Hockey 2000, 25).1 Recent
developments in text processing lead us to reconsider the fundamental
purpose of text encoding.
Writing emerged gradually as a technology for representing spoken
human language.2 Social and economic factors led at certain times and in
certain places to an increase in the frequency of writing and the number
1 “No revolution in communications media succeeds without a transitional period during

which it simply imitates the old system. [. . . ] For example, early printed books imitated
manuscripts, and early cinema used fixed cameras in imitation of the fixed viewpoint of the
theatre-goer” (Waller, 1986, 74).
2 The earliest “proto-writing”, attested in the ancient Near East, is associated with

economic and administrative functions; it is related only loosely to spoken language

(Damerow, 1999). For further remarks on proto-writing, see: Boltz 2006; Hyman 2006.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 3 — #23

4 CHAPTER 1. INTRODUCTION

atures, alternate letterforms, accented letters, and abbreviations (Stein-

berg 1961, 20, 30; Walden Font 1997; Füssel 2005, 17–18); see F IGURE
1.1. These had arisen in response to the demands of manuscript copy-
ing. Gutenberg’s characters were modeled upon a style of gothic script
current in the Germany of his day (Gill 1936, 32–33; Sampson 1985,
112; Kapr 1993, 20–22; Haralambous 2004, 367–368); see F IGURE
1.2. In its general layout, the printed Bible also resembled a fifteenth-
century northern European handwritten codex.4 Adaptation of printing
with movable type to radically different writing systems was neither fast
nor without difficulty.5 When the Venetian Gregorio de Gregorii pub-
lished an Arabic-language Book of Hours (Kitāb s.alāt as-sawā , ı̄) in
1514, his attempt to produce the hundreds of types needed to imitate Ara-
bic calligraphy and reproduce the contextual variants of Arabic charac-
ters resulted in an un-aesthetic and partly unreadable publication (Lunde
1981, 21; Roper 2002). Arabic printing only achieved a mature form
with the types cut by Robert Granjon in the 1580s.6
The Industrial Revolution of the nineteenth century led to increased
mechanization in the production of printed materials and the transforma-
tion of basic techniques. The Mergenthaler Linotype (1886) and Lanston
Monotype (1889) allowed the keyboarding of text to replace the process
of manual composition, in which types were picked one by one from a
wooden typecase, as in F IGURE 1.3 (Steinberg 1961, 286; Schlesinger
1989; Kahan 2000).7 The layout of the keyboards on these machines,
4 The British Library has made digital images of its two complete Gutenberg Bibles

available: <http://www.bl.uk/treasures/gutenberg/homepage.html>. See also the Ransom

Center’s Digital Gutenberg Project: <http://www.hrc.utexas.edu/exhibitions/permanent/
gutenberg/>.
5 On the earliest printing in Greek and Hebrew, see Füssel (2005, 101–104, 107–109).

Aldus Manutius, who published the first volume of an edition of Aristotle in Greek in
1495, closely imitated calligraphic style in his type, and made use of numerous ligatures
and abbreviations. Ingram (1966), who provides an extensive guide to ligatures and ab-
breviations in early Greek typography, remarks that when he first encountered Renaissance
Greek printing, “I saw little resemblance between the Greek I had learned in school and
this peculiar, cramped typeface which I could not read and which often contained only an
occasional letter I could recognize” (Ingram, 1966, 371).
6 On the early history of Arabic typography in Europe, see Roper (2002).
7 Automation began to be introduced into type composition and casting considerably

earlier in the nineteenth century. Notable early systems were devised by William Church
(1822) and by James Young and Adrian Delambre (1840–1841) (Schlesinger 1989; Kahan
2000, 1–2).

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 5 — #25

i i

1.1. REPRESENTING SPOKEN LANGUAGE 5

F IGURE 1.3: Newspaper composing room with workers setting text

manually from typecases, 1892

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 6 — #26

i i

6 CHAPTER 1. INTRODUCTION

however, resembled at first the older typecases; with time, they became
simplified and more ergonomic (AbiFarès, 2001). Another late nine-
teenth century technology, the typewriter, was first commercially manu-
factured in the United States in the 1870s.8 The typewriter greatly ex-
panded the mechanical production of texts and allowed mechanical tech-
nology to be used for the creation of even ephemeral documents. Type-
writers reproduced many aspects of printing technology, but with several
accommodations: a greatly reduced inventory of characters, monospac-
ing, and the elimination of many possibilities for aesthetic refinement.
Teletype machines, which originated around 1907, allowed for the
remote transmission and printing of text; they led eventually to stan-
dards for information encoding, most notably ASCII (American Standard
Code for Information Interchange) in the 1960s (Bemer, 1963; Smith,
1964; Mackenzie, 1980; Gaylord, 1995).9 Current digital computer key-
boards evolved from teletype keyboards, and the first documents created
using computers resembled typewritten documents. Digital typesetting
emerged in the 1970s and made possible the creation of high-quality doc-
uments that incorporated aspects of traditional typography (Syropoulos,
Tsolomitis & Sofroniou, 2003). The desktop publishing revolution of the
1980s and 90s brought these capabilities to an international public that
continues to expand today.
8 Manufacture by Remington of the typewriter designed by Christopher Latham Sholes

and Carlos Glidden began in 1873 (Beeching, 1990; Bukatman, 1993; Kahan, 2000).
9 We may look even earlier, to the five-bit code for telegraphy patented in 1874 by Bau-

dot (Gillam, 2002, 43). A later rearrangement of the code was standardized in 1931 as
CCITT #2 by the Comité Consultatif International Télpéhonique et Télégraphique (now
renamed ITU-T) and extensively used by teletype machines (Mackenzie, 1980, 6, 62–64).
As a matter of historical curiosity, we may note that the ultimate antecedent of the Bau-
dot code was Francis Bacon’s so-called “bi-literal” cipher, first published in 1623 (Strasser
1988, 88–9; Kahn 1996, 882–3).
ASCII became an American (ASA) standard on June 17, 1963. Although ASCII is gen-
erally thought of as a seven-bit code, it was actually designed as an eight-bit code with the
eighth bit unassigned (Bemer, 1963, 35). When ASA (American Standards Association)
became ANSI (American National Standards Institute), ASCII was officially designated
ANSI X3.4-1968 (Mackenzie, 1980, 8). On the relation of ASCII to ISO 646 see Gaylord
(1995).
An interesting predecessor of character encoding is the Linotype, which redistributed
its matrices in accordance with a seven-digit binary code assigned to each type, “although
[Mergenthaler] probably did not realize the mathematical significance” (Kahan, 2000, 206).

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 7 — #27

i i

1.1. REPRESENTING SPOKEN LANGUAGE 7

With each shift in technology, we observe the survival of elements

from earlier technologies. To varying degrees, writing represents spoken
language (Gibson, 1972, 13); printing represents writing; the typewrit-
ten text represents the printed text; and the first texts created with digital
computers represent their typewritten forbears. The representation of
speech in writing involves a fundamental change of medium from aural
to visual, while the representation of writing in print, printed text in typed
text, and typed text in digitally produced printed text all occur within vi-
sual media. Yet even the latter involve deliberate information recoding.
Decisions are made in the selection of a limited repertoire of certain fixed
shapes to represent in print the multiplicity of variously formed charac-
ters written with the free hand. Similar decisions are made in the further
reduction of the relatively large number of print types to the relatively
small number of types used in a typewriter, and in the design of patterns
to represent characters in a dot-matrix. The issue of character coding
emerges as a problem with the technological shift from traditional man-
ual instruments such as pen, stylus, and brush to mechanized technolo-
gies: movable type, the typewriter, and the digital computer. Whereas
the earlier manual technologies allowed complete flexibility in the final
shape of characters, printing fixed the repertoire of possible shapes into
sets of types (ÄÍÀ¿Â: that which is struck or impressed; but also a type
as opposed to a token — cf. Plato Republic 396e). With the possibility
of data transmission, it was necessary to ensure that characters on one
machine were mapped accurately to characters on another.
At present, the digital computer offers exciting possibilities and chal-
lenges. There is great flexibility in how a text may be displayed or
printed — designers can even draw upon calligraphic principles that were
not possible within the confines of traditional printing technologies. At
the same time, display is only one of numerous functions that comput-
ers can perform. Computers can exchange textual data over space and
time; they can perform linguistic processing, such as spell-checking, ma-
chine translation, content analysis and indexing, and morphological and
syntactic analysis.10 Display for a human reader should no longer be
10 Computers led first to advances in the culture of calculation. Their application to

text and language processing followed at first only slowly, although we find already in
1949 the first electronic text project in the humanities, namely, Roberto Busa’s computer-
generated concordance Index Thomisticus (Hockey, 2000, 5). Today the Index Thomisticus
lives on as the Index Thomisticus Treebank, a morphologically and syntactically annotated

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 8 — #28

i i

8 CHAPTER 1. INTRODUCTION

considered as the primary determinant of an encoding scheme. Rather,

language should be encoded in such a way as to facilitate automatic pro-
cessing, to minimize extrinsic ambiguity and redundancy, and to ensure
longevity. Traditional orthographies — which have led time and again to
scribal corruption, readers’ misunderstandings, and entire industries of
textual criticism — are clearly not optimal. The need to encode Sanskrit,
which has for its entire history been associated with an extremely sophis-
ticated tradition of phonetic and linguistic analysis, provides us with an
exceptional opportunity to rethink some fundamental issues of language
encoding. Traditional orthographies for Sanskrit exhibit a number of in-
felicities in their design that should not be carried over into computer
encodings.

1.2 The Sanskrit language

Sanskrit is the primary culture-bearing language of India, with a con-
tinuous production of literature in all fields of human endeavor over the
course of four millennia. Middle Indo-Aryan languages (Prākrits Pālı̄,
Apabhraṁśa, etc.) and New Indo-Aryan languages (regional languages
such as Tamil, Malayalam, Marathi, Hindustani, etc.) served as the me-
dia of literary composition as well since about the third century B.C.E.
Yet the extent and diversity of literature produced in Sanskrit, the long
temporal span of its use, and the breadth of the use of the language
throughout the Indian subcontinent and Southeast Asia are unparalleled.
Indeed, extant literature in Sanskrit constitutes the largest body of liter-
ature in the world prior to the invention of the printing press. The cul-
tural heritage of Sanskrit is extant in some thirty million manuscripts and
serves as an object of study in academic institutions. The language per-
sists in the recitation of hymns in daily worship and ceremonies, as the
medium of instruction in centers of traditional learning, as the medium of
communication in selected academic and literary journals and academic
fora, and as the primary language of a revivalist community near Ban-
galore. Preceded by a strong oral tradition of knowledge transmission,
corpus that will be invaluable in the construction of new NLP tools for post-classical Latin
(see <http://gircse.marginalia.it/~passarotti>). Lamentably, the increasing availability, and
decreasing cost, of computer equipment has led (perhaps paradoxically) to an atavism that
fetishizes display.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 9 — #29

i i

1.3. THE DEVANĀGARĪ SCRIPT 9

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 11 — #31

i i

1.3. THE DEVANĀGARĪ SCRIPT 11

of consonant characters plus a vowel diacritic, optionally accompanied

by a sign for the nasal anusvāra ( M ) or release of breath, visarga (H). Al-
though modern languages written in Devanāgarı̄ make less use of com-
plicated ligatures, sequences of up to five consonants are permissible and
occur in Sanskrit, and in Sanskrit loanwords in modern Indic languages:
• Sanskit: d:*øñÍáÎÉ*öÉÁ +.eaH daṅks.n.voh. GEN/LOC DU M/F of d:*øñÍáÎÉ*Åu daṅks.n.u ‘mor-
dacious’
• Hindi < Sanskrit: ta;a;t~Tya tātsthya ‘metonymy’.
A symbol for a velar fricative [x] (jihvāmūlı̄ya) or bilabial fricative [F]
(upadhmānı̄ya) (usually written ^) may occur instead of the visarga.
Thus, letting C stand for any consonant graph, V for any vowel graph,
and X for the anusvāra or visarga (or jihvāmūlı̄ya, upadhmānı̄ya) graph,
we may describe an orthographic syllable by means of the regular ex-
pression C 0−5 V X ? .13 Because all consonant graphs imply an inherent
vowel, a sequence of multiple consonants (consonant cluster, saṁyoga)
must be rendered with a single ligature, in which the shape of constituent
graphs can vary considerably. The shape of the ligature is a function
of the shapes of the constituent consonant graphs. Generally, all conso-
nants are rendered in partial form except the last (the prevocalic one).
Consonant graphs that have a vertical bar to the right are usually stacked
horizontally; round-bottomed consonant graphs, by contrast, are stacked
vertically. Sequences involving /r/ are especially complex: when /r/ oc-
curs as the initial element of a consonant cluster, it is written as a diacritic
above the line (kR hrkai = .=, + k); elsewhere it takes the form of a diag-
onal bar slanted down to the left, attached near the bottom of the graph
that represents the (phonetically) preceding consonant (kÒ hkrai = k, + .=).
13 Notation: e0−5 denotes a concatenation of from zero to five occurrences of e; e? is

equivalent to e0−1 .
Psycholinguistic research suggests “that orthographic representations are organized into
syllable-like units independently from phonological influences” (Ward & Romani, 2000,
654). Cf. Caramazza & Miceli (1990); Badecker (1996, 60 n. 5, 67). For further dis-
cussion with reference to Indic scripts see Sproat (2006); Kompalli (2007). The regular
expression given above formalizes one of the two criteria of orthographic legality: “how
many consonant letters you may have in a row before you must have a vowel” (Ward &
Romani, 2000, 654). Knowledge of orthographic legality also involves knowledge of or-
thotactic constraints on sequences of consonant characters (i. e., is a particular sequence of
characters legal or illegal?).

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 12 — #32

i i

12 CHAPTER 1. INTRODUCTION

Some ligatures (e. g., [a hks.ai = k, + :Sa) have idiosyncratic forms that are
opaque in terms of their constituent analysis, and may thus be considered
“graphic idioms” (Ivanov & Toporov, 1968, 35).14 Traditional Sanskrit
orthography requires glyphs for representing more than a thousand con-
sonant clusters, and it is not uncommon for there to exist four or more
distinct styles for representing a single cluster (Wikner, 2002). Agen-
broad (n.d.) illustrates difficulties in unifying consonantal characters in
single ligatures. Shaw (1980, 28) reports that traditionally Devanāgarı̄
fonts required 500–800 types for conjunct consonants.
An examination of the visual characteristics of Devanāgarı̄ script
helps to explain its graphotactic properties. Hamp (1959, 2) uses the
term ‘graphotactic’ for the combination of graphic units by analogy with
the term ‘phonotactic’. The two most obvious visual features of Devanā-
garı̄ are the headstroke (śirorekhā) that runs horizontally across the top
of a sequence of Devanāgarı̄ consonant graphs,15 and the vertical bar that
appears at the right of many characters. The portion of the character that
is densest in information (in information-theoretic terms) is below the
14 Voigt (2005, 34) argues that h[ai originally was not a ligature, but rather was derived

directly from Aramaic hs.i and was used to represent [ts] (possibly with the final component
glottalized: [ts’]).
15 This feature arose from the technology of calligraphy (Ghosh, 1983, 16). The head-

stroke developed from an earlier head mark, which evolved in turn from the triangle of
ink formed by the first placement of the pen at the start of drawing a character (Salomon
1998, 31–8l; Shaw 1980, 28). In typographic terms, the headstroke in Devanāgarı̄ is the
equivalent of the baseline in scripts such as Latin and Greek (cf. Katsoulidis 1996).
Ivanov & Toporov (1968, 35) offer a doubtful functional explanation of the śirorekhā.
They write:
The continuity of the phonetic stream is reflected in the continuity of the
graphic chain: separate syllabic symbols in a word and separate words
themselves are connected by an uninterrupted horizontal line. This feature
of the Indian writing can be explained not only by its phonetic character
but also by the specific character of the word in Sanskrit where a significant
role is played by long compound words which are sometimes functionally
analogous to entire syntagms.
Such an explanation cannot be accepted because there is no correlation between the pho-
netic unity and the graphic unity of strings united by a headbar or separated by a gap in
the headbar. There is no greater phonetic unity in tasmātkaroti than in anyo ’gacchat even
though the latter breaks the headbar between words, and the former forms a conjunct conso-
nant running the headbar across two words. Moreover, manuscripts write entire sentences
uninterrupted regardless of word boundaries.

i i

i i
i i

(b) A consonant that follows /H/ is drawn within the open circle
that comprises the lower half of the h, utilizing the roof and
right of this circle as its upper horizontal or right vertical bar.
(e. g. Ì = h, + l)

Vowels are mostly written in Devanāgarı̄ with diacritics, which may

appear above, below, to the left, or to the right of the onset of the or-
thographic syllable. For example, diacritics for the vowels /e/ and /o/
are written above (:ke ;kE ), below (ku kU kx kX kw), to the left (;
a;k), or to
the right (k+:a k
+:a k+:ea k+:Ea) of the consonant character k hkai. Utterance-
initially, however, independent vowel characters are used. This practice
seems to reflect the influence of Semitic scripts (Scharfe 2002; Voigt
2005, 44). In Semitic writing, words do not begin with a vowel; this
is a consequence of Semitic word structure, in which only consonants
are allowed in word-initial position (Miller, 1994, 56).18 Two consonant
symbols, aleph (representing a glottal stop) and , ayin (representing a
pharyngeal or epiglottal voiced continuant) (McCarthy, 1994), that are
frequent word-initially in Semitic are likely not to have been recognized
as representing consonant sounds by speakers of languages that lacked
the phonemes represented (cf. Driver 1976, 154–155, 178–179; Miller
1994, 46).19 Thus the Brāhmı̄ characters that developed into Devanā-
garı̄ A/A;a derive from the Aramaic aleph (for which the Aramaic name
was ālaph), and the characters that developed into O;/Oe; derive from the
Aramaic , ayin (for which the Aramaic name was , ēn). The charac-
ters A;ea A;Ea are secondary developments from A. Characters for
independent r and au are not attested until the second half of the first
millennium ˚ C . E . (Scharfe, 2002, 393). In Kharos.t.hı̄ initial vowels are
formed from attaching the dependent vowel signs to a character derived
from aleph. Although Indian grammarians do not include the glottal stop
in their phonologies, we may conceive of the independent initial vowel
signs (A A;a I IR o O; Oe; A;ea A;Ea) as representing glottal stop
+ vowel,20 an idea apparently anticipated already by Lepsius (see Whit-
18 For the Arabic grammarians’ treatment of this fact, see Hadj-Salah (1971, 74); Al-

Nassir (1993, 22).

19 A number of Middle and Modern Aramaic dialects show , ayin having weakened into

the glottal stop [P] (Kaufman 1984, 93 n. 40; Hoberman 1985, 224).
20 In Kharosthı̄ initial consonants are formed from attaching the dependent vowel signs
..
to a character derived from aleph (Scharfe, 2002, 393).

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 15 — #35

i i

1.3. THE DEVANĀGARĪ SCRIPT 15

ney 1861, 328).21 The independent vowel signs appear word-internally

in the rare (Salomon, 1998, 15 n. 26) Sanskrit lexical items that contain
a sequence of vowels in hiatus, e .g. :pra;o+.ga praüga ‘front part of the shafts
of a chariot’, and in compounds, e. g. manaA;apa manaāpa ‘gaining the heart,
attractive, beautiful’.
Nasalization and pitch accents are written in Devanāgarı̄ with addi-
tional diacritics. Nasalization is written by a half-moon plus dot (candra-
bindu) over the vertical bar of the nasalized sound (e. g. ta;<a;(ãÉa tā˜ śca). The
accentual systems of Vedic schools vary. The most widely used, the Rg-
˚
vedic accentual system, generally places a horizontal stroke beneath the
CV portion of an orthographic syllable that includes a low-pitched vowel
(anudātta) (e. g. k! ), and a vertical stroke above the CV portion of an or-
thographic syllable that includes a circumflexed vowel (svarita) (e. g. k ).
Short and long aggravated svaritas (kampa) use the numerals 1 and 3 in
addition (nya1 ! ; :Sya;e!a3 ! ). The high pitch (udātta) is left unmarked. Other
accentual systems employ additional diacritics, including various signs
above, below, to the left, to the right, through the middle of, and around
the CV portion of an orthographic syllable that includes a circumflexed
vowel; within a given system, various signs differentiate particular types
of circumflex accent. Diacritics added to the visarga symbol indicate
high pitch, low pitch, or circumflex.22

21 Owing to sandhi, initial independent vowel signs will be written only (1) in hiatus,
i. e. the environment V##V; or (2) in pausa (initially in a major phonological phrase). Al-
though the glottal stop is not a phoneme of English, it commonly occurs in inter-word
hiatus, e. g. heavy oak; steady awning — N. B. that the glottal stop is not ordinarily real-
ized as full glottal closure (Hillenbrand & Houde, 1996); cf. Hadj-Salah (1971, 73 n. 63).
Similar phonetics is likely to obtain in Sanskrit. Note that inter-word hiatus is often consid-
ered exceptional — careful authors of ancient Greek prose, for example, avoided it entirely
(Benseler, 1841). Many languages typically eliminate within-word hiatus (Clements, 1990,
301) or disallow it entirely (Romani & Calabrese, 1998, 102).
22 Cardona 1997, li–lxiv; Witzel 1974.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 16 — #36

i i

16 CHAPTER 1. INTRODUCTION

1.4 Roman transliteration

As Sanskrit studies became important in the West, European scholars de-
vised methods to transliterate Sanskrit text in Roman script. The early
history of efforts to standardize such methods are described in the pref-
ace to the dictionary of Monier-Williams (1872). The eminent Sanskritist
William D. Whitney made some comments in 1880 in the Proceedings of
the American Oriental Society (Whitney, 1880). Whitney accords West-
ern scholars great license, writing, “the language is written in India, to
no small extent, in whatever alphabet the writers are accustomed to em-
ploy for other purposes; and there is no reason why we may not allow
ourselves to do the same” (Whitney, 1880, li). He considers questions of
how to mark vocalic quantity in Romanized Sanskrit, examines the ques-
tion of how the diphthongs should be presented, prefers r. (or Lepsius’
r) to r.i (likewise .l or l to lr.i — characterized as “that monstrous absur-
˚dity”), and devotes considerable
˚ discussion to the matter of anusvāra. He
concludes, “To sum up briefly: the items to be most strongly urged, as
involving important principles, are the use of r. and s. for the lingual vowel
and lingual sibilant respectively; of next consequence, for the sake of uni-
formity, is the adoption of the signs c, j, y, ç for the palatal sounds; the
designation of long vowels, of the diphthongs, of the nasals, are minor
matters, which will doubtless settle themselves by degrees in the right
manner” (Whitney, 1880, liii).
Of particular importance as regards standardization of the schemes
used by European scholars was the Geneva Oriental Congress of 1894
(Wujastyk, 1996). Contemporary schemes for Romanizing Sanskrit are
quite similar to those employed in the nineteenth century and are charac-
terized by the following conventions:
1. Sanskrit sounds that correspond to normal values for Roman letters
are represented by those letters (e. g. b = [b]).
2. The letter h, which by itself indicates a phoneme /H/, is used also
to indicate the aspirate series of stops in digraphs such as bh.
3. The retroflex consonants are indicated with an underdot (e. g. .t).
4. A macron indicates a long vowel (e. g. ā).
5. The palatal nasal is written ñ; the velar, ṅ.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 17 — #37

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 19 — #39

i i

1.5. THE ALL-INDIA ALPHABET 19

x). The Alphabet was also associated with progress in communication

technologies: “the adoption of a Romanic system [. . . ] would enable
Indians to bring into use for their own languages such modern devices as
the teleprinter and tape machine, with consequent great advantage to the
Indian Press” (Jones, 1942, 17).26
Despite the ambitions of Firth, the Alphabet was scarcely used. Sev-
eral textbooks made use of it, including A. H. Harley’s Colloquial Hin-
dustani (Harley, 1955) and T. Grahame Bailey’s Teach Yourself Urdu
(edited by Firth and Harley, and originally entitled Teach Yourself Hin-
dustani) (Bailey, Firth & Harley, 1956). The Alphabet comprised a core
set of characters, with extensions added for sounds present only in spe-
cific Indian languages. Firth worked out orthographies based on the Al-
phabet for Hindustani (Hindi and Urdu), Marathi, Gujarati, Tamil, Tel-
ugu, and Sinhalese (the last devised by Jones and Perera) (Jones, 1942,
13), as well as Burmese and Persian (Firth, 1936). Occasionally Firth’s
orthography appeared in the publications of linguists associated with the
School of Oriental and African Studies (SOAS) at the University of Lon-
don, for instance Allen (1951).
Although the All-India Alphabet seems not to have been used for
Sanskrit, Firth included symbols for spelling Sanskrit words as they ap-
pear in Hindi. Moreover, W. Sidney Allen adapted the Alphabet for San-
skrit (Allen, 1953). The Alphabet was designed as a scientific orthog-
raphy, “an alphabet that embodies all the latest findings of phonetics,
linguistics and psychology, and which satisfies the demands of the ty-
pographer, the typewriter, and the calligraphist” (Jones, 1942, 10). The
Alphabet tends to represent phonological rather than phonetic distinc-
tions (Firth, 1936, 539). Surface morphophonological alterations and
phonetic differences are not supposed to be represented in the orthogra-
phy (Jones, 1942, 5–6). On the whole Firth aims at representing single
sounds with single characters, but he departs for various reasons, em-
ploying at times digraphs and even trigraphs (e. g. phw27 for a bilabial
aspirated stop with velar co-articulation in Burmese) (Firth, 1936, 543).
The design of the Alphabet is motivated by ease of reading (legibility
26 Such arguments were once made also for China and Japan (Ramsey 1989, 143–154;

i i

22 CHAPTER 2. EXISTING ENCODING SYSTEMS

F IGURE 2.1: Engraved plate illustrating the Devanāgarı̄ script from

Athanasius Kircher, China Illustrata, 1667.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 23 — #43

i i

2.1. A BRIEF HISTORY OF INDIAN PRINTING 23

F IGURE 2.2: Hitopadeśa Introduction 2ab excerpted from Charles

Wilkins, A Grammar of the Sanskrı̆ta Language, 1808 (set with
Devanāgarı̄ type of the author’s design).

1748) included two hundred translations of the Lord’s Prayer in various

languages and writing systems, Indian ones among them (Firth, 1936,
519). The first movable types for Devanāgarı̄ were successfully cast in
the 1740s in Rome for the press of the Congregatio de Propaganda Fide
(Glaister 1979, 134; Shaw 1980, 29).2
The first important book printed in an Indic script is commonly held
to be the Bengali grammar of Nathaniel Brassey Halhed (1751–1830),
published in 1783, with type cast by Charles Wilkins (b. 1749–1750;
d. 1836) (Smith 1885, 211, 242; Priolkar 1958, 51–53; Diehl 1968;
cf. Firth 1946, 119–120), who later designed the first truly serviceable
Devanāgarı̄ type (see F IGURE 2.2) (Diehl, 1968, 335–336). Printed
Devanāgarı̄ in India appears as early as 1789, with The New Asiatick
Miscellany published by the Chronicle Press of Calcutta (Shaw, 1980,
29).
In 1804 the English shoemaker and Baptist missionary William Car-
ey published a Sanskrit reader at Serampore, thus making, in the words
of H. T. Colebrook, the “first attempt to employ the press in multiply-
ing copies of Sanscrı̆t books with the Dévanagarí character” (Windisch,
1917, 28). A Devanāgarı̄ font subsequently produced (in 1806) under
the supervision of Carey contained nearly a thousand character combina-
tions (Smith 1885, 243; Priolkar 1958, 59, 63, 65). Carey’s Devanāgarı̄
2 On the early history of Devanāgarı̄ typography in Europe, see Windisch (1917, 70,

78–79); Glaister (1979, 134–136); Shaw (1980).

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 24 — #44

i i

24 CHAPTER 2. EXISTING ENCODING SYSTEMS

was used not only for setting Sanskrit, but also for vernacular languages
such as Marathi, Hindi, Nepali, and Gujarati (Shaw, 1980, 30).
Hot-metal typesetting came to India in the 1920s when the Mergen-
thaler Linotype Company started shipping Indic fonts for its linecasters
(Ross, 2002). The Monotype Corporation cut a 12 point Devanāgarı̄ font
for hot-metal typesetting as early as 1923 (Shaw, 1980, 28). Hot-metal
technology, however, necessitated “severely restricted character sets, the
lack of kerning, and the inability to position the subscribed or super-
scribed vowel signs” (Ross, 2002).3 The Indologist W. Norman Brown
(1892–1975), founder of the first South Asia area studies program in
the United States (at the University of Pennsylvania), served as consul-
tant to the Merganthaler Linotype Company in the 1930s and subsequent
decades. Brown considered script reform measures that would ease the
transition to modern technologies such as hot-metal typesetting.4 The
Devanāgarı̄ script reform committee of Uttar Pradesh made several rec-
ommendations (1940), including:

1. to abandon the practice of vertical stacking of characters in con-

juncts; instead characters with a vertical bar should form conjuncts
using their combining form (without the vertical bar), and con-
juncts involving other consonants should be indicated by means of
the virāma;

2. to eliminate the exceptional directionality of certain characters: hii

is to be written with a new symbol that follows the consonant, hri
in clusters is to be replaced by a new symbol that does not disrupt
the linear order;
3. to indicate anusvāra by a small circle at the right (Brown, 1953, 4).

The aim of these reforms was to reduce the number of pieces of type
needed to set Devanāgarı̄. (Traditionally, Devanāgarı̄ type required four
3 “The Linotype mechanism put constraints on type face design because the machine

could not emulate all the features of manuscript; in particular, where adjacent elements
overlap vertically” (Kahan, 2000, 190). See also Ghosh (1983, 10).
4 Politicians of course had their say in the matter. Jawharlal Nehru for some time con-

sidered the benefits that might follow from adopting the Roman alphabet. Gandhi sought
to replace the independent vowel signs of Devanāgarı̄ with the sign A, together with the
dependent vowel signs.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 25 — #45

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 29 — #49

i i

2.4. ISCII 29

a standard or used in other projects. The code is inadequate for Sanskrit,

since it provides no way to represent r̄, l, etc.
˚˚

2.4 ISCII
The Indian Script Code for Information Interchange (ISCII) is an Indian
national standard; the first version was published by the Indian Depart-
ment of Electronics (DOE) in 1983 (Bhatt, n.d.). More recent versions
have been published in 1986, 1988, 1991, and 1998. ISCII is designed
to support Devanāgarı̄ as well as nine other Brāhmı̄-derived scripts: Gu-
jarati, Panjabi, Assamese, Bengali, Oriya, Telugu, Tamil, Malayalam and
Kannada. These scripts are the primary means of writing for the twenty-
two nationally recognized languages of India, with the exception of those
that are primarily written in Perso-Arabic script, viz. Urdu, Kashmiri,
Sindhi (Singh, 1997).
ISCII employs a single set of codepoints for ten distinct scripts. Thus
the syllable hkai is encoded identically whether it is written in Devanā-
garı̄, Gujarati, or Malayalam. The general structural principles of ISCII
are based on those of the Brāhmı̄-derived scripts. In general:
• Consonants imply /a/, unless overridden by either an explicit vowel
or the HALANT character (= virāma, i.e. the ∅ vowel).
• Separate codepoints exist for independent and dependent vowel
signs.

• Characters are encoded in logical (phonetic) rather than visual or-

der.
ISCII is an abstract encoding that does not specify the particular glyphs
used to represent the underlying character stream. Proper rendering of
ISCII-encoded text requires knowledge of the script behaviors for a par-
ticular writing system. ISCII-1991 (IS 13194:91) defines three important
control characters (Bureau of Indian Standards, 1992):
1. INV: an abstract “invisible” consonant allows for the rendering of
diacritic signs which would normally have to be positioned with
respect to a particular consonant graph.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 30 — #50

i i

30 CHAPTER 2. EXISTING ENCODING SYSTEMS

2. EXT: introduces extensions, including the Vedic extensions (31

symbols) specified in Annex G: special signs for jihvāmūlı̄ya, upa-
dhmānı̄ya, and visarga; special signs for anusvāra; diacritics for
accents (varieties of udātta, anudātta, svarita, and kampa); and an
abbreviation sign and filler mark. These symbols do not exhaust
the repertoire employed by the various Vedic schools.
3. ALT: prefixes a character or script attribute code that allows for
character styles such as boldface or italic and for Indic script se-
lection such as Bengali or Gujarati.

2.5 Unicode: Indic scripts

The Unicode Standard is an evolving character encoding designed to pro-
vide support for a great many of the modern and ancient languages of
the world (Unicode Consortium, 2006). Many code blocks in Unicode
are based on existing national or international standards; the Devanāgarı̄
block of Unicode is based on ISCII-1988. Unicode differs from ISCII in
that it provides separate blocks, isomorphic with one another to the great-
est degree possible for each script, for eight other Indic scripts covered by
ISCII. By design, Unicode encodes plain text and leaves non-distinctive
character styles such as boldface or italic to a higher-level protocol. By
employing separate blocks for distinct Indic scripts and by encoding only
plain text, Unicode needs no equivalent for the ISCII ALT character. Ver-
sion 5.0 of Unicode did not support characters needed for the adequate
representation of Vedic texts. It did not include the Vedic character ex-
tensions in ISCII Annex G. The authors of the present volume drafted
a joint proposal in collaboration with Michael Everson, the Irish repre-
sentative to ISO 10646 (Universal Character Set), R. K. Joshi and Alka
Irani of the Centre for Development of Advanced Computing (C-DAC)
in Mumbai, Swaran Lata of the Department of Information Technol-
ogy in the Ministry of Communications & Information Technology of
the Government of India, New Delhi, and other scholars. The Unicode
Technical Committee and International Standards Organization accepted
sixty-eight new characters for Vedic and historical Indic which became
part of Unicode Standard 5.2, and amendment 6 of ISO/IEC 10646:2003
in the Fall of 2009. The new characters are included in two code pages:

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 31 — #51

32 CHAPTER 2. EXISTING ENCODING SYSTEMS

(ëÐ*:ëÐÅÁ ), (2) a single glyph with two components stacked horizontally

(#k), or (3) hkai + virāma + hkai (k, +.k).
A number of criticisms of Unicode, with reference to Indic scripts,
can be found (Hellingman, 1998; White, 2002). We focus here on those
features that may be perceived as anomalous from the point of view of
the Sanskritist:
1. As Yannis Haralambous writes, Unicode is (for historical reasons)
“quite awkward: it is partly logical and partly graphical” (Har-
alambous & Plaice, 2002). Separate versions of vowels (e. g. /ā/)
exist for the independent (A;a) and dependent (:a) forms. But the
distribution of these vowel forms is entirely complementary.
2. In order to code the isolated consonant /k/, it is necessary to use the
sequence U+0915 (k) U+094D (, ) (DEVANAGARI LETTER KA
+ DEVANAGARI SIGN VIRAMA). Here a character is needed to
encode the zero-vowel, whereas in U+0915 (k) (DEVANAGARI
LETTER KA) no distinct character encodes the vowel /a/.
(a) Shaping engines are supposed to provide a suitable ligature
for k hkai + virāma + k hkai (= ëÐ*:ëÐÅÁ ); in order to prevent liga-
ture formation, a special character ZWNJ (U+200C: ZERO-
WIDTH NON-JOINER) is needed: U+0915 (k) + U+094D
(, ) + U+200C (ZWNJ) + U+0915 (k) → k, +.k. Similarly,
to form the horizontally stacked conjunct, the special char-
acter ZWJ (U+200D: ZERO-WIDTH JOINER) is needed:
U+0915 (k) + U+094D (, ) + U+200D (ZWJ) + U+0915 (k)
→ #k. These two format characters correspond to nothing
either visual or linguistic.

2.6 CS (Classical Sanskrit) and CSX (Classi-

cal Sanskrit Extended)
In 1990 a group of scholars at the 8th World Sanskrit Conference in Vi-
enna agreed on an 8-bit encoding for transliterated Sanskrit called CS
(Wujastyk, 1990). A superset of this standard, CSX (Classical Sanskrit
Extended), was also devised, which allowed for characters used in the

i i

be formed with a two-character sequence, using the combining diacrit-

ics. For example: r = U+0071 (LATIN SMALL LETTER R) + U+0325
(COMBINING RING ˚ BELOW); ā´ = U+0101 (LATIN SMALL LETTER A
WITH MACRON ) + U+0301 ( COMBINING ACUTE ACCENT ). For San-
skrit, three stacked diacritics will sometimes be needed. Diacritic stack-
ing for rendering takes place at the OS/font level or the application lev-
el.18 Up to three diacritics may need to be stacked above a Roman char-
acter (length + nasalization + accent), in addition to one below (e g. ring
below indicating syllabicity of a liquid).

2.9 7-bit meta-transliterations

7-bit meta-transliterations are designed to be pure ASCII transliterations
that may be mapped unambiguously onto an encoding that assigns a
unique codepoint to each character in an underlying Romanization (La-
gally, 1999).19 Reversibility is guaranteed by ensuring that the meta-
transliteration satisfies the Fano condition: no code word is a prefix of
any other code word (Fano, 1966, 67). If the meta-transliteration is based
on a conventional Romanization, it should be human-readable to some
degree.
To represent diacritics, meta-characters are chosen; thus h.i (ASCII
PERIOD ) may represent an underdot. Such a meta-transliteration for Ro-
manized Sanskrit would use .n to encode n., the retroflex nasal spelled in
Devanāgarı̄ with the character :N,a. If it is desired to encode the period, this
may be indicated uniquely as PERIOD + SPACE. A meta-transliteration
inherits defects in the corresponding Romanization. Thus, if we Ro-
manize the voiceless aspirate dental (in Devanāgarı̄, T,a) as th, the meta-
transliteration th satisfies the Fano condition for the Romanization, but
not for Devanāgarı̄ — as will be exemplified in the next section.
18 At the 12th World Sanskrit Conference in Helsinki, 13–18 July, 2003, a proposal was

circulated, under the name “The Vāmana Project”, to add to Unicode all characters needed
for implementing ISO 15919 in precomposed format. It is, however, the policy of the Uni-
code consortium to add no new precomposed characters, where characters can be composed
from presently-encoded characters.
19 Such input schemes are used, for instance, in Lagally’s excellent ArabT X package
E
(Lagally, 2004).

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 36 — #56

i i

36 CHAPTER 2. EXISTING ENCODING SYSTEMS

The meta-transliterations have the advantages of being round-trip-

pable (e. g. to CSX) and easily manipulable in virtually any software
environment, since they are pure ASCII and can be read by humans with
only a minimum of effort. A tabular overview of a modified form of the
Velthuis scheme, the Kyoto-Harvard scheme, the “wx” (or Hyderabad-
Tirupati) scheme, as well as SLP1, is given by Huet (2009, 196).

2.10 Velthuis transliteration and ITRANS

The Velthuis transliteration is named for the Dutch scholar Frans Velthuis
(Wujastyk, 1996).20 It does not satisfy the Fano condition for represent-
ing Sanskrit phonemic strings, since (for example) the voiceless aspirate
dental may be coded th, which is potentially ambiguous with respect
to a sequence representing a voiceless dental /t/ followed by a voiced
glottal fricative /H/. Since Sanskrit phonotactics forbids such a sequence,
Velthuis applications can assume that the sequence th uniquely repre-
sents the voiceless aspirate dental. Problems will still arise elsewhere,
as in the case where digraphs for diphthongs are spelled identically with
sequences of distinct vowels. For instance, additional means will be re-
quired to disambiguate between the diphthong au and sequence of simple
vowels a + u.
Velthuis also offers alternative ways of transliterating certain speech
sounds, e. g. O for the diphthong au, T for the voiceless aspirate dental
th, and .T for the voiceless aspirate retroflex dental .th. If only these
alternatives are used, the meta-transliteration satisfies the Fano condition.
Charles Wikner’s package “Sanskrit for LATEX 2ε ” (Wikner, 2002)
employs a modified version of the Velthuis scheme. The ITRANS (In-
dian languages TRANSliteration) scheme, used by a popular software
package (developed by Avinash Chopde) for transliteration and recod-
ing, also significantly resembles the Velthuis scheme (Pandey, 1998).21
An ITRANS package is available for TEX, which allows for typesetting
Devanāgarı̄, Tamil, Bengali, Telugu, Gujarati, Kannada, Panjabi, and Ro-
manized Sanskrit using the ITRANS software and transliteration conven-
tions (Syropoulos et al., 2003, 351–355).
20 Cf. Bakker, Barkhuis & Velthuis (1990).
21 <http://www.aczoom.com/itrans/>

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 37 — #57

i i

2.11. WX 37

2.11 wx
The authors of the textbook Natural Language Processing: A Paninian
Perspective present a scheme for “[i]nternal representation in the com-
puter” that shares many design principles with our SLP1 (Bharati, Chai-
tanya & Sangal, 1996, 193). In the wx scheme (so dubbed after the
characters used to encode the dental stops t and d), a single charac-
ter represents a single speech sound. Equivalences are more or less
straight-forward. Lower-case ASCII letters represent short vowels or
close diphthongs, while upper-case letters represent long vowels and
open diphthongs. The symbol q represents r, and L, l (Huet, 2009,
196); while no provision is made at all for r̄˚or l̄. The˚graphic oppo-
sition lowercase–uppercase consistently represents˚ ˚ the phonological op-
position unaspirated–aspirated. Some characters have a peculiar repre-
sentation: e. g. the velar nasal ṅ (f) and the palatal nasal ñ (F). The den-
tal oral stops t, th, d, dh are represented as w, W, x, X, whereas the
retroflex .t, .th, d., d.h are represented as t, T, d, D. This convention
is no doubt motivated by the fact that speakers of Modern Indic and Dra-
vidian languages regularly perceive English alveolar stops as retroflex.22
The retroflex sibilant s. is represented as R. This scheme, despite its con-
siderable virtues, seems not to be widely used, although Indian students
in NLP study it, and it plays a role in the Anusaaraka suite of NLP soft-
ware,23 including the Sanskrit morphological analyzer of Amba Kulkarni
and V. Sheeba. The scheme is, however, fundamentally limited, since it
does not allow for the full set of vocalic liquids described by the Sanskrit
grammarians, the unaspirated and aspirated retroflex lateral flaps .l and
l.h, any system of accents, or other sounds peculiar to Vedic traditions.

2.12 Kyoto-Harvard
The Kyoto-Harvard transliteration is not a meta-transliteration as defined
above. It instead chooses one or two symbols for each Sanskrit speech
sound, with the addition of some special-use symbols (Wujastyk, 1996).
22 Thus in Hindi, for example, both instances of alveolar [t] in tractor become [ú]: :f"E;#f:=.

The retroflex series of stops in Hindi contrasts (as in Sanskrit) with pure dentals: [t”], [t”h ],
[d”], [d”h ] (not alveolars). Cf. Harley (1955, xix).
23 <http://ltrc.iiit.net/~anusaaraka/>

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 38 — #58

i i

38 CHAPTER 2. EXISTING ENCODING SYSTEMS

Where the conventional Romanization for a Devanāgarı̄ character can

be represented in ASCII, Kyoto-Harvard uses that representation. Oth-
erwise: r → R, l → L; long vowels are represented by their upper-case
˚ except
equivalents, ˚ r̄ → q, l̄ → E; ṅ → G; ñ → J; retroflex consonants are
˚
uppercased (and followed ˚by h if they are aspirated); ś → z; ṁ → M; h
.
→ H. Special symbols exist also for anunāsika (&), jihvāmūlı̄ya and upa-
dhmānı̄ya (x and f), the udātta and svarita accents (; and :), external
sandhi (ˆ), and compound junction (.). A variant form of the Kyoto-
Harvard scheme is sometimes used, in which long vowels are indicated
by doubling the symbol for the short vowel.
A significant number of Sanskrit texts have been entered in this for-
mat. Unfortunately, it is not ideal, since it allows ambiguity such as that
between the diphthong au and the sequence of simple vowels a + u.

2.13 Varn.amālā
Joshi, Dharmadhikari & Bedekar (2007) have proposed a scheme for
Sanskrit text encoding which they term varn.amālā ‘garland of speech
sounds’. Whereas ISCII and Unicode take as their starting point for
the encoding of Indian-language texts the orthographic syllable (aks.ara),
Joshi et al. propose a phonemic approach in which the fundamental unit
is the individual speech sound (varn.a). The proposed varn.amālā in-
cludes the fourteen vowels of Sanskrit; six additional vowels (short e,
candra e, long candra e, short o, candra o, long candra o); anusvāra,
nasalization (candrabindu), and visarga; and thirty-four consonants (in-
cluding the retroflex lateral flap .l).
The varn.amālā scheme has been implemented in the context of the
IndiX project developed by C-DAC Mumbai. IndiX is a set of libraries
and applications based on the GNU/Linux operating system that provide
support for Indic scripts.24
The varn.amālā scheme is indeed based on phonetic principles, many
of which are in accord with principles that we develop below. The status
of this encoding, however, remains unclear. Joshi et al. (2007) do not
assign codepoints or provide an ordering of the sounds in the repetoire.
Earlier work by Joshi (2006) presents the varn.amālā as a “Vedic San-
24 <http://www.cdacmumbai.in/projects/indix/>.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 39 — #59

i i

2.13. VARN
. AMĀLĀ 39

skrit Coding Scheme”. Codes are envisioned as being assigned in the

(currently unused) Unicode block beginning U+0800. In Joshi’s draft,
the basic Sanskrit sounds, together with numerals, some special sym-
bols (such as the danda), and a few control characters are allocated to
U+0800–U+087F. In U+0880–U+08FF are signs used in various Ve-
dic manuscript traditions, including diacritics that indicate accents. Here
the consonants and vowels of Sanskrit are treated phonetically (although
not all the sounds Joshi includes have phonemic status in Sanskrit), but
the remainder of the coded items are not phonetic but rather visual (or
script-based)! Marks for accents could be interpreted phonetically, al-
though they are presented merely as uninterpreted symbols; but the sva-
stika (U+08E6) represents nothing phonetic, and numerals (U+0800–
U+0809) are properly non-glottographic (Hyman, 2006). This scheme
is unsuitable for encoding in Unicode, since it is phonetically organized
and duplicates material already encoded. At the same time, it cannot
properly be called a sound-based encoding, since it includes a substan-
tial number of characters that do not represent sounds.
Joshi et al. (2007) present a number of arguments in support of the
varn.amālā that are specious. They assert, “Through the Varnamala ap-
proach the IPA equivalence for Sanskrit text (as well as other Indian lan-
guage text) can be established as one to one correspondence”. Yet many
sounds will have to be represented with digraphs in IPA. They assert,
“Through the Varnamala-Phonemic approach lexical order and sorting
operations in the areas of dictionary etc. can be done in the logical and
more efficient way”. But collation is fundamentally independent of en-
coding (Wissink, 2001). Collating order varies for different languages
written in the same script. And sometimes multiple collating orders
are used even within a single language. Thus in the case of Sanskrit,
anusvāra and visarga collate between the vowels and the consonants in
dictionaries such as Monier Williams’, while in Bloomfield’s Vedic Con-
cordance, anusvāra collates after visarga, jihvāmūlı̄ya, and upadhmānı̄-
ya. In addition the authors assert, “Under the phonemic scheme the key-
board in put [sic] procedure will be simplified by reducing keys for vowel
matras”. Yet input methods are independent of underlying encodings; an
input method in which independent vowels and vowel mātras are entered
in the same way could equally be used with the existing Unicode Deva-
nāgarı̄ encoding. As we shall see, there are more reliable justifications

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 40 — #60

i i

40 CHAPTER 2. EXISTING ENCODING SYSTEMS

for a sound-based encoding than these.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 41 — #61

i i

Chapter 3

Critique of encoding
systems seen so far

Most of the encoding systems surveyed above are based primarily either
upon Devanāgarı̄ script or upon the standard Romanization of Sanskrit.
The difficulties with these systems are due in part to problems in the
modes of graphic representation of Sanskrit sounds adopted in Devanā-
garı̄ and the standard Romanization themselves. Current encoding per-
sists in being script-based; it allows display conventions to govern uses of
encoding that transcend appearance. While free-hand drawing and type-
face, upon which contemporary encoding systems are based, historically
served only display purposes, contemporary character encoding serves
linguistic and archiving purposes that transcend mere display. Hence,
while it is understandable that initially character encoding was motivated
by display issues in imitation of typeface or manuscript hand, recent ex-
igencies require an explicit system for encoding complete linguistic in-
formation. It is therefore timely to consider the principles governing the
design of character-encoding systems.
The difficulties with the Devanāgarı̄ standards and the Roman stan-
dards surveyed above become evident by observing the discrepancies be-
tween the encoding of Sanskrit embodied in the Devanāgarı̄ script and in
standard Romanization. Consider especially the following three points:

41
i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 42 — #62

i i

42 CHAPTER 3. CRITIQUE OF ENCODING SYSTEMS

1. In the Devanāgarı̄ standards, there are separate characters for vow-

els when they appear post-consonantally versus when they appear
phrase-initially or post-vocalically. In the Roman standards, a sin-
gle character is used in all contexts.
2. In the Devanāgarı̄ standards, post-consonantal /a/ is implicitly indi-
cated by the graph of the preceding consonant, while its absence is
explicitly represented by a sign indicating the cessation of speech
(virāma). In the Roman standards, the distribution of hai corre-
sponds exactly to the distribution of the vowel /a/.
3. In the Roman standards, certain single sounds are represented by
digraphs: the aspirate stops (kh, gh, ch, jh, .th, d.h, th, dh, ph, bh)
and the open diphthongs (ai, au). In the Devanāgarı̄ standards,
single characters represent each of these segments.
The common feature of these discrepancies is a departure from the prin-
ciple of representing a single Sanskrit sound by a single character. Both
the Devanāgarı̄ and the Roman standards concur in departing from this
principle in one additional case:
4. In both the Devanāgarı̄ and Roman standards the aspirate retroflex
lateral flap / h / is represented by a digraph: \h, .lh.
5. An additional discrepancy exists between the encoding of accent
in Devanāgarı̄ script and the encoding in standard Romanization.
The Romanization encodes lexical or post-prosodic high pitch and
independent circumflex, or deep accent. Devanāgarı̄ encodes man-
ifest pitch or surface accent. The failure of scholars to recognize
the difference has led to confused explanations of Devanāgarı̄ ac-
centual systems and the obfuscation of genuinely different recita-
tional traditions and dialects.

3.1 Ambiguity and redundancy

The deficiencies that current encoding systems inherit from the Devanā-
garı̄ and Roman orthographies raise questions regarding general princi-
ples. In particular, we will consider the principles of avoiding ambiguity
and redundancy. To avoid ambiguity and redundancy requires that an

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 43 — #63

i i

3.1. AMBIGUITY AND REDUNDANCY 43

encoding system be characterized by a one-to-one correspondence be-

tween characters and items to be encoded,1 and that all encoded items be
of the same kind (e. g., phonemes or written characters). In items (1), (3),
and (4), above, a single sound is represented by more than one character,
and in (2), a sound is inversely represented: that is, the presence of the
sound is represented by the absence of a character, and the absence of the
sound by the presence of a character. The departure from the principle of
a one-to-one correspondence between what is to be represented and the
representation signals confusion concerning the principles of encoding.
Although the adoption of digraphs to transcribe aspirate stops and
the aspirate retroflex lateral flap / h / in the Roman transcription of San-
skrit departs from a one-to-one correspondence between what is to be
represented and the representation, the character hhi was chosen because
it represents aspiration, which is the common feature of all the aspirate
stops and also of the voiced fricative /H/. Similarly, although ai and au
are digraphs representing single diphthongs, the individual components
of the digraphs were chosen as representations of the subsegments of
those diphthongs. Insofar as the individual characters in these digraphs
represent individual features and subsegments in the sounds they repre-
sent, the Roman transcription of Sanskrit does observe a one-to-one cor-
respondence. Yet it still garners the fault of inconsistency in the princi-
ples of representation: some characters represent sound segments, while
others represent features; and others, subsegments.
It is not absolutely necessary that an encoding scheme adhere to the
principle of one-to-one correspondence and a consistent basis for its en-
coding. Yet, if it does not, it runs the risk of ambiguity, which is a fault
in itself. Freedom from ambiguity is the minimal requirement for the
adequacy of an encoding scheme.
The standard Roman encoding is encumbered with the fault of ambi-
guity in either case, whether it adheres to a consistent basis of encoding
sound segments while it departs from the principle of one-to-one repre-
sentation, or else conforms to the principle of one-to-one representation
while it adopts an inconsistent basis of encoding. If it consistently rep-
resents sound segments, it uses the characters hhi, hai, hii, and hui in
ambiguous ways. Each serves the dual functions of (1) representing a
1 Compare Whitney (1861, 301): “each single sign was originally meant to have a single

sound, and each single sound a separate and invariable sign”.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 44 — #64

i i

44 CHAPTER 3. CRITIQUE OF ENCODING SYSTEMS

segment by itself as well as (2) constituting a member of one or more

i i

50 CHAPTER 4. THE BASIS FOR ENCODING

into a small number of graphemes organized around a system of distinc-

tive features (Olivier 1974; Krampen 1986), and psychological research
confirms that in reading the grapheme has a salient status independent of
phonetics. This status can be confirmed by disorders such as pure global
alexia, in which patients cannot read written text, although they may be
capable of writing with considerable fluency; at the same time, the patient
may have no difficulty copying or naming non-grapheme shapes (Cohen
& Dehaene 2004, 471–473; Caramazza 2000, 204).3 Moreover, research
suggests that the consonant/vowel distinction in orthography “reflects a
psychological reality” that is not entirely parasitic on the same distinc-
tion at the phonological level (Cubelli, 1991, 260). Another confirm-
ing phenomenon is grapheme-color synesthesia, in which an involuntary
color percept accompanies the visual presentation of a grapheme; this is
the most commonly presented form of synesthesia (Esterman, Verstynen,
Ivry & Robertson 2006; Simner, Ward, Lanz, Hansari, Noonan, Glover &
Oakley 2005; Ward, Simner & Auyeung 2005; Smilek, Dixon & Merikle
2005; Rich & Mattingley 2005; Wollen & Ruggiero 1983). That the con-
comitant color is genuinely perceived is demonstrated by a number of
experiments (Ramachandran, Hubbard & Butcher, 2004, 869–870).4
To the extent that current encoding systems are based primarily on the
underlying script, their capacity to represent knowledge can be no bet-
3 A complementary variety of agraphia occurs, in which a subject is incapable of fol-

lowing the phonetic–graphic route in writing (and thus is entirely incapable of writing
nonsense words) but has well-preserved ability to write known words via the whole-word
route (Shallice, 1981).
4 There are two kinds of grapheme-color synesthetes: for projectors, the color percept

is bound to the visually-presented grapheme, whereas for associators the color percept
is “seen” before the “mind’s eye” (Smilek et al., 2005). The most compelling current
explanation of grapheme-color synesthesia is based on proximity of the V4 or V8 areas
implicated in color vision to the so-called “visual number grapheme” area. These areas are
all located within the fusiform gyrus. Additional connections between these areas could
explain the synesthetic percepts (Ramachandran & Hubbard 2001; Ramachandran et al.
2004). Subsequent research has identified the “visual number grapheme” area as belong-
ing to the visual word form area (VWFA), with the approximate location (−43, −54, −12)
in Talairach space (Cohen & Dehaene, 2004). Although it is unlikely that the VWFA is
entirely devoted to reading, it is hypothesized that the VWFA contains detectors tuned to
recognize graphemes, as opposed to pseudo-graphemes. It is further hypothesized that
“neurons in the fusiform region are tuned to progressively larger and more invariant units
of words, from visual features in extrastriate cortex to broader units such as graphemes,
syllables, morphemes, or even entire words as one moves anteriorily [sic: anteriorly] in the
fusiform gyrus” (Cohen & Dehaene, 2004, 471).

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 51 — #71

i i

4.1. I: SPOKEN COMMUNICATION IS PRIOR TO WRITTEN 51

ter than the orthography associated with that script. The complexity of
the mapping between the orthographic and the phonetic levels is known
as orthographic depth and can be precisely quantified (Frost 1992; van
den Bosch, Content, Daelemans & de Gelder 1994; Treiman 2006, 595).
To take two contrasting cases, Finnish orthography is very shallow (or
transparent), while English is quite deep (Lyytinen, Aro, Holopainen,
Leiwo, Lyttinen & Tolvanen, 2006, 40). Since orthographies are never
entirely shallow or transparent (Weir, 1967), character encoding by its
very nature represents knowledge that has already passed through sev-
eral stages at which information loss is possible. The goal of encoding
should be to minimize the loss of information. Since degradation can
occur at each stage of expression and transition, one ought to capture the
informational content at the earliest stage possible. Given that script is
inherently a secondary phenomenon vis-à-vis spoken language, encoding
should be based directly on spoken language.
As noted above (§1.3), Devanāgarı̄ script itself was not specifically
designed to represent Sanskrit phonology but rather was adapted to this
use subsequently. Devanāgarı̄ derives from Brāhmı̄ script, which was in
turn influenced by Kharos.t.hı̄, which was itself adapted from Aramaic.
Brāhmı̄ was placed in service in India originally to represent the phonol-
ogy of Prākrit, rather than Sanskrit; the former lacks a number of the
latter’s phonemes, including vocalic r, r̄, and l, and the open diphthongs
˚˚
ai and au (Oberlies, 2003, 168). Moreover, ˚ phonological features
some
of Sanskrit for which Devanāgarı̄ incorporates an encoding mechanism,
such as the glottal stop, are not explicitly recognized in the phonologies
of Indian linguists. Since Devanāgarı̄ was never systematically designed
to represent the phonological systems of Indian linguists in the first place,
it would be surprising indeed if it should serve as a more appropriate ba-
sis for encoding Sanskrit than Sanskrit phonology. In fact, very few of the
world’s writing systems were designed for the languages that they repre-
sent in extant texts and manuscripts. Borrowing is the norm in the history
of writing, and adaptations almost always fail to capture the structure of
the spoken language adequately.
Therefore, where one has access to the phonology of the language,
where the orthography is fairly shallow, and where the orthography de-
parts from an ideal coding of spoken language structure, the basis for text
encoding should be phonetic rather than graphic. Sanskrit meets these

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 52 — #72

54 CHAPTER 4. THE BASIS FOR ENCODING

Speech may be analyzed into acoustic parameters (frequency, phase,

and amplitude of waveforms)11 as well as articulatory parameters (ma-
nipulation of the vocal tract, including larynx, tongue, lips, etc.). Feature
analysis seeks to characterize perceptible features by associating them
with regular patterns of concurrent acoustic and articulatory parameters
(Laver, 1994, 101–110).
Nor is writing one-dimensional. Just as speech is analyzable into
phonetic features, so writing may be analyzed into graphic features. Ana-
logous to articulatory and acoustic features in phonetics are stroke analy-
sis and block adjacency graph (BAG) analysis in optical character recog-
nition (OCR) (Sonka, Hlavac & Boyle, 1999; Kompalli, 2007). Just as
articulatory features are correlated with the production of speech sounds,
stroke sequence is correlated with the production of written characters,
and just as acoustic features are correlated with auditory parameters of
speech sounds, BAG analysis is correlated with the shape of the complete
character.12 Marked alterations in phonetic and graphic features occur at
the boundaries between phonetic and graphic segments.
The analysis of graphic features is more obviously applicable to some
writing systems than to others; it is of particular interest where graphic
features are correlated with phonetic features. Perhaps the most obvi-
ous application is to Korean han’gŭl, “in which graphic shapes are de-
every phoneme sequence in normal speech” (Goodglass, 1993, 62). Cf. Oudeyer 2006, 24.
Some researchers have moved in the direction of developing a nonsegmental phonology
(Griffen, 1976).
11 Sine waves (or sinusoids) may be uniquely characterized in terms of these parameters.

encode directly the sounds of the spoken language, rather than the char-
acters that symbolize them. In making such decisions, one must consider
whether the cultural heritage is received primarily in written or in oral
form and, if written, how closely the written form represents the phonol-
ogy of the language.
For English, the Roman script, rather than the oral language, is the
predominant vehicle of the received cultural heritage. Scholarship is
primarily written. Although regional pronunciation varies, spelling is
highly standardized and needs to be taught even to native speakers up
through secondary school. The Roman script, designed to model the
Latin sound system, was never systematically remodeled to accord with
English phonology.19 Moreover, the phonology of English has changed
significantly since the adoption of the Roman alphabet, widening further
the gap between script and sound. Character encoding evolved first to
capture the system of contemporary written English (Birnbaum, 1989),
and ASCII (as well as supersets such as ISO 8859-120 and Unicode) pro-
vides a reasonable basis for archiving and processing English language
text. A phonetic encoding of English would not be desirable for many ap-
plications, since it would necessarily impose arbitrary dialectal features
on written texts. Furthermore, writers and readers of English are used to
an orthography that often privileges morphological representation over
phonological representation (consider for instance the different vowels
in potent ["powtnt] and impotent ["Imp@tnt]) (Weir 1967; Klima 1972;
French 1976, 124; " Sampson 1985, 204–205; " Tolchinsky 2003, 92, 193–
194; Snowling 2005; Lyytinen et al. 2006, 49). English spelling also
possesses a lexical-semantic aspect, as shown by such homophonous but
heterographic and heterosemantic sets as {knew, new, gnu} (Weir 1967,
19 The earliest inscriptions in Old English are in the Runic futhorc alphabet, which de-

rives from the Germanic futhark and is first attested (in the Caistor-by-Norwich runes) for
the fourth or early fifth century (Page, 1999, 21). The adoption of the Roman alphabet
was a response to the spread of Christianity. Originally, several added letters represented
phonemes specific to English: hæ, ð, þ, ßi (the latter two directly borrowed from futhorc: þ
= þorn ‘thorn’; ß = ßynn ‘joy’) (Page, 1999, 186, 212–213). With the rise of printing in the
15th century, the added characters fell into disuse, since they did not exist in the fonts of
continental printers (McArthur, 1992, 31–32). Once again we see the limitations imposed
by a shift in technology.
20 See Gaylord 1995.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 59 — #79

60 CHAPTER 4. THE BASIS FOR ENCODING

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 61 — #81

i i

Chapter 5

Sanskrit phonology

Sanskrit phonology has been a topic of investigation since phoneticians

analyzed interword sound alterations in Vedic hymns at the beginning
of the first millennium B . C . E .1 During the 6th through 4th centuries
B . C . E ., around the time that Pān.ini composed his grammar of Sanskrit,
phoneticians systematically analyzed the phonetic features of sounds and
categorized sounds according to these features in treatises termed Prāti-
śākhya that were proper to particular Vedic schools (Staal, 1972, xxiv).2
Subsequent treatises called Śiks.ā continued the tradition of phonological
analysis. The phonetic and phonological analyses in these texts differ
from each other and from that assumed for the operation of Pān.inian
grammatical rules. Modern historical and comparative linguists ana-
lyze the sound structure of various Sanskrit dialects at various histori-
cal periods; in so doing they rely on the data of Indian predecessors and
adopt or adapt many of their analytic principles. Relevant also are the
independently-motivated featural analyses proposed by modern phonol-
ogists. While it is neither practical nor desirable for us to present all the
1 By phonology we mean the study of the sound system of a language, including the
relationship of sounds to one another and the patterned alternation of sounds. Phonetics
denotes a broader science that may also describe paralinguistic, extralinguistic, and non-
systematic aspects of a spoken language.
2 Dating is a matter of some controversy. Scharfe (1977, 129–30) dates the Vājasaneyi-

prātiśākhya to 250 B . C . E .

61
i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 62 — #82

i i

62 CHAPTER 5. SANSKRIT PHONOLOGY

details of such analyses, we aim to survey here those aspects of phonemic

and featural analyses of Sanskrit that are relevant to its encoding.
We summarize the system of phonological features of Sanskrit in TA -
BLE 1 and show the classification of phonetic segments of Sanskrit in ac-
cordance with these features in TABLE 5. The system presented in these
tables is based upon our own analyses of the Indian phonetic treatises
and on recent contributions to Sanskrit phonetics. Significant differences
and variations both in the system of features and in the classification of
segments will be discussed in due course.

5.1 Description of Sanskrit sounds

Phonetic segments are categorized in TABLE 5 in rows by their place of
articulation within the mouth and in columns by their manner features:
stricture, voicing, aspiration, nasalization, and duration.3 Indian phoneti-
cians categorize the duration of segments by recourse to the measure of
the short vowel. A short vowel measures one mora;4 long vowels, two
morae; prolonged vowels (not shown), three morae; consonants, half a
mora.5 In terms of pitch, Indian phoneticians categorize vowels as high-
pitched, low-pitched, circumflexed, or monotone. A circumflexed vowel
is described as dropping from high to low, and a series of syllables is
monotone if devoid of relative distinction in pitch.
The vowels represented in Devanāgarı̄ by O; and A;ea, although typ-
ically categorized as diphthongs, are phonetically monophthongal mid
vowels and hence Romanized e and o. The true diphthongs (written Oe;
and A;Ea) have two places of articulation — one each from the subseg-
ments of which they are composed: ai, composed of subsegments a and
i, is glottal-palatal; whereas au, composed of subsegments a and u, is
glottal-labial. In the table they are placed in the row that corresponds to
their second property. The vowels and semivowels other than r (i. e. y,
l, and v) include nasalized variants (not shown) as well as the clear (un-
3 Allen (1953, 20) differs in leaving out l̄ as well as the more open and most open

manners of articulation, and in not categorizing˚ anusvāra and h as semivowels.

4 The mora is a unit of relative duration that holds constant over differing rates of speech.
5 For comparison, in English spoken in a connected style and at an ordinary rate, the

median absolute duration of a stressed vowel is 130 msec; that of a consonant or unstressed
vowel is about 70 msec (Klatt, 1976).

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 63 — #83

i i

5.1. DESCRIPTION OF SANSKRIT SOUNDS 63

nasalized) varieties. v,a, conventionally Romanized as v, was originally a

labiovelar approximant [w]; in some dialects it is described by ancient
phoneticians as a labiodental [V] (Pān.inı̄yaśiks.ā 18).
Indian phoneticians describe a number of other phonetic segments
not shown in TABLE 5. Nasals called yama occur as a transition be-
tween an oral stop and a subsequent nasal stop. Four yamas character-
ized by the voicing and aspiration of the preceding stop are Romanized
k̃ kh̃ g̃ gh̃ and are designated variously in Indian phonetic treatises as
k<u K<ua g<ua ;G<ua 6 or kM KMa gMa ;GMa.7 Another nasal segment called nāsikya (h̃) oc-
curs as a transition between h /H/ and a subsequent nasal stop n., n, or
m.8 Unreleased stops occur before stops, and reduced semivowels cor-
responding to y, l, and v occur word-finally; both are termed abhinidhā-
na (Varma 1929, 137–147; Allen 1953, 71–73). Firmer approximants y
and v occur word-initially, and lighter approximants y and v occur word-
finally in several dialects (Varma 1929, 126–132; Allen 1953, 68–69;
A. 8.3.18). Short simple vowels ĕ and ŏ occur in Vedic recitation and
in phonetic treatises.9 The Keśavı̄śiks.ā and Pratijñāsūtra notice slightly
lengthened short vowels in the Vājasaneyisaṁhitā. The former states
that short vowels are slightly long (kiṁcit dı̄rgham) except when fol-
lowed by a syllable containing a long ā preceded by a consonant, or a
vowel preceded by a consonant and followed by a visarga. The latter
states that slight length (ı̄s.addı̄rghatā) occurs in a word-initial syllable
containing the vowel a preceded by a consonant (Varma, 1929, 179).10
Vowel segments (a i u˚r˚l e ) called svarabhakti break up certain consonant
clusters (Schmidt, 1875, 1–8). In particular, a svarabhakti appears in
clusters consisting of r plus a fricative, and in broken clusters consist-
6 VPr. 8.31 (Rastogi, 1967, 89).
7 The Caturādhyāyikābhās.ya on CA. 1.1.26 (Deshpande, 1997b, 139).
8 See Allen (1953, 75–78), Mishra (1972, 87–88), van Nooten (1973, 412), Cardona

(1977), Cardona (1980, 253 n. 14).

9 chandogānāṁ sātyamugrirānāyanı̄yā ardham ekāram ardham okāraṁ cādhı̄yate, etc.
.
MBhK., I 22.21–24. See Cardona 1987, 28–30; Cardona 1983.
10 The statement of the Pāriśiksātı̄kā Yājusabhūsana that one should pronounce a short
. . . . .
vowel like a long one in an aggravated svarita seems to lengthen a short vowel to a long one
rather than account for a length between that of a short vowel and a long one. Likewise, it is
not clear that shortened long vowels termed ks.ipra ‘quick’ are any different in length from
short vowels. The only evidence Varma (1929, 178) cites for them describes their length as
that of a short vowel, and he himself notes that their length “may be confused with that of
a short vowel”.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 64 — #84

i i

64 CHAPTER 5. SANSKRIT PHONOLOGY

ing of a voiced abhinidhāna plus a stop or fricative (Schmidt 1875, 1–

8; Varma 1929, 133–136; Allen 1953, 73–75; Rkprātiśākhya 6.46-53,
14.58). Caturādhyāyikā 1.4.10–11 distinguishes ˚ two lengths of svara-
bhakti. Vedic phonetic treatises also describe (1) longer and shorter
lengths of anusvāra, which regularly occur after short and long vowels
respectively (Vājasaneyiprātiśākhya 4.148–149; Rkprātiśākhya 13.32–
33); (2) realizations of anusvāra as velar nasalized ˚ stops before r and
fricatives (gũ and, before unvoiced fricatives, ṅk) (Cardona, 2003, 110);
and (3) extra high or extra low pitches and special varieties of circum-
flex accent determined by sandhi and phonotactics (Rkprātiśākhya 3.4;
Vājasaneyiprātiśākhya 4.136, 138; A. 1.2.40). Patañjali ˚ asserts that there
are prolonged vowels measuring four morae (MBhK. III 421.13–14).
Certain Śiks.ā texts distinguish in addition to short and long anusvāra (1) a
two-mora (dvimātra) anusvāra before consonant + r (Yājñavalkyaśiks.ā
139; Pārāśarı̄śiks.ā 31) or (2) a heavy (guru) anusvāra ˚ before a consonant
cluster (Laghumādhyandinı̄yaśiks.ā 14–15; Keśavı̄śiks.ā 5). Some Śiks.ās
describe nasalized vowels prolonged by up to six morae (raṅga) (Malla-
śarmakrtaśiks.ā 43–46). Vocalic and consonantal subsegments comprise
˚ r and l (Allen, 1953, 61–62). Subsegments of diphthongs
the vowels
˚ quality
are of similar ˚ to independent vowels. Unaspirated and aspirated
retroflex lateral flaps / / and / h /, written L, .l and \h, .lh, occur intervocali-
cally in Rgvedic (as well as in the Nirukta) in place of d. and d.h (Allen,
˚ 11
1953, 73).
11 After consultation with the colleagues mentioned in parentheses below, it remains un-

clear whether the Vedic L, .l and \h, .lh were flaps, taps, or approximants. In Modern Indic
(Gujarati, Marathi, Oriya, and the four Dravidian languages), L, .l is a retroflex lateral ap-
proximant, not a flap (Aklujkar, Cardona, Deshpande, Bhaskararao), and it is reasonable to
assume that retroflex lateral approximants developed from the intervocalic voiced retroflex
stops .q, d. and Q, d.h (Cardona). In Tamil the retroflex lateral approximant ñ .l is not exclu-
sively intervocalic but occurs in clusters, including geminates (Steever) and contrasts with
a central retroflex approximant with lateral contact between the sides of the mid-tongue
and the palate x ñ l, as well as with a non-lateral post-alveolar ñ r (which may be in the
process of merging ¯ with alveolar ñ̀ r) (Keane, 2004, 113) (with thanks
¯ also to Chevillard).
Likewise, the Vedic retroflex laterals are distinguished from the modern Hindi retroflex
flaps .qÍ, and QÍ,. The development of weaker allophones in intervocalic position in Vedic is
paralleled in Middle Indo-Aryan: nn > n., and ll > .l (Hock).

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 65 — #85

i i

5.2. PHONETIC AND PHONOLOGICAL DIFFERENCES 65

5.2 Phonetic and phonological differences

68 CHAPTER 5. SANSKRIT PHONOLOGY

TABLE 5.1: The systems of accentuation of the Rkprātiśākhya versus

Vājasaneyiprātiśākhya ˚

tone Rkprātiśākhya Vājasaneyiprātiśākhya

˚
extra high beginning of svarita

high udātta, pracaya, end of udātta, beginning of sva-

svarita rita
low anudātta, end of svarita anudātta, pracaya, end of
before udātta or svarita svarita
extra low anudātta and end of sva-
rita before udātta or sva-
rita

to the syllabified visarga.20 Differences in the designation of the place of

articulation of h and visarga are therefore probably due to phonological
considerations.

5.2.3 Differences in phonological classification of seg-

ments
It is not necessarily the case that different classifications reflect differ-
ences in phonetics. Phonologists make different decisions concerning
how to classify complex phonetic data as they balance fidelity to pho-
netic detail against elegance in the phonological system.21 Hence it is
probably due to the consideration of secondary articulations that some
treatises place the vowels r and l at the base of the tongue.22 A similar
consideration accounts for˚ the disagreement
˚ over whether the place of
articulation of h and h. is that of a neighboring vowel, the glottis, or the
chest. Those who consider the place of articulation as that of the neigh-
boring vowel regard spread buccal place features as more primary than
extrabuccal place features; those who consider the place of articulation
as the glottis regard glottal stricture as primary; and those who consider
the place of articulation as the chest regard the regulation of pulmonic
airflow as primary. Similarly, although nasalization might be regarded as
a resonance feature, a number of treatises make the nose a second place
of articulation for nasal vowels, semivowels, and stops (Allen 1953, 39;
Bare 1976, 75).
In several other cases there is reason to believe that ostensibly pho-
netic descriptions are colored by phonological considerations. Some In-
dian treatises classify e and o as monophthongs with single places of
articulation (as we do) (Rkprātiśākhya 13.40; Shastri 1937, 98); others
˚ with dual places of articulation (Deshpande,
classify them as diphthongs
1997a, 76). They are phonetically realized as monophthongs; but histori-
cally and underlyingly, in terms of phonology, they are diphthongs (Allen
1953, 62–4; Cardona 1983, 13–32). Similar is the case of v, which some
20 Likewise the pitch feature spreads to syllabified anusvāra in the White Yajurvedic gũ
pronunciation, as demonstrated by a horizontal line beneath the sign for gũ after extra-low-
pitched vowels.
21 Cardona (1983) considers the interplay of phonetics and phonology in Indian treatises.
22 Rkprātiśākhya 1.41; Allen 1953, 55; Mishra 1972, 80; Varma 1929, 7.
˚

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 72 — #92

i i

72 CHAPTER 5. SANSKRIT PHONOLOGY

classify as labiodental; others as purely labial (as we have) (Deshpande

1997a, 76; Allen 1953, 57). Pān.inian prosodic rules operate as though
v were a labial semivowel, even though commentators recognize that it
is realized as a labiodental fricative. For like reasons, while Pān.inians
recognize the phonetic occurrence of diphthongs measuring three or four
morae, they classify them all as prolonged (i. e. trimoraic) in order to
preserve a strict tripartite division of vocalic length.23
While RPr. 6.29 describes yamas as non-nasal stops that have devel-
oped a nasal˚ offset before a nasal,24 the TPr. 21.12, APr. 1.99, and CA.
1.4.8 describe them as epenthetic nasals inserted between a non-nasal
stop and a following nasal. Uvat.a, in his comment on RPr. 6.29 (Shas-
˚
tri, 1931, 206), VPr. 8.31 (Rastogi, 1967, 89), the Tribhās . yaratna, in
its initial enumeration of sounds (Whitney, 1862, 10) and its comment
on TPr. 21.12 (Whitney, 1862, 389), and the Caturādhyāyikābhās.ya
on APr. 1.1.14–15 (Deshpande, 1997b, 117–119) and 26 (Deshpande,
1997b, 139) all count four yamas. Yet Whitney (1862, 393–395) and
Deshpande (1997b, 251–254) are of the opinion that the CA. held there
to be twenty yamas. Whitney and Deshpande’s insistence that there were
twenty must be accepted as a phonetic evaluation on the grounds that
the yama inherits properties of the preceding sounds, of which there are
twenty, in addition to the nasality of the following sound. Conversely,
the ancient texts enumerated four yamas on the grounds of phonologi-
cal abstraction based upon the features of voicing and aspiration of the
preceding sound. The RPr. and VPr. 1.103 syllabify yamas with the
preceding vowel while ˚ the TPr. 21.8 syllabifies them with the following.
Varma (1929, 79–80) attributes different reflexes in different dialects to
dialectal differences in the syllabification of yamas described by the two
Prātiśākhyas.25
23 Nageśa writes that the term trimātra is indicatory (upalaksana) of anything longer
. .
than two morae (ūkāla eveti. tatra trimātragrahan.am ekadvimātrabhinnopalaks.an.am iti
bhāvah.. MBh. Uddyota on Patañjali’s comment is.yate eva caturmātrah. plutah. under
A. 8.2.106. MBhK. III 421.14, Rohatak ed. V.427, Guru Prasad Shastri, vol. VIII, p. 149.
24 Whitney (1862, 393–394) interprets the passage as doubling and therefore as epenthe-

sis in the manner of the other Prātiśākhyas.

25 See the additional note of Shastri (1937, 192) on RPr. 6.29.
˚

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 73 — #93

i i

5.2. PHONETIC AND PHONOLOGICAL DIFFERENCES 73

5.2.4 Differences in the system of feature classification

Apart from differences concerning the classification of specific segments,
ancient authorities differ over the system of feature classification.26 Pho-
netic treatises vary in the number of places of articulation enumerated,
generally distinguishing the place of articulation of velar stops and jihvā-
mūlı̄ya from that of a, h, and h.. They place the jihvāmūlı̄ya at the base of
the tongue (jihvāmūla) and the velar stops either there or at the base of
the jaw (hanumūla); a, h, and h. they place in the throat (kan..tha) (Allen
1953, 51–2; Deshpande 1997a, 76; Bare 1976, 74; Mishra 1972, 77, 80).
In contrast, Pān.inian grammarians operate with five places of articulation
rather than six; they combine the glottal and velar places under the term
guttural (kan..thya) (Allen 1953, 52; Mishra 1972, 77,119).27 They avoid
having to posit different places of articulation for distinguishing between
a and h (on the one hand) and the velar stops (on the other) by employ-
ing efficient techniques of reference to the segments instead. Pān.inian
grammarians consider the nose (nasality) as a means, rather than a place,
of articulation. Thereby they avoid complications that would result from
considering all nasals (their distinct oral places of articulation notwith-
standing) as homorganic.28

5.2.5 Indian treatises on phonological features

Significantly, certain Indian phoneticians give particular prominence to
features. A few explicitly state that features are entities distinct from
both articulatory processes and phonetic segments and serve as the ele-
ments of which the latter are composed. Such analyses directly inspired
feature analysis in modern linguistics. Beyond classifying sounds ac-
cording to their common features, the Āpiśaliśiks.ā operates with the fea-
tures associated with those sound classes (Cardona, 1965, 248). After
classifying sounds according to their place of articulation in section 1,
the second section explicitly associates these sound classes, designated
26 These differences have been studied by Bare (1976) and summarized by Deshpande

(1997a).
27 Bhattojidı̄ksita preserves for etymological reasons the base of the tongue as a separate
.. .
place of pronunciation only for the jihvāmūlı̄ya: Siddhāntakaumudı̄ 10 (Cardona, 1965,
227).
28 Deshpande (1997a, 84). Kāśikā on A. 1.1.8.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 74 — #94

i i

tors without internal organization.34 Chomsky & Halle (1968), in their
influential sketch of a system of universal phonetic features, presented
a hierarchical system, which they characterized, however, as primarily
expository in purpose (300). At the same time, they noted the desirabil-
ity of research into the organization of features. More recently, excit-
ing progress has been made in modeling the relations between features.
Halle (1983) demonstrated that articulatory mechanisms, acoustic data,
and phonological rules all provide constraints on the organization of fea-
tures. Clements (1985) suggested that features follow a hierarchical or-
ganization governed by limits regarding both their sequential ordering
and their simultaneous grouping. On this view, features are regarded not
as properties of sound segments but as independent units in their own
right. Associated with each point in the speech signal is a feature geom-
etry that is orthogonal to the temporal dimension of the signal. Perhaps
the most significant aspect of Clements’ account is a “constrained the-
ory of assimilation processes, according to which all assimilation rules
involve the spreading of a single node: the root node, a class node, or a
feature node” (Clements, 1985, 247). In feature spreading, multiple seg-
ments, which were previously linked to separate features, are relinked
to a single feature.35 Since feature groupings recur across the world’s
languages, the aim of phonologists is to discover an adequate universal
feature organization (Clements & Hume, 1995).
Halle (1995) and Halle, Vaux & Wolfe (2000) arrange features un-
der their articulators instead of grouping them according to constriction,
which was the organizing principle of feature geometry in Clements’
model. Halle also considers that acoustic aspects of features play a sec-
ondary role. He believes “that there is a direct connection only between
features in memory and the articulatory actions to which they give rise”
(2002, 7). He therefore groups features under the only moveable parts
of the vocal tract, namely: lips, tongue blade, tongue body, tongue root,
soft palate, and larynx, and provides each with a unary designated artic-
of tenseness; (4) might be interpreted in terms of flatness; and (5) might be interpreted in
terms of stridency.
34 Ivanov & Toporov (1968, 40), however, present a feature tree, with (10) as the root

node and with higher nodes branching on the basis of features with decreasing indices in
their (inversely) ranked list (see n. 33 above).
35 See e. g. Halle (1995); Calabrese (1998, 9).

i i

to encode
In a comprehensive linguistic encoding scheme, whether based on speech
segments or on phonological features, it is not necessary to encode all
the elements that may be observed; one need only encode distinctive el-
ements. For an encoding scheme based on segments, we select a set
of Sanskrit sounds that are minimally distinctive in the sense described
above (§4.3). For a scheme based on features, we select a set of mini-
mally distinctive features to describe the set of distinctive segments. The
set of minimally distinctive features we select is shown in TABLE 1. It
is not possible to eliminate (as did Pān.ini) the distinction between the
guttural and velar places of articulation, if we wish the feature system
uniquely to distinguish the visarga from the velar fricative jihvāmūlı̄ya.
Pān.ini did not need this feature distinction, since he was able to refer
to segments directly (not just through the feature system). Further re-
ductions to the feature systems of the Indian phoneticians are not pos-
sible. We preserve the stricture distinctions of Āpiśali between open,
more open, and most open in order to distinguish vowels that Pān.ini dis-
tinguishes by explicitly classifying certain vowels as gun.a and vrddhi.
We abandon, however, the purely phonetically motivated stricture ˚ fea-
ture close (saṁvrtta) of a number of phonetic and grammatical treatises
˚
79
i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 80 — #100

i i

80 CHAPTER 6. SOUND-BASED ENCODING

including Āpiśali’s; the category is associated only with the vowel a,

which is already uniquely characterized by place and length features.
We select our set of minimally distinctive Sanskrit sounds to encode
from those discussed in section 5.1. In order to clarify our criteria for
determining which sounds are distinctive, we discuss next the concept of
a phoneme, its limiting parameters, its relation to Pān.ini’s concept of a
sound class, and the relevance of some of the limiting parameters to gen-
erative grammar and to historical and comparative linguistics. At each
stage in this discussion we specify the set of sounds that our developing
concept of a distinctive segment would include. Finally, having arrived
at a satisfactory concept of a distinctive segment, we specify the set of
sounds we wish to encode and justify the inclusion of various segments
with reference to the limiting parameters already discussed.

6.1.1 Phoneme
Kemp (1994) summarizes the major elements and history of the con-
cept of a phoneme. Early definitions of the phoneme limited features
that could distinguish phonemes to those qualifying timbre, but since
the 1950s the concept has been extended to include duration, stress, and
pitch.
Phonemes are the minimally contrastive segments of sound in a lan-
guage, on the basis of the contrast between which lexical and gram-
matical distinctions can be made. Sounds that are lexically or gram-
matically contrastive in parallel distribution are independent phonemes.
Conversely, where phonetically similar sounds differ only post-lexically,
they are not independent phonemes; rather they are either allophones or
free phonetic variants. Phonetically similar sounds that occur in com-
plementary distribution are allophones; phonetically similar sounds that
are non-contrastive in parallel distribution are free phonetic variants. A
middle category concerns sounds that are barely contrastive (Goldsmith,
1995a, 10–12). Two sounds, both of which are common, may be con-
trastive in just a small set of environments; one of two contrastive sounds
may occur only in limited contexts; or there may be some other asym-
metry between contrastive sounds. The contrast here possesses a low
functional yield.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 81 — #101

i i

6.1. DISTINCTIVE ELEMENTS 81

The concept of a phoneme is yoked with two parameters that limit its
utility as the sole basis for encoding. The first is that the sounds belong
to the same language in the strictest sense, namely, “the speech of one
individual pronouncing in a definite and consistent style” (Jones, 1962,
9). Differences in style, rate, or dialect are not included in the same
phonemic system. The second limiting parameter of the concept of a
phoneme is that for sounds to be considered contrastive they are required
to differentiate semantic content in a narrow sense.
A number of the phonetic segments described in section 5.1 are not
phonemes. These include inseparable phonetic segments described as
subsegments. The status of subsegments within the vowels r and l and
within e, o, ai, and au cannot be considered independently of˚those˚vow-
els. Although Old Indo-Aryan e, o, ai, and au are historically derived
from Proto-Indo-Iranian sequences of separate vowels *aï, *aü, *āï, and
*āü, they cannot be eliminated as independent phonemes in a synchronic
description of Sanskrit. The rest of the subsegments described in section
5.1 are overlapping phases, that is, they are simultaneously the offset
phase of the first of two segments and the onset phase of the second.
As such, they form parts of allophones. These include the nasals yama
and nāsikya in the phonological description of the Rkprātiśākhya, where
they are the overlapping phases of a stop or h and ˚ the following nasal
stop. While Indian phoneticians make a great contribution to the science
of phonetics by providing descriptions of these sounds, the subsegments
are not phonemic. They occur in very limited environments as parts of
sounds that occur in complementary distribution with other allophones
of their respective phonemes.
Several other marginal phonetic segments are not phonemes in the
strict and narrow sense. They occur only in complementary distribution
with other sounds in parallel contexts and hence are allophones. The
short vowels ĕ and ŏ occur word-initially in hiatus after e and o in com-
plementary distribution with a in certain Vedic dialects. They also occur
in Sāmaveda as free phonetic variants in a specific recitational repetition
called nyuṅkha. Slightly lengthened short vowels in Vājasaneyisaṁhitā
occur in complementary distribution with short vowels.1 The retroflex
L, .l and \h, .lh occur intervocalically in complementary distribution with d.
1 Long vowels shortened in specific contexts and termed ksipra likewise would not be
.
phonemes, even if they did differ in length from short vowels.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 82 — #102

i i

82 CHAPTER 6. SOUND-BASED ENCODING

and d.h in Rgvedic dialect. In several dialects, in complementary distri-

bution with˚normal y and v, firmer palatal and labial approximants occur
word-initially, and lighter palatal and labial approximants occur word-
finally. The epenthetic vocalic segments svarabhakti are automatically
inserted in predictable environments and thus are not phonemic. For the
same reason, the nasals yama and nāsikya, which in the phonological de-
scription of most ancient Indian phonetic treatises are epenthetic nasals
automatically inserted in predictable environments, are not phonemic.
Certain members of two subgroups of phonetic segments, sibilants
and nasals, occur only non-contrastively either in complementary distri-
bution in specific dialects or as free phonetic variants. In the sibilant
subgroup, jihvāmūlı̄ya and upadhmānı̄ya are allophones of s and r word-
finally before unvoiced velar and labial stops. Visarga generally occurs
in pausa (dahati agnih.) in complementary distribution with r and voice-
less fricatives h, ś, s., s and h (agnir dahati, agnih karoti, agniś carati,
agnis tis..thati, ¯agnih pūjyate),
ˇ and as a dialectal ¯or free phonetic vari-
ant of jihvāmūlı̄ya ˇ(h) and upadhmānı̄ya (h) before unvoiced velar and
¯
labial stops (agnih. karoti, agnih. pūjyate),2ˇ and of sibilants before sibi-
lants (agniś śrn.oti : agnih. śrn.oti). It also occurs as a phonetic variant
before palatal˚ and labial stops ˚ in certain dialects (yajuh karoti : yajus
. .
karoti). A parallel situation is found with certain sounds in the nasal
subgroup. Nasalized semivowels are allophones of word-final3 m be-
fore their corresponding clear semivowels (cakame purūravasam : saỹ-
yama). Anusvāra generally occurs in complementary distribution with
m before a fricative (saṁ-śaya) and as a dialectal or free phonetic vari-
ant of nasal stops before oral stops (śaṅ-kara : śaṁ-kara), and of nasal-
ized semivowels before semivowels (saỹ-yama : saṁ-yama). Different
lengths of anusvāra are allophones additionally determined by the length
of the preceding vowel and by following consonant clusters or consonant
+ r. Among the nasal stops the palatal nasal is not a phoneme. It is an
˚
allophone of m before a palatal stop (sañ-caya) and is a phonetic variant
2 Labial and velar sounds, such as [F] and [x], are acoustically similar and share the

feature gravity. Historically, the voiceless velar fricative symbolized by hghi in English
words like cough is in Present Day English a voiceless labio-dental fricative [f] (Ladefoged,
1971, 44).
3 We use word-final as a translation of padānta, that is, occurring at the end of a pada

(independent word, preverb, or compound element).

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 83 — #103

i i

6.1. DISTINCTIVE ELEMENTS 83

of anusvāra in the same context (saṁ-caya). It is likewise an allophone

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 87 — #107

i i

6.1. DISTINCTIVE ELEMENTS 87

In describing his method of analysis, Vacek (1976, 407) notes, “every

6.1. DISTINCTIVE ELEMENTS 91

svarabhakti as argued by Deshpande. There is no independent evidence

that long svarabhakti vowels do not inherit the high or low pitch of the
preceding vowel, nor that short svarabhakti vowels are not recited with
accumulated (pracaya) pitch after a svarita vowel. There is therefore no
independent evidence that the term sphot.ana applies only to the short
svarabhakti as Deshpande argues. It is doubtful that it does and doubtful
that the text itself asserts a different behavior regarding accent inheri-
tance. Therefore there is insufficient evidence to establish any contrast
between short and long svarabhakti. Should evidence be found to estab-
lish such a contrast, of course, it would serve as grounds to recognize
short and long svarabhakti as distinct phonemes.

6.1.6 Phoneme in the broader sense

It is clear that the limiting parameters placed on the concept of a pho-
neme, in the strict and narrow sense, diminish its utility as the sole basis
for a single character-encoding scheme for Sanskrit texts. If an encoding
scheme is to convey the same information that the language conveys, it
should provide the means to distinguish all minimally contrastive seg-
ments, insofar as any contrastive information is conveyed by the differ-
ence between those segments. And it must include differences in style,
dialect, and genre, insofar as these are significant contrasts within the
scope of the collection encoded. The corpus of Sanskrit texts includes
various dialects of Vedic and classical as well as more varied speech
communities such as Buddhist Hybrid Sanskrit (Edgerton, 1970). It in-
cludes borrowings from early dialects, Prākrits, substrate languages (cf.
Witzel 1999, Hock 1975), and foreign languages. In many cases, the only
evidence for such loan words is in the Sanskrit itself. And extant doc-
uments indicate paralinguistic semantic content through such devices as
prolonged vowels, at least in the Vedic texts. The extended parameters in
the concept of a phoneme discussed in sections 6.1.3–6.1.5 are adequate
to convey the desired contrasts. Hence the phoneme in the broad sense is
suitable to serve as the basis for a single character-encoding scheme for
all Sanskrit dialects, borrowings, and linguistic uses.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 92 — #112

i i

92 CHAPTER 6. SOUND-BASED ENCODING

6.1.7 Contrastive phonologies

Incompatible phonological schemes have been proposed for the descrip-
tion of Sanskrit (see above §5.2). The form in which Sanskrit and Ve-
dic texts have been received in oral recitation as well as in manuscripts
and the various scripts and encodings used to transmit Sanskrit texts all
adopt — at least implicitly — some phonological scheme. The vari-
ous encodings used to transmit Sanskrit texts are not entirely compatible
with one another. Information contained in one phonological scheme
cannot necessarily be captured in another. Although we have attempted
to devise an encoding that captures all the distinctions made by all the
phonological schemes used to describe and transmit Sanskrit, most ex-
isting texts do not represent all these distinctions. Most Sanskrit texts
do not represent epenthetic nasals (yamas and nāsikya), unreleased stops
and semivowels (abhinidhāna), epenthetic vowels (svarabhakti), accent,
distinctions in the weight of semivowels, and distinctions in types of
anusvāra. Where accent is represented, it is often not represented in such
a way that one can determine how it is to be mapped onto the range
of tones needed to describe the various traditions of Vedic accentuation
completely. When information is not provided about epenthetic nasals
(yamas and nāsikya), unreleased stops and semivowels (abhinidhāna),
epenthetic vowels (svarabhakti), accent, semivowel weight, and length
of anusvāra, these features should simply be ignored. Since sufficient
information is not always available to encode a text with the full reper-
toire of phonological distinctions required for a completely contrastive
description, an encoding scheme must provide defaults to allow the in-
formation that is provided to be represented, even if that information is
less than complete.
In the case of epenthetic sounds (yamas, nāsikya, and svarabhakti),
the default is simple: leave them out. In the case of the unusual weight of
semivowels, unreleased varieties of stops and semivowels, and accented
vowels, in the absence of special information, the normal, clear, unmod-
ified sound will be the default. If a semivowel is not specified as heavy,
light, or unreleased, the default semivowel, without specification of spe-
cial weight will be used. If accent is not specified, the monotone vowel
will be used. In such cases the default does not necessarily indicate the
lack of the special feature; it merely indicates the absence of information
concerning the feature. Only when the text does specify a particular con-

i i

it. For example, the unaspirated and aspirated retroflex lateral flaps / /
and / h / do not occur in Classical Sanskrit; the phonemes /ã/ and /ãh /
do. In certain Rgvedic dialects, unaspirated and aspirated retroflex lat-
eral flaps occur˚in complementary distribution with [ã], [ãh ] and hence
are allophones of [ã], [ãh ]. In separate encoding schemes for Classical
Sanskrit and for the Rgvedic dialects, encodings are necessary only for
the two phonemes /ã/,˚ /ãh /.
In a database that includes passages both in the Rgvedic dialect and in
˚
the Classical Sanskrit dialect, one could tag the passages in one or both of
the dialects and apply phonetic rules to produce the contextually appro-
priate allophones proper to each dialect. For instance, Yāska’s Nirukta,
which is predominantly in the Classical Sanskrit dialect, cites passages
in the Rgvedic dialect. Nirukta 3.11 cites RV. 2.23.9, which contains
the word ˚ talito with an intervocalic retroflex ˚ lateral flap, Romanized l.
. .
Nirukta 3.11 then cites the linguist Śākapūn.i explaining that tal.it refers
to lightning in the passage vidyut tal.id bhavatı̄ti śākapūn.ih.. Rather than
including both the retroflex lateral flap [ ] and [ã] in the character encod-
ing, one might tag the text in Rgvedic dialect and allow special rules to
˚ flap [ ] in text so tagged. Such a tagging
realize /ã/ as the retroflex lateral
in the latter passage could be achieved as follows:
<embed dialect="rv">vidyut taqid bhavati</embed>
iti SAkapURiH

Within the tagged dialect portion, intervocalic /ã/ will always be real-
ized as the retroflex lateral flap [ ]; outside such tags, it will always be
realized as [ã].
It is not likely that such a system would be practical at the present
stage, however, since this encoding would have to be fairly fine-grained,
and since we possess insufficient information about dialectal differences
and loanwords. We are uncertain, for instance, whether Śākapūn.i writes
consistently in Rgvedic dialect or just cites the single word tal.it in Rg-
vedic dialect. If˚the latter, the above demonstration includes too much˚of
the passage within the <embed> tag. Moreover the Nirukta itself uses
the retroflex .l, .lh even when not directly citing. Immediately after refer-
ring to Śākapūn.i, the Nirukta continues, sā hy avatāl.ayati, using .l outside
Rgvedic dialect. The text as received makes no mention/use distinction.
˚
With the lack of reliable information about the author’s dialect, one is

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 95 — #115

i i

6.2. HIGHER-ORDER PROTOCOLS 95

forced to accept that [ã], [ãh ] and unaspirated/aspirated retroflex lateral

flaps [ ], [ h ] occur in contrastive distribution. Sequences Vl.V occur
in the Nirukta external to Rgvedic citations and alongside Vd.V, setting
[ã] and the retroflex lateral˚flap [ ] in contrast; for example, avatāl.ayati
‘strikes down’ (3.11) : lambacūd.aka ‘one having long locks (of hair)’
(1.14). Therefore, the unaspirated and aspirated retroflex lateral flaps
[ ], [ h ] occur in contrastive distribution with [ã], [ãh ] not only within
the collection comprising all the dialects of Sanskrit, but even in the clas-
sical Sanskrit dialect that excludes the Rgvedic dialect. Hence all four
must be encoded separately at the character˚ level.
Similarly, an encoding of surface accent at the character level must
embrace the range of pitches utilized across Vedic schools and dialects
because it is not always practical to use higher-order text-encoding de-
vices to bracket off excerpts from the texts of various Vedic schools and
dialects. Many texts, especially ritual texts, cite passages from more than
one Vedic saṁhitā. One could bracket passages of the Śākalasaṁhitā
of the Rgveda and passages of the Vājasaneyisaṁhitā of the Yajurveda
˚ in XML tags and employ separate encoding schemes ade-
separately
quate to capture the surface pitch contrasts within each.7 Yet many ritual
texts include passages, which, though accented in accordance with ei-
ther the system described in the Rkprātiśākhya or that described in the
˚ to known collections. To be sure
Vājasaneyiprātiśākhya, are untraced
higher level text bracketing may be preferrable in instances in which the
significance of accentual marks is only known by identifying the text and
knowing the accentual system described by a particular phonetic treatise.
There are, however, instances in which the accentual system is known,
yet the text is unidentified. Such texts lack clear criteria for higher-order
tagging. While a character-level surface pitch encoding does require
choice of pitch level, it does not commit one to textual identifications
for which there is no evidence.
Separately each system described in section 5.2.1 requires the dis-
tinction of only three pitches. The system of the Rkprātiśākhya distin-
˚
7 Neither the Mādhyandina nor the Kānva recension of the Vājasaneyisaṁhitā employs
.
the system described in the Vājasaneyiprātiśākhya according to which one would expect
the vertical line above the high-pitched syllable rather than above the circumflexed syllable,
if graphic marks correspond with pitch contours as suggested by Witzel 1974. But several
saṁhitās (cf. §3.2) do employ a vertical line above the high-pitched syllable (Mı̄māṁsaka,
1964, 12, 21, 25, 42).

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 96 — #116

i i

96 CHAPTER 6. SOUND-BASED ENCODING

guishes between extra-high, high, and low, while the system of Pān.ini
and the Vājasaneyiprātiśākhya distinguishes between high, low, and ex-
tra-low. Although each of the systems of surface accentuation distin-
guishes only three pitches, a system of surface accentuation that will ac-
commodate contrasts across these systems must distinguish four pitches.
A system that captures distinction in pitch across Vedic dialects must
therefore distinguish between extra-high, high, low, and extra-low. Con-
sequently, it is necessary to devise a character-encoding scheme adequate
to capture phonemic distinctions in the broad sense across all Sanskrit di-
alects.
Higher-level bracketing does not seem suitable to capture distinctions
in various Sanskrit dialects and loan words since there may be insuffi-
cient evidence to identify the various dialects and source languages for
loan words. There exist, however, phonetic distinctions that are more
suitably captured by using higher-level bracketing than by incorporating
them in a character encoding based on the phoneme in the broad sense.
Higher-level bracketing is appropriate where the phonetic distinction is
made only with explicit reference to units at a higher level than the pho-
neme. For instance, higher-level bracketing is appropriate where the pho-
netic distinction is made only with reference to lexical items. For exam-
ple, Mallaśarmakrtaśiks.ā 45–46 describes nasalized vowels prolonged
˚
to five and six morae. While there is no reason to doubt the phonetic
accuracy of the description, there is no need to include the distinction of
five- and six-mora lengths in the featural scheme of Sanskrit nor to in-
clude nasalized vowels having a length of five or six morae in the broadly
phonemic character inventory. Such lengths need not be included be-
cause their only occurrence is in the final vowels of particular lexical
items. The length of five morae occurs only in the word mahā, and the
length of six morae occurs only in the word ati (Mallaśarmakrtaśiks.ā
46). Because the occurrence is lexically specific, the phenomenon ˚ is best
described lexically, as it was described by the Śiks.ā itself. The Śiks.ā
calls the occurrence of these extra-long nasalized vowels mahāraṅga and
atiraṅga, i.e. the raṅga of mahā and the raṅga of ati. The defining char-
acter of the distinguishing feature seems to be the lexical item rather than
the length of the sound. Therefore, we consider it a lexical feature rather
than a phonetic one.
It is difficult to capture suprasegmental features such as accent in

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 97 — #117

i i

6.2. HIGHER-ORDER PROTOCOLS 97

a segmental encoding; hence, one might choose to utilize higher-order

units — in particular syllabic units — to encode accent. One could thus
tag syllables and assign accentual features to them. Indian phonetic trea-
tises themselves recognized the syllabic nature of accent. As mentioned
in §4.2.2, ancient Indian linguistic treatises recognized that vocalic ac-
cent spread to adjacent consonants. Other Indian treatises limited accent
to vowels and ignored consonants in accentual rules (Vyā. Pa. 33). Yet
difficulties would arise in attempting to encode the accent of syllabified
visarga and anusvāra. In these cases the techniques of marking accent in
Vedic texts are more easily correlated with individual characters. Certain
Vedic traditions mark accent of the non-vocalic elements anusvāra and
visarga. Such marking, and the oral recitation of the texts, demonstrate
that anusvāra and visarga are syllabified. Visarga is syllabified by echo-
ing the preceding vowel or final subsegment of an open diphthong, and
anusvāra is syllabified in White Yajurvedic recitation as gũ. Yet no vowel
belongs in the underlying text and no vowel is written. Since the syllab-
ification is clearly indicated by the marking of accent on the anusvāra
and visarga characters and the syllabification can be most reliably in-
ferred from the accent marking, it seems preferable, given the lack of
other explicit information about the syllabification of these elements, to
include accent in a character encoding.8 In the purely segmental encod-
ing SLP2, we include characters for high-pitched visarga, low-pitched
visarga, and svarita visarga, and for low-pitched anusvāra. We do not,
however, include characters for high-pitched and svarita anusvāra. Al-
though the vertical stroke above, used in Devanāgarı̄ script to indicate a
svarita in the Vājasaneyisaṁhitā, is found above the sign for an anusvāra,
it is found there instead of above the sign for the preceding syllable onset
and core, unlike the signs for visarga accent, which appear in addition to
the vertical stroke above the preceding syllable onset and core, and un-
like the horizontal stroke that indicates low pitch, which appears below
the sign for anusvāra in addition to below the sign for the syllable onset
and core. The vertical stroke above the sign for anusvāra is therefore a
graphic transposition that still indicates the accent of the entire syllable,
8 Phonetic treatises disagree as to whether anusvāra is syllabified.
Varn.aratnapradīpikāśiks.ā 50-51 states that anusvāra is among the group of sounds
called yogavāha that are devoid of their own accent. Several Śiks.ās (e.g. Keśavīśiks.ā 5,
Tatkrtā padyātmikā Śiks.ā 15, Svarabhaktilaks.an.apariśis..taśiks.ā 19), in contrast, state that
˚
anusvāra is replaced by nasalized a when a spirant or r follows.

i i

“LIES” — 2011/6/21 — 15:43 — page 101 — #121

i i

Chapter 7

Script-based encoding

Although discussion so far has focused on sound-based encoding, we

note that there are many applications for script-based encoding. Much
of the human cultural heritage has been transmitted primarily in written
form. Some forms of writing are independent of spoken language (Hy-
man, 2006), and written and spoken language manifest parallel struc-
tures that are partly independent (Weir 1967; Vachek 1973). The typo-
graphic form — meaning the visual aspects of written and printed lan-
guage (Waller, 1988, 5) — conveys information that may need to be en-
coded in machine-readable documents. Researchers concerned with his-
torical manuscripts, for instance, attend to characteristics such as scribal
hands, ink color, abbreviations, letterforms, ductus,1 margins, and spac-
ing (Kropač, 1991).
A focus on the primacy of spoken language in particular contexts
need not, and should not, lead to a denigration of writing. In the 1920s
the Soviet psychologist L. S. Vygotsky recognized that writing is both the
product of human cognition and an environmental factor that contributes
to cognitive development. More adventurously, he argued that the in-
vention of writing led historically to new complexity in human cognition
(Vygotskii 2005, 417; Cole, Levitin & Luria 2006, 44–45).2 The fact
1 Cf.Skelton 2008, 161.
2 On the sociogenesis of such complex cultural products as writing see also (Tomasello,
1999, 41–48) and (Damerow, 1996, 316–321).

101
i i

7.1. FEATURAL ANALYSIS 105

Similar sets of features were proposed by Geyer (1970) and Laughery

(1971), both of whom made use of computer simulation models; the lat-
ter author proposed features to distinguish not only capital letters but
also Arabic numerals. The validity of such feature sets can be tested
by empirical data for letter confusion errors derived from psychological
experiments (Geyer & DeWald, 1973).
These schemes take the form of feature lists, that is, essentially un-
ordered sets of features. It is possible also to organize features into trees,
which have an inherent geometry, reflecting systemic relations between
features (cf. p. 77, above). Tversky (1977, 346) presents a feature tree
for the lower-case letters (save hwi), using the binary features {curved,
arched, vertical, angular, circular, tailed, long, dotted, twisted, forked}.
In developing her Prosodic Font system, Rosenberger (1998, 41–42)
identified five similarity groups for Latin characters: “[t]hose that are
constructed as combinations of vertical strokes and circles, those formed
of circles left open for some interval (e. g. like a horseshoe) and a vertical
line, those constructed of slanted lines, the class of letters that combines
elements from the other three, and the letter ‘s”’.7 Her original system
used only four stroke primitives: line, circle, open circle, and s. Because
of implementation difficulties, a second system added three stroke prim-
itives (dot, curved tail, cross-bar) and recognized two basic principles of
stroke positioning: consecutiveness vs. simultaneity and dependence vs.
independence (Rosenberger, 1998, 43–47).
Analysis of characters in terms of graphic features is important in
work on OCR and handwriting recognition (Bansal & Sinha, 2000). In
the case of cursive handwriting, general features (both single-valued and
multi-valued) may be tested for each column within the word rectan-
gle, e. g.: projection profile, partial projection profile, upper/lower word
profile, background to ink transitions, grayscale invariance, Gaussian
smoothing, and Gaussian derivatives (Rath & Manmatha, 2003). “Word
spotting” is an information retrieval technique that uses one or more
images of a written or printed word as a prototype (or prototypes) to
find other tokens of the same word in a set of document images. This
approach treats the word as a holistic entity, rather than as a string of
graphemes. Gradient, Structural, and Concavity (GSC) features are ex-
tracted from the entire word at multiple scales/resolutions. The gradient
7 Cf. Estes (1978, 175).

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 106 — #126

i i

106 CHAPTER 7. SCRIPT-BASED ENCODING

features indicate changes in stroke orientation; the structural features in-

dicate the presence of corners as well as diagonal, horizontal, and vertical
lines; and the concavity features indicate “bowls” and open cavities (Sri-
hari, Srinivasan, Huang & Shetty, 2006).8
In an OCR system for Devanāgarı̄, characters (together with frequent
ligatures) are pre-classified into major categories depending on the posi-
tion (or absence) of a vertical bar (termed danda by the authors) (Govin-
daraju et al., 2004). Gradients (i. e. the magnitude and direction of in-
tensity changes around the pixels of a digitized image of a character) are
thresholded and quantized for a 3 × 3 grid. The resulting feature vector
of length 72 forms the input to a neural network with an input layer of
72 perceptrons. The network classifies characters from a blind test set at
around 95% accuracy (Govindaraju et al., 2004).9
Chinese characters (hanzi/kanji) are traditionally classified (in dictio-
naries and reference works) on the basis of a number (most commonly
189 or 214) basic elements termed “radicals”. In the Rosenberg Graph-
ical System the characters are more conveniently classified according to
22 basic graphical elements, which can be subsumed under five cate-
gories of stroke direction: (1) horizontal, (2) vertical, (3) sloping down-
ward to the left, (4) sloping downward to the right, (5) reverse curved
down (Barlow, 1995). In OCR of Chinese characters, characters are mod-
eled as a set of linear primitives (Suen, Mori, Kim & Leung, 2003). It
is possible analytically to decompose Devanāgarı̄ characters into primi-
tives, but these primitives are non-linear, and no computational technique
has been implemented to decompose characters in such a fashion (Kom-
palli, 2007). Chinese calligraphy recognizes seven or eight basic strokes.
A computational implementation demands further distinctions. Thus the
Hàn Zì software implemented by Douglas Hofstadter and David Leake
in the 1980s required about 40 distinct basic strokes (Hofstadter, 1985,
294).
Donald Knuth’s METAFONT system, begun in 1978 in collabora-
tion with Charles Bigelow and Kris Holmes, implements a high-level
8 Such a computational approach may not be wholly foreign to ways in which humans

recognize words. Psychological evidence suggests that parallel to other word-identification

processes is a holistic process that is sensitive to salient peripheral features of a word’s
shape (Beech & Mayall, 2007).
9 For an elaboration of this model, including reports of word accuracy, see Kompalli

(2007).

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 107 — #127

i i

7.1. FEATURAL ANALYSIS 107

programming language that can be used to construct font glyphs math-

ematically. METAFONT allows for the creation of a family of fonts, by
specifying a fairly large number (about 60) of parameters that determine
the particular realization of glyphs. The best known fonts created with
METAFONT are Knuth’s own Computer Modern fonts, frequently used
with TEX. In Indic typography METAFONT was first used to create a
Devanāgarı̄ font (NCSD) by Ghosh (1983). Subsequently, Frans Velthuis
used METAFONT to create a font Devanag (Pandey, 1998) and Charles
Wikner employed the software in creating his Sanskrit Devanāgarı̄ font
(Wikner, 2002).
Douglas Hofstadter in his ingenious 1982 reply to Knuth argues that
semantic categories (such as the character hAi) are productive sets (Hofs-
tadter, 1985, 263). That is, no finite parameterization is capable of spec-
ifying all the ways in which a particular character may be graphically
realized. Hofstadter understands characters as belonging to a structural
system (such as the system of Latin letters) that employs a set of contrasts
(thus hpi and hbi differ in the relative position of their “post” and “bowl”)
(Hofstadter, 1985, 280). Although Hofstadter rejects analysis of letter-
forms into geometric parts, he allows instead for conceptual roles (such
as “crossbar”, “bowl”, “post”, “tail”) that may be variously realized by
particular glyphs. Glyphs are accepted as characters to the degree that
they are successful in fulfilling a set of roles. Hofstadter’s approach re-
sembles in certain respects prototype theories (Reed, 1978, 153–158).
One potential use of featural analysis is to investigate the history of
writing systems. Coding a set of palaeographic characters by means of
feature vectors might serve as a preliminary to studies employing the
methods of phylogenetic systematics (cladistics) (Skelton, 2008). From
the feature vectors, characters for producing a data matrix of the sort
that is used in phylogenetic analysis might be extracted. Phylogenetic
analysis uses algorithms or optimality criteria to compute an evolutionary
tree that describes the relations between taxa. Such categories as scribal
hands, documents, or find sites might be chosen as appropriate taxa.
Featural analysis also can model character confusion, as in palaeo-
graphic situations when a scribe mistakes one character for a visually
similar one. Feature systems can be used to predict the likelihood of
particular confusions. By combining a set of graphic features with an
edit function such as stepped distance function (SDF), it is possible to

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 108 — #128

i i

108 CHAPTER 7. SCRIPT-BASED ENCODING

compute the orthographic similarity between two strings (Singh, 2006).

Because the orthographic syllable is such a salient unit in Indic scripts,
analysis of this type has many potential applications in manuscript stud-
ies and textual criticism.

7.2 Analysis of Devanāgarı̄ script

We have surveyed a number of attempts to analyze writing at the sub-
graphemic level. As we move from typographers to psychologists, new
media designers, lexicographers, OCR implementors, and cognitive sci-
entists, we see, with shifting goals, shifting levels of analysis. There is no
real consensus on what meaningful distinctions to draw below the level
of the grapheme. This situation is in contrast to that obtaining in phonol-
ogy, where — although there is disagreement about particular features
and about issues such as whether articulatory or acoustic features are
more relevant; or whether n-ary, and not just binary, features should be
adopted — there is a consensus that speech sounds can be understood in
terms of sets of distinctive features (Jakobson et al., 1963; Chomsky &
Halle, 1968; Ladefoged, 1971; Halle, 1983; Clements, 1985; Clements
& Hume, 1995).
Although in most writing systems there is normally no correlation
between graphic and phonetic features, we do occasionally find such a
correlation. hpi and hbi differ in only one visual feature, while /p/ and /b/
differ only in the feature [± voice]. Similarly, hbi and hdi differ only in
one visual feature, while /b/ and /d/ differ only in place of articulation.
Such distinctions appear to emerge synchronically in a process of “re-
signification”. The parallelisms do not hold for letterforms such as {hBi,
hDi, hPi} (from which the lower-case forms developed) or a fortiori {hBi,
hDi, hPi} or {hBi, h∆i, hΠi}.
Historically, we know or suspect that certain characters were derived
from others. Thus in Brāhmı̄, characters for aspirated stops are derived
from characters for unaspirated stops (Dani, 1963). Sometimes the char-
acter for the aspirated stop is formed by completing part of the shape of
the character for the homorganic unaspirated stop, as in hchai < hcai
and ht.hai < ht.ai. In other cases an extra “curlicue” is added, as in
hd.hai < hd.ai and hphai < hpai. The derivational relationship may
still be evident in Devanāgarı̄, where h:pai and h:Pi represent /p/ and /ph /

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 109 — #129

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 111 — #131

i i

7.3. COMPONENT ANALYSES OF DEVANĀGARĪ SCRIPT 111

F IGURE 7.1: Devanāgarı̄ atoms, as drawn by R. K. Joshi, 1984.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 112 — #132

i i

112 CHAPTER 7. SCRIPT-BASED ENCODING

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 113 — #133

i i

Chapter 8

Conclusions

Although computers manipulate linguistic and textual data in sophisti-

cated ways, current encoding systems reflect orthographic design fac-
tors to the exclusion of more relevant information-processing principles.
Even the most recent standardized encoding systems reproduce deficien-
cies inherent in the traditional orthographies themselves. These tradi-
tional orthographies have undergone a long history of adaptation in tech-
nologies for the visual representation of language. Beginning with styli,
brushes, etc., and continuing with the invention of movable type, ma-
chine typesetting, the typewriter, remote transmission by means of tele-
type machines, the invention of standardized computer encodings from
ASCII to Unicode, right up to the desktop publishing revolution, each
stage in technological development represents language visually. Yet
display is only one of numerous functions that computers now perform.
Computers exchange textual data over space and time and perform lin-
guistic processing, such as spell-checking, machine translation, content
analysis and indexing, and morphological and syntactic analysis. There-
fore display for a human reader should no longer be considered the pri-
mary determinant of an encoding scheme. Rather, language should be
encoded in such a way as to facilitate automatic processing, to minimize
extrinsic ambiguity and redundancy, and to ensure longevity. To avoid
ambiguity and redundancy requires that an encoding system be charac-
terized by a one-to-one correspondence between characters and items to

113
i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 114 — #134

i i

114 CHAPTER 8. CONCLUSIONS

be encoded, and that all encoded items be of the same kind.

Text-processing technology arose in the English-speaking world and
assumed as a norm the use of the Roman alphabet with few or no di-
acritics. Adaptation to some non-European scripts required consider-
able effort and compromise. The adaptation of Roman script itself re-
quired the use of a number of diacritics to represent the phonology of
non-European languages accurately. The greatest challenge remains the
application of encoding principles to the representation of non-European
languages. Sanskrit, the primary culture-bearing language of India, with
its enormous body of literature, strong oral tradition, and highly devel-
oped linguistics presents a particularly appropriate case for study.
The encoding schemes used for Sanskrit are based primarily either
upon Devanāgarı̄ script or upon the standard Romanization of Sanskrit.
The difficulties with these schemes are due in part to problems in the
modes of graphic representation of Sanskrit sounds adopted in the scripts
themselves. Both depart from one-to-one correspondence between char-
acters and items to be encoded and from consistency in the type of en-
coded item. Devanāgarı̄ employs redundancy in the representation of
phrase-initial and post-vocalic vowels, and an inversion in the graphic
representation of phonetic elements in its representation of /a/. Roman-
ization employs digraphs for the representation of aspirate stops and open
diphthongs. Both employ digraphs for the representation of the aspirated
retroflex lateral flap / h /. The duplicate use of a sign used to represent an
aspirate segment additionally to represent the feature of aspiration, and
the use in Romanization of a, i, and u to represent phonetic segments
as well as subsegments of diphthongs, garners inconsistency in the type
of item represented and therefore introduces ambiguity. Or, if it avoids
ambiguity by using the diaeresis over the second of two vowels, Roman-
ization still suffers from redundancy in the representation of the vowels i
and u. Encoding standards for Sanskrit that are based on Devanāgarı̄ or
Romanization inherit the deficiencies inherent in the underlying scripts.
They suffer from ambiguity and redundancy by departing from a one-to-
one correspondence and by inconsistency in the basis for encoding.
Clear principles of encoding require determining the location of the
encoding in the space defined by three axes: graphic–phonetic, syn-
thetic–analytic, and contrastive–non-contrastive. One must determine
whether to encode written characters or speech sounds, segments or fea-

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 115 — #135

i i

CONCLUSIONS 115

tures, and what criteria to use to contrast items. Since information degra-
dation arises at each stage in representation of knowledge, it is felicitous
to encode the primary medium of knowledge transmission. Given that
script is inherently a secondary phenomenon vis-à-vis spoken language,
encoding should be based directly on spoken language. Devanāgarı̄ script
itself was not specifically designed to represent Sanskrit phonology, but
rather was adapted to this use subsequently; hence it is not surprising
that it proves to be a less appropriate basis for encoding Sanskrit than
Sanskrit phonology itself.
Few of the world’s writing systems were designed for the languages
that they represent in extant texts. Most were adapted, and adaptations
almost always fail to capture the structure of the spoken language ade-
quately. Therefore, in general, where one has access to the phonology of
the language, where the orthography is fairly shallow, and where the stan-
dard orthography departs from an ideal coding of spoken language struc-
ture, the basis for text encoding should be phonetic rather than graphic.
Sanskrit meets these conditions, and so it is better to encode Sanskrit
speech sounds directly than to encode the secondary representations of
those sounds in Devanāgarı̄, Roman, or any other script. Directly coding
Sanskrit speech sounds will solve the problems of ambiguity and redun-
dancy that we have noted in our survey of current encoding schemes.
Spoken language has a temporal dimension, and scripts that repre-
sent spoken language have a linear dimension that corresponds to the
temporal dimension of spoken language. The minimal independent unit
in the chain of speech is the phonetic segment or phone. The minimal
independent unit in script is the graphic segment or graph. A segmental
linguistic encoding is based upon minimal phonetic or graphic segments.
Yet both phonetic and graphic units may be decomposed into systems
of features orthogonal to this dimension of segmentation and not nec-
essarily coterminous with the minimal units of segmentation. Phonetic
units may be decomposed into a set of acoustic or articulatory features
that are realized simultaneously. Similarly, writing may be analyzed into
graphic features. Although the boundaries between phonetic and graphic
segments are sites of marked alterations in phonetic and graphic fea-
tures, each feature may independently be associated with a string of one
or more phonetic or graphic segments. Encodings may be entirely seg-
mental, at one pole of the synthetic–analytic axis, or entirely featural at

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 116 — #136

i i

116 CHAPTER 8. CONCLUSIONS

the other. For Sanskrit, we have devised an entirely segmental phonetic

encoding (SLP2) (see Appendix C), an encoding based entirely on artic-
ulatory features (SLP3) (see Appendix D), and a phonetic encoding that
utilizes both segmental and featural units, while remaining clear about
which is which (SLP1) (See Appendix B. The features in SLP1 are indi-
cated by modifiers described in section B.3).
All modes of information storage and transmission presuppose a se-
lection of relevant information. The selection of the set of distinctions
to be encoded depends upon the nature of the textual corpus and the in-
formation of interest to its users. Encoding requires classifying items,
identifying items within each class by ignoring irrelevant distinguishing
information, and designating each class by unique identifiers. A linguis-
tic transcription of speech ignores non-linguistic information such as ab-
solute tempo and pitch; a linguistic copy of a manuscript ignores absolute
line thickness and character height. An encoding assigns codepoints to
units that have significant contrasts. Yet a segmental phonetic encoding
of a corpus of Sanskrit texts for a general scholarly community cannot
limit itself to the narrow concept of a phoneme as the distinctive segment
to be encoded, even with its recent extension to include distinctions in
duration, stress, and pitch. Typically, phonemes are the minimally con-
trastive segments of sound in a language, on the basis of the contrast
between which lexical and grammatical distinctions can be made. But
a comprehensive phonological system of the language should be able
to convey whatever information speech conveys. Contrastive and com-
plementary distribution is always with respect to a specific context. If
one stretches two parameters in the typical definition of a phoneme, the
modified concept may serve as a suitable basis for a phonetic encoding:
(1) The language must collapse within its bounds diachronic differentia-
tion, regional dialects, and stylistic strata. (2) The range of the semantic
content that contrastive sounds are required to differentiate must include
paralinguistic semantics.
It is necessary to broaden the concept of a phoneme to comprise lin-
guistic variation, borrowing, and paralinguistic semantics. A phoneme in
such a comprehensive phonological system remains the minimally con-
trastive phonetic segment in a language on the basis of which one word
could be distinguished from another. It differs, however, from the strict
definition by relaxing its limiting parameters. A language then refers to a

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 117 — #137

i i

8.1. DYNAMIC TRANSCODING 117

specified range of dialects, including borrowings. And for sounds in par-

allel distribution to be contrastive, they serve to differentiate a specified
range of semantic content, including paralinguistic content. We have em-
ployed the broader conception of a phoneme to classify Sanskrit sounds
as distinctive in our phonetic encodings. We utilize the SLP1 encoding
for the storage of a corpus of Sanskrit texts in our digital Sanskrit library
and for linguistic processing. We transcode to a variety of Indic scripts
and Romanization in Unicode for display purposes and employ various
meta-transliterations, Indic Unicode, as well as clickable input keyboards
for data input.

8.1 Dynamic transcoding

By storing text in a single underlying format that maximizes fidelity to
the phonetic representation of the spoken language, we allow for extreme
flexibility in display and input options. Text stored in a single underly-
ing representation may easily be displayed in Devanāgarı̄, Roman trans-
literation, phonetic transcription (e. g., that of the IPA), or one of the
regional scripts of India. Likewise, text entered and viewed in Roman
transliteration or one of the Indic scripts may be transcoded and pro-
cessed in the underlying phonetic format. Rules for translating the under-
lying format to one of the surface representations (typically encoded as
Unicode) can be implemented with finite state transducers (Huet, 2005).
We have developed a number of model transcoders using lex (Kernighan
& Pike, 1984) and similar scanner generators (which generate determin-
istic finite automata). Philosophically, such an approach is satisfying,
since it conceives of written Sanskrit as a rule-based transformation from
an underlying level that corresponds in some sense to speech. Practically,
it is very useful to be able to display the same stretch of Sanskrit text in
multiple ways; this possibility allows one to reach multiple audiences,
including beginning students (who cannot yet read an Indic script), and
Indian scholars, whether pandits or amateurs, who are used to using an
Indic script other than Devanāgarı̄.
The Sanskrit Library has deployed a full set of transcoding routines
written in Java that allow Sanskrit text encoded in SLP1 to be displayed
in most major Indic scripts (Bengali, Devanagari, Gujarati, Gurmukhi,
Kannada, Malayalam, Oriya, or Telugu), standard Romanization, or any

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 118 — #138

i i

118 CHAPTER 8. CONCLUSIONS

of several popular encodings (Kyoto-Harvard, wx, ITRANS, etc.), de-

pending upon user preference. Data-entry, and the display of entered
text, is likewise available in numerous formats based upon user prefer-
ence. Clickable input keyboards provide data-entry for those unfamiliar
with any of the available encodings. A transcoding page also allows
users to enter short passages or upload files for transcoding. Although
pre-existing encodings generally capture less information than ours, the
Sanskrit Library has developed automatic and machine-assisted facilities
for conversion of prior and legacy data into the Sanskrit Library Phonetic
encodings.
It would also be easy to develop additional input modes that can
be used with the encoding schemes. These input modes could be cus-
tomized for the needs of different users: e.g., Western scholars used to
dealing with Sanskrit in Romanization, Indians accustomed to differing
regional keyboard layouts, and scholars accustomed to legacy schemes.1
Suitable input methods can also be developed for devices with alterna-
tive input hardware, such as pen computers, PDAs, and mobile phones
(Shanbhag, Rao & Joshi 2002; Gupta 2006).2 In cases where input meth-
ods are being developed for users who are not already accustomed to
existing methods, attention should be paid to ergonomic factors such as
finger travel, error rate, typing speed, cognitive load, and learning curve.

8.2 Text-to-speech and speech-recognition

The discussion of transcoding between data-input, linguistic processing,
and display formats in the context of phonetics raises questions concern-
ing text-to-speech software and phonetic input methods. Text-to-speech
software and phonetic input methods are designed on the basis of the
sound structure of language, rather than on the traditional visual presen-
tation of language. The phonetic encodings described here, particularly
the featural encoding (SLP3), may serve as a starting point for develope-
1 QWERTY keyboards are not well-adapted to Indic script typing, especially for Indian

users who are not familiar with English and English keyboard layouts. New hardware
addresses these challenges (Joshi et al., 2004).
2 As of March 2010, India had about 545 million mobile phone users. Source: <https:

//www.cia.gov/library/publications/the-world-factbook/geos/in.html>.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 119 — #139

i i

8.3. HIGHER-LEVEL ENCODING 119

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 122 — #142

i i

122 APPENDICES

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 123 — #143

i i

Appendix A

Tables

123
i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 124 — #144

i i

124 APPENDICES

A.1 Phonetic features

TABLE 1 shows the structure of phonetic features that serve to character-
ize and contrast the phonetic segments of Sanskrit. The authors selected
the phonetic features shown after examining the sets of features described
in ancient Indian phonetic treatises including those of Āpiśali, Śaunaka,
and others. These features include both place of articulation and stricture
features as well as length and pitch, which have often been excluded from
the discussion of features. Place of articulation features do not include
nasal, although both Āpiśali and Śaunaka include this feature. On the
other hand, stricture features include some of the finer distinctions de-
scribed by Āpiśali. Recent universal linguistic featural systems devised
by Halle and Clements, utilize articulatory and stricture features as their
primary elements respectively.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 125 — #145

i i

APPENDIX A: TABLES 125

TABLE 1: Phonetic features

I. place of articulation V. nasalization [±]

A. guttural VI. length
B. velar A. half
C. palatal B. short
D. retroflex C. slightly long
E. dental D. long
F. labial E. protracted 3
II. manner of articulation (stricture) F. protracted 4+
A. contacted VII. underlying pitch
B. slightly contacted A. none
C. slightly open B. high
D. open C. low
1. simply open D. circumflex
(saṁprasāran.a) VIII. surface tone
2. more open (gun.a) A. extra low
3. most open (vrddhi) B. low
III. voicing [±] ˚ C. high
IV. aspiration [±] D. extra high

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 126 — #146

i i

126 APPENDICES

A.2 Sounds categorized by Āpiśali

TABLE 2 shows the structure of phonetic features described by the an-
cient Indian phonetician Āpiśali. Most conspicuously, Āpiśali explicitly
describes the active articulators of sounds (II), anticipating the approach
adopted by the contemporary phonologist Morris Halle. Āpiśali char-
acterizes nasals by including a nasal place of articulation ([I]G) and in-
cludes a full set of stricture distinctions including five degrees of open-
ness ([III]A4). The extrabuccal features that are associated with the glot-
tis ([III]B1) imply particular features of the larynx ([III]B2), which in
turn imply voice features ([III]B3). Implications are represented by right
arrows (→). To the right of each feature in parentheses are shown the
phonetic segments to which the feature belongs. Āpiśali attributes the
feature dorsolingual only to the jihvāmūlı̄ya ([I]B), while Śaunaka asso-
ciates it with several sounds (TABLE 3 [I]B).

Notes:

1. ṅ ñ n. n m have a secondary place of articulation in the nose.

2. e ai gutturo-palatal.
3. o au gutturo-labial.
4. v dento-labial.
5. ĕ ŏ in Sātyamugri and Rān.āyanı̄ya Sāmaveda (ĀŚ. 6.9).
6. l̄ in imitation of proper names (ĀŚ. 6.6).
˚

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 127 — #147

i i

APPENDIX A: TABLES 127

“LIES” — 2011/6/21 — 15:43 — page 129 — #149

i i

APPENDIX A: TABLES 129

“LIES” — 2011/6/21 — 15:43 — page 131 — #151

i i

APPENDIX A: TABLES 131

TABLE 4: Sounds categorized using phonetic features of Halle et al.

I. articulators 2. [back] (auo)

A. Larynx (Glottis) 3. [high] (iu)
1. [glottal] (hh.) 4. [low] (a)
2. [constricted glottis] (not E. Tongue Blade
used) 1. [coronal] (cchjjhñt. .thd. .l
3. [spread glottis] (aspirates: d.hl.hn. tthddhnrl l̃y ỹ śs. s)
khghchjht.hd.hthdhphbh; 2. [+ anterior] (tthddhnl l̃s)
spirants: śs. s h h. h h ṁ) 3. [− anterior]
4. [stiff vocal folds]¯ ˇ a. [+ distributed] (cchjjh
(high-pitched and ñy ỹ ś)
circumflexed vowels; b. [− distributed] (t. .thd. .l
unvoiced consonants: kkh d.hl.hn. rs.)
ccht. .thtthpph śs. sh. h h) F. Lips
5. [slack vocal folds] ¯ ˇ 1. [labial] (uoaupphbbhmv
(low-pitched and ṽh)
circumflexed vowels; ˇ
2. [rounded] (uoauv ṽ)
voiced consonants: gghj II. articulator-free features
jhd. d.hddhbbh ṅñn. nmh ṁ A. [+ consonantal] (cavity)
.l .lhl l̃ ry ỹv ṽ) 1. [+ sonorant] (no pressure)
B. Tongue Root (not used) a. [+ lateral] (lateral
1. [radical] resonants: .l .lhl l̃)
2. [retracted tongue root] b. [− lateral] (nasal stops:
3. [advanced tongue root] ṅñn. nm; approximant:
C. Soft Palate r)
1. [rhinal] (anusvāra, yamas, 2. [− sonorant] (pressure)
nāsikya: ṁ k̃ k̃h g̃ g̃h h̃) a. [+ continuant]
2. [nasal] (anusvāra, yamas, (spirants: h śs. sh)
nāsikya: ṁ k̃ k̃h g̃ g̃h h̃; b. [− continuant] ¯ ˇ
nasal stop, vowels, (non-nasal stops)
semivowels: ṅñn. nmãı̃ ũ r̃ 3. [suction] (not used)
l̃ ẽaı̃õaũ ỹ ṽ l̃) ˚ 4. [strident] (not used)
˚
D. Tongue Body B. [− consonantal] (no cavity)
1. [dorsal] (kkhggh ṅh; (glides: y ỹv ṽ; vowels; hh. ṁ)
vowels: aiueoaiau) ¯

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 132 — #152

i i

132 APPENDICES

A.5 Sanskrit phonetics

TABLE 5 shows Sanskrit phonetic segments categorized according to the
features in TABLE 1. Place of articulation features appear in the leftmost
column. Stricture appears in the third row of headings with subcate-
gories of vowel stricture in the fourth row. The subcategories of vowel
stricture serve to distinguish vowel grades termed samprasāran.a, gun.a,
and vrddhi in Pān.inian grammar. The fifth row of headings shows voic-
˚
ing; while the sixth row shows aspiration and nasalization of consonants,
as well as length of vowels. Pitch is not shown. Less common segments
are discussed in the notes.2−3,6−8 Unusual is the placement of h with
semivowels,4 and the placement of anusvāra with the velars.5

Notes:

1. The diphthongs ai and au have, and the monophthongs e and o are

considered to have, two places of pronunciation: (i) the glottis, (ii)
the palate or lips.
2. Vowels include prolonged lengths called pluta; three pitches udātta,
anudātta, svarita; and nasalized variants.
3. Semivowels y, l, v include nasal variants ỹ, l̃, ṽ.
4. Short vowels ĕ and ŏ occur in Vedic recitation and in phonetic trea-
tises.
5. Slightly lengthened short vowels occur in certain traditions of the
recitation of the Vājasaneyisaṁhitā.
6. With partial stricture and voicing, h shares features with buccal semi-
vowels.
7. Anusvāra is a nasal glide with the velum as its primary articulator.
8. Unaspirated and aspirated retroflex lateral flaps written L, .l and \h, .lh
occur intervocalically in Rgvedic dialect (and in the Nirukta), instead
of d. and d.h. ˚

i i

i i
i
i

i
i

TABLE 5: Sanskrit phonetics

CONSONANTS VOWELS1,2
3
stops semivowels spirants
APPENDIX A: TABLES

contacted slightly cont. slightly open open

simply open more open most open
UNVOICED VOICED VOICED UNVOICED VOICED VOICED VOICED
unasp. asp. unasp. asp. nasal short4,5 long short long long

GUTTURAL h, h6 H h. Aa A;a ā
VELAR k, k K,a kh g,a g ;G,a gh .z, ṅ M ṁ7 ^h
PALATAL . c,a c C, ch .j,a j J,a jh V,a ñ y,a y Z,a ¯ś I i IR ı̄ O; e Oe; ai
8
RETROFLEX f, .t F, .th .q, d. Q, d.h :N,a n. .=, r :S,a s. r r̄
DENTAL t,a t T,a th d, d ;D,a dh n,a n l, l .s,a s ˚l ˚l̄
LABIAL :p,a p :P, ph b,a b B,a bh m,a m v,a v ^h o ˚u ˚ū A;ea o A;Ea au
“LIES” — 2011/6/21 — 15:43 — page 133 — #153

ˇ
133

i
i

TABLE 7 shows Sanskrit phonetic segments categorized according to the
phonetic features described by the ancient Indian linguist Śaunaka and
shown in TABLE 3. Place of articulation features appear in the leftmost
column. Stricture appears in the third row of headings. The fourth row of
headings shows voicing, and the fifth row shows aspiration and nasaliza-
tion of consonants, as well as length of vowels. Not noted in TABLE 3,
Śaunaka distinguishes fused complex vowels from diphthongs, as shown
in the sixth row of headings. Glottal aperture, vocal fold disposition,
material, and pitch described in TABLE 3 are not shown. Less common
segments are discussed in the notes.1−5 Noteworthy is the placement of
anusvāra (ṁ) with spirants.

Notes:

1. Vowels include prolonged lengths called pluta; three pitches udātta,

anudātta, svarita; and nasalized variants.
2. Semivowels y, l, v include nasal variants ỹ, l̃, ṽ.
3. Four additional nasals k̃, k̃h, g̃, and g̃h, called yama, occur instead
of non-nasal stops before nasals. A nasal fricative h̃ occurs after h
before n., n, m.
4. Unaspirated and aspirated retroflex lateral flaps written L, .l and \h, .lh
occur intervocalically instead of d. and d.h, according to Vedamitra
(1.51).
5. Anusvāra is lengthened by 14 mora to 43 mora after short vowels and
is shortened 14 mora to 14 mora after long vowels.

i i

i i
i
i

i
i

TABLE 7: Sanskrit phonetics according to Śaunaka

CONSONANTS VOWELS1
2
stops semivowels spirants simple complex
APPENDIX A: TABLES

incontinuously contacted slightly cont. continuously open continuously open continuously open
UNVOICED VOICED VOICED UNVD . VD . VOICED VOICED
unasp. asp. unasp. asp. nasal3 short long long
fused diphthong

GUTTURAL H h. h, h Aa A;a ā
VELAR k, k K,a kh g,a g ;G,a gh .z, ṅ ^h
PALATAL . c,a c C, ch .j,a j J,a jh V,a ñ y,a y Z,a ¯ś Ii IR ı̄ O; e Oe; ai
4
RETROFLEX f, .t F, .th .q, d. Q, d.h :N,a n. .=, r :S,a s. r r̄
DENTAL t,a t T,a th d, d ;D,a dh n,a n l, l .s,a s ˚l ˚l̄
LABIAL :p,a p :P, ph b,a b B,a bh m,a m v,a v ^h o ˚u ˚ū A;ea o A;Ea au
“LIES” — 2011/6/21 — 15:43 — page 137 — #157

ˇ
NASAL M ṁ5
137

i
i

i
i
i i

“LIES” — 2011/6/21 — 15:43 — page 138 — #158

i i

138 APPENDICES

A.8 Sanskrit phonemics

TABLE 8 shows Sanskrit phonemes according to traditional strict defini-
tions of the concept of a phoneme. The table redisplays Sanskrit phonetic
segments shown in TABLE 5, setting phonemes in black and sounds that
occur only as allophones in gray. The latter and the marginal phonemes
anusvāra and visarga are discussed in the notes.1−4

Notes:

1. Visarga, allophone of s in pausa becomes a phonetic variant of jihvā-

mūlı̄ya and upadhmānı̄ya before unvoiced velar and labial stops and
of sibilants before the same sibilant. It contrasts with s < k, p;
e. g. paspaśa : antah.pura, paraspara : sarah.padma, antah.karan.a :
uraska.
2. Anusvāra, generally an allophone of morpheme-final m before a semivowel
or spirant, and word-final before a non-labial stop, is a phonetic vari-
ant of m before a labial stop. It contrasts with m in samrāt., samyak,
amlāna, āmred.ita.
3. Jihvāmūlı̄ya and upadhmānı̄ya are allophones of s word-finally be-
fore unvoiced velar and labial stops, respectively.
4. The palatal nasal is an allophone of n before a palatal stop and is an
allophone of m and phonetic variant of anusvāra in the same context.

i i

i i
i
i

i
i

TABLE 8: Sanskrit phonemics

CONSONANTS VOWELS
stops semivowels spirants
APPENDIX A: TABLES

contacted slightly cont. slightly open open

simply open more open most open
UNVOICED VOICED VOICED UNVOICED VOICED VOICED VOICED
unasp. asp. unasp. asp. nasal short long short long long

GUTTURAL h, h H h.1 Aa A;a ā

2
VELAR k, k K,a kh g,a g ;G,a gh .z, ṅ M ṁ ^ h3
PALATAL . c,a c C, ch .j,a j J,a jh V,a ñ4 y,a y Z,a¯ś Ii IR ı̄ O; e Oe; ai
RETROFLEX f, .t F, .th .q, d. Q, d.h :N,a n. .=, r :S,a s. r r̄
DENTAL t,a t T,a th d, d ;D,a dh n,a n l, l .s,a s ˚l ˚l̄
LABIAL :p,a p :P, ph b,a b B,a bh m,a m v,a v ^ h3 o ˚u ˚ū A;ea o A;Ea au
“LIES” — 2011/6/21 — 15:43 — page 139 — #159

ˇ
139

i
i

i
i
i i

“LIES” — 2011/6/21 — 15:43 — page 140 — #160

i i

140 APPENDICES

A.9 Sanskrit sounds derived from PIE by Bur-

row
TABLE 9 redisplays the headings and arrangement of sounds given in
TABLE 5 and shows the Proto-Indo-European reconstruction of each
Sanskrit sound in the place the Sanskrit sound occupies in TABLE 5.
The derivations follow those given in Burrow (1955); for Burrow’s re-
construction of PIE phonology see TABLE 10.

Notes:

1. Some voiced aspirates may perhaps be derived from a voiced unaspi-

rated stop + H (ibid, 72).
2. The symbol Xh stands for the voiced aspirated stops gwh , ǵh , dh , bh .
3. Labiovelars become palatal before H1 e, eH1 , i, iH; otherwise they
become velar (Burrow, 1955, 74–76).
4. Dental stops become retroflex after s. or together with preceding l
(ibid., 96–99).
5. s → s. after i u r/r k except before r/r (ibid., 80).
˚
6. /b/ is rare or non-existent ˚
in PIE. Sanskrit b may arise from voicing
of p; a special instance is voicing caused by a laryngeal, thus Skt.
pibati ‘drinks’ < *pi-pH3 -eti (ibid., 72–73).

i i

i i
i
i

i
i

TABLE 9: Derivation of Sanskrit sounds from Proto-Indo-European phonemes according to Burrow

CONSONANTS VOWELS
stops semivowels spirants
APPENDIX A: TABLES

contacted slightly cont. slightly open open

simply open more open most open
UNVOICED VOICED VOICED UNVOICED VOICED VOICED VOICED
unasp. asp. unasp. asp.1 nasal short long short long long

GUTTURAL Xh 2 s He eH
3 w w w wh
VELAR k k H g g n n,m s
PALATAL3 kw ,ḱ kw H,ḱH gw ,ǵ gwh ,ǵh n y ḱ y yH H ey eHy
4
RETROFLEX t tH d dh n r,l ḱ,s5 r rH
DENTAL t tH d dh n r,l ḱ,s l lH
6
LABIAL p pH b,p,pH bh m w s w wH H ew eHw
“LIES” — 2011/6/21 — 15:43 — page 141 — #161

141

i
i

i
i
i i

“LIES” — 2011/6/21 — 15:43 — page 142 — #162

i i

142 APPENDICES

A.10 PIE phonemics according to Burrow

TABLE 10 shows the Proto-Indo-European phonological system as re-
constructed by Burrow (1955). Burrow’s exposition is less than pellucid,
and his introductory lists of PIE sounds with reflexes in various daughter
languages is misleading, since he later vigorously argues against the ra-
tionale for a considerable number of these sounds. In part, he is reacting
to Edgerton (1946). Burk (1976, 15–18) takes Burrow’s tables at face
value, and attributes to Burrow a Brugmannesque reconstruction of the
consonant system together with 26 (!) vowels and diphthongs.
Notes:
1. Burrow also reconstructs a so-called “laryngeal” H, of unspecified
phonetic value (Burrow, 1955, 85–89). He further describes a three-
laryngeal theory with H1 , H2 , and H3 but notes that “the laryngeal
theory has not yet acquired a completely satisfactory form” (ibid.,
108). He denies that “H in any of its varieties could function as a
vowel” (ibid., 107).
2. Burrow dismisses as “without serious foundation” (ibid., 82) the re-
construction of fricatives þ and ð. On p. 67 he notes a velar nasal and
z but does not discuss these further.
3. Voiceless aspirated stops are not shown, since Burrow reconstructs
these uniformly from voiceless stop + H (ibid., 71–73).
4. Burrow observes that, since the development of the laryngeal theory,
the only “purely . . . vocalic element” is /e/ (ibid., 108). /a/ and /o/
are to be explained either through qualitative alteration or by the ac-
tion of H. /i/ and /u/ as well as the syllabic nasals and liquids are
allophones of the respective consonant phonemes. Long vowels re-
sult uniformly from vowel + H. For a notably lapidary criticism of
similar reconstructions, see Velten (1956).
5. The nasal stops and sonorants have syllabic (vocalic) allophones /n
m r l i u/ (ibid., 108). "
" ""
6. Velar stops are shown in gray, since Burrow regards it as “exceed-
ingly doubtful whether three distinct series [i. e. palatal, velar, labiove-
lar] existed in Indo-European” (ibid., 76).

i i

i i
i
i

i
i

TABLE 10: Proto-Indo-European phonemics according to Burrow

APPENDIX A: TABLES

CONSONANTS1,2,3 VOWELS4
stops liquids5 glides5 spirants
UNVD . VOICED VOICED UNVD . VOICED
unasp. unasp. asp. nasal5

LABIOVELAR kw gw gwh
VELAR 6 k g gh w
PALATAL ḱ ǵ ǵh r y e
DENTAL t d dh n l s
LABIAL p b bh m
“LIES” — 2011/6/21 — 15:43 — page 143 — #163

143

i
i

i
i
i i

“LIES” — 2011/6/21 — 15:43 — page 144 — #164

i i

144 APPENDICES

A.11 PIE phonemics according to Szemerényi

TABLE 11 shows the Proto-Indo-European phonological system as re-
constructed by Szemerényi (1967). This is Szemerényi’s proposed “new
look” for Indo-European, that is “the linguistic stage which can be recon-
structed from the data of the IE languages as their immediate antecedent”
(Szemerényi, 1967, 96 n. 90). The primary differences between this re-
construction and Burrow’s are as follows: (1) a system of four, rather
than three, types of stops is posited in each series of stops; (2) a sepa-
rate series of “palatal” stops is introduced; (3) only a single laryngeal is
given, and it is identified as /h/, a glottal spirant; (4) there are five basic
vowel phonemes, which occur both short (/a e o i u/) and long (/ā ē ō ı̄
ū), and also a schwa.
It is worth noting that this analysis resembles, along broad lines, that
of Brugmann (1906–1916).

i i

i i
i
i

i
i

TABLE 11: Proto-Indo-European phonemics according to Szemerényi

APPENDIX A: TABLES

CONSONANTS VOWELS1
stops liquids glides spirants low mid high
UNVOICED VOICED VOICED UNVD . VOICED
unasp. asp. unasp. asp. nasal

GLOTTAL h
LABIOVELAR kw kwh gw gwh
VELAR k kh g gw a
PALATAL ḱ ḱh ǵ ǵh r y @ e i
DENTAL t th d dh n l s
“LIES” — 2011/6/21 — 15:43 — page 145 — #165

LABIAL p ph b bh m w o u
145

i
i

i
i
i i

“LIES” — 2011/6/21 — 15:43 — page 146 — #166

i i

146 APPENDICES

A.12 Feature tree after Halle

TABLE 12 shows the feature geometry proposed by Halle (1995). In fa-
vor of the interpretation of phonological features as organized in a tree,
rather than constituting an unordered list, are the facts that (1) only a
substantially restricted combination of features is ever used in phono-
logical rules, and (2) sets of features used in phonological rules share a
designated articulator.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 147 — #167

2 + + − − − − + + + + − − + + + + + + − + − + + + − − − − + + + − + + + + + + + + − + + + + + − − −

3 − − + + + + − − − − − − − − − − − − + − − − + − + + + + − − + + − − − − − − − + − − − − − − + − −

4 − − − − − − − − − − − − − − − + + − − − − − − − − − − − + − − − − − − − − + + − + − − − − + − − −

5 − − + + − − − − − − + + − − − + − − − − − − + − − − − − + − − + − − − − − − − − + − − + − + − − −
APPENDIX A: TABLES

6 − − − − − − + + − − − − − − − − − − − + − − + − − − − − − − − − − − − − − + + − − − − − − + − − −

7 − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − + − − − + + − − − + − − − − + − − − −

8 − − − − − − − − − − + − − − + + − − − − + − − − − − − − − − − − − − − − + − − − − − + − − − − − −

9 − + − − − − + + − − − − + + − − − − − − − − − − − − − − + − − − − − − − − − − − − − − − − − − − −

10 − − − − − + − − − − − − − − + − − − − − − − − − − − − − − − − − − − − + − − − − − − − − − − − − −

11 − − − − − − − − − − − − − − − − − − − − − − − − − − − − − + − − − + − − + − − − − − − − + − − − −

12 − − − + − − + + + + − − − − − − − − − + − + − + − − − − − − − − − − − − − − − − − − − − − − − − −

13 − − − − − − − − − − − − − − − − − − − − − − − − + + − + − − − − − − − − − − − − − − − − − − + − −

14 − − − − − − − − − − − − − − − − − − − − − − − − − − − + − − + − − − − − − − − − − − − + − − − − −

15 − − + + − − − − − − − − − − − − − − + − − − + − − − + − − − − − − − − − − − − − − − − − − − − − −

16 − − − − − − − − + + − − − − − − − + − − + − − − − − − − − − − − + − − − − − − − − − − − − − − − −

17 + + + − − − + + − − − − + + − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − −

− − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − −
“LIES” — 2011/6/21 — 15:43 — page 149 — #169

18 + + +

19 − − − − + + − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − + − − −

20 − − − − − − − − − − + + + − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − −

21 − − − − − − − − − − − − − − − − − − + − − − − − − − − − − − − − − − − − − − − − − − − − − − − + +
149

i
i

i
i
i i

“LIES” — 2011/6/21 — 15:43 — page 150 — #170

i i

150 APPENDICES

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 151 — #171

i i

Appendix B

Sanskrit Library Phonetic

Basic

The Sanskrit Library Phonetic Basic encoding scheme (SLP1) attempts

to meet high standards of unambiguous encoding while restricting encod-
ing to 76 codepoints in the ASCII character set. SLP1 utilizes 58 code-
points to encode segments: 53 to represent phonetic segments and five
to represent punctuation h’ . ? - i. In addition SLP1 utilizes 18
codepoints to encode phonetic features: three to indicate stricture, six to
indicate length, eight to indicate tone, and one to indicate nasalization.
Although certain features are indicated by a sequence of codepoints, no
codepoints double as both segments and features. While useful, SLP1
is not an ideal encoding. To its credit it is consistent in that it consis-
tently encodes phonetic rather than graphic elements (with the exception
of the punctuation signs). Yet it does not maintain a consistent basis of
encoding because it mixes the encoding of phonetic segments and pho-
netic features. Nor does it satisfy the Fano condition because it utilizes a
few codepoints as prefixes in code sequences. For example, the forward
slash h/i, back slash h\i, and caret h^i indicate udātta, anudātta, and
independent svarita accents by themselves but also serve as the prefixes
in several sequences that indicate particular tones and tonal sequences
realized in various Vedic traditions; and the digit h1i, which by itself
indicates short length, is used as a prefix in a sequence that serves to in-

151
i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 152 — #172

i i

152 APPENDICES

dicate length of 1 21 morae. Nevertheless, single codepoints capture most

phonetic segments commonly used in classical Sanskrit. The only com-
monly occurring phonetic segment that requires a sequence is nasalized
l, i. e. hl~i. Moreover, SLP1 does clearly define single codepoints or
code sequences to capture a comprehensive set of phonetic distinctions
in classical and Vedic Sanskrit.

B.1 Basic Segments

Aa A;a ā Ii IR ı̄ ou ū
a A i I u U
r r̄ l l̄
f˚ F˚ x˚ X˚
O; e Oe; ai A;ea o A;Ea au
e E o O
k, k K,a kh g,a g ;G,a gh .z, ṅ
k K g G N
. c,a c C, ch .j,a j J,a jh V,a ñ
c C j J Y
f, t. F, t.h .q, d. Q, d.h :N,a n.
w W q Q R
L, l. \h, l.h
L |
t,a t T,a th d, d ;D,a dh n,a n
t T d D n
:p,a p :P, ph b,a b B,a bh m,a m
p P b B m
y,a y .=, r l, l v,a v
y r l v
Z,a ś :S,a s. .s,a s h, h
S z s h
H h. ^ h¯ ^h M ṁ
H Z Vˇ M

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 154 — #174

i i

154 APPENDICES

3 prolonged length of three morae [used for pluta

vowels]
4 prolonged length of four or more morae [used in
raṅga]

B.3.3 Accent

/ high pitch
\ low pitch
^ circumflex
6 extra low tone
7 low tone
8 high tone
9 extra high tone
+ sharpness

B.3.4 Nasalization

~ nasalization

B.4 Modifier combinations and usage notes

B.4.1 Stricture

y_ heavy y
v_ heavy v
y= light y
v= light v
k! unreleased (abhinidhāna) k
g! unreleased (abhinidhāna) g
. . . similarly for other unreleased stops
y! unreleased (abhinidhāna) y
v! unreleased (abhinidhāna) v
l! unreleased (abhinidhāna) l

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 155 — #175

Anusvāra

M# short anusvāra (which follows a long vowel ac-

cording to the Rk and Vājasaneyi Prātiśākhyas:
˚ 13.32–33; VPr. 4.148–149;
RPr. 13.22, 13.29,
˚
the short anusvāra measures half a mora while the
preceding vowel measures 1.5 morae)
M1# long anusvāra (which follows a short vowel ac-
cording to the Rk and Vājasaneyi Prātiśākhyas;
˚ measures 1.5 morae while the
the long anusvāra
preceding vowel measures 0.5 morae)
M1 heavy anusvāra (which is usually called guru and
also by some hrasva and which occurs before a
conjunct consonant according to Śiks.ās)
M2 two-mora anusvāra (which is called dvimātra and
occurs before a consonant followed by r according
˚
to Śiks.ās)

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 158 — #178

i i

158 APPENDICES

Raṅga

2~ two-mora raṅga (vowel two mātras in length

nasalized for the last half mātra with kampa in the
middle according to Pān.inı̄yaśiks.ā 26–30)
4~ raṅga (nasalized vowel four mātras in length fol-
lowed by a break according to Mallaśarmakrta-
˚
śiks.ā; texts show a double danda to mark the break

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 159 — #179

i i

Appendix C

Sanskrit Library Phonetic

Segmental

The Sanskrit Library Phonetic Segmental encoding scheme (SLP2) ad-

heres to the most rigorous standards of unambiguous encoding described
in Chapter 4. It utilizes a consistent basis for encoding, namely broadly
defined phonemes, and it creates a one-to-one correspondence between
codepoints and items encoded. In terms of the three axes of encoding,
SLP2 encodes phonetics rather than graphics, segments rather than fea-
tures, and contrastive rather than complementary units. It encodes San-
skrit phonetic segments by assigning one codepoint to each phoneme
broadly defined, that is, to each segment that is minimally contrastive in
the sense concluded in sections 6.1.5 and 6.1.6.
In column 1 the unique codepoints of SLP2 are shown in hexadecimal
notation. In column 2 the equivalent encoding in SLP1 is given. In
columns 3 and 4 Devanāgarı̄ and Roman representations are given. In
column 5 an IPA transcription of the encoded sound is given.

Devanāgarı̄
Often several options are given for the marking of Vedic accentuation in
Devanāgarı̄, including those used in the following traditions:

159
i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 160 — #180

i i

SLP2 SLP1 D EVAN ĀGAR Ī ROMAN IPA

000 a A a 5
001 a~ A< ã 5̃
002 a/ A 1−4 / A 5−7 / A 8 á 5́
003 a/~ A< 1−4 / A< 5−7 / A< 8 ã´ 5̃´
004 a\ A! 1−5 / AÉ 6,7 / A 8 a/a 5̀
¯
005 a\~ A<! 1−5 / A<É 6,7 / A< 8 ã/ã 5̃`
¯
006 a^ A 1,3 / AÉï 2 / A 8 à 5̂
007 a^~ A< 1,3 / A<Éï 2 / A< 8 ã` 5̃ˆ
008 a/8 A 1−4 / A 5−7 5£
Ă

009 a/8~ A< 1−4 / A< 5−7 5̃ £

00A a\7 A! 1−4 / A 5−6 / AÍ 7 5Ă£

00B a\7~ A<! 1−4 / A< 5−6 / A<Í 7 5̃Ă£
00C a\6 A! 5 / AÉ 6,7 5Ă£
00D a\6~ A<! 5 / A<É 6,7 5̃Ă£
00E a^98 A 1−4 5£
Ą

00F a^98~ A< 1−4 Ą

5̃ £
010 a^97 A1 ! 1,4 5Ć£
011 a^97~ A< 1 ! 1,4 5̃Ć£
012 a^87 A 5 / AÍ 6,7 5Ą£
013 a^87~ A< 5 / A<Í 6,7 5̃Ą£
014 a^87+ A 5,6 / AÉï 7 5Ą£
015 a^87+~ A< 5,6 / A<Éï 7 5̃Ą£
016 a^86 3A! 5 / A 6 5Ć£

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 163 — #183

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 165 — #185

i i

APPENDIX C: SANSKRIT LIBRARY PHONETIC SEGMENTAL 165

SLP2 SLP1 D EVAN ĀGAR Ī ROMAN IPA

045 A^87+~ A;a< 5,6 / A;aÉï< 7 Ã:Ą£
046 A^86 3A;a! 5 / A;a 6 A:Ć£
047 A^86~ 3A;a<! 5 / A;a< 6 Ã:Ć£
048 a3 A3 a3 A::
049 a3~ A< 3 ã3 Ã::
04A a3/ A3 / A 3 5−7 / A 3 8
1−4
á3 Á::
04B a3/~ A< 3 1−4 / A< 3 5−7 / A< 3 8 ´
ã3 ´
Ã::
04C a3\ A! 3 1−7 / A 3 8 a3/a3 À::
¯
04D a3\~ A<! 3 1−7 / A< 3 8 ã3/ã3 `
Ã::
¯
04E a3^ A 3 1,3 / AÉï3 2 / A 3 8 à3 Â::
04F a3^~ A< 3 1,3 / A<Éï 3 2 / A< 3 8 `
ã3 ˆ
Ã::
050 a3/8 A3 1−4 / A 3 5−7 Ă
A:: £
051 a3/8~ A< 3 1−4 / A< 3 5−7 Ă
Ã:: £
052 a3\7 A! 3 1−4 / A3 5−7 A::Ă£
053 a3\7~ A<! 3 1−4 / A< 3 5−7 Ã::Ă£
054 a3\6 A! 3 5 / AÉ3 6,7 A::Ă£
055 a3\6~ A<! 3 5 / A<É 3 6,7 Ã::Ă£
056 a3^98 A 3 1−4 Ą
A:: £
057 a3^98~ A< 3 1−4 Ą
Ã:: £
058 a3^97 A! 3 ! 1,4 A::Ć£
059 a3^97~ A<! 3 ! 1,4 Ã::Ć£
05A a3^87 A 3 5 / AÍ3 6,7 A::Ą£
05B a3^87~ A< 3 5 / A<Í 3 6,7 Ã::Ą£

i i

i i
i i

ı̃ £
090 i^97 I1 ! 1,4 iĆ£
091 i^97~ I< 1! 1,4 ı̃ Ć£
092 i^87 I 5 / IÍ 6,7 iĄ£
093 i^87~ I< 5 / I<Í 6,7 ı̃ Ą£
094 i^87+ I 5,6 / IÉï 7 iĄ£
095 i^87+~ I< 5,6 / I<Éï 7 ı̃ Ą£
096 i^86 3+I! 5 / I 6 iĆ£
097 i^86~ 3+I<! 5 / I< 6 ı̃ Ć£
098 i1# I i i;
099 i1#~ I< ı̃ ı̃;
09A i1#/ I 1−4 / I 5−7 / I 8 ı́ ı́;
09B i1#/~ I< 1−4 / I< 5−7 / I< 8 ´ı̃ ´ı̃;

09C i1#\ I! 1−5 / IÉ 6,7 / I 8 i/i

¯
ı̀;

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 168 — #188

i i

168 APPENDICES

SLP2 SLP1 D EVAN ĀGAR Ī ROMAN IPA

09D i1#\~ I<! 1−5 / I<É 6,7 / I< 8 ı̃/ı̃
¯
`ı̃;

09E i1#^ I 1,3 / IÉï 2 / I 8 ı̀ ı̂;

09F i1#^~ I< 1,3 / I<Éï 2 / I< 8 `ı̃ ˆı̃;

0A0 i1#/8 I 1−4 / I 5−7 i; £

0A1 i1#/8~ I< 1−4 / I< 5−7 Ă

ı̃; £
0A2 i1#\7 I! 1−4 / I 5−6 / IÍ 7 i;Ă£
0A3 i1#\7~ I<! 1−4 / I< 5−6 / I<Í 7 ı̃;Ă£
0A4 i1#\6 I! 5 / IÉ 6,7 i;Ă£
0A5 i1#\6~ I<! 5 / I<É 6,7 ı̃;Ă£
0A6 i1#^98 I 1−4 Ą
i; £
0A7 i1#^98~ I< 1−4 Ą
ı̃; £
0A8 i1#^97 I1 ! 1,4 i;Ć£
0A9 i1#^97~ I< 1! 1,4 ı̃;Ć£
0AA i1#^87 I 5 / IÍ 6,7 i;Ą£
0AB i1#^87~ I< 5 / I<Í 6,7 ı̃;Ą£
0AC i1#^87+ I 5,6 / IÉï 7 i;Ą£
0AD i1#^87+~ I< 5,6 / I<Éï 7 ı̃;Ą£
0AE i1#^86 3+I! 5 / I 6 i;Ć£
0AF i1#^86~ 3+I<! 5 / I< 6 ı̃;Ć£
0B0 I IR ı̄ i:
0B1 I~ I_ ˜ı̄ ı̃:
0B2 I/ IR 1−4 / IR 5−7 / IR 8 ´ı̄ ı́:
0B3 I/~ I_ 1−4 / I_ 5−7 / I_ 8 ´˜ı̄ ´ı̃:

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 169 — #189

0CB i3/~ I< 3 1−4 / I< 3 5−7 / I< 3 8 ´ı̃3 ´ı̃::

0CC i3\ I! 3 1−7 / I 3 8 i3/i3

¯
ı̀::
0CD i3\~ I<! 3 1−7 / I< 3 8 ı̃3/ı̃3
¯
`ı̃::

0CE i3^ I 3 1,3 / IÉï3 2 / I 3 8 ı̀3 ı̂::

0CF i3^~ I< 3 1,3 / I<Éï 3 2 / I< 3 8 `ı̃3 ˆı̃::

0D0 i3/8 I3 1−4 / I 3 5−7 i:: £

0D1 i3/8~ I< 3 1−4 / I< 3 5−7 Ă

ı̃:: £
0D2 i3\7 I! 3 1−4 / I3 5−7 i::Ă£
0D3 i3\7~ I<! 3 1−4 / I< 3 5−7 ı̃::Ă£
0D4 i3\6 I! 3 5 / IÉ3 6,7 i::Ă£
0D5 i3\6~ I<! 3 5 / I<É 3 6,7 ı̃::Ă£
0D6 i3^98 I 3 1−4 Ą
i:: £
0D7 i3^98~ I< 3 1−4 Ą
ı̃:: £
0D8 i3^97 I! 3 ! 1,4 i::Ć£
0D9 i3^97~ I<! 3! 1,4 ı̃::Ć£
0DA i3^87 I 3 5 / IÍ3 6,7 i::Ą£
0DB i3^87~ I< 3 5 / I<Í 3 6,7 ı̃::Ą£
0DC i3^87+ I 3 5,6 / IÉï3 7 i::Ą£
0DD i3^87+~ I< 3 5,6 / I<Éï 3 7 ı̃::Ą£
0DE i3^86 3+I! 3 5 / I 3 6 i::Ć£
0DF i3^86~ 3+I<! 3 5 / I< 3 6 ı̃::Ć£
0E0 i4~ I< 4 ı̃4 ı̃:::
0E1 i4/~ I< 4 1−4 / I< 4 5−7 / I< 4 8 ´ı̃4 ´ı̃:::

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 171 — #191

i i

APPENDIX C: SANSKRIT LIBRARY PHONETIC SEGMENTAL 171

SLP2 SLP1 D EVAN ĀGAR Ī ROMAN IPA

0E2 i4\~ I<! 4 1−7 / I< 4 8 ı̃4/ı̃4
¯
`ı̃:::

0E3 i4^~ I< 4 1,3 / I<Éï 4 2 / I< 4 8 `ı̃4 ˆı̃:::

0E4 i4/8~ I< 4 1−4 / I< 4 5−7 Ă

ı̃::: £
0E5 i4\7~ I<! 4 1−4 / I< 4 5−7 ı̃:::Ă£
0E6 i4\6~ I<! 4 5 / I<É 4 6,7 ı̃:::Ă£
0E7 i4^98~ I< 4 1−4 Ą
ı̃::: £
0E8 i4^97~ ı̃:::Ć£
0E9 i4^87~ I< 4 5 / I<Í 4 6,7 ı̃:::Ą£
0EA i4^87+~ I< 4 5,6 / I<Éï 4 7 ı̃:::Ą£
0EB i4^86~ 3+I<! 4 5 / I< 4 6 ı̃:::Ć£
0EC i* i ı̆
100 u o u u
101 u~ o< ũ ũ
102 u/ o 1−4 / o 5−7 / o 8 ú ú
103 u/~ o< 1−4 / o< 5−7 / o< 8 ũ´ ũ´
104 u\ o! 1−5 / oÉ 6,7 / o 8 u/u ù
¯
105 u\~ o< ! 1−5 / o<É 6,7 / o< 8 ũ/ũ ũ`
¯
106 u^ o 1,3 / oÉï 2 / o 8 ù û
107 u^~ o< 1,3 / o<Éï 2 / o< 8 ũ` ũˆ
108 u/8 o 1−4 / o 5−7 u£
Ă

109 u/8~ o< 1−4 / o< 5−7 ũ £

10A u\7 o! 1−4 / o 5−6 / oÍ 7 uĂ£

10B u\7~ o< ! 1−4 / o< 5−6 / o<Í 7 ũĂ£

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 172 — #192

i i

172 APPENDICES

186 APPENDICES

SLP2 SLP1 D EVAN ĀGAR Ī ROMAN IPA

287 e1^~ O;<1 1,3 / O;Éï< 1 2 / O;<1 8 ĕ`˜ ẽˆ
288 e1/8 O;1 1−4 / O;1 5−7 e£
Ă

289 e1/8~ O;<1 1−4 / O;<1 5−7 ẽ £

28A e1\7 O;!1 1−4 / O;1 5−7 eĂ£

28B e1\7~ O;<!1 1−4 / O;<1 5−7 ẽĂ£
28C e1\6 O;!1 5 / O;É1 6,7 eĂ£
28D e1\6~ O;<!1 5 / O;<É1 6,7 ẽĂ£
28E e1^98 O;1 1−4 e£
Ą

28F e1^98~ O;<1 1−4 Ą

ẽ £
290 e1^97 O;1 ! 1,4 eĆ£
291 e1^97~ O;<1 ! 1,4 ẽĆ£
292 e1^87 O;1 5 / O;Í1 6,7 eĄ£
293 e1^87~ O; < 1 5 / O;Í < 1 6,7 ẽĄ£
294 e1^87+ O;1 5,6 / O;Éï1 7 eĄ£
295 e1^87+~ O;<1 5,6 / O;Éï< 1 7 ẽĄ£
296 e1^86 3:O!;1 5 / O;1 6 eĆ£
297 e1^86~ 3:O<!;1 5 / O;<1 6 ẽĆ£
298 e O; e e:
299 e~ O;< ẽ ẽ:
29A e/ O; 1−4 / O; 5−7 / O; 8 é é:
29B e/~ O;< 1−4 / O;< 5−7 / O;< 8 ẽ´ ´
ẽ:
29C e\ O;! 1−5 / O;É 6,7 / O; 8 e/e è:
¯
29D e\~ O;<! 1−5 / O;<É 6,7 / O;< 8 ẽ/ẽ `
ẽ:
¯

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 187 — #207

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 189 — #209

i i

APPENDIX C: SANSKRIT LIBRARY PHONETIC SEGMENTAL 189

“LIES” — 2011/6/21 — 15:43 — page 190 — #210

323 E3\7~ Oe<!;3 1−4 / Oe<;3 5−7 Aı̃

< :: £
Ă

324 E3\6 Oe!;3 5 / OeÉ;3 6,7 Ai Ă

< :: £

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 191 — #211

i i

APPENDIX C: SANSKRIT LIBRARY PHONETIC SEGMENTAL 191

SLP2 SLP1 D EVAN ĀGAR Ī ROMAN IPA

325 E3\6~ Oe<!;3 5 / Oe<;É 3 6,7 Aı̃ Ă
< :: £
326 E3^98 Oe;3 1−4 Ai
Ą
< :: £
327 E3^98~ Oe<; 3 1−4 Aı̃
Ą
< :: £
328 E3^97 Oe!;3 ! 1,4 Ai Ć
< :: £
329 E3^97~ Oe<!;3 ! 1,4 Aı̃ Ć
< :: £
32A E3^87 Oe ;3 5 / OeÍ ;3 6,7 Ai Ą
< :: £
32B E3^87~ Oe <;3 5 / OeÍ <;3 6,7 Aı̃ Ą
< :: £
32C E3^87+ Oe;3 5,6 / OeÉï;3 7 Ai Ą
< :: £
32D E3^87+~ Oe<;3 5,6 / OeÉï<;3 7 Aı̃ Ą
< :: £
32E E3^86 3:O!e;3 5 / Oe;3 6 Ai Ć
< :: £
32F E3^86~ 3:O<!e;3 5 / Oe<;3 6 Aı̃ Ć
< :: £
330 E4~ Oe<;4 aı̃4 Aı̃
< :::
331 E4/~ Oe<;4 1−4 / Oe<;4 5−7 / Oe<; 4 8 a´ı̃4 A<´ı̃:::
332 E4\~ Oe<!;4 1−7 / Oe<;4 8 aı̃4/aı̃4
¯
A<`ı̃:::
333 E4^~ Oe<; 4 1,3 / OeÉï<;4 2 / Oe<; 4 8 a`ı̃4 A<ˆı̃:::
334 E4/8~ Oe<;4 1−4 / Oe<; 4 5−7 Aı̃
< ::: £
Ă

335 E4\7~ Oe<!;4 1−4 / Oe<;4 5−7 Aı̃

< ::: £
Ă

336 E4\6~ Oe<!;4 5 / Oe<;É 4 6,7 Aı̃

< ::: £
Ă

337 E4^98~ Oe<;4 1−4 Aı̃

< ::: £
Ą

Aı̃ Ć
338 E4^97~ < ::: £
339 E4^87~ Oe <;4 5 / OeÍ <;4 6,7 Aı̃ Ą
< ::: £
33A E4^87+~ Oe<;4 5,6 / OeÉï<;4 7 Aı̃ Ą
< ::: £
33B E4^86~ 3:O<!e;4 5 / Oe<;4 6 Aı̃ Ć
< ::: £

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 192 — #212

i i

192 APPENDICES

SLP2 SLP1 D EVAN ĀGAR Ī ROMAN IPA

380 o1 A;ea1 ŏ o
381 o1~ A;e<a1 ŏ˜ õ
382 o1/ A;ea1 1−4 / A;ea1 5−7 / A;ea1 8 ŏ´ ó
383 o1/~ A;e<a1 1−4 / A;e<a1 5−7 / A;e<a1 8 ŏ´˜ õ´
384 o1\ A;e!a1 1−7 / A;ea1 8 ŏ/ŏ ò
¯
385 o1\~ A;e<!a1 1−7 / A;e<a1 8 ˜ ŏ˜
ŏ/ õ`
¯
386 o1^ A;ea1 1,3 / A;eÉïa1 2 / A;ea1 8 ŏ` ô
387 o1^~ A;e<a1 1,3 / A;eÉï<a1 2 / A;e<a1 8 ŏ`˜ õˆ
388 o1/8 A;ea1 1−4 / A;ea1 5−7 o£
Ă

389 o1/8~ A;e<a1 1−4 / A;e<a1 5−7 õ £

38A o1\7 A;e!a1 1−4 / A;ea1 5−7 oĂ£

40F O^98~ A;E<a 1−4 Ą

5<Ũ: £
410 O^97 A;E!a3 ! 1,4 5U
<:£
Ć

411 O^97~ A;E<!a3 ! 1,4 5<Ũ:Ć£

412 O^87 A;E a 5 / A;EÍ a 6,7 5U
<:£
Ą

413 O^87~ A;E <a 5 / A;EÍ <a 6,7 5<Ũ:Ą£

414 O^87+ A;Ea 5,6 / A;EÉïa 7 5U
<:£
Ą

415 O^87+~ A;E<a 5,6 / A;EÉï<a 7 5<Ũ:Ą£

416 O^86 3A;Ea! 5 / A;Ea 6 5U
<:£
Ć
417 O^86~ 3A;Ea<! 5 / A;E<a 6 5<Ũ:Ć£
418 O3 A;Ea3 au3 Au
< ::
419 O3~ A;E<a3 aũ3 Aũ
< ::
41A O3/ A;Ea3 1−4 / A;Ea3 5−7 / A;Ea3 8 aú3 Aú
< ::
41B O3/~ A;E<a3 1−4 / A;E<a3 5−7 / A;E<a3 8 ´
aũ3 A<ũ´ ::
41C O3\ A;E!a3 1−7 / A;Ea3 8 au3/au3 Aù
< ::
¯
41D O3\~ A;E<!a3 1−7 / A;E<a3 8 aũ3/aũ3 A<ũ` ::
¯
41E O3^ A;Ea3 1,3 / A;EÉïa3 2 / A;Ea3 8 aù3 Aû
< ::

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 197 — #217

i i

APPENDIX C: SANSKRIT LIBRARY PHONETIC SEGMENTAL 197

SLP2 SLP1 D EVAN ĀGAR Ī ROMAN IPA

41F O3^~ A;E<a3 1,3 / A;EÉï<a3 2 / A;E<a3 8 `
aũ3 A<ũˆ ::
420 O3/8 A;Ea3 1−4 / A;Ea3 5−7 Au
< :: £
Ă

421 O3/8~ A;E<a3 1−4 / A;E<a3 5−7 Aũ

< :: £
Ă

422 O3\7 A;E!a3 1−4 / A;Ea3 5−7 Au

< :: £
Ă

423 O3\7~ A;E<!a3 1−4 / A;E<a3 5−7 Aũ

< :: £
Ă

424 O3\6 A;E!a3 5 / A;EÉa3 6,7 Au

< :: £
Ă
425 O3\6~ A;E<!a3 5 / A;E<Éa3 6,7 Aũ Ă
< :: £
426 O3^98 A;Ea3 1−4 Au
Ą
< :: £
427 O3^98~ A;E<a3 1−4 Aũ
Ą
< :: £
428 O3^97 A;E!a3 ! 1,4 Au Ć
< :: £
429 O3^97~ A;E<!a3 ! 1,4 Aũ Ć
< :: £
42A O3^87 A;E a3 5 / A;EÍ a3 6,7 Au Ą
< :: £
42B O3^87~ A;E <a3 5 / A;EÍ <a3 6,7 Aũ Ą
< :: £
42C O3^87+ A;Ea3 5,6 / A;EÉïa3 7 Au Ą
< :: £
42D O3^87+~ A;E<a3 5,6 / A;EÉï<a3 7 Aũ Ą
< :: £
42E O3^86 3A;Ea!3 5 / A;Ea3 6 Au Ć
< :: £
42F O3^86~ 3A;Ea<!3 5 / A;E<a3 6 Aũ Ć
< :: £
430 O4~ A;E<a4 aũ4 Aũ
< :::
431 O4/~ A;E<a4 1−4 / A;E<a4 5−7 / A;E<a4 8 ´
aũ4 A<ũ´ :::
432 O4\~ A;E<!a4 1−7 / A;E<a4 8 aũ4/aũ4 A<ũ` :::
¯
433 O4^~ A;E<a4 1,3 / A;EÉï<a4 2 / A;E<a4 8 `
aũ4 A<ũˆ :::
434 O4/8~ A;E<a4 1−4 / A;E<a4 5−7 Aũ
< ::: £
Ă

435 O4\7~ A;E<!a4 1−4 / A;E<a4 5−7 Aũ

< ::: £
Ă

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 198 — #218

i i

198 APPENDICES

SLP2 SLP1 D EVAN ĀGAR Ī ROMAN IPA

436 O4\6~ A;E<!a4 5 / A;E<Éa4 6,7 Aũ Ă
< ::: £
437 O4^98~ A;E<a4 1−4 Aũ
Ą
< ::: £
Aũ Ć
438 O4^97~ < ::: £
439 O4^87~ A;E <a4 5 / A;EÍ <a4 6,7 Aũ Ą
< ::: £
43A O4^87+~ A;E<a4 5,6 / A;EÉï<a4 7 Aũ Ą
< ::: £
43B O4^86~ 3A;Ea<!4 5 / A;E<a4 6 Aũ Ć
< ::: £
480 k k, k k
481 k! k, k k^
482 k~ k, < k̃ kn
483 K K,a kh kh
484 K! K,a kh kh ^
485 K~ K,a< kh̃ khn
486 g g,a g g
487 g! g,a g g^
488 g~ g,a< g̃ gn
489 G ;G,a gh gh
48A G! ;G,a gh gh ^
48B G~ ;G,a< gh̃ ghn
48C N .z, ṅ N
48D N! .z, ṅ N^
48E c . c,a c c
48F c! . c,a c c^
490 c~ . c,a< c̃ cn
491 C C, ch ch

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 199 — #219

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 201 — #221

i i

APPENDIX C: SANSKRIT LIBRARY PHONETIC SEGMENTAL 201

SLP2 SLP1 D EVAN ĀGAR Ī ROMAN IPA

4C2 b~ b,a< b̃ bn
4C3 B B,a bh bh
4C4 B! B,a bh bh ^
4C5 B~ B,a< bh̃ bhn
4C6 m m,a m m
4C7 m! m,a m m^
4C8 y y,a y j
4C9 y_ y,,a y éJ
<
4CA y= y,a y

4CB y! y,a y j^
4CC y~ y< ,a ỹ j̃
4CD r .=, r õ
4CE l ,a l ”l
4CF l! ,a l ”l^
4D0 l~ < ,a l̃ ”l̃
4D1 v v,a v w
4D2 v_ ë+;aë ,Á v B
4D3 v= v,a v

4D4 v! v,a v w^
4D5 v~ v<.,a ṽ w̃
4D6 S Z,a ś ç
4D7 z :S,a s. ù
4D8 s .s,a s ”s
4D9 h h, h H

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 202 — #222

i i

202 APPENDICES

SLP2 SLP1 D EVAN ĀGAR Ī ROMAN IPA

4DA h~ h,< h̃ Hn
4DB H H h. h
the ends of the candra appear at the headbar when it occurs over an avagraha (figure 8G) and the digits Ë
4DC H/8 Htheand Ù 2,5È appear reduced when the candra appears over them (figure 8Fa, 8Fc).
ends of the candra appear at the headbar when it occurs over an avagraha (figure 8G) and the digits Ë
4DD H\7 @ऀand ôH 2 Èedition
appear reduced when the candra appears over
DEVANAGARI SIGN INVERTED CANDRABINDU is used
themto(figure
mark 8Fa, 8Fc). before spirants in Schröder’s
anusvāra
the ends of theofcandra the KÁ˘ıayajurveda K‡Òhaka-Sa
appear at the headbar when it occursßhit‡ . over(Figure 8A). Although
an avagraha (figure 8G) usedandprimarily
the digits in Ë
4DE H\6 @ऀand ôH 5 Èedition
Devanāgarı̄,
appear reducedthe when
sign is
theused
DEVANAGARI SIGN INVERTED CANDRABINDU to represent
candra appears over Vedic
is used
them texts in other
to(figure
mark 8Fa, scripts as well;
8Fc). before
anusvāra therefore,
spirants its script
in Schröder’s
property of
= “common”.
the KÁ˘ıayajurveda K‡Òhaka-Sa ßhit‡ . (Figure 8A). Although used primarily in
4DF H^98 ¬@ऀH- 2 Devanāgarı̄, the sign is used to representis Vedic
DEVANAGARI
DEVANAGARI SIGN
SIGN SPACING
INVERTEDCANDRABINDU
CANDRABINDU isa spacing to mark
usedtexts in other
mark used to mark
scripts
anusvāra asanusvāra.
well;
before It in
therefore,
spirants is Schröder’s
lower than
its script
U+0910
property of
edition = “common”.
the KÁ˘ıayajurveda K‡Òhaka-Sa
DEVANAGARI SIGN CANDRABINDU and occurs
ßhit‡ .in-line
(Figure at the level
8A). of the Devanagari
Although used primarily headbar.in
¬H- 5 (Figure 8B) the sign is used to representis Vedic
Devanāgarı̄,
DEVANAGARI SIGN SPACING CANDRABINDU a spacing textsmarkin otherusedscripts
to mark asanusvāra.
well; therefore,It is lower than
its script
4E0 H^87 √the ends U+0910
property
DEVANAGARI
of the= candra
DEVANAGARI
appear at the headbar is
“common”.
SIGN CANDRABINDU VIRAMA
SIGN CANDRABINDU
whenused
and ittooccurs
occurs mark anusvāra.
in-line
over atanthe (Figure
level
avagraha 8C)
of(figure
the Devanagari
8G) and the headbar.
digits Ë
¬ƒand ^ È(Figure
appear8B)
DEVANAGARI SIGN
DEVANAGARI SIGN DOUBLE
SPACING CANDRABINDU
CANDRABINDU VIRAMA
reduced when the candra appears is over
a spacingis used to mark
them mark (figure used anusvāra
8Fa,to8Fc).
before a spirant
mark anusvāra. It is lower initial in
than
4E1 Z √the ends U+0910
DEVANAGARI SIGN CANDRABINDU
DEVANAGARI VIRAMA
SIGN CANDRABINDU
h
a consonant cluster. (Figure 8D)
of the candra appear¯at the headbar is
x
whenused
and ittooccurs
occurs mark anusvāra.
in-line
over atanthe (Figure
level
avagraha 8C)
of(figure
the Devanagari
8G) and the headbar.
digits Ë
ƒ≈@ऀand È(Figure
appear8B)
DEVANAGARISIGN
SIGNDOUBLE
CANDRABINDU TWO VIRAMA is used istoused mark to amarkvowel prolonged
anusvāra beforetoa twospirant mora with
initial in
4E2 V √the ^ ends
DEVANAGARI CANDRABINDU
reduced when the
DEVANAGARI SIGN INVERTED CANDRABINDU
nasalization.
aedition
consonant (Figure(Figure
cluster.
DEVANAGARI SIGN CANDRABINDU VIRAMA
of theofcandra
candra appears
8E)h 8D)
the KÁ˘ıayajurveda
appearˇat the headbar
is used
over
Fis used themto(figure
when ittooccurs
K‡Òhaka-Sa
mark
mark. anusvāra.
ßhit‡ (Figure
over
8Fa, 8Fc). before spirants in Schröder’s
anusvāra
(Figure
8A).
an avagraha 8C)
Although
(figure used
8G) andprimarily
the digits in
∆ 9 Devanāgarı̄, the sign is used to represent
M/`ƒ≈@ऀand
Ë
isis used
usedistoused mark toinamark
vowel
vowel prolonged
prolonged totoathree
two mora with
mora
anusvāra asbefore spirant in initial in
DEVANAGARI SIGN
DEVANAGARISIGN CANDRABINDU
SIGNDOUBLE
CANDRABINDU THREE
TWO VIRAMA
4E3 M Ènasalization.
DEVANAGARI
appear reduced
aproperty
consonant
CANDRABINDU
when
cluster.
= “common”.
the
DEVANAGARI SIGN INVERTED CANDRABINDU
(Figure 8F)
8E) 8D)
ṁ
(Figure
candra appears over Vedic
is used
them texts
to(figure
mark other
8Fa, scripts
8Fc). before
anusvāra well; therefore,
spirants its script
Schröder’s
«¬ 2 edition
∆
≈ DEVANAGARI SIGN
of the KÁ˘ıayajurveda
SIGNCANDRABINDU AVAGRAHA
K‡Òhaka-Sa ßhit‡ . (Figure 8A). Although used primarily in
isisisused
usedtoto mark
used mark anusvāra.
a vowel
vowel (Figure 8G)
prolonged
toprolonged
toto three
twoItmoramora with
4E4 M# @ऀ
DEVANAGARI
DEVANAGARI
कM nasalization.
SIGN CANDRABINDU
CANDRABINDU
Devanāgarı̄,
DEVANAGARI SIGN SPACINGthe sign
THREE
TWO
is used to representis Vedic
INVERTEDCANDRABINDU
VEDIC SIGN ANTARGOMUKHA
CANDRABINDU
(Figure 8E)
isa spacing
usedtexts to mark used
in other mark
scripts
anusvāra asanusvāra.
before
8F) is used, with a bindu added on top, to mark short anusvāra after a long
well; therefore,
spirants is Schröder’s
in lower than
its script
U+0910
property
edition =
of “common”.
the KÁ˘ıayajurveda K‡Òhaka-Sa and occurs in-line
. (Figure at the
8A). level of the Devanagari
Although used headbar.
primarily in
«¬ 2 vowel.
∆
DEVANAGARI SIGN CANDRABINDU
DEVANAGARI SIGN
DEVANAGARI (Figure 8H)
SIGNCANDRABINDU
CANDRABINDU AVAGRAHA
THREE isisused
usedto
ßhit‡
to mark
mark anusvāra. (Figure 8G)
a vowel prolonged to three mora with
खM nasalization.
क (Figure 8B) the sign is used
Devanāgarı̄,
DEVANAGARI SIGN SPACING CANDRABINDU
VEDIC SIGN BAHIRGOMUKHA
to represent
(Figure 8F) isis used,
used, with
is Vedic
with aa bindu
a spacing
bindu or
texts
added
mark
candrabindu usedscripts
in other
on top, toadded
to mark
markon short
anusvāra.
astop,
well; Itanusvāra
therefore,
to mark
anusvāra
is lower than
its script
after a long or
«√¬ƒ 2 vowel.
4E5 M#/8 ANTARGOMUKHA
U+0910
property
DEVANAGARI
DEVANAGARI= “common”.
SIGN CANDRABINDU
nasalization. VIRAMA
SIGN CANDRABINDU
(Figure isand
used to mark
occurs anusvāra.
in-line (Figure
at the level of 8C)
the Devanagari headbar.
(Figure 8H) 8I)
DEVANAGARI SIGN CANDRABINDU AVAGRAHA is used to mark anusvāra. (Figure 8G)
isa used toormark beforeon a spirant initial in
खM ! a consonant cluster. (Figureisis8D)
ग
क VEDIC
(Figure
DEVANAGARI 8B)
SIGN DOUBLE CANDRABINDU
SAJIHVASPACING
SIGN BAHIRGOMUKHA
VEDIC SIGN
CANDRABINDU VIRAMA
BAHIRGOMUKHA used, is
used, with used,is
with aa bindu
a spacing
with
bindu or added bindumark
candrabindu used anusvāra
to mark
candrabindu
on top, toaddedmarkon
anusvāra.
added
to marktop,
top,anusvāra
short
Itanusvāra
is lower
aftertoamark
than
or
long
4E6 M#\7 √ anusvāra
DEVANAGARI
vowel.
ANTARGOMUKHA
U+0910 or nasalization. (Figure 8J)
SIGN CANDRABINDU
DEVANAGARI
nasalization.
(Figure
VIRAMA
SIGN CANDRABINDU
(Figure
8H) 8I)
isand
used to mark
occurs anusvāra.
in-line (Figure
at the level of 8C)
the Devanagari headbar.
घƒ≈
ग
ख (Figure
DEVANAGARI
DEVANAGARI
VEDIC
VEDIC SIGN8B)
SIGN LONG
SIGN
SIGNDOUBLE
SIGN
SAJIHVA
CANDRABINDU
ANUSVARA
BAHIRGOMUKHA BAHIRGOMUKHA
CANDRABINDU
is used,
is used to TWO VIRAMA
ismark
with used,
isa long
a bindu
usedanusvāra
with istoused
mark after
bindutoor
or acandrabindu
amark
vowel shortprolonged
a anusvāra
candrabindu
added on vowel. before
added
top,
toa two
to(Figure
mark 8K)mora
on spirant
top, to initial
anusvāra mark
with
or
in
4E7 M#\6 √ nasalization.
a consonant
DEVANAGARI
anusvāra or nasalization.
nasalization.
(Figure
cluster.
SIGN
(Figure
8E) 8D) VIRAMA is used to mark anusvāra. (Figure 8C)
(Figure
CANDRABINDU
8I) (Figure 8J)
घ≈
ग ∆
ƒ3 DEVANAGARI
NOTE:
DEVANAGARI Several SIGNof
SIGN CANDRABINDU
the characters
CANDRABINDU
is usedabove THREE
TWOcould isis used
used to mark
is
a used
bindutoor
be anusvāra
considered amark
vowel
vowel
to prolonged
abeanusvāraprolonged
sequences before of to
toathree
two
on spirant
other mora with
mora
to initial
mark in
characters.
4E8 M1 >< DEVANAGARI
VEDIC SIGN
SIGN
nasalization.
a consonant
The glyphs
anusvāra or for
LONGSIGN
SAJIHVA
DOUBLE
ANUSVARA
(Figure
cluster.
BAHIRGOMUKHA
CANDRABINDU to ismarkused,VIRAMA
a long
with after
8F) 8D) characters are all aligned equally, at the height of the headline
8E)
(Figure
the CANDRABINDU
nasalization. (Figure 8J)
short
candrabindu vowel. added (Figure 8K)
top,

घ≈ « 3 (and
∆ DEVANAGARI
NOTE: not
DEVANAGARI
DEVANAGARI above
Several SIGN
LONGit)
SIGNof
SIGN CANDRABINDU
as shown
CANDRABINDU
the isnext
characters
CANDRABINDU AVAGRAHA
usedtoabovethe
THREE
TWO is
is
as
kacould isused
used
used
follows: toto क
mark
mark
be anusvāra
considered anusvāra.
aƒto
¬ √after vowel
vowel
≈ abe
∆short (Figure
. prolonged
prolonged
«sequences
There is no 8G)
of to
to three
two
question
other mora
mora
that the
characters. with
4E9 M1/8 >< VEDIC
क first SIGN
nasalization.
VEDIC
The twoSIGN
glyphs of these
ANUSVARA
(Figure
thehere
ANTARGOMUKHA
for need toisbe
8F)
8E)
CANDRABINDU
used,
to
encoded
mark
characters
a long
with aasbindu alladded
spacing
are on equally,
top, to
characters,
aligned butmark
vowel.
short
onethemight
at height
(Figure
anusvāra
argue
of the
8K)
thatafter a long
the last
headline
« 3 four
∆ vowel. (Figure 8H) isisfollows:
used
used to क
to mark
mark¬ √ anusvāra.
aƒtovowel (Figure
. prolonged 8G) to three mora with
>< ! could be SIGN
composed withnext a base character and .isAn argument (which
DEVANAGARI
(and notSeveral
above it) CANDRABINDU AVAGRAHA ≈ be∆ «sequences
DEVANAGARI
NOTE: SIGNofas shown
CANDRABINDU
the characters to the
THREE
above as
kacould be considered
COMBINING CANDRABINDU There no
of question
other that
characters. the
4EA M1\7 ख first
क nasalization.
seems
VEDIC
The twoto
SIGN
glyphs us these
of to (Figure
bethe
BAHIRGOMUKHA
here
ANTARGOMUKHA
for 8F)
particularly
need to
CANDRABINDU
isisstrong)
used,
beused,
encodedwith
with aaasbindu
against
characters bindu
this
are or
would
spacing
all candrabindu
added beonthetop,
characters,
aligned typical
equally, added
to
but mark
atglyph
one on
theshort
mighttop, to of
representation
height arguemark
anusvāra thatanusvāra
the after
of a long
thesuch
headlinelast or
« four nasalization.
vowel.
(and not (Figure
sequences
could
DEVANAGARI above (all (Figure
be SIGN 8H)
shown
composed
it) as 8I)
here
shown with
CANDRABINDU at
next a14base
points):
AVAGRAHA
to thecharacter
ka as isfollows:
used to क
mark
and COMBINING
क¬क√कƒक≈क∆क«कƒ ¬ √ anusvāra.
≈ ∆ँक«Ë.ँकThere
ƒ CANDRABINDU (Figure
ÈँकΩँक . no
.isAn 8G)
Note how the
argument
question true
(which
that the
4EB M1\6 ग first
ख
क seems
VEDICtwo
AVAGRAHA to usconnects
SIGN
SIGNof to with
be particularly
SAJIHVA
BAHIRGOMUKHA
these here
ANTARGOMUKHA theto
BAHIRGOMUKHA
need is
KA used,
isstrong)
beused,
while
encoded is
theused,
with
with
against with
aCANDRABINDU
aasbindu
bindu
this or
would
spacing acandrabindu
addedbindu
be
characters,ortypical
onthetop,
AVAGRAHA candrabindu
added
to
but mark
does glyph
one onshort
not.
mighttop,
Weadded to onthat
mark
anusvāra
prefer
representation
argue top,
the to
anusvāra
after amark
unique
of
the such long
last or
anusvāra
nasalization.
vowel.
encodings.
sequences (Figure or
(all be nasalization.
(Figure
8H) 8I)
shown here (Figure
at a14base 8J)character
points): ँकËँकÈँकΩँक . Note how the true
2 four could composed with and COMBINING CANDRABINDU
क¬क√कƒक≈क∆क«कƒ . An argument (which
4EC M1# घ AVAGRAHA
ग
ख seems
VEDICto SIGNusconnects
SIGN LONG
to with
ANUSVARA
be particularly
SAJIHVA
BAHIRGOMUKHA the is
BAHIRGOMUKHAKA used
used,
while
strong) to is
with mark
the
againstused, athis
long
with
aCANDRABINDU
bindu oranusvāra
wouldacandrabindu
bindu
be theor
AVAGRAHAafter
typical adoes
short
candrabindu
added glyphon vowel.
not. top,
Weadded (Figure
to
representation on the
mark
prefer 8K)
top, to
anusvāra
unique
of suchmark or
9. Additions
anusvāra foror
nasalization. nasalization.
Devanāgarī.
(Figure
shown 8I) (Figure
hereThe 8J) five
following characters are proposed ँकËँकas Èँकadditions
Ωँक. Notetohow the existing
2 encodings.
sequences (all at 14 points): क¬क√कƒक≈क∆क«कƒ the true
4ED M1#/8 घ AVAGRAHA
ग
DevanāgarīNOTE:
VEDIC SIGNSeveral
block.
SIGN connects
LONG
SAJIHVA of the
with
ANUSVARA characters
the is
BAHIRGOMUKHAKAused while above
tois mark
the could
used, a long be anusvāra
with
CANDRABINDU considered
a bindu or
AVAGRAHA to
after abe
candrabindu
does sequences
short vowel.
not. Weadded of other
(Figure
on the
prefer top,characters.
8K) to mark
unique
The glyphs
9. Additions
anusvāra
encodings. fororDevanāgarī.for the CANDRABINDU
nasalization. (Figure
The 8J)
following characters are all aligned
five characters equally,asat additions
are proposed the heightto of thethe headline
existing
4EE M1#\7 !
घ DEVANAGARI
@’Devanāgarī
2
(and
NOTE:
VEDIC not above
Several
block.
SIGN VOWEL
LONG it)
of asthe
SIGN
ANUSVARA shown isnext
characters
CANDRA used toabove
LONG totheEmark as
iskacould
used follows:
a long क ¬ √after
inbeDevanagari
considered
anusvāra ƒto≈ abe ∆short
transcriptions «sequences
. There is no
of Avestan
vowel. of question
other
(Figure to 8K)
mark that
thethe
characters.
first
long
9. Additions two
The schwa forofb̄.Devanāgarī.
glyphs these
for thehere
(DEVANAGARI need
The
VOWEL
CANDRABINDU tofollowing
beSIGNencoded
characters fiveascharacters
CANDRA spacing
E isall
are used characters,
to mark
aligned
are thebut
equally,
proposed one
regular
asat might
theschwa
additions height argue that
(Figure
to of
b.) thethe the
9B)last
headline
existing
4EF M1#\6 @’᪓
Devanāgarīfour
(and
NOTE: could
not above
Several
block.
DEVANAGARI
DEVANAGARI be composed
SIGN
VOWEL it)
of asthe
PUSHPIKA
SIGN shownwithisnext
characters
CANDRA used
a LONG
base as
toabove
theEaiska
placeholder
character inand
as follows:
used
could orक“filler”,
beDevanagari
considered
COMBINING ≈often
¬ √ transcriptions
ƒtoCANDRABINDU
be . flanked
∆ «sequences
There by
of .Avestan
isAnof double
no argument
question
otherto mark dandas
(which
that
thethe
characters.
(Figure
seems
first
long
The two9C)of
to
schwa
glyphs usb̄.these
to bethe particularly
here
(DEVANAGARI
for need VOWEL
CANDRABINDU tostrong)
beSIGNencoded against
characters
CANDRA asthis
spacing
E isall
are would
used tobemark
aligned the
characters, typical
thebut
equally, glyph
one
regular
at might
theschwa representation
height argue
b.)of of
9B)such
thatheadline
(Figure
the the last
4F0 M2 >
@’᪓᪔ 3 DEVANAGARI
sequences
four
(and could
not above
DEVANAGARI (all
be
CARET
SIGN
VOWEL shown
composed is used
it)PUSHPIKA
as
SIGN here
shown toisat
with
CANDRA mark
a14
used
next base
LONG the
topoints):
asthe ainsertion
placeholder
character
E iska as
used point
inand
follows: orof
Devanagari
क¬क√कƒक≈क∆क«कƒ omitted
क“filler”, text
∆ँक«Ëand
≈often
¬ √ transcriptions
COMBINING ƒ CANDRABINDU .ँक Èto
flanked
There ँकofΩmark . no
by
.Avestan
ँकisAn word
Note
double how
argument
question
to division.
mark the
dandasthetrue
(which
that the
The divider
(Figure
seems
first
long two9C)of
to
schwa
AVAGRAHA sign
usb̄.connects
to
thesebehas here
(DEVANAGARI awith
distinctive
particularly
needthe
VOWEL toKA shape
bewhile
strong) encoded
SIGN withCANDRABINDU
the
against
CANDRA asathis
thin
spacing descending
E iswould
used tobe thediagonal
characters,
mark
AVAGRAHA typical
thebut does
regular and
glyph
one not. thick
might
schwa rising
Weargueprefer
representation
b.) diagonal
the of
that
(Figure unique
the
9B) such
last
4F1 M2/8 > 3 that
᪓᪔ DEVANAGARI
four
the
distinguish
encodings.
sequences
could (all
point: Á᪔ uswhich
be itshown
from
composed
CARET
SIGN is used
PUSHPIKA the
here generic
to
withisatmark
a14base
used caret
the
points): U+2038.
ainsertion
as character
placeholder andIt is
point
क¬क√कƒक≈क∆क«कƒ a “filler”,
orofzero-width
omittedCANDRABINDU
COMBINING ँकspacing
text
often Ëand Èto
flanked
ँक ँकcharacter
Ωmark . Note
by
. An
ँक wordcentered
double how
argument division.
the
dandas ontrue
(which
The divider
(Figure
seems
AVAGRAHA 9C)
to to beis
sign
connects hasusedawithbetween
distinctive
particularly the KA orthographic
shapeagainst
while
strong) the asyllables:
withCANDRABINDU
thinwould
this descending
कÀ᪔be
कÀthe koko.
AVAGRAHA diagonal
typical (Figure
does and
glyph 9D)
not. thick We rising
preferdiagonal
representation the ofunique
such
᪙9.᪔ Additions
that distinguish
encodings.
sequences
DEVANAGARI
DEVANAGARI for(all LETTER
CARET itshown
from
Devanāgarī. isZHAusedis used
the
here generic
The
to atmark in Devanagari
14 caret
following U+2038.
the insertion
points): five transcriptions
It is of
characters
point
क¬क√कƒक≈क∆क«कƒ areofproposed
a zero-width
omitted Avestan
text ँकtoas
ँकspacing
Ëand Èmark ँक.the
character
ँकadditions
to Ωmark voiced
centered
to
word
Note howthe palatal
theontrue
existing
division.
fricative
the
Devanāgarī
The point:
divider
AVAGRAHA [b].
block.Á᪔ which (Figure
sign
connects is 9E)
hasusedawithbetween
the KA orthographic
distinctive shape the
while asyllables:
withCANDRABINDU
thin descending कÀ koko.
कÀ᪔AVAGRAHA diagonal (Figure
doesand 9D)
not. thick We rising
preferdiagonal
the unique
᪙9. Additions
that distinguish
encodings.
DEVANAGARI for LETTER it from
Devanāgarī. ZHAthe is used
generic
The in Devanagari
caret U+2038.
following five transcriptions areofproposed
It is a zero-width
characters Avestan spacingtoasmark character
additions the voiced thepalatal
centered
to on
existing
9
fricative
@’Devanāgarī
the point: [b].
block.
DEVANAGARI Á᪔ which (Figure
VOWEL 9E)
is usedSIGNbetween
CANDRAorthographic
LONG E is used syllables:
in Devanagari
कÀ᪔कÀ koko. (Figure 9D)
transcriptions of Avestan to mark the
i ᪙9. Additions
long schwa
DEVANAGARI for b̄. Devanāgarī.
LETTER ZHA is used
(DEVANAGARI The
VOWEL in Devanagari
following
SIGN CANDRA five transcriptions
E is used are
characters toof Avestan
mark
proposed
i
toasmark
the regular the voiced
schwa
additions thepalatal
to (Figure
b.) 9B)
existing
9
@’᪓ fricative
Devanāgarī [b]. (Figure
block.
DEVANAGARI
DEVANAGARI 9E)
SIGN PUSHPIKA
VOWEL SIGN CANDRA is usedLONG as Eaisplaceholder
used in Devanagarior “filler”, often flanked
transcriptions by double
of Avestan to markdandas the
(Figure
long schwa9C)b̄. (DEVANAGARI VOWEL SIGN CANDRA E is used to mark the regular schwa b.) (Figure 9B)
9
i @’᪓᪔ DEVANAGARI
DEVANAGARI VOWEL CARET
SIGN PUSHPIKAis
SIGNused tois mark
CANDRA used
LONG the
as Eainsertion
used inpoint
isplaceholder Devanagari itranscriptions
orof“filler”,
omitted text and
often flanked toofmark word
by double
Avestan to division.
dandas
mark the
The
(Figure
long divider
schwa9C)b̄.sign has a distinctive
(DEVANAGARI VOWEL shape SIGN CANDRA with a thin descending
E is used to markdiagonal the regular andschwa thick b.)rising diagonal
(Figure 9B)
᪓᪔ DEVANAGARI
that distinguish CARET
SIGN it fromis used
PUSHPIKA the togeneric usedcaret
is mark the U+2038.
as ainsertion
placeholder It isorof
point a “filler”,
zero-width
omitted often textspacing
and
flanked tocharacter
mark word
by double centered
division.
dandas on
the
The point:
(Figuredivider
9C) Á᪔ which
sign is hasused between orthographic
a distinctive shape with asyllables: कÀ᪔कÀ koko.
thin descending diagonal (Figure and9D) thick rising diagonal
᪙᪔ DEVANAGARI
that distinguish LETTER
CARET it fromisZHA
usedis used
the generic
to in Devanagari
mark caret U+2038.
the insertion transcriptions
It is of
point of Avestan
a zero-width
omitted textspacing
andto marktocharacter
mark the word
voiced
centeredpalatal
division. on
fricative
the point:
The divider[b]. (Figure
Á᪔ which
sign is
hasused9E) between orthographic
a distinctive shape with asyllables: कÀ᪔कÀ koko.
thin descending diagonal (Figure and9D) thick rising diagonal
᪙ DEVANAGARI
that distinguish LETTER it from ZHAthe is used
generic in Devanagari
caret U+2038. transcriptions of Avestan
It is a zero-width spacing to mark the voiced
character centeredpalatal on
fricative 9
the point:[b]. (Figure
Á᪔ which is used9E) between orthographic syllables: कÀ᪔कÀ koko. (Figure 9D)
᪙ DEVANAGARI LETTER ZHA is used in Devanagari transcriptions of Avestan to mark the voiced palatal
fricative [b]. (Figure 9E) 9
i i

“LIES” — 2011/6/21 — 15:43 — page 203 — #223

i i

APPENDIX C: SANSKRIT LIBRARY PHONETIC SEGMENTAL 203

SLP2 SLP1 D EVAN ĀGAR Ī ROMAN IPA

4F2 M2\7 >! 3
4F3 M2\6

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 204 — #224

i i

204 APPENDICES

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 205 — #225

i i

Appendix D

Sanskrit Library Phonetic

Featural

The Sanskrit Library Phonetic Featural encoding scheme (SLP3) creates

a correspondence between codepoints numbered 1-242, selected SLP1
segments, and their features. Each SLP1 segment is associated with nine-
teen features each of which is assigned a value of plus, minus, or neutral.
The latter applies if the feature is inapplicable to the segment in ques-
tion. In addition true diphthongs are assigned pairs of featural values,
one for each of the two constituent sounds. The SLP3 encoding is based
upon phonetic features as described by Halle and shown in Table 4. In
terms of the three axes of encoding described in Chapter 4, SLP3 encodes
phonetics rather than graphics, and contrastive rather than complemen-
tary units. Although it encodes segments, these are explicitly associated
with sets of features, each of which could be assigned a codepoint. Each
segment could then be associated with sets of featural codepoints in a
consistent and unambiguous featural encoding scheme. We have chosen
instead to represent SLP3 in terms of phonetic segments associated with
sets of phonetic features.
In column 1 the unique codepoints of SLP3 are shown in decimal no-
tation. In column 2 the equivalent encoding in SLP1 is given. In columns
3 through 21 the value for each of nineteen features in Halle’s system are
given. Row four of the table header indicates terminal features. Rows

205
i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 206 — #226

i i

206 APPENDICES

one through three of the header show higher nodes in Halle’s feature tree
as shown in Table 12. ‘GUTTRL’ stands for GUTTERAL, ‘SPal’ and
‘spal’ for soft palate, and ‘tblade’ for tongue blade. The abbreviations
shown in columns 3-21 in row four of the table header are given in the
following table:

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 225 — #245

i i

APPENDIX E: MALCOLM D. HYMAN 225

“From Research Challenges of the Humanities to the Epistemic Web

(Web 3.0)” (with J. Renn), NSF/JISC Digital Libraries Infrastruc-
tures, Phoenix, April 17–19, 2007
“What is the Next Step? A Humanities Perspective” (with J. Renn),
Cyber-research Infrastructures and Data Management for Science
and Communities — an ESF/BOREAS Workshop, Paris, February
19–20, 2007
“A Computational Approach to Sanskrit Morphology and Phonology,”
World Sanskrit Conference, Edinburgh, July 10–14, 2006
“Software para realizar exposiciones virtuales” (with J. Damerow), Work-
shop: Ciencia y Cultura entre dos mundos, La Orotava, Tenerife,
May 31, 2006
“Towards a New Platform for Linguistic Analysis and Scholarly Anno-
tation,” Digital Philology: Problems and Perspectives, Universität
Hamburg, January 20, 2006
Co-chair of roundtable discussion “Comparative Literacies of the An-
cient World,” American Historical Society (participants: S. Hous-
ton, M. Hyman, D. Lurie, R. Salomon), January 5, 2006
“Semantic Networks in Ancient and Early Modern Mechanics Texts:
Development and Transformation,” SFB 644 Jahrestagung: Über-
setzung und Transformation, Humboldt-Universität, December 3,
2005
“Aristotle’s Theory of the Syllable,” ICHoLS, Champaign-Urbana, Sep-
tember 2, 2005
“Encoding Sanskrit Phonetics vs. Encoding Devanāgarı̄ Script,” De-
vanāgarı̄ OCR Workshop, Brown University, Providence, Rhode
Island, January 22–23, 2005
“The Challenges of the Humanities to the World Wide Web: Perspectives
from the Archimedes Project” (with M. Schiefsky), ALLC/ACH,
Göteborg, Sweden, June 11–16, 2004
“Interfaces for Parser and Dictionary Access,” invited speaker, LDC In-
stitute, University of Pennsylvania, January 26, 2004
Co-chair of panel “Linguistic Issues in the Text Encoding of Sanskrit,”
ALLC/ACH, Athens, Georgia, May 30, 2003

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 226 — #246

i i

226 APPENDICES

“Greek and Roman Grammarians on Motion Verbs and Place Adver-

bials,” NAAHoLS, Atlanta, Georgia, January 4, 2003
“The Archimedes Project: Current Research” (with M. Schiefsky), NSDL
Workshop, Dibner Institute, MIT, March 9, 2002

C ONFERENCE ORGANIZATION
“Multilingualism, Linguae Francae, and the Global History of Religious
and Scientific Concepts” (with J. Braarvig), The Norwegian Insti-
tute at Athens, Greece, April 3–5, 2009
“Viva Voce: Echoes of Performance in the Ancient Text”
(with V. Panoussi, J. Rowley, P. Thibodeau, M. Sundahl), Brown
University, February 7–8, 1997

T EACHING
Teaching Fellow, Department of Classics, Brown University (1995–1997)

• Essentials of the Latin Language (two semesters)

• Introduction to Latin (intensive)
Teaching Assistant, Department of Classics, Brown University (1994)
• Reason and the Human Good in Ancient Ethical Thought (Instruc-
tor: Martha C. Nussbaum)

P ROFESSIONAL AFFILITATIONS
North American Association for the History of the Language Sciences
Linguistic Society of America
Henry Sweet Society for the History of Linguistic Ideas
Association for Literary and Linguistic Computing
Association for Computing in the Humanities

P ROFESSIONAL ACTIVITIES
Leader, Cross-Sectional Group III: The Spread of Knowledge through
Cultures, TOPOI: The Formation and and Transformation of Space in

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 227 — #247

i i

APPENDIX E: MALCOLM D. HYMAN 227

Ancient Civilizations (German Excellence Cluster 264) (2009)

Project Manager, XML Workflow and Presentation, project funded by

the Max Planck Digital Library (2008–2009)

• Managed a team of three individuals to develop a standardized

workflow for transcription of historical books into structured XML,
a Relax NG schema for these texts, and software for online presen-
tation and content-based access to historical sources

Program Committee member, Second International Sanskrit Computa-

tional Linguistics Symposium (Brown University, May 15–17, 2008);
Third International Sanskrit Computational Linguistics Symposium (Hy-
derabad, January 12–14, 2009)

Expert consultant to ISO/IEC JTC1/SC2/WG2 “Universal Multiple-Octet

Coded Character Set” (2007–2008)

• Proposed standards for encoding Vedic Characters in ISO 10646/

Unicode
• Co-author of working group documents N3235, N3235R, N3290
Exhibitor at the Wissenschaftssommer in Essen, Germany (theme: “Die
Geisteswissenschaften: ABC der Menschheit”) (2007)

• Developed exhibit on current research in linguistic computing and

the decipherment of ancient Near Eastern writing
Member of Sonderforschungsbereich 644 “Transformationen der An-
tike,” Berlin, Germany (2005–2008)

• Investigator in Teilprojekt A6, “Gewicht, Bewegung und Kraft:

Begriffliche Strukturveränderungen antiken Wissens als Folge sei-
ner Tradierung”

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 228 — #248

i i

228 APPENDICES

Chief technical architect for the interactive component of the German

government-sponsored exhibition “Albert Einstein: Ingenieur des Uni-
versums: 100 Jahre Relativität, Atome und Quanten” (2004–2005)

• The interactive component — “an exhibition without walls” — is

an original concept, with major financial support from the Heinz
Nixdorf Foundation, Siemens, and BASF
• Development: distributed software system (Python/Zope) allows
for content creation by scientists and template design by design
professionals. About fifty interactive stations in the Kronprinzen-
palais run the enviroment for the duration of the exhibition. The
exibition has, in addition, a permanent home on the Web, which
includes all digital content produced during the course of the exhi-
bition
• The exhibition won a bronze medal in the “Exhibition Campaign”
categoryoftheInternationalMuseumCommunicationAward(2007)
Member, Board of Directors, The Sanskrit Library, Providence, Rhode
Island (2004–2009)

Technical Consultant, CDLI (Cuneiform Digital Library Initiative), Ber-

lin/Los Angeles (2002–2009)

Research Fellow, Archimedes Project, Harvard University (2001–2004)

• Collaborator with an international team of scholars to implement
a digital research library of texts in the history of mechanics
• Chief developer of Arboreal, an XML-based scholarly working en-
vironment for texts in Greek, Latin, Arabic, Chinese, Akkadian,
Sumerian, and modern European languages (Java, 45,000+ lines)
Technical and linguistic consultant for Sanskrit Library Project, Brown
University (2000–2009)
• Implemented system for morphological analysis of Sanskrit
• Developed system for typesetting a book MS. in Sanskrit, using
TEX (automatic hyphenation for Devanāgarı̄ text; automatic index
generation and formatting)

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 229 — #249

i i

APPENDIX E: MALCOLM D. HYMAN 229

• Authored electronic index browser, with capabilities for lexical

and grammatical analysis of word-forms (Java, 7000+ lines)
Prepared SGML-encoded text of Dyer-Seymour commentary on Plato’s
Apology and Crito for Perseus Project (1998)
Referee for John Benjamins, Transactions of the American Philological
Association, New England Classical Journal, Historiographia Linguis-
tica, Harvard Studies in Classical Philology, Association for Literary
and Linguistic Computing, Association for Computing in the Humani-
ties, Boston Studies in the Philosophy of Science, (1994–2009)

C OMPUTER S KILLS
Programming Languages: Java, Perl, Python
Other: XML, XSL, RDF, Relax NG, TEI, HTML, CGI, JavaScript, LATEX,
PostgreSQL, Zope, R, xfst
Linux system administration

L ANGUAGES R EAD
Latin, Ancient Greek, Sanskrit, Italian, French, Spanish, German
some university study also of Akkadian

OTHER S KILLS
Copy-editing and indexing experience

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 230 — #250

i i

238 BIBLIOGRAPHY

Driver, G. R. (1976), Semitic Writing: From Pictograph to Alphabet, 3d

edn, Oxford University Press, London. Edited by S. A. Hopkins.
Edgerton, F. (1946), Sanskrit Historical Phonology: A Simplified Outline
for Beginners in Sanskrit, Vol. 5 of Supplement to the Journal of the
American Oriental Society, American Oriental Society, Baltimore
MD.
—–. (1970), Buddhist Hybrid Sanskrit Grammar and Dictionary, 2 vols.,
Motilal Banarsidass, Delhi. Facsimile of 1953 New Haven edition.
Edgerton, W. F. (1941), ‘Ideograms in English writing’, Language
17(2), 148–150.
Eisenstein, E. L. (1980), The Printing Press as an Agent of Change:
Communications and Cultural Transformations in Early-Modern
Europe, Cambridge University Press, Cambridge.

Ellis, A. W. (1979), ‘Slips of the pen’, Visible Language 13(3), 265–282.

Emeneau, M. B. (1946), ‘The nasal phonemes of Sanskrit’, Language
22(2), 86–93.
Erduman, D., ed. (2004), Geschriebene Welten: Arabische Kalligraphie
und Literatur im Wandel der Zeit, Dumont, Köln.

Esterman, M., Verstynen, T., Ivry, R. B. & Robertson, L. C. (2006),

‘Coming unbound: Disrupting automatic integration of synes-
thetic color and graphemes by transcranial magnetic stimulation
of the right parietal lobe’, Journal of Cognitive Neuroscience
18(9), 1570–1576.

Estes, W. K. (1978), Perceptual processing in letter recognition and read-

ing, in Carterette & Friedman (1978), pp. 163–220.
Fano, R. M. (1966), Transmission of Information: A Statistical Theory
of Communications, 2d edn, MIT Press, Cambridge MA.

Firth, J. R. (1936), ‘Alphabets and phonology in India and Burma’, Bul-

letin of the School of Oriental Studies 8(2/3), 517–546.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 239 — #259

i i

BIBLIOGRAPHY 239

—–. (1946), ‘The English school of phonetics’, Transactions of the

Philological Society pp. 92–132.
French, M. A. (1976), Observations on the Chinese script and the classi-
fication of writing-systems, in Haas (1976), pp. 101–129.

Frost, R. (1992), Orthography and phonology: The psychological reality

of orthographic depth, in P. Downing, S. D. Lima & M. Noonan,
eds, ‘The Linguistics of Literacy’, Vol. 21 of Typological Studies in
Language, John Benjamins, Amsterdam, pp. 255–274.
Fry, A. H. (1941), ‘A phonemic interpretation of visarga’, Language
17(3), 194–200.
Füssel, S. (2005), Gutenberg and the Impact of Printing, Ashgate,
Burlington, VT. First published in 1999 as Gutenberg und seine
Wirkung, Insel, Frankfurt am Main.

Gaylord, H. E. (1995), ‘Character representation’, Computers and the

Humanities 29, 51–73.
Geyer, L. H. (1970), A Two-Channel Theory of Short Term Visual Stor-
age, PhD thesis, SUNY at Buffalo.
Geyer, L. H. & DeWald, C. G. (1973), ‘Feature lists and confusion ma-
trices’, Perception & Psychophysics 14(3), 471–482.
Ghosh, P. K. (1983), An approach to type design and text composition
in Indian scripts, Technical Report STAN-CS-83-965, Department
of Computer Science, Stanford University. <http://infolab.stanford.
edu/TR/CS-TR-83-965.html>.

Gibson, E. J. (1969), Principles of Perceptual Learning and Develop-

ment, Appleton-Century-Crofts, New York.
—–. (1972), Reading for some purpose, in Kavanagh & Mattingly
(1972), pp. 3–19.

Gill, E. (1936), An Essay on Typography, 2d edn, Sheed and Ward. Fac-

simile edition with new introduction by Christopher Skelton, David
R. Godine, Boston, 2007.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 240 — #260

i i

240 BIBLIOGRAPHY

Gillam, R. (2002), Unicode Demystified: A Practical Programmer’s

Guide to the Encoding Standard, Addison-Wesley, Boston.
Glaister, G. A. (1979), Glaister’s Glossary of the Book: Terms Used in
Papermaking, Printing, Bookbinding, and Publishing with Notes on
Illuminated Manuscripts and Private Presses, 2d edn, University of
California Press, Berkeley.
Goldin-Meadow, S. (2003), Hearing Gesture: How Our Hands Help Us
Think, Belknap Press, Cambridge MA.
Goldsmith, J. A. (1995a), Phonological theory, in The Handbook of
Phonological Theory (Goldsmith, 1995b), pp. 1–23.
—–, ed. (1995b), The Handbook of Phonological Theory, Blackwell,
Cambridge MA.
Goodglass, H. (1993), Understanding Aphasia, Academic Press, San
Diego.
Govindaraju, V., Setlur, S., Khedekar, S., Kompalli, S., Farooq, F. &
Vemulapati, R. (2004), Enabling digital access to multi-lingual In-
dian documents, in ‘Proceedings of the First International Work-
shop on Document Image Analysis for Libraries’.

Griffen, T. D. (1976), ‘Toward a nonsegmental phonology’, Lingua

40, 1–20.
Gupta, R. (2006), ‘Technology for Indic scripts: A user perspective’,
Language in India 6, 1–17. <http://www.languageinindia.com/
july2006/indictechnology.pdf>.

Haas, W., ed. (1976), Writing without Letters, Vol. 4 of Mont Follick
Series, Manchester University Press, Manchester.
Hadj-Salah, A. (1971), ‘La notion de syllabe et la theorie cinetico-
impulsionnelle des phoneticiens arabes’, Al-Lisāniyyāt 1, 63–83.

Halle, M. (1983), ‘On distinctive features and their articulatory imple-

mentation’, Natural Language Theory 1, 91–105.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 241 — #261

i i

BIBLIOGRAPHY 241

—–. (1988), ‘Remarques sur la révolution scientifique en phonologie,

1926/1930’, Actes de la recherche en sciences sociales 74, 89–96.
—–. (1995), ‘Feature geometry and feature spreading’, Linguistic In-
quiry 26(1), 1–46.

—–. (2002), From Memory to Speech and Back: Papers on Phonetics

and Phonology 1954–2002, Vol. 3 of Phonology and Phonetics, De
Gruyter, Berlin.
Halle, M., Vaux, B. & Wolfe, A. (2000), ‘On feature spreading and
the representation of place of articulation’, Linguistic Inquiry
31(3), 387–444.
Hamann, S. R. (2003), The Phonetics and Phonology of Retroflexes, PhD
thesis, Universiteit Utrecht.
Hamp, E. P. (1959), ‘Graphemics and paragraphemics’, Studies in Lin-
guistics 14(1–2), 1–5.
Haralambous, Y. (2002), ‘Unicode et typographie: un amour impossi-
ble’, Document Numérique 6(3/4), 107–139.
—–. (2004), Fontes & codages, O’Reilly, Paris.

Haralambous, Y. & Plaice, J. (2002), ‘Low-level Devanāgarı̄ support

for Omega—Adapting devnag’, TUGboat 23(1), 50–56. <http:
//www.tug.org/TUGboat/Articles/tb23-1/haralambous.pdf>.
Harley, A. H. (1955), Colloquial Hindustani, Routledge & Kegan Paul,
London. With an introduction by J. R. Firth.

Harris, W. V. (1989), Ancient Literacy, Harvard University Press, Cam-

bridge MA.
Hellingman, J. (1998), ‘Indian scripts and Unicode’, <http://ldc.upenn.
edu/myl/IndianScriptsUnicode.html>.

Henderson, L. (1985), ‘On the use of the term “grapheme” ’, Language

and Cognitive Processes 1(2), 135–148.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 242 — #262

i i

242 BIBLIOGRAPHY

Hillenbrand, J. M. & Houde, R. A. (1996), ‘The role of F0 and amplitude

in the perception of intervocalic glottal stops’, Journal of Speech &
Hearing Research 39(6), 1182–1191.
Hixon, T. J. (1987), Respiratory function in speech, in Respiratory Func-
tion in Speech and Song (Hixon & Collaborators, 1987), chapter 1,
pp. 1–54.
Hixon, T. J. & collaborators (1987), Respiratory Function in Speech and
Song, College-Hill Press, Boston.
Hoberman, R. D. (1985), ‘The phonology of pharyngeals and pharyn-
gealization in Pre-Modern Aramaic’, Journal of the American Ori-
ental Society 105(2), 221–231.
Hock, H. H. (1975), ‘Substratum influence on (Rig-Vedic) Sanskrit?’,
Studies in the Linguistic Sciences 5(2), 76–125.

—–. (1979), ‘Retroflexion rules in Sanskrit’, South Asian Languages

Analysis 1, 47–62.
—–. (1993), ‘Subversion or convergence? the issue of pre-Vedic
retroflexion reexamined’, Studies in the Linguistic Sciences
23(2), 73–115.

—–. (n.d.), ‘Devanagari made easy’, Unpublished instructional materi-

als.
Hockey, S. (2000), Electronic Texts in the Humanities: Principles and
Practice, Oxford University Press, New York.

Hoenig, A. (1990), ‘A constructed Duerer alphabet’, TUGboat

11(3), 435–438. <http://www.tug.org/TUGboat/Articles/tb11-
3/tb29hoenig.pdf>.
Hofstadter, D. R. (1985), Metamagical Themas: Questing for the
Essence of Mind and Pattern, Basic Books, New York.

Hubel, D. H. & Wiesel, T. N. (1968), ‘Receptive fields and func-

tional architecture of monkey striate cortex’, Journal of Physiology
195, 215–243.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 243 — #263

i i

BIBLIOGRAPHY 243

Huet, G. (2005), ‘A functional toolkit for morphological and phonolog-

ical processing, application to a Sanskrit tagger’, Journal of Func-
tional Programming 15(4), 573–614.
—–. (2009), Formal structure of Sanskrit text: Requirements for a me-
chanical Sanskrit processor, in Huet, Kulkarni & Scharf (2009),
pp. 162–199.
Huet, G., Kulkarni, A. & Scharf, P. M., eds (2009), Sanskrit Com-
putational Linguistics: First and Second International Symposia:
Rocquencourt, France, October 2007; Providence, RI, USA, May
2008, Vol. 5402 of Lecture Notes in Artificial Intelligence, Springer,
Berlin.
Hyman, M. D. (2006), ‘Of glyphs and glottography’, Language & Com-
munication 26, 231–249.
Ingram, W. H. (1966), ‘The ligatures of early printed Greek’, Greek,
Roman and Byzantine Studies 7(4), 371–389.
Ishida, R. (2002), ‘An introduction to Indic scripts’, Paper delivered at
22nd Int. Unicode Conference, San José, CA, Sept. 2002. <http:
//www.w3.org/2002/Talks/09-ri-indic/indic-paper.pdf>.

Ivanov, V. V. & Toporov, V. N. (1968), Sanskrit, Nauka, Moscow. Origi-

nally published in Russian, 1960.
Jaffré, J.-P. & Fayol, M. (2006), Orthography and literacy in French, in
Joshi & Aaron (2006), pp. 81–103.
Jakobson, R. ([1929] 1971), Remarques sur l’évolution phonologique du
russe comparée à celle des autres langues slaves, in ‘Selected Writ-
ings, Vol. 1: Phonological Studies’, 2d edn, Mouton, The Hague,
pp. 7–116.
Jakobson, R., Fant, C. G. M. & Halle, M. (1963), Preliminaries to Speech
Analysis: The Distinctive Features and their Correlates, MIT Press,
Cambridge MA. With additions and corrigenda to the 1952 edition.
Jenkins, J. H. (1999), ‘The Unicode character-glyph model: Case
studies’, Paper delivered at 15th Int. Unicode Conference, San

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 244 — #264

i i

244 BIBLIOGRAPHY

José, CA, Aug./Sept. 1999. <http://developer.apple.com/fonts/

WhitePapers/IUC15CG.pdf>.
Jones, D. (1942), The Problem of a National Script for India, Pioneer
Press, Lucknow, U. P.

—–. (1962), The Phoneme: Its Nature and Use, 2d edn, W. Heffer &
Sons, Cambridge.
Joseph, J. E. (2000), Limiting the Arbitrary: Linguistic Naturalism and
its Opposites in Plato’s Cratylus and Modern Theories of Lan-
guage, Vol. 96 of Studies in the History of the Language Sciences,
John Benjamins, Amsterdam.
Joshi, A., Ganu, A., Chand, A., Parmar, V. & Mathur, G. (2004),
‘Keylekh: A keyboard for text entry in Indic scripts’, CHI 2004,
April 24–29, Vienna.

Joshi, R. K. (2006), ‘The phonemic model from India for bi-modal appli-
cations’, Paper delivered at the Second Workshop on International-
izing SSML, Heraklion, Crete, May 2006. <http://www.w3.org/
2006/02/SSML/agenda.html>.
Joshi, R. K., Dharmadhikari, T. N. & Bedekar, V. V. (2007), ‘The
phonemic approach for Sanskrit text’, <http://sanskrit.inria.fr/
Symposium/Phonemics_CDAC.pdf>.
Joshi, R. M. & Aaron, P. G., eds (2006), Handbook of Orthography and
Literacy, Erlbaum, Mahwaw NJ.
Kahan, B. (2000), Ottmar Mergenthaler: The Man and his Machine; A
Biographical Appreciation of the Inventor on his Centennial, Oak
Knoll Press, New Castle DE.
Kahn, D. (1996), The Codebreakers: The Story of Secret Writing, 2d edn,
Scribner, New York.

Kapr, A. (1993), Fraktur: Form und Geschichte der gebrochenen

Schriften, Hermann Schmidt, Mainz.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 245 — #265

i i

BIBLIOGRAPHY 245

Katsoulidis, T. (1996), The physiognomy of the Greek typographical let-

ter, in M. S. Macrakis, ed., ‘Greek Letters: From Tablets to Pixels’,
Oak Knoll Press, New Castle DE, pp. 153–161.
Kaufman, S. A. (1984), ‘On vowel reduction in Aramaic’, Journal of the
American Oriental Society 104(1), 87–95.
Kavanagh, J. F. & Mattingly, I. G., eds (1972), Language by Ear and by
Eye: The Relationships between Speech and Reading, MIT Press,
Cambridge MA.
Keane, E. (2004), ‘Tamil’, Journal of the International Phonetic Associ-
ation 34(1), 111–116.
Kelly, J. (1981), The 1847 alphabet: an episode of phonotypy, in Asher
& Henderson (1981), pp. 248–264.
Kemp, J. A. (1994), Phoneme, in R. E. Asher, ed., ‘The Encyclopedia of
Language and Linguistics’, Vol. 6, Pergamon, New York, pp. 3029–
3036.
Kenyon, F. G. (1951), Books and Readers in Ancient Greece and Rome,
2d edn, Clarendon, Oxford.
Kernighan, B. W. & Pike, R. (1984), The UNIX Programming Environ-
ment, Prentice-Hall, Englewood Cliffs NJ.
Kielhorn, L. F., ed. (1962, 1965, 1972), The Vyākaran.a-mahābhās.ya of
Patañjali, third edition revised and furnished with additional read-
ings, references and select critical notes by k. v. abhyankar edn,
Bhandarkar Oriental Research Institute, Pune. 3 vols.
Kim, C. W. (1997), The structure of phonological units in han’gŭl, in Y.-
K. Kim-Renaud, ed., ‘The Korean Alphabet: Its History and Struc-
ture’, University of Hawai‘i Press, Honolulu, pp. 145–160.
Kita, S., ed. (2003), Pointing: Where Language, Culture, and Cognition
Meet, Erlbaum, Mahwaw NJ.
Klatt, D. H. (1976), ‘Linguistic uses of segmental duration in English:
Acoustic and perceptual evidence’, Journal of the Acoustical Soci-
ety of America 59(5), 1208–1221.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 246 — #266

i i

246 BIBLIOGRAPHY

i i

BIBLIOGRAPHY 249

Madria & G. Pernul, eds, ‘Electronic Commerce and Web Tech-

Sonka, M., Hlavac, V. & Boyle, R. (1999), Image Processing, Analysis

and Machine Vision, 2d edn, PWS, Pacific Grove CA.
Sproat, R. (2006), ‘Brahmi-derived scripts, script layout, and segmental
analysis’, Written Language & Literacy 9(1), 45–65.
Srihari, S. N., Srinivasan, H., Huang, C. & Shetty, S. (2006), ‘Spotting
words in Latin, Devanagari and Arabic scripts’, Vivek: Indian Jour-
nal of Artificial Intelligence 16(3), 2–9.
Staal, J. F. (1972), A Reader on the Sanskrit Grammarians, Motilal Ba-
narsidass, Delhi.

Steinberg, S. H. (1961), Five Hundred Years of Printing, 2d edn, Penguin,

Harmondsworth.
Stemberger, J. P. (1982), ‘The nature of segments in the lexicon: Evi-
dence from speech errors’, Lingua 56, 235–259.
Strasser, G. F. (1988), Lingua Universalis: Kryptologie und Theorie der
Universalsprachen im 16. und 17. Jahrhundert, Vol. 38 of Wolfen-
bütteler Forschungen, Harrasowitz, Wiesbaden.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 256 — #276

i i

256 BIBLIOGRAPHY

Suen, C. Y., Mori, S., Kim, S. H. & Leung, C. H. (2003), Analysis and
recognition of Asian scripts—the state of the art, in ‘Proceedings of
the 7th International Conference on Document Analysis and Recog-
nition’, pp. 866–878.
Sweet, H. (1892), A Manual of Current Shorthand, Orthographic and
Phonetic, Clarendon, Oxford.
Syropoulos, A., Tsolomitis, A. & Sofroniou, N. (2003), Digital Typog-
raphy Using LATEX, Springer, New York.
Szemerényi, O. (1967), ‘The new look of Indo-European: Reconstruc-
tion and typology’, Phonetica 17(2), 65–99.
Takakusu, J. (1896), Record of the Buddhist Religion as Practised in
India and the Malay Archipelago, Clarendon, Oxford.
Tolchinsky, L. (2003), The Cradle of Culture and What Children Know
About Writing and Numbers Before Being Taught, Erlbaum, Mah-
waw NJ.
Tomasello, M. (1999), The Cultural Origins of Human Cognition, Har-
vard University Press, Cambridge MA.
Treiman, R. (2006), Knowledge about letters as a foundation for reading
and spelling, in Joshi & Aaron (2006), pp. 581–599.
Trigger, B. G. (1998), ‘Writing systems: A case study in cultural evolu-
tion’, Norwegian Archaeological Review 31(1), 39–62.
Trigo Ferre, R. L. (1988), The Phonological Derivation and Behavior of
Nasal Glides, PhD thesis, MIT, Cambridge MA. MIT Dissertations
in Linguistics TRIG01.
Tversky, A. (1977), ‘Features of similarity’, Psychological Review
84(4), 327–352.
Unicode Consortium (2006), The Unicode Standard, Version 5.0,
Addison-Wesley, Boston.
Vacek, J. (1976), ‘The Sanskrit sibilants’, Wissenschaftliche Zeitschrift
der Humboldt-Universität zu Berlin, Gesellschafts und sprachwis-
senschaftliche Reihe 25(3), 407–412.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 257 — #277

i i

BIBLIOGRAPHY 257

Vachek, J. (1973), Written Language: General Problems and Problems

of English, number 14 in ‘Janua Linguarum Series Critica’, Mou-
ton, The Hague.
Vaid, J. (2002), ‘Exploring word recognition in a semi-alphabetic script:
The case of Devanagari’, Brain and Language 81, 679–690.
van den Bosch, A., Content, A., Daelemans, W. & de Gelder, B. (1994),
‘Measuring the complexity of writing systems’, Journal of Quanti-
tative Linguistics 1(3), 178–188.
van Nooten, B. A. (1973), The structure of a Sanskrit phonetic treatise,
in I. Konks, P. Numerkund & L. Mall, eds, ‘Oriental Studies’, Toid
Orientalistika Alalt; Trudy po Vostokovedeniju II 2, Tartu Univer-
sity, Tartu, pp. 408–436.
Varma, S. (1929), Critical Studies in the Phonetic Observations of In-
dian Grammarians, Royal Asiatic Society, London. Reprint: Delhi:
Munshiram Manoharlal, 1961.
Vedavrata, ed. (1962–1963), Patañjali’s Vyākaran.amahābhās.ya with
Kaiyat.a’s Pradı̄pa and Nāgojı̄bhat..ta’s Uddyota, Haryān.ā Sāhitya
Saṁsthāna, Gurukula Jhajjar (Rohatak).
Velten, H. V. (1956), Hedgehogs Versus foxes in comparative linguistics,
in M. Halle, H. G. Lunt, H. McLean & C. H. van Schooneveld,
eds, ‘For Roman Jakobson: Essays on the Occasion of His Sixtieth
Birthday, 11 October 1956’, Mouton, The Hague, pp. 585–587.
Vincent, D. (2000), The Rise of Mass Literacy: Reading and Writing in
Modern Europe, Polity, Cambridge.
Voigt, R. (2005), ‘Die Entwicklung der aramäischen zur Kharos.t.hı̄-
und Brāhmı̄-Schrift’, Zeitschrift der Deutschen Morgenländischen
Gesellschaft 155, 25–50.
Vygotskii, L. S. (2005), Pedagogicheskaia psikhologiia, AST-Astrel-
Liuks, Moscow.
Walden Font (1997), ‘The Gutenberg press: Five centuries of German
Fraktur’, <http://www.waldenfont.com/downloads/gbpmanual.
pdf>.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 258 — #278

i i

258 BIBLIOGRAPHY

Waller, R. (1986), ‘What electronic books will have to be better than’,

Information Design Journal 5(1), 72–75.
—–. (1988), The Typographic Contribution to Language: Towards a
Model of Typographic Genres and Their Underlying Structures,
PhD thesis, University of Reading.
Ward, J. & Romani, C. (2000), ‘Consonant-vowel encoding and ortho-
syllables in a case of acquired dysgraphia’, Cognitive Neuropsy-
chology 17(7), 641–663.
Ward, J., Simner, J. & Auyeung, V. (2005), ‘A comparison of lexical-
gustatory and grapheme-colour synaesthesia’, Cognitive Neuropsy-
chology 22(1), 28–41.
Watson, P. J. & Hixon, T. J. (1987), Respiratory kinematics in classical
(opera) singers, in Respiratory Function in Speech and Song (Hixon
& Collaborators, 1987), chapter 10, pp. 337–374.

Weir, R. H. (1967), Some thoughts on spelling, in W. M. Austin, ed., ‘Pa-

pers in Linguistics in Honor of Léon Dostert’, Mouton, The Hague,
pp. 169–177.
Wennerstrom, A. (2001), The Music of Everyday Speech: Prosody and
Discourse Analysis, Oxford University Press, New York.
White, A. (2002), ‘The Unicode Standard for Scripts of India (TUSSI):
A request to make the TUSSI specification compatible with the
ISCII standard, and beyond’, <http://www.exnet.btinternet.co.uk/
uniprop/encoding.htm>.

Whitney, W. D. (1861), ‘On Lepsius’s standard alphabet’, Journal of the

American Oriental Society 7, 299–332.
—–. (1862), ‘The Atharva-veda-prâtiçâkhya, or Çâunakîyâ catur-
âdhyâyikâ: Text, translation, and notes’, Journal of the American
Oriental Society 7, 333–615.

—–. (1868), ‘The Tâittirîya-Prâtiçâkhya, with its commentary, the Tri-

bhâshyaratna: Text, translation, and notes’, Journal of the Ameri-
can Oriental Society 9, 1–469.

i i

i i
i i

“LIES” — 2011/6/21 — 15:43 — page 259 — #279

i i

BIBLIOGRAPHY 259

i i

INDEX 269

recitation, 49, 63 non-glottographic, 39

saṁhitā, 95 phonographic, 49
schools, 30, 61, 67, 84, 95, 97 Roman, 19
Vedic Sanskrit Coding Scheme, syllabic, 10
see varn.amālā
Velthuis, Frans, 36, 107 XML, 93, 95, 120
visible speech, see Bell, Alexander
Melville Yāska, 94
visual art, 1, 48, 88 Yijing, 9
visual word form area, 50 Young, James, 4
vyākaran.a, 119
Ziegenbalg, Bartholomew, 18
Vygotsky, L. S., 101
Zwicky, Arnold M., 76
Ward, Ida Caroline, 66
Westermann, Diedrich, 66
Whitney, William Dwight, 16, 46,
66, 72
Wikner, Charles, 36, 107
Wilkins, Charles, 23, 109
Williams, Monier, 39
word spotting, 105
World Wide Web, xii, 27, 33
writing, xiii, 2–4, 7, 9, 12, 27, 29,
31, 41, 50, 51, 54, 55,
59, 101–103, 108, 115
cursive, 105
ease of, 20
implements, 53
proto-writing, 2
writing system, 4, 9, 13, 18, 23,
27, 29, 49, 51, 54, 55,
107, 108, 115
alphabetic, 10, 16
artificial, 55
borrowing of, 51
East Asian, 103
ideographic, 49
logographic, 49

i i

Siva Sutra Paper
No ratings yet
Siva Sutra Paper
16 pages
The Lexicography of Sanskrit
No ratings yet
The Lexicography of Sanskrit
19 pages
Iks Unit 2
No ratings yet
Iks Unit 2
40 pages
Pāṇini's Impact on Sanskrit Grammar
No ratings yet
Pāṇini's Impact on Sanskrit Grammar
9 pages
Indian Language Traditions and Their Influence On Modern Linguistics
100% (1)
Indian Language Traditions and Their Influence On Modern Linguistics
11 pages
Sanskrit in Computational Linguistics
No ratings yet
Sanskrit in Computational Linguistics
12 pages
Oraya Account Ancient Indian Grammatical Studies Patanjali Mahabhasya
No ratings yet
Oraya Account Ancient Indian Grammatical Studies Patanjali Mahabhasya
8 pages
Linguistics in Premodern India
No ratings yet
Linguistics in Premodern India
20 pages
Indian Tradition of Science - An Introductory Overview Srinivas (2016)
No ratings yet
Indian Tradition of Science - An Introductory Overview Srinivas (2016)
164 pages
Introduction to Prakrit by Woolner
No ratings yet
Introduction to Prakrit by Woolner
248 pages
Introductiontopr 00 Woolrich
100% (2)
Introductiontopr 00 Woolrich
248 pages
Introductiontopr00woolrich BW
No ratings yet
Introductiontopr00woolrich BW
248 pages
Unit 3 Iks
100% (1)
Unit 3 Iks
27 pages
Paninis Grammar
No ratings yet
Paninis Grammar
19 pages
History of Linguistics India
100% (1)
History of Linguistics India
4 pages
Cultural Developments
No ratings yet
Cultural Developments
88 pages
संस्कृत
50% (2)
संस्कृत
78 pages
Panini by Grammar - Research Databases, Ebooks, Discovery Service
No ratings yet
Panini by Grammar - Research Databases, Ebooks, Discovery Service
6 pages
Paini S Grammar and Its Computerization
No ratings yet
Paini S Grammar and Its Computerization
24 pages
Sanskrit Text Comparison Methods
No ratings yet
Sanskrit Text Comparison Methods
8 pages
Unit 14
No ratings yet
Unit 14
16 pages
Brahmi For Lingua Akshara
No ratings yet
Brahmi For Lingua Akshara
43 pages
Grammar of Three Sages
No ratings yet
Grammar of Three Sages
3 pages
Language Vs Grammatical Tradition in Ancient India
No ratings yet
Language Vs Grammatical Tradition in Ancient India
35 pages
Sanskrit Computational Linguistics
No ratings yet
Sanskrit Computational Linguistics
5 pages
Sanskrit Grammar: Vyakarana Overview
No ratings yet
Sanskrit Grammar: Vyakarana Overview
3 pages
CS TR 83 965
No ratings yet
CS TR 83 965
159 pages
Sanskrit: Ancient Language Insights
No ratings yet
Sanskrit: Ancient Language Insights
2 pages
Indo-Aryan Indo-European Family Vedic Sanskrit Official Languages Hinduism Buddhism Jainism
No ratings yet
Indo-Aryan Indo-European Family Vedic Sanskrit Official Languages Hinduism Buddhism Jainism
4 pages
Chapter 5
No ratings yet
Chapter 5
23 pages
Origin & Purity of Sanskrit: Nikul Joshi
No ratings yet
Origin & Purity of Sanskrit: Nikul Joshi
3 pages
The Inarticulate Nymph and The Eloquent King
No ratings yet
The Inarticulate Nymph and The Eloquent King
14 pages
Sanskrit: Ancient Language Insights
No ratings yet
Sanskrit: Ancient Language Insights
2 pages
Phonetics in Sanskrit
No ratings yet
Phonetics in Sanskrit
39 pages
Oral Tradition of Sanskri 2800052
No ratings yet
Oral Tradition of Sanskri 2800052
331 pages
Sanskrit: A Living Scholarly Language
No ratings yet
Sanskrit: A Living Scholarly Language
1 page
Aindra
No ratings yet
Aindra
7 pages
Sanskrit - Wikipedia
No ratings yet
Sanskrit - Wikipedia
44 pages
Early Indian Scripts
100% (1)
Early Indian Scripts
53 pages
A Formal Computational Analysis of Indic Scripts: University of Illinois at Urbana-Champaign
No ratings yet
A Formal Computational Analysis of Indic Scripts: University of Illinois at Urbana-Champaign
32 pages
Cpms Long Iwlc 06
No ratings yet
Cpms Long Iwlc 06
19 pages
Sanskrit and Its Development From Proto-Indo-European (PDFDrive)
No ratings yet
Sanskrit and Its Development From Proto-Indo-European (PDFDrive)
57 pages
Chinese and Indian Linguistic Traditions
No ratings yet
Chinese and Indian Linguistic Traditions
27 pages
Prakrit Phonology
No ratings yet
Prakrit Phonology
13 pages
Sanskrit
No ratings yet
Sanskrit
25 pages
Wiebke - Petersen - 10849 004 2117 7
No ratings yet
Wiebke - Petersen - 10849 004 2117 7
19 pages
The Origin of Siddham Learning in India
No ratings yet
The Origin of Siddham Learning in India
12 pages
09 Chapter 2
No ratings yet
09 Chapter 2
25 pages
WonderOfSanskrit v21 Final
No ratings yet
WonderOfSanskrit v21 Final
15 pages
SVBF Sanskrit 01
No ratings yet
SVBF Sanskrit 01
5 pages
Gla1 Elti
No ratings yet
Gla1 Elti
16 pages
About Sanskrit PDF
100% (3)
About Sanskrit PDF
59 pages
IKS Reading Material - Module Two
No ratings yet
IKS Reading Material - Module Two
26 pages
Reviews o F Books: Me Neau Niversity of Alifornia Ress
No ratings yet
Reviews o F Books: Me Neau Niversity of Alifornia Ress
2 pages
Devasthali SiddhantaKaumudi 1968 PDF
100% (2)
Devasthali SiddhantaKaumudi 1968 PDF
531 pages
The Ruki-Rule in Vedic
No ratings yet
The Ruki-Rule in Vedic
6 pages
Upasargartha Chandrika Part2 Bharatiya Vidya Prakashan NewDelhi 1976 PDF
No ratings yet
Upasargartha Chandrika Part2 Bharatiya Vidya Prakashan NewDelhi 1976 PDF
209 pages
Szemerenyi Sprachwissenschaft1990
No ratings yet
Szemerenyi Sprachwissenschaft1990
394 pages
Kassian Hitt Verb Gloss 2002
No ratings yet
Kassian Hitt Verb Gloss 2002
65 pages
'Ayurvedic Cure For Common Diseases
No ratings yet
'Ayurvedic Cure For Common Diseases
193 pages
Motocultivadores Diesel Kipor Kdt910ca Kdt910e
100% (1)
Motocultivadores Diesel Kipor Kdt910ca Kdt910e
25 pages
'4990010020689 - The System of Ayurveda
100% (2)
'4990010020689 - The System of Ayurveda
379 pages
Obituary: The Language of The Kharo$ (Hi Documents From Chinese Turkestan
No ratings yet
Obituary: The Language of The Kharo$ (Hi Documents From Chinese Turkestan
12 pages
Sharma PingalaChhandah 1931 PDF
100% (3)
Sharma PingalaChhandah 1931 PDF
256 pages
Belvalkar SystemsSanskritGr 1915 PDF
No ratings yet
Belvalkar SystemsSanskritGr 1915 PDF
162 pages
Chandahsutram Pingal Halayudha PDF
No ratings yet
Chandahsutram Pingal Halayudha PDF
258 pages
Chhandorachanaa
No ratings yet
Chhandorachanaa
0 pages
Upasargartha Chandrika Part1 Bharatiya Vidya Prakashan NewDelhi 1976 PDF
No ratings yet
Upasargartha Chandrika Part1 Bharatiya Vidya Prakashan NewDelhi 1976 PDF
112 pages
AHS Su1-30
No ratings yet
AHS Su1-30
570 pages
Between Passive and Reflexive: The Vedic Presents With The Suffix - Ya
No ratings yet
Between Passive and Reflexive: The Vedic Presents With The Suffix - Ya
8 pages
Benfey Dictionary1866
No ratings yet
Benfey Dictionary1866
1,151 pages
Bhandarkar Lectures1914
100% (1)
Bhandarkar Lectures1914
322 pages
Doctor Career Report
No ratings yet
Doctor Career Report
2 pages
24.strategies For Integrating Generative AI Into Higher Education
No ratings yet
24.strategies For Integrating Generative AI Into Higher Education
11 pages
UKASFP 2009 Conference Outline
No ratings yet
UKASFP 2009 Conference Outline
6 pages
Unit 1 Module 1 - Introduction To Value Education
No ratings yet
Unit 1 Module 1 - Introduction To Value Education
137 pages
Student Advertising Insights
No ratings yet
Student Advertising Insights
7 pages
Upstream Elem A2 UnitTests KEY
No ratings yet
Upstream Elem A2 UnitTests KEY
1 page
HSC Personal Interest Projects Guide
No ratings yet
HSC Personal Interest Projects Guide
21 pages
Previewpdf
No ratings yet
Previewpdf
29 pages
Nivesh
No ratings yet
Nivesh
10 pages
M&E PM Approved March 2024
100% (1)
M&E PM Approved March 2024
151 pages
Backward Design Lesson Plan Template
100% (2)
Backward Design Lesson Plan Template
7 pages
Teachers Heart in Conquering Separation Anxiety of Kindergarten StudentsFEROLINO and BUQUIRAN ARTICLE 1
No ratings yet
Teachers Heart in Conquering Separation Anxiety of Kindergarten StudentsFEROLINO and BUQUIRAN ARTICLE 1
8 pages
Nivaldo A. Lemos - Analytical Mechanics (2018, Cambridge University Press) PDF
82% (11)
Nivaldo A. Lemos - Analytical Mechanics (2018, Cambridge University Press) PDF
475 pages
Student Exam: Language & Research
No ratings yet
Student Exam: Language & Research
4 pages
The Use of Skimming Technique in Student
No ratings yet
The Use of Skimming Technique in Student
7 pages
Work Immersion Portfolio
No ratings yet
Work Immersion Portfolio
41 pages
Unit 4 Learning: 4.0 Objectives
No ratings yet
Unit 4 Learning: 4.0 Objectives
10 pages
Public Speaking For College and Career 12th Edition Gregory Full Download
No ratings yet
Public Speaking For College and Career 12th Edition Gregory Full Download
413 pages
Pdfmergerfreecom The Series Alston Publishing House
50% (2)
Pdfmergerfreecom The Series Alston Publishing House
18 pages
Research Title Defense Template
No ratings yet
Research Title Defense Template
7 pages
Jose Rizal: Scholar & Patriot
No ratings yet
Jose Rizal: Scholar & Patriot
2 pages
Research Kupal
No ratings yet
Research Kupal
56 pages
Electrical Equipment Exam Prep
No ratings yet
Electrical Equipment Exam Prep
4 pages
Christina Post Cody Gibo Darren Marzan Kevan Pascua Ethical Dilemma 8
No ratings yet
Christina Post Cody Gibo Darren Marzan Kevan Pascua Ethical Dilemma 8
5 pages
Field Practice 1 Casework and Training Lecture
No ratings yet
Field Practice 1 Casework and Training Lecture
97 pages
Physics Principles With Applications 6th Edition Test Bank PDF Download
100% (1)
Physics Principles With Applications 6th Edition Test Bank PDF Download
407 pages
Grade 5 Annual Syllabus 2025-26
No ratings yet
Grade 5 Annual Syllabus 2025-26
19 pages
Rcn221 (Community III) @d9
No ratings yet
Rcn221 (Community III) @d9
87 pages
Dinter & Schneider - Transdisciplinary Perspectives On Childhood in Contemporary Britain - Literature, Media and Society
No ratings yet
Dinter & Schneider - Transdisciplinary Perspectives On Childhood in Contemporary Britain - Literature, Media and Society
266 pages
Grade 5 POV Lesson Plan
No ratings yet
Grade 5 POV Lesson Plan
2 pages