0% found this document useful (0 votes)
33 views17 pages

Download

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views17 pages

Download

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/8337766

Online Recognition of Chinese Characters: The State-of-the-Art

Article in IEEE Transactions on Pattern Analysis and Machine Intelligence · March 2004
DOI: 10.1109/TPAMI.2004.1262182 · Source: PubMed

CITATIONS READS
346 14,613

3 authors:

Cheng-Lin Liu Stefan Jaeger


Institute of Automation of Chinese Academy of Sciences National Institutes of Health
419 PUBLICATIONS 18,911 CITATIONS 109 PUBLICATIONS 6,323 CITATIONS

SEE PROFILE SEE PROFILE

Masaki Nakagawa
Tokyo University of Agriculture and Technology
314 PUBLICATIONS 3,252 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Performance evaluation of deep learning models in abnormality detection from chest radiographs View project

(Bio)medical Imaging View project

All content following this page was uploaded by Stefan Jaeger on 17 May 2014.

The user has requested enhancement of the downloaded file.


198 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004

Online Recognition of Chinese Characters:


The State-of-the-Art
Cheng-Lin Liu, Member, IEEE, Stefan Jaeger, and Masaki Nakagawa, Member, IEEE Computer Society

Abstract—Online handwriting recognition is gaining renewed interest owing to the increase of pen computing applications and new
pen input devices. The recognition of Chinese characters is different from western handwriting recognition and poses a special
challenge. To provide an overview of the technical status and inspire future research, this paper reviews the advances in online
Chinese character recognition (OLCCR), with emphasis on the research works from the 1990s. Compared to the research in the
1980s, the research efforts in the 1990s aimed to further relax the constraints of handwriting, namely, the adherence to standard stroke
orders and stroke numbers and the restriction of recognition to isolated characters only. The target of recognition has shifted from
regular script to fluent script in order to better meet the requirements of practical applications. The research works are reviewed in
terms of pattern representation, character classification, learning/adaptation, and contextual processing. We compare important
results and discuss possible directions of future research.

Index Terms— Online Chinese character recognition, state-of-the-art, pattern representation, character classification, model learning,
contextual processing, performance evaluation.

1 INTRODUCTION

I Nonline character recognition, the trajectories of pen tip


movements are recorded and analyzed to identify the
linguistic information expressed. Owing to the availability
Oriental languages). A common feature of these applica-
tions is that they require high recognition accuracy.
The research of online character recognition started in
of both temporal stroke information and spatial shape the 1960s and has been receiving intensive interest from the
information, online character recognition is able to yield 1980s. The comprehensive survey of Tappert et al. reviewed
higher accuracy than offline recognition. Online recognition the status of research and applications before 1990 [120] and
also provides good interaction and adaptation capability early works of online Japanese character recognition have
because the writer can respond to the recognition result to been reviewed in [85], [132]. A recent comprehensive
correct the error or change the writing style. survey of handwriting recognition, by Plamondon and
In recent years, new types of pen input devices and Srihari, mainly concerns western handwriting [102]. Our
interfaces have been developed to improve the precision of paper contributes a survey to online Chinese character
trajectory capturing and the comfort of writing. Devices are recognition (OLCCR) since this recognition problem is very
available for writing on ordinary paper and wireless different from western handwriting recognition and it poses
transmission of handwriting, for example. Powerful soft- a special challenge.
ware is available now for analyzing and retrieving hand-
written documents. This development stimulates new 1.1 Related Problems
applications of handwriting recognition and has resulted Pen computing applications closely related to handwriting
in a renewed interest in research [114]. The applications of recognition are mathematical formula recognition [5], [148]
online recognition include text entry for form filling and and diagram recognition [3], where both the character
message composition, personal digital assistants (PDA), classes (mostly Latin characters) and the layout are
computer-aided education [90], handwritten document recognized. Another application is signature verification
retrieval [77], [107], etc. For handheld devices, pen input that checks whether a handwritten signature is generated
is competitive to speech input because it is insensitive to by a specific writer or not. It does not necessarily identify
environmental noise, which is an important advantage for the symbolic classes of a signature’s constituent characters
many applications. For desktop applications, online recog- though. Signature verification has been reviewed in [61],
nition is well-suited to text entry for large alphabets (like [102] and recent works are reported in [44], [57]. A form of
handwritten document retrieval, the so-called ink matching,
does not identify the character classes either [78]. Hand-
. C.-L. Liu is with the Central Research Laboratory, Hitachi, Ltd., 1-280 written sketch recognition is based mostly on noncharacter
Higashi-koigakubo, Kokubunji-shi, Tokyo 185-8601, Japan.
E-mail: liucl@crl.hitachi.co.jp. data and typically ignores linguistic information [82], [109].
. S. Jaeger and M. Nakagawa are with the Department of Computer Science, We do not cover these problems further because they are
Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, not relevant to the methodology of OLCCR.
Koganei-shi, Tokyo 184-8588, Japan.
E-mail: stefan@hands.ei.tuat.ac.jp, nakagawa@cc.tuat.ac.jp. 1.2 Characteristics of Chinese Characters
Manuscript received 27 Sept. 2002; revised 19 May 2003; accepted 11 Aug. Chinese characters are used in daily communications by
2003.
Recommended for acceptance by K. Yamamoto.
over one quarter of world’s population, mainly in Asia.
For information on obtaining reprints of this article, please send e-mail to: There are mainly three character sets: traditional Chinese
tpami@computer.org, and reference IEEECS Log Number 117468. characters, simplified Chinese characters, and Japanese
0162-8828/04/$20.00 ß 2004 IEEE Published by the IEEE Computer Society
LIU ET AL.: ONLINE RECOGNITION OF CHINESE CHARACTERS: THE STATE-OF-THE-ART 199

Fig. 1. Examples of traditional Chinese, simplified Chinese, and


Japanese Kanji.

Kanji. Japanese Kanji characters have mostly identical shape


to the corresponding traditional Chinese or simplified
Chinese. For some Kanji characters, nevertheless, the shape
is slightly different from both the traditional and simplified
Chinese. Fig. 1 shows some examples of the three character
sets. We can see that, among the 14 characters, four have Fig. 2. Chinese writing styles: regular, fluent, and cursive.
different shapes (the last character only varies slightly,
while the fourth one is treated identically in three sets). The methods of OLCCR can be roughly divided into two
In the mainland of China, two character sets, containing categories: structural methods and statistical methods.
3,755 characters and 6,763 characters, respectively, were Structural methods are based on stroke analysis. The character
announced as the National Standard GB2312-80 (the first set models of structural methods can be further divided into
is a subset of the second one) [116]. In Taiwan, 5,401 traditional stroke-order dependent models and stroke-order free ones.
characters are included in a standard set. In both traditional Statistical methods mainly utilize the holistic shape informa-
and simplified Chinese, about 5,000 characters are frequently tion, so it is easier to achieve stroke-order independence. From
used. In Japan, 2,965 Kanji characters are included in the an application’s point of view, a recognition method can be
JIS level-1 standard and 3,390 Kanji characters are in the writer-dependent or writer-independent. Writer-indepen-
level-2 standard (the two sets are disjoint). dent recognition is more challenging due to the diversity of
A Chinese character is an ideograph and is composed of writing styles. On the other hand, writer-dependent recogni-
mostly straight lines or “poly-line” strokes. Many characters tion allows stable recognition of cursive script due to the
contain relatively independent substructures, called radi- relative stability of personal writing styles.
cals, and some common radicals are shared by different 1.4 Contents of the Paper
characters. This property can be utilized in recognition to
This survey will emphasize the research efforts from the
largely reduce the size of reference model database and
1990s. On comparing the performance of state-of-the-art
speed up recognition.
methods, discussing their insufficiencies, we will suggest
Chinese handwritten scripts are classified into three
future research directions. The rest of this paper is organized
typical styles: regular script, fluent script, and cursive script.
as follows: Section 2 gives the overview of a typical
The intermediate styles are called fluent-regular script and
OLCCR system and Section 3 briefly reviews preprocessing.
fluent-cursive script, respectively. Some examples of the Sections 4, 5, and 6 address the main tasks of character
three typical styles are shown in Fig. 2. We can see that, in recognition, namely, pattern representation, classification
regular script, strokes are mostly straight-line segments. The (including coarse classification and fine classification), and
fluent script has many curved strokes and, frequently, reference model learning/adaptation, respectively. Though
successive strokes are connected. In cursive script, some in an OLCCR system, the schemes of classification and
character shapes totally differ from the standard shape, so it is learning largely depend on that of representation, a scheme in
difficult to recognize them, even for humans. a task can still connect with multiple schemes in another task.
Hence, we address the three tasks in separate sections and try
1.3 The State-of-the-Art
to thread the connected schemes in different tasks. Section 7
Since the 1990s, the research efforts of OLCCR have been addresses the contextual processing of character segmenta-
aiming at the relaxation of constraints imposed on writers to tion and recognition. Section 8 compares the performance of
ensure successful recognition, namely, the isolation of representative methods and Section 9 discusses the future
characters and the compliance with standard shapes. For research directions.
Chinese characters, the main problem in online recognition
is to overcome the stroke-order and stroke-number varia-
bility. The target of OLCCR in the 1990s has shifted from 2 OVERVIEW OF A TYPICAL OLCCR SYSTEM
regular script to fluent script, which features greater A practical OLCCR system is depicted diagrammatically in
variability of stroke-order and stroke-number and occurs Fig. 3. The input to the system is a sequence of handwritten
frequently in practical writing. character patterns. First, the handwriting sequence is
In the literature of character recognition, the regular style is segmented into character patterns according to the temporal
also referred to as block style or hand-printed style, while the and shape information. Often, the boundary between
fluent style is often called “cursive” style. The current systems characters cannot be determined unambiguously before
can recognize regular script with high accuracy, whereas the character recognition, so candidate character patterns are
recognition of fluent style still remains unsolved and requires generated and recognized and the correct patterns are
more intensive research efforts. The fluent script or fluent- selected in contextual processing at the end of the process
regular script is the target of most recognition systems chain. The recognition of segmented (candidate) patterns
because people naturally write this way. involves the following steps: preprocessing, description, and
200 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004

Fig. 4. Examples of linear normalization, moment normalization, and


nonlinear normalization.

Data reduction can be accomplished by two approaches:


equidistance sampling and line approximation (feature
point detection). With equidistance sampling, the trajectory
points are resampled such that the distance between
adjacent points is approximately equal. The data amount
of equidistance point representation is still appreciable. A
higher data reduction rate can be achieved by detecting the
corner points of trajectories. The corner points and the ends
of a stroke trajectory are often called feature points.
Corner detection from digitized curves has been widely
Fig. 3. Diagram of a practical OLCCR system. addressed in the shape recognition literature. The basic idea
is to estimate the curvature at each point on the curve and
classification. Classification is often decomposed into coarse retain the points of high curvature [105]. An alternative is
classification and fine classification. Pattern description is polygonal approximation, which recursively finds the
vertex of maximum point-to-chord distance [104]. Corner
also referred to as feature extraction, which represents the
detection and polygonal approximation are complementary
input pattern either statistically by feature vectors or
and can be combined to achieve better performance [138].
structurally by various levels of primitives. The model
Line approximation of strokes has been used in many
database (also called reference database or recognition online recognition systems (e.g., [38], [55], [63], [145], [146]).
dictionary) contains the reference models or classification Normalization of character trajectories to a standard size
parameters for coarse classification and fine classification. is adopted in almost every character recognition system.
To speed up the recognition of the large category set, a Conventionally, the coordinates of stroke points are shifted
fast coarse classification procedure is commonly used to and scaled such that all points are enclosed in a standard
first select a small subset of candidate classes to which the box (this is called linear normalization). Alternatively, by
input pattern is expected to belong to. Then, the input moment normalization [4], the centroid of input pattern is
pattern is classified into one of these candidate classes in the shifted to the center of standard box and the second-order
fine classification stage. This two-stage recognition strategy moments are scaled to a standard value.
has been widely adopted by now, though tree classification To alleviate the shape deformation of handwritten Chinese
and multistage classification can further speed up the characters, nonlinear normalization was proposed in the
recognition. In contextual processing, linguistic knowledge 1980s and was proven efficient to improve the accuracy of
and geometric features are used to verify the segmentation offline character recognition. It was later successfully applied
and classification results. The performance of character to online character recognition as well [28], [45], [88], [99],
recognition relies largely on the quality of the model [100]. Nonlinear normalization reassigns the coordinates of
database. This database is built from heuristic knowledge, stroke points according to the line density distribution with
manually selected character prototypes, or from multiple the aim of equalizing the stroke spacing [62], [127], [141]. For
sample patterns. For writer-dependent recognition, the comparing the effects of normalization, Fig. 4 shows two
character patterns and the results of linear, moment, and
models or parameters can adapt to the writer’s style to
nonlinear normalization. It was shown that moment normal-
improve the recognition performance.
ization can yield comparable recognition accuracy to non-
linear normalization [70]. For online patterns, the line density
3 PREPROCESSING can be computed directly from the online trajectory [45], [88],
instead of the 2D image.
The preprocessing of the trajectory of input pattern directly
facilitates pattern description and affects the quality of
description. The preprocessing tasks of online character 4 PATTERN REPRESENTATION
patterns include noise elimination, data reduction, and shape The representation schemes of input pattern and model
normalization. The noise in character trajectories is due to database are of particular importance since the classification
erratic hand motions and the inaccuracy of digitization. The method depends largely on them. We divide the schemes into
noise reduction techniques used in most systems are basically three groups: statistical, structural, and hybrid statistical-
those explained in [120]: smoothing, filtering, wild point structural. In statistical representation, the input pattern is
correction, stroke connection, etc. As the quality of input described by a feature vector, while the model database (also
devices steadily advances, trajectory noise becomes less called parameter database in this case) contains the classifica-
influential and simple smoothing operations will suffice. tion parameters. The structural representation scheme has
LIU ET AL.: ONLINE RECOGNITION OF CHINESE CHARACTERS: THE STATE-OF-THE-ART 201

In the following, we review the detailed schemes of


structural representation, statistical-structural representa-
tion, and statistical representation, respectively.

4.1 Structural Representation


We review the structural representation schemes in the
order of the levels as shown in Fig. 5.

4.1.1 Point and Line Segment Representation


A representation with resampled points can cope well with
curved strokes, though it results in large size of model
database. If both the input pattern and the character
prototype in the model database are represented as point
sequences, they can be matched based on stroke correspon-
dence in which the between-stroke distance is computed by
Fig. 5. Hierarchy of structural representation schemes.
aligning the points [132], [133], [134].
The feature point or line segment representation is
long been dominating the OLCCR technology, whereas the widely adopted nowadays. It is especially suited to
statistical scheme and the hybrid scheme are receiving regular-style characters, which are composed of mostly
increasing attention in recent years. The statistical-structural straight-line segments. Input patterns represented as
scheme is only used for describing the reference models. It feature point sequences can be matched with character
takes the same structure as the traditional structural models represented as feature point sequences [13], [55],
representation, yet the structure elements (primitives) and/ [86], higher-level primitives (such as stroke codes) [20], or
or relationships are measured probabilistically. Hidden hierarchical structures [60]. Analogously, input patterns
Markov Models (HMMs) can be regarded as instances of represented as line segments can be matched with character
the statistical-structural representation. models represented as line segments [18], [35], [124] or
The structural representation schemes can be further higher-level structures [16], [139].
partitioned into five levels: sampling points, feature points or
line segments, stroke codes or HMMs, relational, and 4.1.2 Stroke Code Representation
hierarchical. Fig. 5 shows the hierarchy of five levels. For Stroke code representation schemes have been adopted
describing the input pattern or the model database, the from the early stages of OLCCR research (e.g., [145], [146]).
primitives of higher-level structure are composed of the The strokes in regular script can be categorized into classes
lower-level primitives, e.g., a stroke is recorded by the according to the constituent line segment sequence and
constituent line segments or sampling points, while the each class is assigned a code or index number [63], [65].
relational structure takes strokes as primitives. We treat Each stroke code has a corresponding reference model/
HMMs on the same level as stroke code representation prototype or a set of rules. A character model is then
because HMMs are frequently used to model strokes or represented as a sequence of stroke codes or a relational
substrokes, while a sequence of HMMs represent a radical or structure with stroke codes as primitives [14], [65]. When
a character. using stroke code-based models in recognition, the stroke
The feature point representation is equivalent to the line codes of input pattern are determined in matching with
segment representation because every pair of succeeding character/stroke models. In [74], [106], the input pattern is
feature points gives a line segment. In stroke code representa- initially represented as line segments and strokes are
tion, the types of strokes in input pattern or reference models detected using finite state automaton.
are specified and the reference model is represented as a The stroke code representation of [47] is unique in that a
sequence of strokes. The relational structure represents a connected stroke (a piece of trajectory containing one or
character model or a radical model wherein the relationship multiple normal strokes) is represented as a single feature
between strokes is specified. In the hierarchical representa- vector and stroke prototypes are designed by clustering
tion of model database (also called structured representa- sample patterns. However, because it does not rely on
tion), a number of radical models are shared to construct the stroke decomposition, this scheme does not generalize to
character models of all categories. On the other hand, the novel stroke shapes not contained in the learning samples.
hierarchical structure of input pattern is a relational structure
with radicals as primitives. The structured representation of 4.1.3 Relational Representation
model database is very storage efficient, while the sampling For stroke-order-free recognition, the primitives (strokes or
points representation is data intensive. line segments) of a character and the relationship between
An OLCCR system can use different representation them are often represented by means of a relational structure,
schemes for the input pattern and the model database, such as an Attributed Relational Graph (ARG). In an ARG, the
respectively. In the case of structural or statistical-structural nodes denote primitives and the arcs denote the relationships
representation, the model database is usually described at between nodes. If the attributes of nodes and/or arcs are
higher level than the input pattern. In recognition, for represented with fuzzy sets, the graph is called fuzzy ARG
example, the input pattern is represented in point sequence (FARG) [6]. Fig. 6 shows an example of ARG.
or line segments, then strokes and relational structure are Two problems arise in ARG matching of characters. First,
extracted by matching the point sequence or line segments the primitives and relation codes of the reference models
with the higher-level primitives of reference models. must be carefully designed to tolerate the shape variation of
202 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004

Fig. 6. An example of ARG representation (left: character, right: ARG).


The relation codes “X,” “T,” “L,” and “P” stand for intersection, end-to-line
adjacency, end-to-end adjacency, and positional relation, respectively.

input patterns. Using fuzzy attributes or multiattribute


relation codes can alleviate this problem [7], [10], [71].
Fig. 7. Network representation of characters with shared radical models.
Second, the description of input patterns in ARGs is not Every path corresponds to a character as shown in the rightmost column.
straightforward because it is often difficult to reliably extract
the primitives and their relationships prior to ARG matching.
probabilistically. HMMs have been used in speech
To solve this problem, the input pattern is represented in
recognition since the 1970s [103] and have been applied
lower-level primitives (usually line segments), which are
to western character recognition since the 1980s. Only in
grouped into higher-level primitives by matching them with
recent years have HMMs been applied to Chinese
reference ARGs [149]. Because a stroke may contain multiple
characters. Generally, left-right HMMs are used to model
line segments, using line segments instead of strokes as the
the sequences of points or line segments for substrokes
primitives of ARGs can facilitate the extraction of primitives
[91], [92], [121], strokes, radicals [48], or whole characters
from input pattern [72], [73].
[117], [143]. Since the character-based HMM is stroke-order
4.1.4 Structured Representation dependent, multiple models are often generated for the
characters with stroke-order variations or large shape
Due to the hierarchical nature of Chinese characters, a
variations [117], [143]. Using stroke-based or substroke-
character pattern can be described in a tree structure, with
based HMMs, the character models can be constructed
the character itself at the top level and the radicals and strokes
hierarchically and the stroke-order variations can be
as low-level primitives. In a structured representation of
represented in a variation network [93].
reference models, the radical models or stroke models are
A constrained ergodic HMM, named path controlled
shared by different characters such that a character model is
HMM (PCHMM), was proposed to overcome the stroke-
constructed dynamically using the constituent radicals and
order variation of online characters [150]. The successor
strokes. This strategy can largely save the storage space of
attribute matrix (SAM) of [58] is similar to ergodic HMMs
model database, considering the fact that hundreds of distinct
since it estimates the transition probabilities between strokes.
radicals are shared by thousands of characters [87].
The character models with shared radical models can be 4.3 Statistical Representation
organized in a lookup table [8], [9], [38], [146], a tree In the statistical recognition approach, we are mainly
structure [11], [75], [76], or a network [47]. Fig. 7 shows an concerned with the representation of input patterns
example of network representation of model database. Via (basically in feature vectors). The model database contains
stroke and radical extraction from the input pattern, the the classification parameters, which can be estimated by
character model fitting the input pattern can be retrieved by standard statistical techniques [22], [25]. We will review the
traversing the tree or network. The structured representa- statistical classification techniques in Section 5.
tion approach can vary in the representation scheme of The feature vector representation of character patterns
radicals: line segments [75], [76], connected stroke codes enables stroke-order and stroke-number free recognition by,
[47], and hierarchies of substroke HMMs [91]. for example, mapping the pattern trajectory into a 2D image
4.2 Statistical-Structural Models and extracting so-called offline features [28]. In this context,
various feature extraction techniques in offline character
In a statistical-structural representation scheme, a character
recognition [32], [128] can be applied to online recognition
model is described in a string, tree, or graph structure, with as well.
the primitives and/or relationships measured probabilisti- The so-called direction feature [144], which is widely used
cally to better model the shape variations of input patterns. in offline character recognition [50], [51], is being used in
In principle, any structural model can be described online recognition [28], [45], [88]. In offline recognition, the
probabilistically by replacing the attributes of primitives skeleton or contour pixels of the character image are classified
and/or relationships with probability density functions into either four (horizontal, vertical, diagonal, and antidia-
(PDFs). The mean and variance of stroke and relationship gonal) or eight directions and stored in respective directional
attributes in [76] are connected to PDF representations. planes. Each directional plane is compressed by grid
Gaussian PDFs have been used for describing the distribu- partitioning or blurring (spatial filtering and sampling) [67].
tions of feature points [20] and stroke attributes [150]. This In [41], we named the resulting features “histogram features,”
direction has not been explored adequately yet. which is motivated by the fact that they describe the number
The Hidden Markov Model (HMM) is a directed graph of occurrences for each direction. For online recognition,
with nodes and between-node transitions measured directional features can be extracted directly from the online
LIU ET AL.: ONLINE RECOGNITION OF CHINESE CHARACTERS: THE STATE-OF-THE-ART 203

trajectory [45]. A “direction-change” feature characterizing matching, stroke correspondence, relational matching, and
the temporal information was proposed to enhance the knowledge-based matching. DP matching works on ordered
recognition performance of direction feature [99], [100]. sequences and, hence, is stroke-order dependent. Stroke
correspondence is different from relational matching in that it
does not consider the interstroke relationship. Connected to
5 CHARACTER CLASSIFICATION these approaches, some general strategies are: hierarchical
In this section, we first review the coarse classification matching and deformation methods.
techniques. For fine classification, we categorize the Hierarchical matching can improve the speed of structur-
techniques into three groups: structural matching, prob- al recognition. When stroke codes or radical models are
abilistic matching, and statistical classification. shared by different characters, the classification can be
performed by a decision tree [11] or a network [47]. When the
5.1 Coarse Classification strokes or radicals of input pattern have been identified, the
Coarse classification can be accomplished by class set character recognition is reduced to traversing a path in the
partitioning or dynamic candidate selection. In class set tree/network. However, the accuracy of classification is
partitioning, the character classes are divided into disjoint or limited by the identification of strokes or radicals in the input
overlapping groups. The input pattern is first assigned to a pattern, which is not a trivial task. Therefore, instead of
group or multiple groups and then, in fine classification, the deterministic traversals, measuring the likelihood of paths
input pattern is compared in detail with the classes in the and search with backtrack is helpful to improve the
group(s). In dynamic candidate selection, a matching score recognition performance.
(similarity) is computed between the input pattern and each Deformation techniques are useful to improve the
class and a subset of classes with high scores is selected for matching similarity by deforming the character prototype
detailed classification. The average number of candidates can or the input pattern. Based on the stroke correspondence,
be significantly reduced without loss of precision via selecting the deformation vector field (DVF) between the input
a variable number of candidatesby confidence evaluation [68]. pattern and the prototype can be computed and the
For coarse classification based on class set partitioning, the prototype is iteratively deformed by local affine transforma-
groups of classes are determined in the classifier design stage tion (LAT) to fit the input pattern [131]. A noniterative
using clustering or prior knowledge. Class grouping can be stroke-based affine transformation (SAT) decomposes the
based on overall character structure [66], basic stroke DVF of each stroke incorporating the relationship between
substructure [12], stroke sequence [14], and statistical or successive strokes [134]. In another work, a so-called
neural classification [79]. Partitioning into overlapping parabola transformation was proposed to deform the
groups can reduce the risk of excluding the true class of input character prototype based on attributed string matching of
pattern. feature point sequences [13].
Dynamic candidate selection avoids the training process
of class set partitioning. The character classes are ordered 5.2.1 DP Matching
according to a matching score based on simple structural DP matching finds the ordered correspondence between the
features or statistical features. For instance, the number of symbols (primitives) of two strings with the aim of
strokes or line segments of the input pattern can be used to minimizing the edit (Levinstein) distance. The DP matching
filter out the unlikely classes [7], [65], [132], [133]. For of point sequences is also referred to as dynamic time
efficient filtering, the bounds of stroke number depend on warping (DTW). Attributed string matching refers to the
the character class and writing quality [29]. As to other matching of sequences of attributed primitives. In online
features, the matching score is computed by string match- character recognition, feature points or line segments are
ing [124], peripheral feature matching [19], voting of often taken as the primitives of sequence representation
structural features [126], or feature vector matching [28], [55], [88], [124].
[81], [88], [94], [99], [139]. The matching score of coarse The search space of DP matching is represented in a
classification can also be combined with that of fine rectangular grid with two diagonal corners denoting empty
classification to improve the final accuracy [119], [139]. matching (start) and complete matching (goal), respectively.
In coarse classification by feature vector matching, the In a path from start to goal, the transition between neighbor-
distance measure, such as city block distance and Euclidean ing grid points corresponds to symbol deletion, insertion, or
distance, can be computed very efficiently and the efficiency substitution. A generalization of attributed string matching
can be further improved by dimensionality reduction and can merge multiple primitives in one string to match with one
combining class-specific features [81]. In structural matching, primitive in another string [123]. By imposing constraints
candidate classes can also be selected via radical detection onto potential grid transitions, the search speed can be largely
[16], [60], [76]. A detected radical excludes all the classes not improved with little loss of accuracy (e.g., [86], [88]).
containing the radical from fine classification. The radical DP matching is a mature technique, but the performance
detection approach is tightly connected to structural match- of recognition depends strongly on the selection of
ing and will be addressed later. primitives and the definition of the between-primitive
distance measure. For dealing with stroke-order variations,
5.2 Structural Matching a character class needs multiple prototypes.
In fine classification by structural matching, the input pattern
is matched with the structural model of each (candidate) class 5.2.2 Stroke Correspondence
and the class with the minimum matching distance is taken as Based on the stroke correspondence between the input
the recognition result. We divide the structural matching pattern and a character prototype, the character matching
methods into four categories: DP (dynamic programming) distance is computed as the sum of between-stroke
204 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004

distances. The correspondence can be found by reordering of strokes [8], [9], for example. A knowledge-based system
the input strokes or prototype strokes using domain specific was also proposed for discriminating similar characters
rules [31], [74], [145]. The rules include, e.g., the precedence which were output as candidate classes by a base
orders of different types of strokes. The compilation of recognition system [16], [23].
rules, however, is laborious. We regard the deviation-expansion (D-E) model of [16],
A simple noniterative technique, such as the interstroke [65] as a kind of prior knowledge because it prespecifies the
distance matrix (ISDM) [96], [132] and the like [60], works ranges of stroke-order variations and stroke-number varia-
well but does not necessarily find one-to-one correspon- tions. The variations of a character class are expanded in a
dence. Wakahara et al. proposed a heuristic one-to-one D-E tree and the optimal matching with an input pattern is
stroke correspondence algorithm and an advanced selective found with DP search [65] or A* search [16].
linkage method to cut connected input strokes and allow For constructing knowledge-based systems, the acquisi-
matching with multiple prototype strokes [133]. This tion and organization of knowledge bases is not trivial and is
approach deals with both stroke-order variation and sometimes laborious. Nevertheless, using simple heuristics
stroke-number variation. to assist search-based matching is beneficial. Some heur-
Stroke correspondence was also solved by DP search in a istics, such as the stroke-order statistics, can be obtained
hypercube and stroke connection was resolved by pairing from character samples [93], [113]. On the other hand,
an input stroke with multiple prototype strokes in multi- building special modules or heuristic rules for discriminat-
layer cube search [112]. Interstroke relationships and the ing similar characters is helpful to improve the overall
prior knowledge of stroke order variations can be incorpo- recognition accuracy.
rated to improve the matching performance [113].
5.3 Probabilistic Matching
5.2.3 Relational Matching Using probabilistic attributes in representing structural
Relational matching is the search for a correspondence models and computing matching distance helps improve
between two sets of elements under the constraint of the tolerance of shape deformations. An example is to use
relationship. It can be formulated as a consistent labeling stroke-type probability table in calculating the between-
problem and be solved using heuristic search [95], [135] or stroke distance [106], while, by modeling the prototype
relaxation labeling [30], [36]. Relaxation labeling is compu- strokes as Gaussian density functions, the matching score of
tationally efficient, while heuristic search is flexible in terms an input pattern and a character model becomes the joint
of incorporating various knowledge sources and constraints. probability of constituent strokes [20], [150], which is
A well-known heuristic search algorithm, the A* search, computed by grouping the feature points or line segments
has been used for ARG (attributed relational graph) of the input pattern into strokes by attributed string
matching in [122], [136] and for OLCCR in [71], [72], [150]. matching or heuristic search.
The graph matching was also accomplished by finding the In HMM-based recognition, the task of recognition is to
maximal cliques in the association graph [7], [10]. Relaxa- decode the observation sequence (points or line segments)
tion labeling was used for graph matching of OLCCR in into the most probable state sequence. This is usually
[149]. Utilizing the local invariance of stroke-order within a accomplished by using a dynamic programming procedure
radical, a method combines DP and relaxation for radical called Viterbi decoding [103], as practiced in [117], [143]
detection and residual matching, respectively [139]. Rela- when representing character models holistically as HMMs.
tional matching was also formulated as an assignment By representing radicals and interradical ligatures in
problem (AP) with relational constraints [35] or a two-layer HMMs and sharing them for all character classes in a network
AP with a layer computing between-stroke distance [73]. [48], the input sequence (8-direction codes) can be decoded
Though the AP can be efficiently solved using the into the concatenation of radical and ligature HMMs using
Hungarian method [101], the incorporation of relational the level building algorithm [84]. In recognition based on
constraints is not so flexible as in heuristic search. substroke HMMs, the recognition is also performed by search
As DP matching and stroke correspondence do, rela- through an interconnected network [91], [92], [121]. In
tional matching can be applied to the matching of either a matching with a character model represented by path-
holistic pattern or a radical. The advantage of relational controlled HMM (PCHMM), the line segment sequence of
matching over DP matching lies in the stroke-order the input pattern is reordered to correspond to the optimal
independence, while the advantage over stroke correspon- state sequence using A* search [150].
dence is that the constraint of relationship improves the
accuracy of matching. Relational matching is more compu- 5.4 Statistical Classification
tationally expensive than both DP matching and stroke Various statistical techniques [43] are applicable for
correspondence, however. classification when describing the input pattern as a feature
vector. Unlike that in coarse classification, simple classifiers
5.2.4 Knowledge-Based Matching are used to achieve high-speed; fine classification usually
The prior knowledge of character structure and writing employs sophisticated classifiers to achieve high accuracy.
appears as heuristic rules or constraints. The constraints Kawamura et al. [45] achieved a fairly high recognition
can be used to efficiently reduce the search space of accuracy in OLCCR using a multiple similarity measure [37],
structural matching [71], [72], [113], [139], while rule-based which is similar to the subspace method [97] in that each class
methods have been used for stroke reordering [31], [74], is represented as a linear subspace. The multiple similarity
radical detection [75], [76], and character matching [8], [9], method, the subspace method, and the modified quadratic
etc. The rules represent the knowledge of basic strokes discriminant function (MQDF) [50] are quadratic classifiers.
allowed for a character and the invariant geometric features The MQDF is a smoothed version of QDF, obtained under the
LIU ET AL.: ONLINE RECOGNITION OF CHINESE CHARACTERS: THE STATE-OF-THE-ART 205

assumption of multivariate Gaussian density for each class partitioning the primitives of learning patterns is necessary
[25]. The quadratic classifiers yield high accuracies, but are for HMM learning. This can generally be accomplished by
expensive regarding storage and computation. On the other level building Viterbi decoding [84]. Currently, some
hand, a simple metric, like the Euclidean distance between research works use manually partitioned primitives in
the input feature vector and a prototype vector, gives fast HMM learning [48], [91].
recognition. Designing multiple prototypes by clustering for
each class can improve the accuracy, whereas the prototype 6.3 Multiprototype Learning
learning by learning vector quantization (LVQ) [56], [69], Clustering techniques have been used to design multiple
[125] leads to significant improvement. stroke prototypes and character prototypes from learning
Closely related to statistical techniques are neural net- patterns. If the strokes or characters are represented as
works. For large category set classification, divide-and- feature vectors, the clustering can be accomplished by well-
conquer strategies, i.e., partitioning the category set into known statistical clustering algorithms like the k-means
groups of classes, followed by an appropriate organization of algorithm. Statistical clustering has been used to design
multiple neural networks, may yield high performance. There stroke prototypes in [142]. The clustering of structural
have not been many works of OLCCR using neural networks. patterns, however, must be based on the structural
A successful example was reported by Matic et al. [79]. matching between patterns.
For stroke-order dependent recognition methods, multi-
ple prototypes per class are necessary to absorb stroke-
6 MODEL LEARNING AND ADAPTATION
order variations. In a simple scheme, each class initially has
The quality of the model database influences the recognition a single prototype and, when the matching distance
performance. For statistical or neural classification, the between a learning pattern and the prototype exceeds a
database contains parameters (prototype vectors, subspace threshold, a new prototype is added [55], [143]. Iterative
vectors, connecting weights, etc.), which can be estimated clustering analogous to k-means has been used to learn
from samples using well-known estimation and learning multiple structural prototypes for alphabetic characters [59],
techniques [22], [25], [43]. For structural matching, the [130] and Chinese characters [1]. In [1], each class initially
database contains the structural character/radical models. has one prototype and the number of prototypes is adjusted
The learning of structural models from samples is not trivial in the clustering process. Another approach designs
because the learning patterns of a class have a different prototypes by selection from samples according to the
number of primitives and the primitives do not correspond. proximity between the samples of a class [106].
It is noteworthy that many previous works avoided the Learning vector quantization (LVQ) [56], primarily for
problem of model learning. Instead, they built the structural feature vectors, was generalized to adjust structural proto-
models manually using prior knowledge (e.g., [7], [16], [65], types and applied to alphabetic recognition [59], [130] and
[76], [73], [139]) or used carefully written character patterns Chinese character recognition [1]. For LVQ of structural
as prototypes (e.g., [132], [133]). Usually, model learning patterns, an elastic matching algorithm is embedded to
proceeds by iteratively adjusting the parameters of the correspond the primitives of the prototype and the learning
structural models on matching the learning patterns with
pattern. Based on the deformation vectors between the
the models. The adjusted models can give higher recogni-
corresponding primitives of prototype and learning pattern,
tion accuracy than the initial, manually built models.
the prototype is drawn either toward the learning pattern or
6.1 Mean Prototype Learning apart from the learning pattern with the aim of reducing the
Usually, a “mean” prototype of the learning patterns for each number of misclassifications. Since LVQ adjusts prototypes
class gives good recognition performance. Based on an initial discriminatively, it can give higher classification accuracy
structural model and the correspondence between this model than clustering.
and every learning pattern, the means and variances (or 6.4 Structured Learning
PDFs) of the structural attributes can be computed and the
In structured model database, the common radical models
mean prototype can be updated. The mean prototype
(described in mean structural attributes) is updated itera- shared by different classes can be generated and adjusted
tively until the mean attributes converge. This iterative according to the learning patterns that contain this radical.
procedure is a generalized version of the EM (expectation- However, the automated learning of radical models has not
maximization) algorithm [21]. It has been used to learn the been adequately addressed.
mean prototypes of point sequences [133], feature points [18], In an interactive learning approach, radical prototypes
FARG attributes [149], and the parameters of PCHMM [150]. are generated incrementally upon request when a part of
the input pattern or the pattern as a whole mismatch the
6.2 HMM Learning prototypes [87]. An approach uses LVQ to adjust the radical
In HMM learning, the parameters (initial probabilities, prototypes discriminatively [53]. It describes radical proto-
transition probabilities, and emission probabilities) are gen- types as line segments spanned in a square box. When
erally estimated on learning patterns using an EM procedure constructing a character model, the constituent radical
called Baum-Welch algorithm [103]. If the states can be prototypes are rescaled to the actual sizes and aspect ratios.
partitioned artificially or have explicit physical meanings, the On matching the character model with an input pattern, the
probabilities can be calculated by counting the frequencies of deformation vectors of a constituent radical are scaled back
events, such as for the discrete HMMs of [143]. to square box and used to adjust the radical prototype. It
In the situation that an HMM represents a primitive was shown that the adjusted radical prototypes outperform
(stroke, substroke, etc.) instead of a holistic pattern, the mean radical prototypes [53].
206 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004

Fig. 8. A segmentation-recognition candidate network. The thick line denotes the most plausible segmentation path.

6.5 Writer Adaptation handwritten text recognition, the integration of segmentation


For writer-dependent recognition, the model database is and recognition in contextual processing has particular
initially designed on the samples of multiple writers (this is importance because, often, the characters cannot be segmen-
to overcome the insufficient amount of writer-specific ted unambiguously prior to recognition [24], [83].
samples) and then adapted to the patterns of a specific The linguistic processing of character recognition results
user. Writer adaptation improves the recognition accuracy after segmentation is usually referred to as postprocessing.
for the user, particularly when the user’s writing style is Based on the candidate classes given by the character
novel or the writing environment changes [52]. recognizer, additional candidates can be added according
For alphanumeric recognition, an adaptation method to the statistics of confusion between characters in order to
includes three adaptation modes: adding new prototypes, reduce the risk of excluding the true class [140]. The selection
deactivating confusing prototypes, and reshaping existing of final classes from candidate classes is based on the
prototypes [59], [130]. The classification is performed using linguistic knowledge represented in word dictionaries,
the k-nearest neighbor (k-NN) rule on DTW (dynamic time character-based n-grams [64], [88], [115], or word-based
warping). The input pattern is added into the prototype set n-grams [39], [118], [137], [140]. The word-based n-gram is
when one of the k-NN prototypes belongs to wrong class. If generally based on the syntactic/semantic classes (e.g., parts
two prototypes are close to the input pattern but belong to of speech) of words. Its use in linguistic processing involves
different classes, they are deactivated. Prototype reshaping the segmentation of text into words, usually by morphologi-
works in a similar way to LVQ. It takes place when the cal analysis using a lexicon [118], [137]. The adaptation of
input pattern is slightly different from the nearest proto- writer-specific linguistic dictionaries is beneficial for writer-
type. A similar method was used in online Japanese dependent recognition [40]. Linguistic postprocessing can
character recognition with special strategies to control the significantly improve the accuracy of handwritten text
adding, deletion, and reshaping of prototypes [2]. recognition, e.g., halved character-level error rates were
A simple interactive adaptation method shows promise in reported in OLCCR [39], [140].
online Japanese character recognition in [147]. When the The ambiguities in segmentation are generally solved by
recognition of an input pattern gives multiple candidate
generating candidate character patterns and verifying the
classes and the best candidate class is not correct, the input
candidate patterns using geometric features, recognition
pattern is added as a prototype or an existing prototype is
results, and linguistic knowledge. The candidate character
replaced with the average of the input pattern and the
patterns can be represented in a candidate network whose
prototype.
edges denote combinations of segments that constitute the
The adaptation of HMMs has been widely addressed in
the speech recognition literature. In the adaptation of candidate patterns [83]. Each candidate pattern is assigned
substroke HMMs for OLCCR, the mean vectors of emission several candidate classes, together with their scores. Each
probability densities are adjusted based on the segmenta- path in the network represents a segmentation of the hand-
tion of HMM states [91]. written input and its recognition result. The most plausible
path can be found by DP search. Linguistic knowledge is
either used to verify the paths [34], [83] or numerically
7 CONTEXTUAL PROCESSING incorporated into the path scores [26], [98], [110], [111].
Most efforts in OLCCR research have dealt with feature Fig. 8 shows an example of candidate segmentation-
extraction and classification/matching of character patterns. recognition network. On assigning candidate classes to each
Typically, the classification/matching module provides not candidate pattern, the edge of the candidate pattern should be
only a unique class index, but also the scores (probabilities, split into multiple edges corresponding to multiple candidate
similarities, or distances) to a number of candidate classes. classes. The path with the minimum cost or maximum
The linguistic context can provide valuable information for likelihood defines the result, i.e., the optimal segmentation
selecting the optimal class from this set of candidates. In with the pattern classes. The likelihood of a path integrates
addition, the geometric features of character patterns (size, both the probability that a candidate pattern is indeed a
location, aspect ratio, etc.) are useful to segment a hand- character and the compatibility of the recognized character
writing sequence into single characters. For page-based string with the language model [26], [110], [111].
LIU ET AL.: ONLINE RECOGNITION OF CHINESE CHARACTERS: THE STATE-OF-THE-ART 207

TABLE 1
Specifications of Kuchibue and Nakayosi Databases

used in recognition experiments. The Nakayosi database


covers a larger category set than the Kuchibue database.

8.2 Comparison of Results


The performance of a recognition system is measured in
terms of recognition accuracy, memory requirement, and
recognition speed. In general, statistical methods offer high
speed under large memory requirements, while structural
methods (including statistical-structural models) have low-
Fig. 9. Online patterns of TUAT Kuchibue database. er speed but smaller memory. The large memory of
statistical methods is due to the high dimensionality of
8 PERFORMANCE EVALUATION feature representation and, particularly, the class-specific
subspaces of quadratic classifiers. Structural methods are
Comparing the performance of different recognition sys- computationally expensive because the matching of char-
tems is not easy because the systems are usually optimized acter structures is a combinatorial problem.
for different applications and are thus adapted to different Table 2 lists some recognition results (experimented with
character sets and writing styles. However, in the experi-
at least 1,000 categories, without linguistic processing)
mental stage, the settings of recognizers are controllable and
reported since 1990. The recognition methods are roughly
the performance can be compared by testing on a common
classified into statistical (statis) or structural (struct) methods.
sample set. We introduce online sample databases available
HMM-based methods and some special structural methods
for system design and performance evaluation and then
are singled out. The method of Zheng et al. [150] unifies
compare recognition results reported in the literature.
statistical stroke models and PCHMMs. Velek et al. combined
8.1 Databases four (statistical) offline recognizers and three (structural)
Several recent research works promote the sharing of online recognizers [129]. Table 2 also gives the test writing
samples, such as the UNIPEN project [27] and the TUAT style, the numbers of learning samples and test samples of
(Tokyo University of Agriculture and Technology) databases each recognizer, if available.
[89]. The online sample databases of TUAT are gaining The recognition rates in Table 2 reflect the combined
popularity in Japan for system design and evaluation [42], performance of pattern representation, learning, and classi-
[80], [89]. Two TUAT databases are available: The Nakayosi fication. The recognition accuracy depends on the number
database is usually used for classifier learning, while the of learning samples, especially for statistical methods. Some
Kuchibue database is used for evaluation. Many researchers structural methods use a carefully written character proto-
have reported results on the Kuchibue database [1], [53], [88], type or a manually built model for each class, yet achieve
[99], [119], [143]. high accuracy by knowledge-based or flexible matching.
In the TUAT databases, most patterns were collected by Both statistical and structural methods can benefit from
writing natural sentences taken from a Japanese newspaper. learning with large sample sets. Some successful examples
The writing style was not constrained so that most of the are Kawamura et al. [45], Wakahara et al. [133], [134],
characters were written fluently. Some people habitually
Rowley et al. [106], and Velek et al. [129].
write in regular style, though. The samples of both databases
The comparison between statistical methods and struc-
were collected using two types of digitizing tablets (WACOM
tural methods is a debate in the character recognition
and MUTOH). For either database, the writing box on display
community. The selection of classifier should take into
was set to 60  60 dots, corresponding to physical size of 1:7 
account the trade-off between the recognition accuracy and
1:7cm2 (WACOM) or 1:43  1:43cm2 (MUTOH). The samples
the model database size. Statistical classification can yield
of Kuchibue were recorded in display coordinates while
those of Nakayosi were recorded in tablet coordinates, with very high accuracy if a sophisticated classifier is trained
the resolution 13 times higher (WACOM) or 9.45 times higher with large number of samples. For instance, among the
(MUTOH) than the display coordinates. Fig. 9 shows some classifiers of Velek et al. [129], a statistical classifier gives
typical sample patterns of the Kuchibue database. higher accuracy than the structural ones, but its model
The specifications of both TUAT databases are given in database consumes over 30MB memory. On the other hand,
Table 1. The Kuchibue database contains the patterns of a structural classifier can give fairly high accuracy with very
120 writers: 11,962 patterns per writer covering 3,356 cate- small storage, e.g., the model database of Akiyama et al. [2]
gories. Excluding the JIS level-2 Kanji characters, there are consumes only 166KB memory. Structural methods have
11,951 patterns for 3,345 categories (including 2,965 Kanji potential to give higher accuracies relying on probabilistic
characters and 380 non-Kanji symbols), which are frequently structure modeling and learning with samples.
208 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004

TABLE 2
Recognition Results Reported in the Literature

Our purpose in collecting recognition rates from the significantly improve the recognition performance. The
literature is both to investigate the status of performance significant improvement relies on the integration of multiple
and to compare the recognition methods. We did not collect approaches and the joint effects of all processing steps. In
the recognition rates of commercial products in the market the following, we will discuss the research directions in
considering that the advertised rates are estimated in respect to pattern representation, classification, learning/
different environments and their recognition methods are adaptation, and contextual processing, respectively.
rarely publicized. We believe it is currently possible to In the area of representation, both the feature vector
achieve a very high recognition rate, above 98 percent on representation and the traditional structural representation
regular scripts. However, on fluent or fluent-regular scripts, have apparent insufficiencies which can be overcome by the
it is difficult to achieve a recognition rate above 90 percent hybrid statistical-structural representation. In OLCCR, only
(see the results on Kuchibue database). Though linguistic a few works have modeled the PDFs of stroke attributes
processing can largely reduce the error rate, the remaining [20], [150]. Recently, modeling both strokes and interstroke
recognition errors still bring high inconvenience to the user.
relationships probabilistically has been tried in online
Assuming linguistic processing resolves half of recognition
numeral recognition [15] and offline Chinese character
errors and our target of final correct rate is 99 percent (even
recognition [49]. A hybrid model can also describe
this rate is not enough), then the character recognizer
characters structurally with statistical radical/stroke mod-
should provide an accuracy above 98 percent. For free
handwriting recognition, it is very hard to reach this target. els (HMMs, PDFs, or discriminant functions) as primitives.
Global feature vector representation schemes can also be
improved by informative feature extraction, automatic
9 FUTURE DIRECTIONS feature transformation and selection. Currently, the so
The gap between the technical status and the required simple direction feature (histogram feature) performs fairly
performance indicates that the problem of OLCCR is not well, but we are sure that it can be surpassed. The feature
solved yet and it leaves us research opportunities. To reach transformation and selection, pertaining to statistical
the goal of totally free handwriting recognition, we should pattern recognition [43], is effective to improve the
seriously reconsider the methods and find ways to classification performance.
LIU ET AL.: ONLINE RECOGNITION OF CHINESE CHARACTERS: THE STATE-OF-THE-ART 209

Improvements in classification are possible with both REFERENCES


statistical classification and structural matching approaches [1] K. Akiyama and K. Ishigaki, “A Method of Generating High
and, particularly, the integration of multiple classifiers. In Quality Online Character Recognition Dictionary Based on
desktop applications, sophisticated statistical classifiers can Training Samples,” Technical Report of IEICE, PRMU99-235
yield high recognition accuracies and the accuracy can be (2000-02), (in Japanese).
[2] K. Akiyama et al., “An Adaptation Method Based on Template
further improved by discriminative learning (like the LVQ). Cache for Online Character Recognition,” Technical Report of
The recognition accuracy of structural matching can be IEICE, PRMU2000-210 (2001-03), (in Japanese).
improved by modeling the structural models probabilisti- [3] D. Blostein, “General Diagram-Recognition Methodologies,”
cally. The computation cost of structural matching can be Graphics Recognition Methods and Applications, R. Kasturi and
K. Tombre eds. pp. 107-122, Springer, 1995.
lowered by exploring a variety of prior knowledge, such as [4] R.G. Casey, “Moment Normalization of Handprinted Character,”
the partial invariance of stroke-order and stroke connection. IBM J. Research Development, vol. 14, pp. 548-557, 1970.
Multiple classifiers can be combined in cascade or in parallel [5] K.-F. Chan and D.-Y. Yeung, “Error Detection, Error Correction
to improve the speed and/or accuracy. The cascaded and Performance Evaluation in On-Line Mathematical Expres-
sion Recognition,” Pattern Recognition, vol. 34, no. 8, pp. 1671-
combination can not only speed up the recognition, but also 1684, 2001.
improve the accuracy by employing sophisticated classifiers [6] K.P. Chan and Y.-S. Cheung, “Fuzzy-Attribute Graph with
(such as neural networks, support vector machines, and rule- Application to Chinese Character Recognition,” IEEE Trans.
based methods) for discriminating similar characters. The Systems Man, and Cybernetics, vol. 22, no. 1, pp. 153-160, 1992,
[7] J.W. Chen and S.Y. Lee, “On-Line Handwritten Chinese Character
parallel combination of multiple complementary classifiers
Recognition via a Fuzzy Attribute Representation,” Image and
can improve the classification accuracy [33], [54] and its Vision Computing, vol. 12, no. 10, pp. 669-681, 1994.
promise in OLCCR has been justified in [119], [129]. [8] J.-W. Chen and S.-Y. Lee, “A Hierarchical Representation for the
The recognition performance depends on the quality of the Reference Database of On-Line Chinese Character Recognition,”
model database and can benefit from learning with large Advances in Syntactic and Structural Pattern Recognition, P. Perner,
P. Wang, and A. Rosenfeld, eds., pp. 351-400, Springer, 1996.
sample sets. The potential high performance of structural [9] J.-W. Chen and S.-Y. Lee, “On-Line Handwriting Recognition of
methods has been hindered by the lack of efficient learning Chinese Characters via Rule-Based Approach,” Proc. 13th Int’l
methods. The adjustment of structural parameters starting Conf. Pattern Recognition, vol. 3, pp. 220-224, 1996.
from manually built models and the structural prototype [10] J.W. Chen and S.Y. Lee, “On-Line Chinese Character Recognition
via a Representation of Spatial Relationships between Strokes,”
selection from samples have been attempted, but the Int’l J. Pattern Recognition and Artificial Intelligence, vol. 11, no. 3,
automatic construction of structural models has not been pp. 329-357, 1997.
reported in the literature of OLCCR. The discriminative [11] K.J. Chen, K.K. Li, and Y.L. Chang, “A System for On-Line
learning of structural parameters is able to yield higher Recognition of Chinese Characters,” Int’l J. Pattern Recognition and
Artificial Intelligence, vol. 2, pp. 139-148, 1988.
recognition accuracy than the EM-like learning, but has not [12] R.-H. Chen, C.-W. Lee, and Z. Chen, “Preclassification of Hand-
been widely adopted. The task of learning also lies in the written Chinese Characters Based on Basic Stroke Substructures,”
acquisition of knowledge base for rule-based classification or Proc. Fourth Int’l Workshop Frontiers in Handwriting Recognition,
for assisting structural matching, which was often accom- pp. 176-184, 1994.
[13] W.-T. Chen and T.-R. Chou, “A Hierarchical Deformation Model
plished manually. For the adaptation of writer-dependent for On-Line Cursive Script Recognition,” Pattern Recognition,
models, the previous works have used ad hoc methods to vol. 27, no. 2, pp. 205-219, 1994.
add, delete, or adjust structural prototypes. This problem [14] Z. Chen, C.-W. Lee, and R.-H. Cheng, “Handwritten Chinese
needs to be considered more closely and principled methods Character Analysis and Preclassification Using Stroke Structural
Sequence,” Proc. 13th Int’l Conf. Pattern Recognition, vol. 3, pp. 89-93,
are hoped to better deal with it. 1996.
With regard to the contextual processing of OLCCR, [15] S.J. Cho and J.H. Kim, “Bayesian Network Modeling of Strokes and
layout analysis and text line extraction techniques are to be Their Relationships for On-Line Handwriting Recognition,” Proc.
developed to meet with the needs of page-based input. The Sixth Int’l Conf. Document Analysis and Recognition, pp. 86-90, 2001.
[16] K.-S. Chou et al, “Knowledge Model Based Approach in
performance of integrated segmentation-recognition can be Recognition of On-Line Chinese Characters,” IEEE J. Selected Areas
improved by evaluating the paths in the candidate network Comm., vol. 12, no. 9, pp. 1566-1575, 1994.
probabilistically, rather than heuristically. This requires the [17] K.-S. Chou, K.-C. Fan, and C.-K. Lin, “A Knowledge Based
probabilistic representation of linguistic knowledge, the Approach to the Recognition of On-Line Confusing Chinese
Characters,” Proc. Fourth Int’l Workshop Frontiers in Handwriting
transformation of geometric features and character recogni- Recognition, pp. 185-194, 1994.
tion scores into probabilities. Another contextual feature, [18] K-S. Chou, K.-C. Fan, and T.-I. Fan, “Radical-Based Neighboring
called style consistency, as has been utilized in numeral Segment Matching Method for On-Line Chinese Character
recognition [46], [108], is helpful to reduce errors in writer- Recognition,” Proc. 13th Int’l Conf. Pattern Recognition, vol. 3,
pp. 84-88, 1996.
independent recognition. In a group of character patterns
[19] K.-S. Chou, K.-C. Fan, and T.-I. Fan, “Peripheral and Global
given by the same writer, the correlation between the Features for Use in Coarse Classification of Chinese Characters,”
character shapes can be explored to clear the confusions Pattern Recognition, vol. 30, no. 3, pp. 483-489, 1997.
such as similar shapes given by different writers belonging [20] T.-R. Chou and W.T. Chen, “A Stochastic Representation of
Cursive Chinese Characters for On-Line Recognition,” Pattern
to different classes. The style consistency can be utilized in
Recognition, vol. 30, no. 6, pp. 903-920, 1997.
the classifier design or postprocessing stage. [21] A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum Like-
lihood from Incomplete Data via the EM Algorithm,” J. Royal
Statistical Soc. Series B., vol. 39, no. 1, pp. 1-38, 1977.
ACKNOWLEDGMENTS [22] R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification, second
ed. Wiley Interscience, 2001.
The authors are grateful to the anonymous reviewers for [23] K.-C. Fan, C.-K. Lin, and K.-S. Chou, “Confusion Set Recognition
their invaluable comments and suggestions, which led to of On-Line Character Recognition by Artificial Intelligence
improvements to the presentation quality of this paper. Techniques,” Pattern Recognition, vol. 28, no. 3, pp. 303-313, 1995.
210 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004

[24] T. Fujisaki, T.E. Chefalas, J. Kim, C.C. Tappert, and C.G. Wolf, [48] H.J. Kim, K.H. Kim, S.K. Kim, and F.T.-P. Lee, “On-Line Recognition
“On-Line Run-On Character Recognizer: Design and Perfor- of Handwritten Chinese Characters Based on Hidden Markov
mance,” Character and Handwriting Recognition: Expanding Fron- Models,” Pattern Recognition, vol. 30, no. 9, pp. 1489-1499, 1997.
tiers, P.S.P. Wang ed., pp. 123-136, World Scientific, 1992. [49] I.-J. Kim and J.H. Kim, “Statistical Character Structural Modeling
[25] K. Fukunaga, Introduction to Statistical Pattern Recognition, second and Its Application to Handwritten Chinese Character Recogni-
ed. Academic Press, 1990. tion,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25,
[26] T. Fukushima and M. Nakagawa, “On-Line Writing-Box-Free no. 11, pp. 1422-1436, 2003.
Recognition of Handwritten Japanese Text Considering Character [50] F. Kimura, K. Takashina, S. Tsuruoka, and Y. Miyake, “Modified
Size Variations,” Proc. 15th Int’l Conf. Pattern Recognition, vol. 2, Quadratic Discriminant Functions and the Application to Chinese
pp. 359-363, 2000. Character Recognition,” IEEE Trans. Pattern Analysis and Machine
[27] I. Guyon, L. Schomaker, R. Plamondon, M. Liberman, and S. Janet, Intelligence, vol. 9, no. 1, pp. 149-153, Jan. 1987.
“UNIPEN Project of On-Line Data Exchange and Recognizer [51] F. Kimura, T. Wakabayashi, S. Tsuruoka, and Y. Miyake,
Benchmarks,” Proc. 12th Int’l Conf. Pattern Recognition, vol. 2, “Improvement of Handwritten Japanese Character Recognition
pp. 29-33, 1994. Using Weighted Direction Code Histogram,” Pattern Recognition,
[28] M. Hamanaka, K. Yamada, and J. Tsukumo, “On-Line Japanese vol. 30, no. 8, pp. 1329-1337, 1997.
Character Recognition Experiments by an Off-Line Method Based [52] Y. Kimura, K. Odaka, A. Suzuki, and M. Sano, “Analysis and
on Normalization-Cooperated Feature Extraction,” Proc. Third Int’l Evaluation of Dictionary Learning on Handy Type Pen-Input
Conf. Document Analysis and Recognition, pp. 204-207, 1993. Interface for Personal Use,” Trans. IEICE Japan, vol. J84-D-II, no. 3,
[29] M. Hamanaka and K. Yamada, “On-Line Character Recognition pp. 509-518, 2001.
Adaptively Controlled by Handwriting Quality,” Proc. Seventh [53] A. Kitadai and M. Nakagawa, “A Learning Algorithm for Structural
Int’l Workshop Frontiers in Handwriting Recognition, pp. 23-32, 2000. Character Pattern Representation Used in On-Line Recognition of
[30] R.M. Haralick, “The Consistent Labeling Problem: Part I,” IEEE Handwritten Japanese Characters,” Proc. Eighth Int’l Workshop
Trans. Pattern Analysis and Machine Intelligence, vol. 1, no. 2, Frontiers in Handwriting Recognition, pp. 163-168, 2002.
pp. 173-184, 1979. [54] J. Kittler, “On Combining Classifiers,” IEEE Trans. Pattern Analysis
[31] Y. Hidai, K. Ooi, and Y. Nakamura, “Stroke Re-Order Algorithm and Machine Intelligence, vol. 20, no. 3, pp. 226-239, Mar. 1998.
for On-Line Hand-Printed Character Recognition,” Proc. Eighth [55] M. Kobayashi et al., “RAV (Reparameterized Angular Variations)
Int’l Conf. Pattern Recognition, vol. 2, pp. 934-936, 1986. Algorithms for Online Handwriting Recognition,” Int’l J. Docu-
[32] T.H. Hilderbrand and W. Liu, “Optical Recognition of Chinese ment Analysis and Recognition, vol. 3, no. 3, pp. 181-191, 2001.
Characters: Advances since 1980,” Pattern Recognition, vol. 26, [56] T. Kohonen, “The Self-Organizing Map,” Proc. IEEE, vol. 78, no. 9,
no. 2, pp. 205-225, 1993. pp. 1464-1480, 1990.
[33] T.K. Ho, J.J. Hull, and S.N. Srihari, “Decision Combination in [57] Y. Komiya, T. Ohishi, T. Matsumoto, “A Pen Input On-Line
Multiple Classifier Systems,” IEEE Trans. Pattern Analysis and Signature Verifier Integrating Position, Pressure and Indication
Machine Intelligence, vol. 16, no. 1, pp. 66-75, Jan. 1994. Trajectories,” IEICE Trans. Information and Systems, vol. E84-D,
[34] C. Hong, G. Loudon, Y. Wu, and R. Zitserman, “Segmentation and no. 7, pp. 833-838, 2001.
Recognition of Continuous Handwriting Chinese Text,” Proc. Int’l [58] K. Kuroda, K. Harada, and M. Hagiwara, “Large Scale On-Line
Conf. Computer Processing of Oriental Languages, pp. 630-633, 1997. Handwritten Chinese Character Recognition Using Successor
[35] A.J. Hsieh, K.-C. Fan, and T.-I. Fan, “Bipartite Weighted Matching Method Based on Stochastic Regular Grammar,” Pattern Recogni-
for On-Line Handwritten Chinese Character Recognition,” Pattern tion, vol. 32, no. 8, pp. 307-1315, 1999.
Recognition, vol. 28, no. 2, pp. 143-151, 1995. [59] J. Laaksonen, V. Vuori, and E. Oja, “Adaptation of Prototype Sets
[36] R.T. Hummel and S.W. Zucker, “On the Foundations of Relaxa- in On-Line Recognition of Isolated Handwritten Latin Charac-
tion Labeling Processes,” IEEE Trans. Pattern Analysis and Machine ters,” Advances in Handwriting Recognition, S.-W. Lee ed., pp. 489-
Intelligence, vol. 5, no. 3, pp. 267-287, 1983. 497, World Scientific, 1999.
[37] T. Iijima, H. Genchi, and K. Mori, “A Theory of Character [60] S.-R. Lay et al., “On-Line Chinese Character Recognition with
Recognition by Pattern Matching Method,” Proc. First Int’l Joint Effective Candidate Radical and Candidate Character Selection,”
Conf. Pattern Recognition, pp. 50-56, 1973. Pattern Recognition, vol. 29, no. 10, pp. 1647-1659, 1996.
[38] K. Ishigaki and T. Morishita, “A Top-Down Online Handwritten [61] F. Leclerc and R. Plamondon, “Automatic Signature Verification:
Character Recognition Method via the Denotation of Variation,” the State of The Art—1989-1993,” Int’l J. Pattern Recognition and
Proc. Int’l Conf. Computer Processing of Chinese and Oriental Artificial Intelligence, vol. 8, no. 3, pp. 643-660, 1994.
Languages, pp. 141-145, 1988. [62] S.-W. Lee and J.-S. Park, “Nonlinear Shape Normalization
[39] N. Itoh, “Japanese Language Model Based on Bigrams and Its Methods for the Recognition of Large-Set Handwritten Charac-
Application to On-Line Character Recognition,” Pattern Recogni- ters,” Pattern Recognition, vol. 27, no. 7, pp. 895-902, 1994.
tion, vol. 28, no. 2, pp. 135-143, 1995. [63] X. Li and N.S. Hall, “Corner Detection and Shape Classification of
[40] N. Iwayama and K. Ishigaki, “Adaptive Context Processing in On- On-Line Handprinted Kanji Strokes,” Pattern Recognition, vol. 26,
Line Handwritten Character Recognition,” Proc. Seventh Int’l no. 9, pp. 1315-1334, 1993.
Workshop Frontiers in Handwriting Recognition, pp. 469-474, 2000. [64] M.-Y. Lin and W.-H. Tsai, “A New Approach to On-Line Chinese
[41] S. Jaeger, C.-L. Liu, and M. Nakagawa, “The State of the Art in Character Recognition by Sentence Contextual Information Using
Japanese On-Line Handwriting Recognition Compared to Tech- the Relaxation Technique,” Proc. Int’l Conf. Computer Processing of
niques in Western Handwriting Recognition,” Int’l J. Document Chinese and Oriental Languages, pp. 131-134, 1988.
Analysis and Recognition, vol. 6, no. 2, pp. 75-88, 2003. [65] C.-K. Lin, K.-C. Fan, and F.T. Lee, “On-Line Recognition by
[42] S. Jaeger and M. Nakagawa, “Two On-Line Japanese Character Deviation-Expansion Model and Dynamic Programming Match-
Databases in UNIPEN Format,” Proc. Sixth Int’l Conf. Document ing,” Pattern Recognition, vol. 26, no. 2, pp. 259-268, 1993.
Analysis and Recognition, pp. 566-570, 2001. [66] T.-Z. Lin and K.-C. Fan, “Coarse Classification of On-Line Chinese
[43] A.K. Jain, R.P.W. Duin, and J. Mao, “Statistical Pattern Recogni- Characters via Structure Feature-Based Method,” Pattern Recogni-
tion: A Review,” IEEE Trans. Pattern Analysis and Machine tion, vol. 17, no. 10, pp. 1365-1377, 1994.
Intelligence, vol. 22, no. 1, pp. 4-37, Jan. 2000. [67] C.-L. Liu, Y.-J. Liu, and R.-W. Dai, “Preprocessing and Statistical/
[44] A.K. Jain, F.D. Griess, and S.D. Connell, “On-Line Signature Structural Feature Extraction for Handwritten Numeral Recogni-
Verification,” Pattern Recognition, vol. 35, no. 12, pp. 2963-2972, 2002. tion,” Progress of Handwriting Recognition, A.C. Downton and
[45] A. Kawamura et al., “On-Line Recognition of Freely Handwritten S. Impedovo eds., pp. 161-168, World Scientific, 1997.
Japanese Characters Using Directional Feature Densities,” Proc. [68] C.-L. Liu and M. Nakagawa, “Precise Candidate Selection for
11th Int’l Conf. Pattern Recognition, vol. 2, pp. 183-186, 1992. Large Character Set Recognition by Confidence Evaluation,” IEEE
[46] T. Kawatani, “Character Recognition Performance Improvement Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 6,
Using Personal Handwriting Characteristics,” Proc. Third Int’l pp. 636-642, June 2000.
Conf. Document Analysis and Recognition, pp. 98-103, 1995. [69] C.-L. Liu and M. Nakagawa, “Evaluation of Prototype Learning
[47] H.J. Kim, J.W. Jung, and S.K. Kim, “On-Line Chinese Character Algorithms for Nearest Neighbor Classifier in Application to
Recognition Using ART-Based Stroke Classification,” Pattern Handwritten Character Recognition,” Pattern Recognition, vol. 34,
Recognition Letters, vol. 17, pp. 1311-1322, 1996. no. 3, pp. 601-615, 2001.
LIU ET AL.: ONLINE RECOGNITION OF CHINESE CHARACTERS: THE STATE-OF-THE-ART 211

[70] C.-L. Liu, H. Sako, and H. Fujisawa, “Handwritten Chinese [91] M. Nakai, N. Akira, H. Shimodaira, and S. Sagayama, “Substroke
Character Recognition: Alternatives to Nonlinear Normalization,” Approach to HMM-Based On-Line Kanji Handwriting Recogni-
Proc. Seventh Int’l Conf. Document Analysis and Recognition, pp. 524- tion,” Prof. Sixth Int’l Conf. Document Analysis and Recognition,
528, 2003. pp. 491-495, 2001.
[71] J. Liu, W.-K. Cham, and M.M.Y. Chang, “Online Chinese [92] M. Nakai, T. Sudo, H. Shimodaira, and S. Sagayama, “Pen
Character Recognition Using Attributed Relational Graph Match- Pressure Features for Writer-Independent On-Line Handwriting
ing,” IEE Proc. Vision Image Signal Processing, vol. 143, no. 2, Recognition Based on Substroke HMM,” Proc. 16th Int’l Conf.
pp. 125-131, 1996. Pattern Recognition, vol. 3, pp. 220-223, 2002.
[72] J. Liu, W.-K. Cham, and M.M.Y. Chang, “Stroke Order and Stroke [93] M. Nakai, H. Shimodaira, and S. Sagayama, “Generation of
Number Free On-Line Chinese Character Recognition Using Hierarchical Dictionary for Stroke-Order Free Kanji Handwriting
Attributed Relational Graph Matching,” Proc. 13th Int’l Conf. Recognition Based on Substroke HMM,” Proc. Seventh Int’l Conf.
Pattern Recognition, vol. 3, pp. 259-263, 1996. Document Analysis and Recognition, pp. 514-518, 2003.
[73] J.Z. Liu, K. Ma, W.K. Cham, and M.M.Y. Chang, “Two-Layer [94] H. Nambu, T. Kawamata, F. Maruyama, and F. Yoda, “On-Line
Assignment Method for Online Chinese Character Recognition,” Handwriting Chinese Character Recognition: Comparison and
IEE Proc. Vision Image Signal Processing, vol. 147, no. 1, pp. 47-54, Improvement to Japanese Kanji Recognition,” Proc. 14th Int’l Conf.
2000. Pattern Recognition, vol. 2, pp. 1145-1149, 1998.
[74] Y.J. Liu and J.W. Tai, “Structural Approach to On-Line Chinese [95] N.J. Nillson, Principles of Artificial Intelligence. Springer-Verlag,
Character Recognition,” Proc. Ninth Int’l Conf. Pattern Recognition, 1980.
pp. 808-810, 1988. [96] K. Odaka, T. Wakahara, and I. Masuda, “Stroke Order Free On-
[75] Y.-J. Liu and J.W. Tai, “An On-Line Chinese Character Recogni- Line Character Recognition,” Trans. IECE Japan, vol. J65-D, no. 6,
tion System for Handwritten in Chinese Calligraphy,” From Pixels pp. 679-686, 1982, (in Japanese).
to Features III: Int’l Workshop Frontiers in Handwriting Recognition, [97] E. Oja, Subspace Method of Pattern Recognition. Research Studies
pp. 69-80, 1991. Press, 1983.
[76] Y.-J. Liu, L.-Q. Zhang, and J.W. Tai, “A New Approach to On-Line [98] M. Okamoto, H. Yamamoto, K. Sawada, and K. Yamamoto, “On-
Handwriting Chinese Character Recognition,” Proc. Second Int’l Line Handwriting Character String Separation Method Using
Conf. Document Analysis and Recognition, pp. 192-195, 1993. Network Expression,” Proc. 13th Int’l Conf. Pattern Recognition,
[77] D. Lopresti and G. Wilfong, “Cross-Domain Searching Using vol. 4, pp. 422-425, 1996.
Handwritten Queries,” Proc. Seventh Int’l Workshop Frontiers in [99] M. Okamoto and K. Yamamoto, “On-Line Handwriting Character
Handwriting Recognition, pp. 3-12, 2000. Recognition Using Direction-Change Features that Consider
[78] M.Y. Ma, P.S.P. Wang, D.P. Lopresti, and J. Crisman, “Semantic Imaginary Strokes,” Pattern Recognition, vol. 32, no. 7, pp. 1115-
Matching of Free-Format Chinese Handwriting,” Proc. Int’l Conf. 1128, 1999.
Computer Processing of Oriental Languages, pp. 107-112, 1997. [100] M. Okamoto and K. Yamamoto, “On-Line Handwritten Character
[79] N. Matic, J. Platt, and T. Wang, “QuickStroke: An Incremental On- Recognition Method Using Directional Features and Clockwise/
Line Chinese Handwriting Recognition System,” Proc. 16th Int’l Counter-Clockwise Direction-Change Features,” Proc. Fifth Int’l
Conf. Pattern Recognition, vol. 3, pp. 435-437, 2002. Conf. Document Analysis and Recognition, pp. 491-494, 1999.
[80] K. Matsumoto, T. Fukushima, and M. Nakagawa, “Collection and [101] C.H. Papadimitrious and K. Steiglitz, Combinatorial Optimization—
Analysis of On-Line Handwritten Japanese Character Patterns,” Algorithms and Complexity. Prentice Hall, 1982.
Proc. Sixth Int’l Conf. Document Analysis and Recognition, pp. 496- [102] R. Plamondon and S.N. Srihari, “On-Line and Off-Line Hand-
500, 2001. writing Recognition: A Comprehensive Survey,” IEEE Trans.
[81] K. Matsumoto and M. Nakagawa, “Improvement of a Coarse Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 63-82,
Classification Method for On-Line Recognition of Handwritten Jan. 2000.
Japanese Characters,” Technical Report of IEICE, PRMU2001-273 [103] L.R. Rabiner, “A Tutorial on Hidden Markov Models and Selected
(2002-03), (in Japanese). Applications in Speech Recognition,” Proc. IEEE, vol. 77, no. 2,
[82] S. Müller, E. Eickeler, and G. Rigoll, “Multimedia Database pp. 257-286, 1989.
Retrieval Using Hand-Drawn Sketches,” Proc. Fifth Int’l Conf. [104] U. Ramer, “An Iterative Procedure for the Polygonal Approxima-
Document Analysis and Recognition, pp. 289-292, 1999. tion of Plane Closed Curves,” Computer Graphics and Image
[83] H. Murase, “Online Recognition of Free-Format Japanese Hand- Processing, vol. 1, pp. 244-256, 1972.
writings,” Proc. Ninth Int’l Conf. Pattern Recognition, pp. 1143-1147, [105] A. Rosenfeld and E. Johnston, “Angle Detection on Digital
1988. Curves,” IEEE Trans. Computers, vol. 22, pp. 875-878, 1973.
[84] C.S. Myers and L.R. Rabiner, “A Level Building Dynamic Time [106] H.A. Rowley and M. Goyal, and J. Bennet, “The Effect of Large
Warping Algorithm for Connected Word Recognition,” IEEE Training Set Size on Online Japanese Kanji and English Cursive
Trans. Acoustic, Speech, and Signal Processing, vol. 29, no. 2, Recognizers,” Proc. Eighth Int’l Workshop Frontiers in Handwriting
pp. 284-297, 1981. Recognition, pp. 36-40, 2002.
[85] M. Nakagawa, “Non-Keyboard Input of Japanese Text—On-Line [107] G. Russel, M.P. Perrone, Y. Chee, and A. Ziq, “Handwritten
Recognition of Handwritten Characters as the Most Hopeful Document Retrieval,” Proc. Eighth Int’l Workshop Frontiers in
Approach,” J. Information Processing, vol. 13, no. 1, pp. 15-34, 1990. Handwriting Recognition, pp. 233-238, 2002.
[86] M. Nakagawa and K. Akiyama, “A Linear-Time Elastic Matching [108] P. Sarkar and G. Nagy, “Style Consistency in Isogenous Patterns,”
for Stroke Number Free Recognition of On-Line Handwritten Proc. 15th Int’l Conf. Pattern Recognition, vol. 2, pp. 859-862, 2000.
Characters,” Prof. Fourth Int’l Workshop Frontiers in Handwriting [109] L. Schomaker, L. Vuurpijl, and E. de Leau, “New Use for the Pen:
Recognition, pp. 48-56, 1994. Outline-Based Image Queries,” Proc. Fifth Int’l Conf. Document
[87] M. Nakagawa and L.V. Tu, “Structural Learning of Character Analysis and Recognition, pp. 293-296, 1999.
Patterns for On-Line Recognition of Handwritten Japanese [110] S. Senda, M. Hamanaka, and K. Yamada, “Box-Free Online
Characters,” Advances in Structural and Syntactic Pattern Recogni- Character Recognition Integrating Confidence Values of Segmen-
tion, P. Perner, P. Wang, and A. Rosenfeld, eds., pp. 180-188, tation, Recognition and Language Processing,” Technical Report
Springer-Verlag, 1996. of IEICE, PRMU98-138 (1998-12) (in Japanese).
[88] M. Nakagawa et al., “Robust and Highly Customizable Recogni- [111] S. Senda and K. Yamada, “A Maximum-Likelihood Approach to
tion of On-Line Handwritten Japanese Characters,” Proc. 13th Int’l Segmentation-Based Recognition of Unconstrained Handwriting
Conf. Pattern Recognition, vol. 3, pp. 269-273, 1996. Text,” Proc. Sixth Int’l Conf. Document Analysis and Recognition,
[89] M. Nakagawa et al., “On-Line Handwritten Character Pattern pp. 184-188, 2001.
Database Sampled in a Sequence of Sentences without Any [112] J.-P. Shin and H. Sakoe, “Stroke Correspondence Search Method
Writing Instructions,” Proc. Fourth Int’l Conf. Document Analysis for Stroke-Order and Stroke-Number Free On-Line Character
and Recognition, pp. 376-381, 1997. Recognition—Multilayer Cube Search,” Trans. IEICE Japan,
[90] M. Nakagawa, K. Akiyama, T. Oguni, and N. Kato, “Handwriting- vol. J82-D-II, no. 2, pp. 230-239, 1999.
Based User Interfaces Employing On-Line Handwriting Recogni- [113] J.-P. Shin, “Optimal Stroke-Correspondence Search Method for
tion,” Advances in Handwriting Recognition, S.-W. Lee, ed., pp. 578- On-Line Character Recognition,” Pattern Recognition Letters, vol. 23,
587, World Scientific, 1999. pp. 601-608, 2002.
212 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 26, NO. 2, FEBRUARY 2004

[114] J. Subrahmonia and T. Zimmerman, “Pen Computing: Challenges [137] P.-K. Wong and C. Chan, “Postprocessing Statistical Language
and Applications,” Proc. 15th Int’l Conf. Pattern Recognition, vol. 2, Models for a Handwritten Chinese Character Recognizer,” IEEE
pp. 60-66, 2000. Trans. Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 29,
[115] C.Y. Suen, “N-Gram Statistics for Natural Language Under- no. 2, pp. 286-290, 1999.
standing and Text Processing,” IEEE Trans. Pattern Analysis and [138] W.-Y. Wu and M.-J. Wang, “Detecting the Dominant Points by the
Machine Intelligence, vol. 1, no. 2, pp. 164-172, 1979. Curvature-Based Polygonal Approximation,” CVGIP: Graphical
[116] J.W. Tai, “Some Research Achievements on Chinese Character Models and Image Processing, vol. 55, no. 2, pp. 79-88, 1993.
Recognition in China,” Character and Handwriting Recognition: [139] X.-H. Xiao and R.-W. Dai, “On-Line Handwritten Chinese
Expanding Frontiers, P.S.P. Wang, ed., pp. 199-206, World Scientific, Character Recognition Directed by Components with Dynamic
1992. Templates,” Proc. Int’l Conf. Computer Processing of Oriental
[117] K. Takahashi, H. Yasuda, and T. Matsumoto, “A Fast HMM Languages ’97, pp. 89-94, 1997.
Algorithm for On-Line Handwritten Character Recognition,” Proc. [140] R. Xu, D. Yeung, W. Shu, and J. Liu, “A Hybrid Post-Processing
Fourth Int’l Conf. Document Analysis and Recognition, pp. 369-375, System for Handwritten Chinese Character Recognition,” Int’l J.
1997. Pattern Recognition and Artificial Intelligence, vol. 16, no. 6 pp. 657-
[118] K. Takeuchi and Y. Matsumoto, “Japanese OCR Correction Using 679, 2002.
Stochastic Morphological Analyzer and Probabilistic N-Gram [141] H. Yamada, K. Yamamoto, and T. Saito, “A Nonlinear Normal-
Word Model,” Int’l J. Computer Processing of Oriental Languages, ization Method for Handprinted Kanji Character Recognition—
vol. 13, no. 1, pp. 69-82, 2000. Line Density Equalization,” Pattern Recognition, vol. 23, no. 9,
[119] H. Tanaka et al., “Hybrid Pen-Input Character Recognition System pp. 1023-1029, 1990.
Based on Integration of Online-Offline Recognition,” Proc. Fifth [142] K. Yamasaki, “Automatic Prototype Stroke Generation Based on
Int’l Conf. Document Analysis and Recognition, pp. 209-212, 1999. Stroke Clustering for On-Line Handwritten Japanese Character
Recognition,” Proc. Fifth Int’l Conf. Document Analysis and Recogni-
[120] C.C. Tappert, C.Y. Suen, and T. Wakahara, “The State of the Art in
tion, pp. 673-676, 1999.
On-Line Handwriting Recognition,” IEEE Trans. Pattern Analysis
[143] H. Yasuda, K. Takahashi, T. Matsumoto, “On-Line Handwriting
and Machine Intelligence, vol. 12, no. 8, pp. 787-808, Aug. 1990.
Recognition by Discrete HMM with Fast Learning,” Advances in
[121] J. Tokuno et al., “Context-Dependent Substroke Model for HMM- Handwriting Recognition, S.-W. Lee ed., pp. 19-28, World Scientific,
Based On-Line Handwriting Recognition,” Proc. Eighth Int’l 1999.
Workshop Frontiers in Handwriting Recognition, pp. 78-83, 2002. [144] M. Yasuda and H. Fujisawa, “An Improvement of Correlation
[122] W.-H. Tsai and K.-S. Fu, “Subgraph Error-Correcting Isomorph- Method for Character Recognition,” Trans. IEICE Japan, vol. J62-D,
isms for Syntactic Pattern Recognition,” IEEE Trans. Systems, Man, no. 3, pp. 217-224, 1979.
and Cybernetics, vol. 13, no. 1, pp. 48-62, 1983. [145] P.J. Ye, H. Hugli, and F. Pellandini, “Techniques for On-Line
[123] W.-H. Tsai and S.S. Yu, “Attributed String Matching with Merging Chinese Character Recognition with Reduced Writing Con-
for Shape Recognition,” IEEE Trans. Pattern Analysis and Machine straints,” Proc. Seventh Int’l Conf. Pattern Recognition, pp. 1043-
Intelligence, vol. 7, no. 4, pp. 453-462, 1985. 105, 1984.
[124] Y.-T. Tsay and W.-H. Tsai, “Attributed String Matching by Split- [146] E.F. Yhap and E.C. Greanias, “An On-Line Chinese Character
and-Merge for On-Line Chinese Character Recognition,” IEEE Recognition System,” IBM J. Research Development, vol. 25, no. 3,
Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 2, pp. 187-195, 1981.
pp. 180-185, Feb. 1993. [147] T. Yokota, S. Kuzunuki, K. Gunji, and N. Hamada, “User
[125] M.-K. Tsay, K.-H. Shyu, and P.-C. Chang, “Feature Transforma- Adaptation in Handwriting Recognition by An Automatic
tion with Generalized LVQ for Handwritten Chinese Character Learning Algorithm,” Proc. Ninth Int’l Conf. Human-Computer
Recognition,” IEICE Trans. Information and Systems, vol. E82-D, Interaction, pp. 455-459, 2001.
no. 3, pp. 687-92, 1999. [148] R. Zanibbi, D. Blostein, and J.R. Cordy, “Recognizing Mathema-
[126] C.-C. Tseng, B.-S. Jeng, and K.-S. Chou, “Candidate Selection in On- tical Expressions Using Tree Transform,” IEEE Trans. Pattern
Line Chinese Character Recognition System Using Voting Scheme,” Analysis and Machine Intelligence, vol. 24, no. 11, pp. 1455-1467,
J. Information Science and Eng., vol. 15, no. 3, pp. 451-462, 1999. Nov. 2002.
[127] J. Tsukumo and H. Tanaka, “Classification of Handprinted Chinese [149] J. Zheng, X. Ding, and Y. Wu, “Recognizing On-Line Handwritten
Characters Using Non-Linear Normalization and Correlation Chinese Character via FARG Matching,” Proc. Fourth Int’l Conf.
Methods,” Proc. Ninth Int’l Conf. Pattern Recognition, pp. 168-171, Document Analysis and Recognition, pp. 621-624, 1997.
1988. [150] J. Zheng, X. Ding, Y. Wu, and Z. Lu, “Spatio-Temporal Unified
[128] M. Umeda, “Advances in Recognition Methods for Handwritten Model for On-Line Handwritten Chinese Character Recognition,”
Kanji Characters,” IEICE Trans. Information and Systems, vol. 79-D, Proc. Fifth Int’l Conf. Document Analysis and Recognition, pp. 649-
no. 5, pp. 401-410, 1996. 652, 1999.
[129] O. Velek, S. Jaeger, and M. Nakagawa, “A New Warping
Technique for Normalizing Likelihood of Multiple Classifiers Cheng-Lin Liu received the BS degree in
and Its Effectiveness in Combined On-Line/Off-Line Japanese electronic engineering from Wuhan University,
Character Recognition,” Proc. Eighth Int’l Workshop Frontiers in the ME degree in electronic engineering from
Handwriting Recognition, pp. 177-182, 2002. Beijing Polytechnic University, the PhD degree
[130] V. Vuori, J. Laaksonen, E. Oja, and J. Kangas, “Experiments with in pattern recognition and artificial intelligence
Adaptation Strategies for a Prototype-Based Recognition System from the Institute of Automation, Chinese Acad-
for Isolated Handwritten Characters,” Int’l J. Document Analysis emy of Sciences, in 1989, 1992, and 1995,
and Recognition, vol. 3, no. 3, pp. 150-159, 2001. respectively. He was a postdoctoral fellow at the
[131] T. Wakahara, “On-Line Cursive Script Recognition Using Local Korea Advanced Institute of Science and Tech-
Affine Transformation,” Proc. Ninth Int’l Conf. Pattern Recognition, nology (KAIST) and, later, at Tokyo University of
pp. 1133-1137, 1988. Agriculture and Technology from March 1996 to March 1999. He then
became a research staff member at the Central Research Laboratory,
[132] T. Wakahara, H. Murase, and K. Odaka, “On-Line Handwriting
Hitachi, Ltd. and now is a senior researcher. His research interests
Recognition,” Proc. IEEE, vol. 80, no. 7, pp. 1181-1194, 1992.
include pattern recognition, artificial intelligence, image processing,
[133] T. Wakahara et al., “On-Line Cursive Kanji Character Recognition
neural networks, machine learning, and, especially, applications to
as Stroke Correspondence Problem,” Proc. Third Int’l Conf.
character recognition and document processing. He is a member of the
Document Analysis and Recognition, pp. 1059-1064, 1995.
IEEE and IEEE Computer Society.
[134] T. Wakahara and K. Okada, “On-Line Cursive Kanji Character
Recognition Using Stroke-Based Affine Transformation,” IEEE
Trans Pattern Analysis and Machine Intelligence, vol. 19, no. 12,
pp. 1381-1385, Dec. 1997.
[135] P.H. Winston, Artificial Intelligence, third ed. Addison Wesley, 1992.
[136] A.K.C. Wong, M. You, and S.C. Chan, “An Algorithm for Graph
Optimal Monomorphism,” IEEE Trans. Systems, Man, and Cyber-
netics, vol. 20, no. 3, pp. 628-636, 1990.
LIU ET AL.: ONLINE RECOGNITION OF CHINESE CHARACTERS: THE STATE-OF-THE-ART 213

Stefan Jaeger graduated from the University of Masaki Nakagawa received BSc and MSc
Kaiserslautern, Germany, in computer science degrees from the University of Tokyo in 1977
in 1994 and received the PhD degree in and 1979, respectively. During the academic
computer science from Albert-Ludwigs Univer- year 1977-1978, he followed the computer
sity, Freiburg, Germany, in 1998. From 1994 to science course at Essex University in England,
1998, he worked as a PhD student at the and received the MSc degree with distinction in
Daimler-Benz Research Center, Ulm, Germany, computer studies in July 1979. He received the
where he was engaged in offline cursive hand- PhD degree in information science from the
writing recognition for postal mail sorting. His University of Tokyo in December 1988. From
PhD thesis addressed the problem of recovering April 1979, he has been working at Tokyo
dynamic information from static handwritten word images and was University of Agriculture and Technology. Currently, he is a professor
awarded the Dissertation Prize from the German Research Centers for of media interaction in the Department of Computer, Information, and
Artificial Intelligence in 1999. In 1998, he joined the Interactive Systems Communication Sciences. For the past 10 years, he has been
Laboratories jointly located at Carnegie Mellon University, Pittsburgh, collaborating with industry to advance pen-based human interactions
Pennsylvania, and at the University of Karlsruhe, Germany, where he composed of online handwriting recognition, pen-based interfaces, and
was a research staff member responsible for online handwriting applications, especially educational applications on an interactive
recognition and pen-computing. Since November 2000, he has been electronic whiteboard. He has been serving for several committees of
working as a senior researcher in the Department of Computer, the Japanese government on Industry and Universities partnership and
Information, and Communication Sciences at Tokyo University of those on IT-oriented and IT-supported learning. He is a member of the
Agriculture and Technology, Japan. His research interests include IEEE Computer Society.
pattern recognition, artificial intelligence, document analysis, hand-
writing recognition, and machine learning.

. For more information on this or any other computing topic,


please visit our Digital Library at http://computer.org/publications/dlib.

View publication stats

You might also like