On Data Reliability Assessment in Accounting Information Systems
On Data Reliability Assessment in Accounting Information Systems
doi 10.1287/isre.1050.0063
2005 INFORMS
On Data Reliability Assessment in
Accounting Information Systems
Ramayya Krishnan
The Heinz School, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213-3890, rk2x@cmu.edu
James Peters
The R. H. Smith School of Business, University of Maryland, College Park, Maryland 20742-7215,
jmpeters@umd.edu
Rema Padman, David Kaplan
The Heinz School, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213-3890
{rpadman@cmu.edu, djk@andrew.cmu.edu}
T
he need to ensure reliability of data in information systems has long been recognized. However, recent
accounting scandals and the subsequent requirements enacted in the Sarbanes-Oxley Act have made data
reliability assessment of critical importance to organizations, particularly for accounting data. Using the account-
ing functions of management information systems as a context, this paper develops an interdisciplinary
approach to data reliability assessment. Our work builds on the literature in accounting and auditing, where
reliability assessment has been a topic of study for a number of years. While formal probabilistic approaches
have been developed in this literature, they are rarely used in practice. The research reported in this paper
attempts to strike a balance between the informal, heuristic-based approaches used by auditors and formal,
probabilistic reliability assessment methods. We develop a formal, process-oriented ontology of an accounting
information system that denes its components and semantic constraints. We use the ontology to specify data
reliability assessment requirements and develop mathematical-model-based decision support methods to imple-
ment these requirements. We provide preliminary empirical evidence that the use of our approach improves
the efciency and effectiveness of reliability assessments. Finally, given the recent trend toward specifying
information systems using executable business process models (e.g., business process execution language), we
discuss opportunities for integrating our process-oriented data reliability assessment approachdeveloped in
the accounting contextin other IS application contexts.
Key words: workow and process management; accounting information systems; mathematical modeling;
internal control
History: Salvatore March, Senior Editor; Kalle Lyytinen, Associate Editor. This paper was received on June 4,
2003, and was with the authors 14.75 months for 3 revisions.
1. Introduction
The reliability of data produced by organizational
information systems used to plan, diagnose, and con-
trol business operations has long been considered
important. Despite extensive study of this problem in
the context of accounting information systems, few
rigorous yet practical tools have emerged in the liter-
ature (Felix and Niles 1988, Waller 1993). The present
guidance consists mainly of frameworks that do not
provide rigorous, systematic ways to assess data reli-
ability. These frameworks provide checklists of issues
that affect data reliability but do not provide formal
denitions of the key concepts in the frameworks
nor decision rules or algorithms to insure that the
reliability assessment is both efcient and effec-
tive
1
(cf. Committee of Sponsoring Organizations of
the Treadway Commission (COSO) 1994, Informa-
tion Systems Audit and Control Foundation (COBIT)
2000). The lack of formal concept denitions and deci-
sion rules makes it difcult to develop practical data
reliability assessment systems.
However, the relevance of data reliability assess-
mentparticularly in the context of accounting
1
We provide specic denitions for efcient and effective
in 2.2.2.
307
Krishnan et al.: On Data Reliability Assessment in Accounting Information Systems
308 Information Systems Research 16(3), pp. 307326, 2005 INFORMS
information systemshas increased considerably
with the recent passage by the U.S. Congress of
the Sarbanes-Oxley (SOX) Act (H.R. 3763Sarbanes-
Oxley, Title IV, 404 2002; Securities and Exchange
Commission 2003). The act requires a rms CEO and
CFO to certify the reliability of the data reported
in the nancial statements as well as the reliability
and documentation of the information system that
produced those data. These mandates may well be
overdue because reports of signicant reliability prob-
lems in current accounting information systems have
begun to appear in practitioner outlets. For example,
CFO magazine states Experts estimate that anywhere
from 10 percent to 30 percent of the data ow-
ing through corporate systems is bad. . . (Goff 2003,
pp. 9798). In addition, recent surveys indicate that
improving the reliability of a rms AIS to meet SOX
requirements will require signicant effort. For exam-
ple, Business Week (2003) reports that a recent survey
found SOX will . . . prompt 85% of Americas largest
100 companies to overhaul many components of their
nancial-reporting systems.
In this paper, we develop a formal approach to data
reliability assessment motivated by a eld study of a
major international accounting rm. The eld study
focused on data reliability assessment of account-
ing functions of management information systems.
Accounting functions are all information-capturing
and processing activities that lead to the mainte-
nance of general ledger account balances in a manage-
ment information system (MIS), regardless of whether
those functions are embedded in an integrated MIS
or in a freestanding accounting information system
(AIS).
2
General ledger account balances are the core
nancial data used by organizations to make manage-
rial decisions and to report the nancial status of an
organization to external stakeholders (Hollander et al.
2000). The eld study identied important decision
support requirements and a data reliability assess-
ment task (key control selection, discussed in detail
in 2) considered to be important to auditorsthe
professionals tasked with conducting data reliabil-
ity assessments of organizational information sys-
tems. The approach we have developed consists of
2
For simplicity, we refer to these accounting functions as an AIS
for the balance of this paper.
(a) an ontological metamodel that permits the repre-
sentation of key concepts needed to assess data reli-
ability, and (b) a set of algorithms that can process
instances of the ontological model to support deci-
sion making by auditors. We provide preliminary evi-
dence that the use of this approach improves the
effectiveness of auditors engaged in data reliability
assessment.
The rest of the paper is organized as follows. Sec-
tion 2 denes the key-control-selection problem and
describes how it ts into the process of evaluating AIS
data reliability using an illustrative example of a por-
tion of an AIS. Section 2 also reviews relevant prior
work on reliability assessment, including work in the
accounting and auditing literature, the software lit-
erature, the data quality literature, and the literature
on CASE tools and emerging work on declarative,
yet executable, business process models. Section 3
presents our process-oriented ontological metamodel
of basic concepts and relationships that describe an
AIS as a directed, attributed, acyclic graph. Using
this ontology, 4 formulates key control selection as
a set-covering problem and presents two procedures
required to compute the parameters of the model
from instances of the ontological model. Section 5
presents preliminary results comparing our AIS data
reliability assessment approach with that of experi-
enced auditors. Finally, 6 discusses how the general
approach we develop for AISs is more broadly appli-
cable to other IS applications.
2. The Key-Control-Selection Problem
2.1. Description of the Field Study
We conducted a eld study in a large international
accounting rm to understand their data reliability
assessment practices. The eld study consisted of
extensive interviews with seven audit managers in
three different ofces of the audit rm. The rms
director of audit research selected the managers to
interview based on their extensive knowledge and
experience with AIS reliability assessment. All inter-
views with the audit managers were recorded and
transcribed for further analysis. The purpose of the
interviews, and the analysis of their content, was to
identify the process auditors used to evaluate data
reliability in an AIS. The analysis identied an impor-
tant taskkey control selectionwithin that process
Krishnan et al.: On Data Reliability Assessment in Accounting Information Systems
Information Systems Research 16(3), pp. 307326, 2005 INFORMS 309
that the auditors identied as a step that required
computer-based decision support. We reviewed the
results of our analysis of the transcripts with the
rms director of audit research to conrm our nd-
ings. In addition, we conducted a review of the
professional literature to conrm the importance of
the key-control-selection task, the recommended pro-
cedures for key control selection, and the deni-
tion of the concepts required to select key controls.
A description of the key-control-selection task and
how it relates to AIS data reliability assessment is pre-
sented in the next section.
2.2. Introduction to AIS Reliability Assessment
and Key Control Selection
3
2.2.1. AIS Reliability Assessment. An AIS is a
transaction-oriented information system, and errors
arise in the context of these transactions. Auditors
who assess AIS reliability rely on four important con-
cepts: (a) assertions, (b) error classes, (c) information
transformation processes (ITPs), and (d) control pro-
cedures (AICPA 1999). Each of these are discussed
in turn.
Assertions and error classes are closely related. An
assertion is a statement about the absence of a partic-
ular class of error in the general ledger account data.
Auditors begin by determining the assertions they
would like the data in the general ledger accounts
to support. Five classes of errors, namely complete-
ness, existence, valuation, rights and obligations, and
presentation and disclosure (AICPA 1990, 1999), are
considered. Completeness errors occur when a valid
transaction that should be in the system is missing
(i.e., it is either not recorded or has been deleted
incorrectly). Existence errors occur when an invalid
transaction is added to the system. Valuation errors
occur when the data in the system does not accurately
reect the economic results of the transaction that
created the data. These three error classes are mutu-
ally exclusive, and exhaustive in terms of errors that
affect the accuracy of the data in the AIS. Auditors
also consider rights and obligations, and presentation
and disclosure errors, but these errors are relevant to
3
This section, as well as 3, 4, and 5, contains a variety of technical
terms. We provide a glossary of these terms in an appendix to this
paper.
the production of external nancial statements from
the AIS database and not the accuracy of the AIS
database, per se. Therefore, we do not consider them
in our research. Professional standards do not spec-
ify a tolerable error from these sources, but leave this
determination to the auditor (AICPA 1990). The set
of error classes determined by an auditor to be rele-
vant to a reliability assessment study are referred to
as target error classes.
The three error classes we consider capture two
key elements of data reliability that have been regu-
larly used in IS data quality research: completeness
and accuracy (Ballou and Pazer 1985; Redman 2001,
ch. 14; Wang et al. 1995). Based on our eldwork and
review of the literature, we nd that accuracy is a
combination of existence and valuation. That is,
the concept of accuracy includes both the exclusion
of invalid transactions in the AIS as well as the accu-
rate valuation of valid transactions. Because different
activities may lead to completeness, existence, or val-
uation errors, the auditors practice of decomposing
accuracy into existence and valuation errors, as well
as considering completeness, contributes to a more
complete assessment of AIS reliability.
Information transformation points (ITP) are points
in the AIS where these different classes of errors can
be introduced. ITPs are arranged in connected struc-
tures, such that information ows from one ITP to
another. This structure begins at the boundary of the
organization when the AIS rst captures data about
a transaction. This data then ows through a series
of ITPs until it reaches the general ledger account,
which, for our purposes, is the terminal point of the
AIS. These intervening ITPs alter the data in a vari-
ety of ways and, therefore, are capable of introduc-
ing errors.
Controls are procedures designed to prevent or
detect one or more of these errors. The internal control
structure of an AIS species the set of control pro-
cedures included in the AIS; their capacity for error
prevention/detection; and the information ows from
which each control can prevent/detect errors. The
reliability of an AIS is evaluated with respect to the
absence of error classes in the general ledger accounts.
Given an AIS and its internal control structure, the
auditor assesses the risk that one or more of the target
error classes will be present in one or more general
Krishnan et al.: On Data Reliability Assessment in Accounting Information Systems
310 Information Systems Research 16(3), pp. 307326, 2005 INFORMS
ledger accounts (GLAs). In other words, the audi-
tor has to determine if the error elimination
4
capa-
bilities of the controls are adequate to prevent errors
from reaching the general ledger accounts (GLAs).
Key controls are the subset of controls in the AIS that
provide the auditor with the desired assurance about
the absence of these error classes in the GLA.
This framing of the data reliability assessment prob-
lem focuses on the evaluation of an AIS and its exist-
ing internal control structure with respect to a target
set of assertions/error classes. Our eld study con-
rmed that this was the framing of the data reliability
assessment problem employed by the auditors. The
set of controls in the system is taken as given and
treated as exogenous to the data reliability assessment
problem. We discuss in 6 opportunities for further
research into the related problem of how to design
the set of controls in an AIS to meet specied data
reliability objectives.
2.2.2. Key Control Selection. The auditor needs
to develop a plan to test a set of controls in the
AIS to ensure that the general ledger accounts are
free of the types of error in the target assertions. An
AIS normally contains redundant or overlapping con-
trols, and so the auditor need not test all the con-
trols to determine if the AIS is reliable (i.e., his/her
target assertions are being met). The subset of the
set of controls selected for testing are referred to as
key controls. Testing controls is costly. For example,
auditors use techniques such as reprocessing a sam-
ple set of transactions to verify that the population
of transactions has been executed accurately. There-
fore, the auditor would prefer to select the smallest set
of key controls to test that will provide the required
assurance that they are functioning as designed. An
effective set of key controls ensures that the selected
controls have the capacity to eliminate target error
classes (TECs) in the GLAs. An efcient set of key con-
trols includes the fewest (or cheapest to test) set of
controls, while still being effective. Auditors nd the
development of effective and efcient key control sets
to be difcult. In 5, we present preliminary evidence
4
While auditors distinguish between preventive and detective/
corrective controls, our approach does not, and (for simplicity) we
will use eliminate in the balance of the paper to refer to a con-
trols ability keep errors classes from reaching the general ledger.
to demonstrate how even experienced (e.g., an aver-
age of 57 months of audit experience) auditors do
poorly when asked to construct effective and efcient
testing plans. They choose either too many controls or
too few controls, leading to problems with data relia-
bility assessments. Thus, the fundamental importance
of selecting key controls for data reliability assessment
led to our focus on key control selection in this paper.
2.3. Illustrative Example of an AIS
Figure 1 documents the main portion of an orga-
nizations purchasing processes and depicts infor-
mation transformation processes (ITP), information
ows, control procedures, general ledger accounts
(GLA), and target error classes (TEC). It is based on
a real case developed by the rm that participated
in our eld study. Developing such documentation is
the rst step in the data reliability assessment process
(Ballou and Pazer 1985). There are a variety of nota-
tions available for documenting the components of an
AIS. We have used notation loosely based on business
process modeling notation (BPMN) (see BPMI.org)
that permits the representation of the key concepts
underlying our approach in a direct manner. We leave
the question of developing robust extensions of the
BPMN notation suited to the needs of data reliabil-
ity assessment to future work. The formal semantics
associated with this notation is presented in 3.
The notation describes the AIS in terms of eco-
nomic events, information transformation processes,
controls, accounts, and error classes. We discuss the
example starting at the left of Figure 1. At the left
are economic events that create the information ow
that triggers the different ITPs shown in the g-
ure (i.e., when merchandise is requested by a unit
within the rm, when merchandise is received from
a vendor, or when an invoice is received from a ven-
dor). The three documentspurchase order, receiving
report, and invoiceare merged to produce a pay-
ment voucher that is used to support a cash pay-
ment to the vendor, resulting in entries to the cash,
accounts payable, inventory, and expense general
ledger accounts. The rounded rectangles of Figure 1
are ITPs. Examples are processes labeled Purchase
order and Check register. These are points at
which errors can be introduced. For example, com-
pleteness, existence, and valuation errors can be intro-
duced by the Purchase order ITP. The rectangles are
Krishnan et al.: On Data Reliability Assessment in Accounting Information Systems
Information Systems Research 16(3), pp. 307326, 2005 INFORMS 311
Figure 1 Diagram of Purchasing Activities Example
f 1
Request
merchandise
Purchase order
(PO) C, E,V Check register
C, E,V
Cash
C, E
Review coding,
math, &
authority
E,V
Account for PO
numbers
C
Match to VP
C,E,V
Match to
requisition
E,V
f 4
f 5
f 6
f 3
f 8
f 10
f 11
f 12
f 13
Cancel original
documents
C, E
f 9
Approve PO
E,V
Accounts
payable
C,V
Compare batch
totals
C, E,V
f 2 f 7
Disbursement
C, E,V
Receive
merchandise
Voucher package
(VP) C, E,V
Receiving report
(RR) C, E,V
Account for RR
numbers
C
Inventory
C, E,V
f 14
f 15
Match PO, RR
and I
C, E,V
Voucher register
N
Expense
C,V
Purchase invoice
(I) C,E,V
Receive
invoice
Error codes:
Key:
fn
Control
Error classes
covered
GLA
TECs
C Completeness
ITP
Error classes
at risk
Assurance-
control's
span
Economic
event
Information
flow
E Existence
V Valuation
N none
controls and are associated with ITPs, as indicated by
the assurance arrows. For example, the Account for
PO numbers control eliminates completeness errors
on ow ] 4.
Most controls function by comparing information
owing out of an ITP with information owing into
that ITP. Preventative controls function by screening
information owing into an ITP and, thus, prevent
the information owing out of the ITP from contain-
ing errors. Irrespective of the type of the control, the
objective of the control is to eliminate errors from the
information owing from an ITP.
The set of errors the control is able to eliminate
is referred to as its coverage capability. For exam-
ple, the Compare batch totals control in Figure 1
can prevent completeness, existence, and valuation
errors. The information ows on which the control
can eliminate errors are referred to as the span of a
control. For example, Compare batch totals com-
pares information from the Disbursement ITP with
information from the Check register and Voucher
register ITPs to assure that completeness, valuation,
and existence errors are absent in all the ows out of
Disbursement, Check register, and Voucher reg-
ister transformation points.
As noted above, reliability assessment begins with
the auditor using his judgment to establish a set of
target error classes (TECs) or assertions in the general
ledger account (GLA). The circular nodes in Figure 1
represent the GLAs, and the TECs for each GLA are
shown in italics. The set of GLAs and their associated
TECs describe the goal of the reliability assessment
process. In the example, the auditor has set complete-
ness, existence, and valuation as target error classes
for the cash account, while only testing completeness
and valuation for the accounts payable account. Audi-
tors use their judgment to select target error classes.
For example, in Figure 1 the auditor has elected not
to consider existence violations for accounts payable
and expenses due to the low probability that the sys-
tem would record expenses or liabilities if they did
not exist.
Krishnan et al.: On Data Reliability Assessment in Accounting Information Systems
312 Information Systems Research 16(3), pp. 307326, 2005 INFORMS
Given the documentation of the AIS and its internal
control structure as depicted in Figure 1, the auditor
has to construct effective and efcient control-testing
plans. This raises four important research questions:
Question 1: What should be the formal repre-
sentation of the AIS for purposes of data reliabil-
ity assessment? We develop a formal approach to
support AIS data reliability assessment that permits
algorithmic processing but is intuitive enough that
auditors could apply it in practice. The foundation
of the approach is a semantically precise notation
required to model an AIS. While many diagramming
notations have been developed by rms for inter-
nal use by trained auditors, they do not have for-
mal semantics and therefore do not lend themselves
to computer-based decision support. We develop an
ontological metamodel of an AIS suited to the needs
of data reliability assessment.
Given the characteristics of the problem described
above, there are supporting research questions that
need to be answered to build a formal approach:
Question 2: Given the internal control structure,
on which information ows can a given control pre-
vent or detect errors? That is, what is a controls span?
The span of the control depends on the structure of
information ows in the AIS and the set of ITPs from
which the control has access to information. Comput-
ing the span of the control given the documentation
of the AIS is difcult for auditors. We present our
approach to this problem in 4.3.
Question 3: Given the TECs associated with
each GLA, what TECs should be eliminated from
each ow that contributes, either directly or indirectly,
to the GLA? This is important because controls are
designed to provide assurances for ows coming from
each ITP, and different ITPs may generate different
sets of error classes. If the correct set of errors is elim-
inated at each ow, this will ensure that TECs will
not be present in the GLAs. We detail our approach
to this problem in 4.2.
Question 4: Given the TECs for each ITP and
the span of the controls, how should an effective and
efcient set of key controls be selected? Even expe-
rienced auditors do poorly at constructing effective
and efcient key control sets as we discovered during
our eldwork and our preliminary evaluation study
reported in 5. We discuss our approach to this prob-
lem in 4.1.
2.4. Prior Work and Related Literature
2.4.1. Auditing Literature. The most directly
related literature is the work on auditing in the
accounting literature. While the auditing literature
has not studied the key-control-selection problem,
it contains work on probabilistic and deterministic
models for AIS reliability assessment and decision
support. Probabilistic modeling research has focused
on nding ways to combine the necessary probability
estimates to provide an overall, quantitative assess-
ment of AIS reliability (e.g., Ahituv et al. 1985; Ballou
and Pazer 1985; Bodnar 1975; Cooley and Cooley
1982; Hamlen 1980; Haskins and Nanni 1987; Knechel
1983, 1985; Lea et al. 1992). Probabilistic research in
AIS evaluation has had little, if any, inuence on prac-
tice because the models tend to make too many sim-
plifying assumptions to achieve tractability (Felix and
Niles 1988).
Research on deterministic models for AIS reliability
assessment has focused more on decision support by
modeling the evaluators cognitive process and build-
ing expert systems (e.g., Kelly 1985, Meservy et al.
1986). However, because the approach is not based on
formal methods, it cannot guarantee the completeness
or accuracy of the resulting evaluation. One determin-
istic approach that evolved into an applied system
was the TICOM project (Bailey et al. 1985). Price-
waterhouseCoopers built on the modeling approach
developed in the TICOM project to implement a deci-
sion support system, called COMET, to help auditors
evaluate AIS systems reliability (Nado et al. 1996).
However, COMET does not help auditors link weak-
nesses in AIS reliability directly to TECs and, there-
fore, provides little support to assist AIS designers in
reengineering the internal control structure of the AIS
to eliminate future errors.
Our work strikes a balance between the informal
heuristic methods used by practitioners and extant
formal decision support approaches. We extend the
state of the art by developing a formal graph-based
notation suited to the needs of reliability assessment.
The notation has syntactic similarities to the informal
notation used by auditors (cf. Grant Thornton 1996),
has formal semantics, and can be processed algorith-
mically (using a mathematical modeling framework)
to provide decision support in the creation of both
effective and efcient testing plans.
Krishnan et al.: On Data Reliability Assessment in Accounting Information Systems
Information Systems Research 16(3), pp. 307326, 2005 INFORMS 313
2.4.2. IS Data Quality Literature. Complement-
ing the work in the accounting literature is the work
on data quality (Kaplan et al. 1998, Pierce 2004, Pipino
et al. 2002, Wang et al. 1995). This literature draws
on the analogy between physical manufacturing pro-
cesses and information manufacturing and argues
that data quality should be concerned about accuracy,
timeliness, consistency, and accessibility (Wang et al.
1995). In the AIS context, we decomposed accuracy
into existence and valuation to match how practi-
tioners view data reliability. The data quality research
offers a starting point towards dening error classes
for other IS applications that are analogous to the
error classes used to assess data reliability in AIS.
Distinctions are made in this literature between
quality problems that arise because of aws in IS
design versus quality problems that arise due to oper-
ational aws. Recent work by Pierce (2004) relates
this perspective on data as an information product
to ideas from auditing by dening a structure called
a control matrix. However, in contrast to our work,
Pierce (2004) neither models the information ows
in the information system and the location of con-
trols, nor provides a method for reasoning with the
control matrices to arrive at overall measures of data
quality. In addition, many papers in this literature
assume that the systems represent worlds with high
rates of change (e.g., stock market data) and offer pre-
scriptions (Orr 1998) with an emphasis on timeliness
and accessibility. While the emphasis is different in
our application context, the core ideas underlying our
work rely on an abstractionconsisting of controls
and error classesof an information system that has
applicability in contexts other than AIS (Pierce 2004).
2.4.3. Business Process Modeling and CASE
Tools. Because reliability assessment works with a
model of information ows and information trans-
formation processes, process-modeling methodolo-
gies and related systems analysis and design tools
are relevant to our work. In general, the literature
on these topics focuses on supporting analysis and
design. As noted in 2.2, data reliability assessment
of AIS presently focuses on the evaluation of the
internal control structure of an AIS rather than on
its design. However, modeling representations devel-
oped in process modeling and CASE literatures are
relevant to our work because documenting the AIS
and its internal control structure is essentially an exer-
cise in representation. We briey discuss the unied
modeling language (UML) and business process mod-
eling notation (BPMN) and the reasons why we have
not used these approaches in our work.
UML (see www.uml.org) is an industry standard
for software design. It provides a number of diagrams
(e.g., state transition diagram, activity diagram) for
visualizing and documenting the artifacts that make
up a software system. Tools are available to help
designers create these diagrams. While the advantage
of using UML as a starting point in our work is the
ability to have an impact on reliability assessment and
reliability design in domains other than AIS, there are
signicant differences in both the conceptualization
and the semantics underlying our approach and that
of UML. Specically, UML does not support a pro-
cesscentric view of the world that is central to our
conceptualization. Further, UML diagramming nota-
tions lack a formal semantics (Mcumber and Cheng
2001). Even the recently released UML 2.0 speci-
cations from the Object Management Group (OMG)
offer only a natural language semantics for UML dia-
gramming notation, and developing a formal seman-
tics is still an active area of research.
In contrast to UML, the recent initiatives in the
area of business process modeling and management
share the processcentric view underlying our work.
Examples include the work on creating declarative,
yet executable, models of business processes such
as business process execution language (BPEL) and
the business process modeling language/notation
(BPML/BPMN) (see BPMI.org 2003). The advantage
of using a broadly adopted and popular notation is
the opportunity to integrate our reliability assessment
approach in tools created to support business pro-
cess modeling and execution. However, extensions to
BPMN notation and semantics are required to adapt
it for use in data reliability assessment.
Because BPMN does not provide specic object
types to represent fundamental data reliability con-
cepts such as ITPs, controls, and their attributes, we
need specialized notational extensions to BPMN that
Distinguish Between ITPs and ControlsBoth
could be represented as BPMN atomic tasks, however,
BPMN cannot capture semantic differences between
these objects (e.g., the fact that ITPs generate errors
and controls eliminate them).
Krishnan et al.: On Data Reliability Assessment in Accounting Information Systems
314 Information Systems Research 16(3), pp. 307326, 2005 INFORMS
Distinguish Between Message Flows from ITPs to
Controls and from Controls to ITPsAs with ITPs and
controls, the different types of information ows in an
AIS have different semantics. However, information
ows are represented as message ows in BPMN, and
message ows do not have the expressive power to
distinguish between the different types of information
ows in an AIS (e.g., information ows from controls
to ITPs provide assurance, while ow between ITPs
may contain errors). In addition, the BPMN syntax
only allows messages to link to tasks and not to other
information ows, as our assurance ows do.
Specify Error Classes for ITPs and ControlsThe
only structure in BPMN that allows tagging of ITPs
and controls with error classes is text annotation,
which is not executable and lacks formal semantics.
While these notational extensions are relatively easy
to make, there is a fundamental difference between
the semantics underlying BPMN and the seman-
tics required for data reliability assessment. BPMN
semantics are based on a variant of the pi calculus
(Milner 1992) and are designed to represent and rea-
son about distributed processes. This is in contrast to
the semantics required to reason about the structure
of information ows and the error prevention capabil-
ities of internal controls. Because of the basic nature
of these changes, we defer further discussion of nota-
tional issues to future research and discuss this topic
briey in 6.
3. An Ontological Metamodel of
an AIS
The key-control-selection problem assumes a cer-
tain conceptualization of the AIS (e.g., in terms of
accounts, ows, controls, error classes, and their rela-
tionships). Both auditors and systems designers use
graph-based structural models of AIS (such as in
Figure 1) that are an informal realization of this
conceptualization to support both design and testing.
As discussed in 2.3, question Q1 is concerned with
the formalization of the graphical model required to
support the computer-based specication of AIS by
auditors as part of a data reliability study. Impor-
tantly, such formalization should be open to analysis
by inferential procedures. For example, these proce-
dures are required to derive the span of a control,
a parameter of the model used to determine effective
and efcient key control sets. Formalized models of
the structure of a system are referred to as ontologi-
cal metamodels (Gruber 1991, Noy and Hafner 2000,
Wand and Weber 1990) in the literature.
An AIS can be conceptualized at two levels of gran-
ularity at least: an action level and a process level.
Action-level ontologies (e.g., Hamscher 1992) and the
TICOM model (Bailey et al. 1985) are ne grained
and describe an AIS in terms of actions, like waiting
for a document, and objects on which they operate,
such as repositories of les where documents are kept.
Other ontologies of AIS have focused on the artifacts
developed within the system, but not on the processes
that produce them (Wand and Weber 1990). Action-
level ontologies are well suited to simulate processing
and document ow through an AIS. However, they
are too ne grained for the purposes of data relia-
bility assessment. In contrast, a process-level ontol-
ogy is coarse grained and species an AIS in terms
of ITPs and information ows. ITPs are at a higher
level of abstraction than actions and can be thought
of as collections of actions that are used to capture
or transform information. This level of abstraction is
well suited to the needs of data reliability assessment
because the objective is to not to simulate actions, but
to identify key controls.
We therefore develop a process-level ontological
metamodel using data gathered from eld interviews
with practicing auditors and a detailed review of the
existing literature (e.g., Arens and Loebbecke 1997,
Grant Thornton 1996). The model uses set theoretic
notation and provides a language to state the key
components of an AIS and their interrelationships in
a precise manner. We provide a brief overview of
the ontology using set-theoretic notation developed
in the model management literature (Bhargava and
Kimbrough 1993).
3.1. Denition of AIS Components
The metamodel of an AIS consists of the following
major components and their interrelationships.
Economic Event (EE)Event that generates the ini-
tial need for the AIS to capture information.
General Ledger Accounts (GLA)Maintain total of
nancial activity (i.e., the output of the AIS from
which the nancial statements are produced).
Krishnan et al.: On Data Reliability Assessment in Accounting Information Systems
Information Systems Research 16(3), pp. 307326, 2005 INFORMS 315
Information Transformation Process (ITP)Processes
that transform information from one form to another.
ITPs can introduce errors. These processes also are
where control procedures designed to eliminate errors
introduced by processes are located.
Control Procedures (CONTROL)Procedures de-
signed to eliminate specic error classes. A control is
associated with one or more ITPs in that it has access
to information present in the ows emanating from
them. Controls have the capacity to eliminate error
classes that are introduced by ITPs.
Information Flows (FLOW)Abstractions of infor-
mation that ow through the AIS, i.e., the output of
ITPs.
Target Error Classes (TEC)Error classes the evalua-
tor wants to determine are not present in an account.
ECs are associated with accounts and ITPs as well as
with controls. In the case of accounts, their association
implies the need to check an account for the presence
of the given error class. In the case of ITPs, the asso-
ciation is used to identify the classes of errors the ITP
can generate. In the case of a control, the association
is used to assert the capability of the control to elimi-
nate the EC.
3.2. Formal Specication of Relationships
Between AIS Components
An AIS is conceptualized as a directed, acyclic, at-
tributed graph (see Figure 2). The nodes of the graph
are GLAs, EEs, ITPs, and controls. The arcs are the
FLOWs. TECs are attributes of GLAs. These objects
are modeled as sets, and each set is required to be
nonempty. In our description of the ontological meta-
model, we use special fonts as follows: sets, functions,
relations, and VARIABLES. Additionally, we refer to the
power set of a term using square brackets (e.g., [EC] is
the power set of EC). Each node and arc in the notation
is required to be an object in this conceptualization.
Each AIS component also has both an ID and a
description attribute. The ID is a unique identier.
Figure 2 The Six Components of an AIS
AIS_COMPONENT
CONTROL
NODE
GLA ITP EE FLOW
Subset of
Attribute of
TEC
Figure 3 AIS Components
GLA AIS_COMPONENT
ITP AIS_COMPONENT
EE AIS_COMPONENT
TEC AIS_COMPONENT
CONTROL AIS_COMPONENT
FLOW AIS_COMPONENT
ID: AIS_COMPONENT Names
description: AIS_COMPONENT Text
I AIS_COMPONENT,I (An AIS component cannot be an empty set)
I,J AIS_COMPONENT,I J = (Components are mutually exclusive)
The description allows an auditor to make any special
notes about each component. Each node and arc has
an identier and a description (see Figure 3).
Next, we specify the FLOWs in the AIS (see Fig-
ure 4). Each FLOW has an origin and a destina-
tion where information ows from the origin to the
destination. Origin and destination are modeled as
functions. Their domain and range for each class of
ow is declared in Figure 4. Because the output of
an ITP is a ow and because ITPs contain processes
that can create errors, we can only draw conclu-
sions about the degree to which FLOWs whose ori-
gins are ITPs are free of errors. We refer to these
FLOWs as Checkable_Flows. We distinguish between
two classes of checkable ows, Middle_Flows and
End_Flows. End_Flows are FLOWs whose destination
is a GLA, the output of the AIS. Middle_Flows are
FLOWs between ITPs. In contrast, Beginning_Flows
are those whose origin is an economic event and rep-
resent information-capturing activity at the boundary
of the organization where the auditor does not have
the ability to assess data reliability. That is, the audi-
tor cannot verify the accuracy of the information con-
Figure 4 Flows and Their Properties
Beginning_Flow Flow
Checkable_Flow Flow
Middle_Flow Checkable_Flow
End_Flow Checkable_Flow
I, J Flow, I J =
origin: Beginning_Flow EE
destination: Beginning_Flow ITP
origin: Middle_Flow ITP
destination: Middle_Flow ITP
origin: End_Flow ITP
destination: End_Flow GLA
I ITP,( K ITP and I K and J Middle_Flow) or
( J Beginning_Flow and K EE), origin(J) = K
destination(J) = I
I GLA, K ITP and J End_Flow, origin(J) = K
destination(J) = I
I Beginning_Flow J Middle_Flow destination(I)= origin(J)
I Middle_Flow J End_Flow destination(I)= origin(J)
FLOW
Beginning_Flow Checkable_Flow
Middle_Flow End_Flow
Krishnan et al.: On Data Reliability Assessment in Accounting Information Systems
316 Information Systems Research 16(3), pp. 307326, 2005 INFORMS
Figure 5 Directionality
node-components = GLA EE ITP
upstream: node-components node-components
X,Y node-components, upstream(X,Y) is interpreted as X is
upstream of Y
I Flow, upstream(origin(I), destination(I))
X,Y,Z node-components, upstream(X,Y) upstream(Y,Z) upstream(X,Z)
tained in the economic event, only that the AIS has
captured that information accurately. In Figure 1, ] 1
is a Beginning_Flow, ] 5 is a Middle_Flow, and ] 11 is
an End_Flow.
Because the AIS is a connected set of FLOWs,
the destination of a Beginning_Flow is the origin of
a Middle_Flow. In turn, the destination of a Mid-
dle_Flow is the origin of an End_Flow. This require-
ment leads to the specication of an AIS as a
connected graph whose nodes are EEs, ITPs, and
GLAs and whose arcs are FLOWs (refer to Figure 4).
Because ows are directed from EEs to GLAs,
we introduce a transitive relation upstream that is
dened on the set of ITPs, EEs, and GLAs (i.e., the
nodes in the graph structure). Upstream captures the
directionality of ows. Node A is upstream of Node B
if information ows from A to B, possibly through
other nodes (see Figure 5).
By construction, cycles are not possible in an AIS
because the origin of a node is required to be
upstream of the destination node. We now turn to
error classes. Recall that the auditor associates a set of
error classes, referred to as target error classes, with a
GLA. We model this association as a set-valued func-
tion, TEC (target error class)see Figure 6. In Fig-
ure 1, the TEC of the cash account is {C, |] indicating
that completeness and existence are the target error
classes for the cash account. We model the TEC map-
ping as a function to facilitate inference.
ITPs create errors in information ows, and controls
are designed to eliminate those errors. The essence
of data reliability assessment is to ensure that the
controls are operating as designed. The error classes
that a control is designed to eliminate are modeled
as a relation called covers. In Figure 1, the control
Figure 6 Target Error Classes
TEC: GLA [EC]
I GLA, E [EC] such that TEC(I) = E
Compare batch totals can eliminate completeness,
existence, and valuation errors. A control has access to
information from the ows that emanate from at least
one ITP in an AIS. We model this as a relation located-
at that relates a control to the ITPs from which it can
obtain information (see Figure 7). This relation is not
explicitly represented in Figure 1. The control Com-
pare batch totals is located at the Check register
and Voucher register ITPs. In addition, a control
may have access to other ITPs that are upstream of
the ITPs at which it is located. For example, the con-
trol Compare batch totals has access to information
from the Disbursement ITP as well, as indicated by
its span (dened below). Given the set of ITPs from
which the control has access to information, an eval-
uator needs to determine the set of owsreferred
to as the span of a controlon which the control can
eliminate errors. This requires an analysis of the struc-
ture of information ows in the AIS and is a dif-
cult and error-prone step for human auditors (cf. Q2
in 2.3). However, span is an important parameter of
the key-control-selection model, and 4.3 presents an
algorithm that uses the ontological model discussed
in this section to compute the span of each control in
the AIS.
The set of objects and the relations presented dene
the structure and the semantic constraints on the
AIS. The expressiveness of these constructs was val-
idated using the set of real-world cases made avail-
able by the rm that participated in the eld study
(see 5). This specication can be implemented using
a logic programming language such as Prolog or
implemented using a database with the constraints
stated using an imperative language such as C (the
option we chose in our implementation). Irrespective
of the option chosen, the ontological metamodel pro-
vides both a language and formal semantics for rep-
resenting and reasoning about the information ows
and structure of an AIS. We are not aware of other
Figure 7 Controls and Their Properties
covers: Control EC
C C E EC, covers(C,E) (A control covers at least one error class)
located-at: Control ITP
C C, T ITP, located-at(C,T) (A control is located at an ITP)
spans-to: Control ITP
C C, T
1
,T
2
ITP, spans-to (C,T
1
) AND located-at(C,T
2
) upstream(T
1
,T
2
)
(A control can only span to an upstream ITP)
Krishnan et al.: On Data Reliability Assessment in Accounting Information Systems
Information Systems Research 16(3), pp. 307326, 2005 INFORMS 317
ontological models that have been developed for the
purpose of AIS reliability assessment and, therefore,
there may be other ways to structure an ontologi-
cal metamodel to support AIS reliability assessment.
Thus, as the rst metamodel for this task presented
in the literature, we focus on demonstrating that it
will achieve the goals we have set for it. We now
present the mathematical model and associated algo-
rithms used for reliability assessment and discuss its
relationship to the ontological model.
4. Mathematical Model Development
The Key-Control-Selection Problem: Given an instance
of the metamodel of an AIS (as dened in the pre-
vious section) and a set of target error classes, deter-
mine the smallest set of controls that will need to be
tested to assess the absence or presence of the target
error classes in the accounts of the AIS.
As noted in the previous section, a control is
designed to eliminate a specic set of error classes in
the set of ows in its span. Referring to the graph-
ical model of the AIS (Figure 1), if the accounts in
the AIS are to be free of the target error classes, this
requires that each ow that directly or indirectly con-
tributes to the information in the accounts should also
be free of these errors. Auditors can establish if this
is the case by testing a set of controls whose coverage
capability on the ows in their span is a superset of
the target error classes on these ows. We formulate
this as a set-covering problem (Cormen et al. 1992)
under the following assumptions. These assumptions
were derived from our eld interviews with experi-
enced auditors and from their rms technical guid-
ance (Grant Thornton 1996) and were not adopted
simply to make our model more tractable. Auditing
research also nds that auditors use simple determin-
istic models to assess AIS reliability (Felix and Niles
1988, Waller 1993), rather than probabilistic models.
Assumption 1. Controls are designed to operate deter-
ministically. For each error class in its coverage set, a con-
trol is designed to eliminate the error with probability 1.
Assumption 2. The capability of a control to elimi-
nate an error is independent of the capability of any other
control.
Assumption 3. The cost of testing a control is xed
and is the same for every control.
Assumption 4. All ITPs generate all possible error
classes.
Although the assumptions on which the model is
based may seem to be simplistic and restrictive, the
most restrictive assumptions offset each other because
they have the opposite impact on reliability. Assump-
tion 4 (i.e., that all ITPs generate all error classes) off-
sets Assumption 1 (i.e., that controls eliminate errors
with probability 1). Further, the preliminary evalu-
ation we conducted (described in 5) supports the
validity of our approach. The mathematical formu-
lation of the key-control-selection problem is shown
below in Figure 8.
The objective function of the model minimizes the
number of key controls selected for testing and fol-
lows directly from Assumption 3. However, it is quite
straightforward to take differences in control testing
costs into account by modifying the objective function
into a weighted, additive cost function. The constraint
set insures that at least one control that can cover each
target error class on each ow will be chosen. The
constraint set follows from Assumptions 1, 2, and 4.
The inputs required to use the model for key con-
trol selection are the set of information ows ( ), the
error classes on these ows that need to be detected
or prevented (|(] )), the set of available controls (C),
and their span (D
ce.f
). These inputs are derived (as
noted in Figure 8) from an instance of the ontological
model developed by the auditor as a rst step in the
data reliability assessment process. In other words,
no additional effort is invested by the auditor to use
Figure 8 The Set-Covering Model Formulation
Min K
c
(Select the Minimum Number of Controls)
cC
s.t.
K
c
= 0 or 1
c
C
Where:
E(f) is the set of error classes that need to be tested in a flow f (this is algorithmically
derived from an instance of the ontological model)
F is a target set of Flows (this is specified exogenously by the auditor)
C is the set of Controls (this is declared in the instance of the ontological model)
K
c
0/1 Variable with the following interpretation:
= 1, when the cth Control is selected as a Key Control;
= 0, when the cth Control is not selected as a Key Control.
D
ce.f
0/1 matrix of constants such that: (this is derived algorithmically from an
instance of the ontological metamodel)
= 1, where the cth Control covers the eth TEC for the fth flow;
= 0, where the cth Control does not cover the eth TEC for the f th flow
D
ce.f
* K
c
1
f
F and
c
E(f) (Cover the set of TECs E(f) for each flow f)
cC