Syntax Testing in Software Systems
Syntax Testing in Software Systems
UNIT –IV
SYNTAX TESTING
1. WHY, WHAT, AND HOW:
1.1. Garbage
“Garbage-in equals garbage-out” is one of the worst cop-outs ever invented by the computer
industry.
We know when to use that one! When our program screws up and people are hurt.
An investigation is launched and it’s discovered that an operator made a mistake, the wrong tape
was mounted, or the source data were inconsistent, or something like that.
That’s the time to put on the guru’s mantle, shake your head, disclaim guilt, and mutter, “What do
you expect? Garbage-in equals garbage-out.”
Can we say that to the families of the airliner crash victims? Will you offer that excuse for the
failure of the intensive care unit’s monitoring system?
How about a nuclear reactor meltdown, a supertanker run aground, or a war? GIGO is no
explanation for anything except our failure to install good data-validation checks, or worse, our
failure to test the system’s tolerance for bad data.
Garbage shouldn’t get in—not in the first place or in the last place.
Every system must contend with a bewildering array of internal and external garbage, and if you
don’t think the world is hostile, how do you plan to cope with alpha particles?
1.2. Casual and Malicious Users
Systems that interface with the public must be especially robust and consequently must have
prolific input-validation checks.
It’s not that the users of automatic teller machines, say, are willfully hostile, but that there are so
many of them—so many of them and so few of us.
It’s the million-monkey phenomenon: a million monkeys sit at a million typewriters for a million
years and eventually one of them will type Hamlet.
The more users, the less they know, the likelier that eventually, on pure chance, someone will hit
every spot at which the system’s vulnerable to bad inputs.
There are malicious users in every population—infuriating people who delight in doing strange
things to our systems.
Years ago they’d pound the sides of vending machines for free sodas. Their sons and daughters
invented the “blue box” for getting free telephone calls.
Now they’re tired of probing the nuances of their video games and they’re out to attack
computers. They’re out to get you.
Some of them are programmers. They’re persistent and systematic. A few hours of attack by one
of them is worse than years of ordinary use and bugs found by chance. And there are so many of
them; so many of them and so few of us.
Then there’s crime. It’s estimated that computer criminals (using mostly hokey inputs) are raking
in hundreds of millions of dollars annually.
A criminal can do it with a laptop computer from a telephone booth in Arkansas.
Every piece of bad data accepted by a system—every crash-causing input sequence—is a chink
in the system’s armor that smart criminals can use to penetrate, corrupt, and eventually suborn
the system for their own purposes.
And don’t think the system’s too complicated for them.
They have your listings, and your documentation, and the data dictionary, and whatever else they
need.*
There aren’t many of them, but they’re smart, motivated, and possibly organized.
Page 1
Software Testing Methodologies Unit IV
1.3. Operators
Roger and I were talking about operators and the nasty things they can do, and the scenarios
were getting farfetched.
Who’d think of mounting a tape with a write-ring installed, writing a few blocks, stopping, opening
the transport’s door, dismounting the tape reel without unloading the buffers, removing the ring,
remounting the tape without telling the system, and then attempting to write a new block? The
malice we ascribed to the operators was embarrassing.
I said to Roger, the designer most concerned with the impact of operator shenanigans, “What the
operators have done to these systems in the past is bad enough—just imagine how they’d act if
they knew how we talked about them.”
To which he snapped, “If they knew how we talked about them, they’d probably act the way we
expect them to!”
I’m not against operators and I don’t intend to put them down. They’re our final defense against
our latent bugs.
Too often they manage, by intuition, common sense, and brilliance, to snatch a mere catastrophe
from the jaws of annihilation.
Operators make mistakes—and when they do, it can be serious. It’s right that they probe the
system’s defenses, catalog its weaknesses and prepare themselves for the eventualities we
didn’t think of.
1.4. The Internal World
Big systems have to contend not only with a hostile external environment but also a hostile
internal environment.
Malice doesn’t play a role here, but oversight, miscommunication, and chance can be just as
deadly.
Any big system is subdivided into loosely coupled subsystems and consequently, there are many
internal interfaces.
Each interface presents another opportunity for data corruption and may require explicit internal-
data validation.
Furthermore, hardware can fail in bizarre ways that will cause it to pump streams of bad data into
memory, across channels, and so on.
Another piece of software may fail and do the same. And then there’re always alpha particles.
1.5. What to Do
Input validation is the first line of defense against a hostile world.
Good designers design their system so that it just doesn’t accept garbage—good testers subject
systems to the most creative garbage possible.
Input-tolerance testing is usually done as part of system testing, such as in a formal feature test
or in a final acceptance test, so it’s usually done by independent testers.
This kind of testing and test design is more fun than any other kind I know of: it’s great therapy
and they pay you for it.
My family and pets loved it when I was doing these tests; after I was through kicking and
stomping the programmers around, there wasn’t a mean bone left in my body.
But to be really diabolical takes organization, structure, discipline, and method.
Taking random potshots and waiting for inspirations with which to victimize the programmer won’t
do the job.
Syntax testing is a primary toot of dirty testing, and method beats sadism every time.
1.6. Applications and Hidden Languages
Opportunities for applying syntax testing abound in most systems because most systems have
hidden languages.
A hidden language is a programming language that hasn’t been recognized as such.
Page 2
Software Testing Methodologies Unit IV
Remember the Third Law (Chapter 2, Section 3.4.1.): Code Migrates to Data.
One of the ways this happens is by taking control information and storing it as data (in lists or
tables, say) and then interpreting that data as statements in an unrecognized and undeclared,
internal, high-level language.
Syntax testing is used to validate and break the explicit or (usually) implicit parser of that
language.
The troubles with these hidden languages are: there’s no formal definition of syntax; the syntax
itself is often buggy; and parsing is inexorably intertwined with processing.
The key to exploiting syntax testing is to learn how to recognize hidden languages. Here are
some examples:
1. User and operator commands are obvious examples of languages. Don’t think that it doesn’t
pay to use syntax-testing methods because you only have a few commands. I’ve found it useful
for only one or two dozen commands. For mainframe systems there are system operator
commands and also a big set of user OS commands. For PC or any other application that can be
used interactively, there are application-specific command sets.
2. The counterpart to operator and user command languages for batch processing are job
control languages: either at the operating system level (e.g., JCL) or application-specific.
3. In Chapter 4, Section 5.2, I wrote about transaction-control languages. Reread that section for
this application.
4. A system-wide interprocess-communication convention has been established. Isn’t that a
minilanguage? A precompilation preprocessor has been implemented to verify that the
convention is followed: use syntax-testing thinking to test it.
5. An offline database generator package is used to create the database. It has a lot of fields to
look at and many rules to follow—and more syntax to check.
6. Any internal format used for interprocess communications that does not consist of simple,
fixed fields should be treated to a syntax test—for example, project or application calling
sequence conventions, or a macro library.
7. Almost any communication protocol has, in part, a command language and/or formats that
deserve syntax testing. Even something as simple as using a telephone can be tested by syntax
testing methods.
8. A complicated application may have several hidden languages: an external language for user
commands and an internal language, not apparent to the user, out of which the applications are
built. The internal languages could be subtle and difficult to recognize. For example, a language
could consist of a pattern of calls to worker subroutines. A deep call tree with a big common
subroutine library can be viewed as a syntax graph (see below). A tip-off for syntax testing is
that there are a few high-level routines with subroutine or function names in the call and a big
common subroutine library. When you see that kind of a call tree, think of syntax testing.
I wouldn’t use syntax-testing methods against a modern compiler.
The way modern compiler construction is automated almost guarantees that syntax testing won’t
be effective.
By extension, if the hidden language is out in the open and implemented as a real language, then
syntax testing will probably fail—not because the tests won’t be valid but because they won’t
reveal enough bugs to warrant the effort.
Syntax testing is a shotgun method that depends on creating many test cases.
Although any one case is unlikely to reveal a bug, the fact that many cases are used and that
they are very easy to design makes the method effective.
It’s almost impossible for the kinds of bugs that syntax testing exposes to remain by the time
(typically, in system testing) that syntax testing is used if it is used against a modern compiler.
Page 3
Software Testing Methodologies Unit IV
Also, for most programming languages there’s no need to design these tests because you can
buy a test oracle for the compiler at a far lower cost than you can design the tests.
1.7. The Graph We Cover
Look at the back of almost any Pascal reference manual, and you’ll see several pages of graphs
such as the one shown in Figure 9. 1.
It is a graph because it has nodes joined by links. The nodes shown here are either circles or
boxes.
The links are arrows as usual. The circles enclose actual characters.
The boxes refer to other parts of the syntax graph, which you can think of as subroutine calls.
The meanings attached to this graph are slightly different than those we’ve used before:
What do you do when you see a graph? COVER IT! Do you know how to do syntax testing? Of
course you do! Look at Figure 9. 1.
You can test the normal cases of this syntax by using a covering set of paths through the graph.
For each path, generate a fields that corresponds to that path.
There are two explicit loops in the graph, one at the top that loops around identifier and another
at the bottom following the “OF.”
There is an implicit loop that you must also test: note that fields calls itself recursively, I would
start with a set of covering paths that didn’t hit the recursion loop and only after that was
satisfactory, hit the recursive cases.
In syntax testing, we must test the syntax graph with (at least) a covering set of test cases, but
we usually go much further and also test with a set of cases that cannot be on the graph—the
dirty cases.
We generate these by methodically screwing up one circle, box, or link at a time.
Look at the comma in the top loop: we can remove it or put something else in its place for two
dirty test cases.
We can also test a link that doesn’t exist, such as following the comma with a type as indicated
by the dotted arrow.
You also know the strategy for loops.
The obvious cases are: not looping, looping once, looping twice, one less than the maximum, the
maximum, and one more than the maximum.
The not-looping case is often productive, especially if it is a syntactically valid but semantically
obscure case.
The cases near the maximum are especially productive.
You know that there must be a maximum value to any loop and if you can’t find what that value
is, then you’re likely to draw blood by attempting big values.
1.8. Overview
Syntax testing consists of the following steps:
1. Identify the target language or format (hidden or explicit).
2. Define the syntax (format) of the language, formally, in a convenient notation such as
Backus-Naur form (BNF).
3. Test and debug the syntax to assure that it is complete and consistent and that it satisfies the
intended semantics.
4. Normal condition testing consists of a covering set of input strings including critical loop
values. The difficult part about normal case testing is predicting the outcome and verifying that
the processing was correct. That’s ordinary functional testing—i.e., semantics. Covering the
syntax graph assures that all options have been tested. This is a minimum mandatory
requirement with the analogous strengths and weaknesses of branch testing for control
flowgraphs. It isn’t “complete” syntax testing by any measure.
Page 4
Software Testing Methodologies Unit IV
5. Syntax testing methods pay off best for dirty testing. Test design is a top-down process that
consists of methodically identifying which component is to be cruddied-up and how.
6. Much of syntax test design can and should be automated by relatively simple means.
7. Test execution automation is essential for syntax testing because this method produces so
many tests.
2.A GRAMMAR FOR FORMATS:
2.1. Objectives
Every input has a syntax.
That syntax may be formally specified or undocumented and “just understood,” but it does exist.
Data validation consists (in part) of checking the input for correct syntax. It’s best when the
syntax is defined in a formal language—best for the designer and the tester.
Whether the designer creates the data-validation software from a formal specification or not is not
important to the tester, but the tester needs a formal specification to create useful garbage.
That specification is conveniently expressed in Backus-Naur form, which is very similar to regular
expressions.
Regular expressions were introduced in Chapter 8 as an algebraic representation of all the paths
in a graph.
It’s usually more convenient to deal with the algebraic version of graphs than with the pictorial
version.
Get comfortable with going back and forth between algebraic forms and pictorial forms for graphs
and with talk about “covering a graph” even if there’s no pictorial graph around.
This isn’t new to you because you worked with paths through an algebraic representation of a
graph long before you heard about flowgraphs: what did you mean by “paths through code”?
2.2. BNF Notation (BACK59)
2.2.1. The Elements
Every input can be considered as if it were a string of characters.
The software accepts valid strings and rejects invalid ones.
If the software fails on a string, we’ve really got it. If it accepts an invalid string, then it’s guilty of
GIGO.
There’s nothing we can do about syntactically valid strings whose values are valid but wrong—
that kind of garbage we have to accept.
The syntax definition must be formal, starting with the most elementary parts, the characters
themselves.
Here’s a sample definition:
alpha_characters ::= A/B/C/D/E/F/G/H/l/J/K/L/M/N/O/P/Q/
R/S/T/U/V/W/X/Y/Z
numerals ::= 1/2/3/4/5/6/7/8/9
zero ::= 0
signs ::= !/#/$/%/&/*/(/)/-/+/=/;/:/“/’/,/./?
space ::= sp
The left-hand side of the definitions is the name given to the collection of objects on the right-
hand side.
The string “::=” is interpreted as a single symbol that means “is defined as.” The slash “/” means
“or.”
We could have used the plus sign for that purpose as with regular expressions but that wouldn’t
be in keeping with the established conventions for BNF. We are using BNF to define a miniature
language.
The “::=” is part of the language in which we talk about the minitanguage, called the
metalanguage.
Page 5
Software Testing Methodologies Unit IV
Spaces are always confusing because we can’t display them on paper. We use sp to mean a
space.
The actual spaces on this page have no meaning.
Similarly, an italicized (or underlined) symbol is used for any other single character that can’t
conveniently be printed, such as null (nl), end-of-text (eot), clear-screen, carriage-return (cr), line-
feed (lf), tab, shift-up (su), shift-down (sd), index, backspace (bs), and so on.
The underlined space, as in alpha_characters, is used as usual in programming languages to
connect words that comprise a single object.
2.2.2. BNF Operators
The operators are the same as those used in path expressions and regular expressions: “or,”
concatenate, (which doesn’t need a special symbol), “×”, and “+”.
Exponents, such as An, have the same meaning as before—n repetitions of the strings denoted
by the letter A.
Syntax is defined in BNF as a set of definitions.
Each definition may in turn refer to other definitions or to itself in such a way that eventually it
gets down to the characters that form the input string.
Here’s an example:
word ::= alpha_character alpha_character / numeral sp numeral
I’ve defined an input string called word as a pair of alpha_characters or a pair of numerals
separated by a space.
Here are examples of words and nonwords, by this definition:
words : AB, DE, XY, 3 sp 4, 6 sp 7, 9 sp 9, 1 sp 2
nonwords : AAA, A sp A1, A), 11, 111, WORD, NOT sp WORD, +
There are 722 possible words in this format and an infinite number of nonwords. If the strings are
restricted to four characters, there are more than a million nonwords.
The designer wants to detect and accept words and reject nonwords; the tester wants to
generate nonwords and force the program to deal with them.
2.2.3. Repetitions
As before, object1-3 means one to three objects, object* means zero or more repetitions of object
without limit, and object+ means one or more repetitions of object.
Neither the star (*) nor the plus (+) can legitimately appear in any syntax because both symbols
mean a possibly infinite number of repetitions.
That can’t be done in finite memory or in the real world. The software must have some means to
limit repetitions.
It can be done by an explicit test associated with every + or * operator, in which case you should
replace the operator with a number.
Another way to limit the repetitions is by placing a global limit on the length of any string.
The limit then applies to all commands and it may be difficult to predict what the actual limit is for
any specific command.
You test this kind of limit by maximum-length strings. Yet another way to implement limits is to
limit a common resource such as stack or array size.
Again, the limits for a specific command may be unpredictable because it is a global limit rather
than a format-specific limit.
The way to test this situation is with many repetitions of short strings.
For example, using as many as possible minimum-length identifiers and not including expression
in the first loop of Figure 9.1: “ID1, ID2, ID3, ID4,.... ID999: type” will do the job.
One of the signs of weak software is the ease with which you can destroy it by overloading its
repetition-limiting mechanisms.
Page 6
Software Testing Methodologies Unit IV
If the mechanism doesn’t exist, you can probably scribble all over the stack or code—crash-
crash, tinkle-tinkle, goody-goody.
2.2.4. Examples
This is an example and not a real definition of a telephone number:
special_digit ::= 1/2/5
zero ::= 0
other_digit ::= 3/4/6/7/8/9
ordinary_digit ::= special digit / zero / other_digit
exchange_part ::= other_digit2 ordinary_digit
number_part ::= ordinary_digit4
phone-number ::= exchange_part number-part
According to this definition, the following are phone-numbers,
3469900, 9904567, 3300000
and these are not:
5551212, 5510000, 123, 8, ABCDEFG, 572-5580, 886-0144.
Another example:
operator_command ::= mnemonic field_unit1-8 +
An operator_command consists of a mnemonic followed by one to eight field_units and a
plus sign.
field_unit ::= field delimiter
mnemonic ::= first_part second_part
delimiter ::= sp / , / . / $ / ×sp1-42
field ::= numeral / alpha / mixed / control
first-part ::= a_vowel a_consonant
second_part ::= b_consonant alpha
a_vowel ::= A/E/I/O/U
a_consonant ::= B/D/F/G/H/J/K/L/M/N/P/Q/R/S/T/V/X/Z
b_consonant ::= B/G/X/Y/Z/W/M/R/C
alpha ::= a-vowel / a consonant b_consonant
numeral ::= 1/2/3/4/5/6/7/8/9/0
control ::= $/×/%/sp/@
mixed ::= control alpha control /
control numeral control /
control control control
Here are some valid operator_commands:
ABXW A. B. C. 7. +
UTMA W sp sp sp sp +
While the following are not operator-commands:
ABC sp +
A sp BCDEFGHIJKLMNOPQR sp47 +
The telephone number example and the operator command example are different.
The telephone number started with recognizable symbols and constructed the more complicated
components from them—a bottom-up definition.
The command example started at the top and worked down to the real characters—a top-down
definition.
These two ways of defining things are equivalent—it’s only a matter of the order in which the
definition lines are printed.
The top-down order is generally more useful and it’s the usual form for language design.
Page 7
Software Testing Methodologies Unit IV
Looking at the definition from the top down leads you to some tests and looking from the bottom
up can lead to different tests.
As a final notational convenience, it’s sometimes useful to enclose an expression in parentheses
to reduce the number of steps in the definition.
For example, the definition step for field_unit could have been simplified as follows:
operator_command ::= mnemonic (field delimiter)1-8 +
This is fine if the syntax doesn’t use parentheses that can confuse you in the definitions;
otherwise use some other bracket symbols such as < and >.
BNF notation can also be expanded to define optional fields, conditional fields, and the like.
In most realistic formats of any complexity, you won’t be able to get everything expressed in this
notation—nor is it essential that you do so; additional narrative descriptions may be needed.
3. TEST CASE GENERATION
3.1. Generators, Recognizers, and Approach
A data-validation routine is designed to recognize strings that have been explicitly or implicitly
defined in accordance with an input syntax.
It either accepts the string, because it is recognized as valid, or rejects it and takes appropriate
action. The routine is said to be a string recognizer.
Conversely, the tester attempts to generate strings and is said to be a string generator.
There are three possible kinds of incorrect actions:
1. The recognizer does not recognize a good string.
2. It accepts a bad string.
3. It may accept or reject a good string or a bad string, but in so doing, it fails.
Even small specifications lead to many good strings and far more bad strings.
There is neither time nor need to test them all. String errors can be categorized as follows:
1. High-Level Syntax Errors—The strings have violations at the topmost level in a top-down
BNF syntax specification.
2. Intermediate-Level Syntax Errors—Syntax errors at any level other than the top or bottom.
3. Field-Syntax Errors—Syntax errors associated with an individual field, where a field is
defined as a string of characters that has no subsidiary syntax specification other than the
identification of characters that compose it. A field is the lowest level at which it is productive
to think in terms of syntax testing.
4. Delimiter Errors—Violation of the rules governing the placement and the type of
characters that must appear as separators between fields.
5. Syntax—Value Errors—When the syntax of one field depends on values of other fields,
there is a possibility of an interaction between a field-value error and a syntax error—for
example, when the con-tents of a control field dictate the syntax of subsequent fields. This is
a messy business that permits no reasonable approach. It needs syntax testing combined
with domain testing, but it’s better to redesign the syntax.
6. State-Dependency Errors—The permissible syntax and/or field values is conditional on
the state of the system or the routine. A command used for start-up, say, may not be allowed
when the system is running. If state behavior is extensive, consider state testing (Chapter
11).
Errors in the values of the fields or the relation between field values are domain errors and should
be tested accordingly.
3.2. Test Case Design
3.2.1. Strategy
The strategy is to create one error at a time, while keeping all other components of the input
string correct; that is, in the absence of the single error, the string would have been accepted.
Page 8
Software Testing Methodologies Unit IV
Once a complete set of tests has been specified for single errors, do the same for double errors
and then triple errors.
However, if there are of the order of N single-error cases, there will be of the order of N2 double-
error and N3 triple-error cases.
Once past the single errors, it takes a lot of judicious pruning to keep the number of tests
reasonable. This is almost impossible to do without looking at the implementation details.
3.2.2. Top, Intermediate, and Field-Level Syntax Errors
Say that the topmost syntax level is defined as:
item ::= atype / btype / ctype / dtype etype
Here are some obvious test cases:
1. Do It Wrong—Use an element that is correct at some other lower syntax level, but not at
this level.
2. Use a Wrong Combination. The last element is a combination of two other elements in a
specified order. Mess up the order and combine the wrong things:
dtype atype / btype etype / etype dtype / etype etype / dtype dtype
3. Don’t Do Enough—For example,
dtype / etype
4. Don’t Do Nothing. No input, just the end-of-command signal or carriage return. Amazing
how many bugs you catch this way.
5. Do Too Much—For example:
atype btype ctype dtype etype / atype atype atype /
dtype etype atype / dtype etype etype / dtype etype 128
Focus on one level at a time and keep the level above and below as correct as you can. It may
help to draw a definition graph; we’ll use the telephone number example (see Figure 9.2).
Check the levels above and below as you generate cases. Not everything generated by this
procedure produces bad cases, and the procedure may lead to the same test case by two
different paths.
The corruption of one element could lead to a correct but different string.
Such tests are useful because logic errors in the string recognizer might miscategorize the string.
Similarly, if a test case (either good or bad) can be generated via two different paths, it is an
especially good case because there is a potential for confusion in the routine.
I like test cases that are difficult to design and difficult to recognize as either good or bad because
if I’m confused, it’s likely that the designer will also be confused.
It’s not that designers are dumb and testers smart, but designers have much more to do than
testers.
To design and execute syntax-validation tests takes 5% to 10% of the effort needed to design,
code, test, validate, and integrate a syntax-validation routine.
The designer has 10 to 20 times as much work to do as does the tester.
Given equal competence, if the tester gets confused with comparatively little to do, it’s likely that
the overloaded designer will be more confused by the same case.
Now look at Figure 9.3. I generated this syntax graph from Figure 9.2 by inserting the subsidiary
definitions and simplifying by using regular expression rules.
The result is much simpler. It can obviously be covered by one test for the normal path and there
are far fewer dirty tests:
1. Start with a special_digit.
2. Start with a zero.
3. Only one other_digit before a zero.
4. Only one other_digit before a special-digit.
5. Not enough digits.
Page 9
Software Testing Methodologies Unit IV
6. Too many digits.
7. Selected nondigits.
You could get lavish and try starting with two zeros, two special_digits, zero followed by
special_digit, and special_digit followed by zero, thereby hitting all the double-error cases of
interest; but there’s not much more you can do to mess up this simplified graph. Should you do
it?
No! The implementation will tend to follow the definition and so will the bugs.
Therefore, there’s a richer possibility for variations of bad strings to which the simplified version is
not vulnerable.
Don’t expect to find opportunities for such simplifications in the syntax of mature programming
languages—the language designers usually do as much simplification as makes sense before
the syntax is released.
For formats, operator commands, and hidden languages, there are many such opportunities.
The lesson to be learned from this is that you should always simplify the syntax graph if you can.
The implementation will be simpler, it will take fewer tests to cover the normal cases, and there
will be fewer meaningful dirty tests.
3.2.3. Delimiter Errors
Delimiters are characters or strings placed between two fields to denote where one ends and the
other begins.
Delimiter problems are an excellent source of test cases. Therefore, it pays to identify the
delimiters and the rules governing their syntax.
1. Missing Delimiter—This kind of error causes adjacent fields to merge. This may result in a
different, but valid, field or may be covered by another kind of syntax error.
2. Wrong Delimiter—It’s nice when several different delimiters are used and there are rules
that specify which can be used where. Mix them up and use them in the wrong places.
3. Not a Delimiter—There are some characters or strings that are not delimiters but could be
put into that position. Note the possibility of changing adjacent field types as a result.
4. Too Many Delimiters—The number of delimiters appearing at a field boundary may be
variable. This is typical for spaces, which can serve as delimiters. If the number of delimiters
is specified as 1 to N, it pays to try 0, 1, 2, N - 1, N, N + 1, and also an absurd number of
delimiters, such as 127, 128, 255, 256, 1024, and so on.
5. Paired Delimiters—These delimiters represent another juicy source of test cases.
Parentheses are the archetypal paired delimiters. There could be several kinds of paired
delimiters within a syntax. If paired delimiters can nest, as in “( ()() )”, there are a whole set of
new evils to perpetrate. For example, “BEGIN...BEGIN...END”, “BEGIN...END...END”.
Nested paired delimiters provide opportunities for matching ambiguities. For example,
“((()(()))” has a matching ambiguity and it’s not clear where the missing parenthesis belongs.
6. Tolerant Delimiters—The delimiter may be optional or several alternate formats may be
acceptable. In communications systems, for example, the start of message is defined as
ZCZC, but many systems allow any one of the four characters to be corrupted. Therefore,
#CZC, Z#ZC, ZC#C, and ZCZ# (where “#” denotes any character) are all acceptable; there
are many nice confusing ways to put in a bad character here:
a. A blank.
b. Z or C in the wrong place—CCZC, ZZZC, ZCCC, ZCZZ (catches them every time!).
c. Something off-the-wall—especially important control characters in some other context.
Tolerance is most often provided for delimiters but can also be provided for individual fields and
higher levels of syntax.
It’s a sword of many edges—more than two for sure—all of them dangerous.
Page 10
Software Testing Methodologies Unit IV
Syntax tolerance is provided for user and operator convenience and in the interest of making the
system humane and robust.
But it also makes the format-validation design job and testing format-validation designs more
complicated.
Format tolerance is sophisticated and takes sophisticated designs to implement and,
consequently, many more and more complicated tests to validate.
If you can’t do the whole job from design to thorough validation, there’s no point in providing the
tolerance.
Most users and operators prefer a solid system with rigid syntax rules to a system with tolerant
rules that don’t always work.
3.2.4. Field-Value Errors
Field-value errors are clearly a domain-testing issue, and domain testing is where it’s at.
Whether you choose to implement field-value errors in the context of syntax testing or the other
way around (i.e., syntax testing under domain testing) or whether you choose to implement the
two methods as separate test suites depends on which aspect dominates.
Syntax-testing methods will usually wear out more quickly than will domain testing.
For that reason, it pays to separate domain and syntax tests into different suites.
You may not be able to separate the two test types because of (unfortunately) context-dependent
syntax—either field values whose domain depends on syntax or syntax that depends on field
values (ugh!).
Here’s a reminder of what to look for: boundary values and near-boundary values, excluded
values, binary values for integers, and values vulnerable to semantic type changes and
representation changes.
3.2.5. Context-Dependent Syntax Errors
Components of the syntax may be interrelated and may be related by field values of other fields.
The first field could be a code that specifies the syntax of subsequent fields. As an example:
command ::= pilot_field syntax_option
pilot_field ::= 1/2/3/4/5/6/7/8/9
syntax_option ::= option1 / option2 / option3 / . . .
The specification further states that option!] must be preceded by “1” as the value of the pilot-
field. Actually, it would have been better had the specification be written as:
command ::= 1 option1 / 2 option2 / 3 option3 / . . .
but that’s not always easy to do. The test cases to use are clearly invalid combinations of
syntactically valid field values and syntactically valid options.
If you can rewrite the specification, as in the above example, to avoid such field-value
dependencies, then it’s better you do so; but if so doing means vast increases in formality, which
could be handled more clearly with a side-bar notation that specifies the relation, then it’s better
to stick to the original form of the specification.
The objective is to make things as clear as possible to the designer and to yourself, and
excessive formality can destroy clarity just as easily as modest formality can enhance it.
3.2.6. State-Dependency Errors
The string or field value that may be acceptable at one instant may not be acceptable at the next
because validity depends on the transaction’s or the system’s state.
As an example, say that the operator’s command-input protocol requires confirmation of all
commands.
After every command the system expects either an acknowledgment or a cancellation, but not
another command.
A valid command at that point should be rejected, even though it is completely correct in all other
respects.
Page 11
Software Testing Methodologies Unit IV
As another example, the system may permit, as part of a start-up procedure, commands that are
forbidden after the system has been started, or it may forbid otherwise normal commands during
a start-up procedure.
A classical example occurs in communications systems.
The start of message sequence ZCZC is allowed tolerance (see page 301) when it occurs at the
beginning of a message.
However, because the system has to handle Polish language words such as: “zczalny”,
“jeszcze”, “deszcz”, and “zczotka” (BEIZ79), the rules state that any subsequent start-of-message
sequence that occurs prior to a correct end-of-message sequence (NNNN) must be intolerant;
the format is changed at that point and only an exact “ZCZC” will be accepted.
I divide state-dependency errors into “simple” and “complicated.”
The simple ones are those that can be described by at most two states.
All the rest are complicated and are best handled by the methods of Chapter 11.
The simple ones take two format or two field-value specifications—and require at worst double
the work.
3.3. Sources of Syntax
3.3.1. General
Where do you get the syntax? Here’s another paradox for you.
If the syntax is served up in a nice, neat, package, then syntax-testing methods probably won’t be
effective and if syntax testing is effective, you’ll have to dig out and formally define the syntax
yourself. Where do you get the syntax?
Ideally, it comes to you previously defined, formally defined, in BNF or an equivalent, equally
convenient notation.*
That’s the case for common programming languages, command languages written by and for
programmers, and languages and formats defined by a formal standard.
3.3.2. Designer-Tester Cooperation and Design Specifications
If there is no BNF specification, I try to get the designers to create one—at least the first version
of one.
Realistically, though, if a BNF specification does not exist, the designers will have to create a
document that can be easily converted into one or what is she designing to?
If you get the designer to create the first version of the BNF specification, you may find that it is
neither consistent nor complete.
Test design begins with requests for clarification of that preliminary specification. Many serious
bugs can be avoided this way.
Do it yourself if you can’t get the designers to create the first version of the BNF specification.
It doesn’t really matter whether it’s complete or correct, as long as it’s down on paper and formal.
Present your specification version to the designers and say that tests will be defined accordingly.
There may be objections, but the result should be a reasonably correct version in short order.
Using a BNF specification is the easiest way to design format-validation test cases. It’s also the
easiest way for designers to organize their work, but sadly they don’t always realize that.
You can’t begin to design tests unless you agree on what is right or wrong.
If you try to design tests without a formal specification, you’ll find that you’re throwing cases out,
both good and bad, as the designers change the rules in response to the cases you show them.
If you can’t get agreement on syntax early in the project, put off syntax test design and
concentrate on some other area.
Alternatively, and more productively, participate in the design under the guise of getting a
specification tied down.
You’ll prevent lots of bugs that way.
Page 12
Software Testing Methodologies Unit IV
It can boomerang, though. I pushed for a BNF specification of operator commands on one
project.
The commands had been adapted from a previous system in the same family whose formats
were clearly specified but not in BNF.
This designer fell in love with BNF and created a monster that was more complicated than a
combination of Ada and COBOL—and mostly wrong.
To make matters worse, his first version of the operator’s manual was written in top-down BNF,
so operators had to plow through several levels of abstract syntax to determine which keys to hit.
Good human engineering will dictate simple, clean, easy-to-understand syntax for user and
operator interfaces.
Similarly, internal formats for interprocess communications should also be simple.
There’s usually a topmost syntax level, several field options, and a few subfields.
Recursive definitions are rare (or should be).
We do find useful recursive definitions in operating system command languages or data query
languages for things such as sorting, data object specifications, and searching; but in general,
recursive definitions are rare and more likely to be a syntax-specification error than a real
requirement.
Be suspicious if the syntax is so complicated that it looks like a new programming language.
That’s not a reasonable thing to expect users or operator to employ.
3.3.3. Manuals as Sources
Manuals, such as instruction manuals, reference manuals, and operator manuals are the obvious
place to start for command languages if there isn’t a formal syntax document and you can’t get
designers to do the job for you.
The syntax in manuals may be fairly close to a formal syntax definition.
Manuals are good sources because more often than not, we’re dealing with a maintenance
situation, rehosting, or a rewrite of an old application.
But manuals can be mushy because the manual writer tries to convey complicated syntactic
issues in a language that is “easy for those dumb users and operators.”*
*Another war story about my favorite bad software vendor, Coddler Incorporated—a pseudonym
invented to protect me from lawsuits.
They had a word processing programming language that was putatively designed for use by word
processing operators.
The language’s syntax description in the manual was sketchy and wrong.
It’s the only programming language that ever defeated me because after weeks of trying I still
couldn’t write more than 20 statements without a syntax error.
When I asked them for complete documentation of the language’s syntax, I was told that, as a
mere user, I wasn’t entitled to such “proprietary” information. We dragged the syntax out of that
evil box experimentally and discovered so many context and state dependencies that it was clear
to us that they had broken new grounds in language misdesign.
What was even worse, when they confronted you with a syntax error, it was presented by
reference to the tokenized version of the source after compilation rather than to the source—you
guessed it, the token table was also “proprietary.”
We dragged that out experimentally also and discovered even more state dependencies.
Syntax testing that garbage dump was like using hydrogen bombs against a house of cards.
The ironic thing about this experience was that we were trying to use this “language” and its
“processor” to automate the design of syntax tests for the system we were testing.
3.3.4. Help Screens and Systems
Putting user information such as command syntax into HELP systems and on-line tutorial is
becoming more commonplace, especially for PC software because it’s cheaper to install a few
Page 13
Software Testing Methodologies Unit IV
hundred K of HELP material on a floppy than it is to print a few hundred pages of instruction
manual.
You may find the undocumented syntax on these screens.
If you have both manuals and help systems, compare them and find out which one is correct.
Data Dictionary and Other Design Documents
For internal hidden languages, your most likely source of syntax is the data dictionary and other
design documents.
Also look at internal interface standards, style manuals, and design practice memos.
Common subroutine and macro library documents are also good sources.
Obviously you can’t expect designers to hand you the syntax of a language whose existence they
don’t even recognize.
3.3.6. Prototypes
If there’s a prototype, then it’s likely to embody much of the user interface and command
language syntax you need. This source will become more useful in the future as prototyping
gains popularity. But remember that a prototype doesn’t really have to work, so what you get
could be incomplete or wrong.
3.3.7. Programmer Interviews
The second most expensive way to get user and operator command syntax is to drag the
information out of the implementing programmer’s head by interviews.
I would do it only after I had exhausted all other sources.
If you’re forced to do this as your only source, then syntax testing may be pointless because a
low-level programmer is making user interface or system architecture decisions and the project’s
probably aground on the reef—it just hasn’t sunk yet.
Syntax testing is then just a cruel way to demonstrate that fact—pitiful.
3.3.8. Experimental
The most expensive possible way to get the syntax is by experimenting with the running program.
Think back to the times you’ve had to use a new system without an instruction manual and of
how difficult it was to work out even a few simple commands—now think of how much work that
can be for an entire set of commands; but for dirty old code, it’s sometimes the only way.
You got it.
Take what you know of the syntax and express it in BNF.
Apply syntax testing to that trial syntax and see what gets accepted, what gets rejected, and what
causes crashes and data loss.
You’ll have a few surprises that will cause you to change your syntax graph.
Change it and start over again until either the money runs out or the program’s been scrapped.
Looking at the code may help, but it often doesn’t because, as often as not, parsing, format
validation, domain validation, and processing are hopelessly intertwined and splattered across
many components and many levels.
It’s usually pretty tender software.
If you have to live with it, think in terms of putting a proper format validation front end on such
junk (see Section 6 below) and avoid the testing altogether.
3.4. Ambiguities and Contradictions
Unless it’s the syntax of a programming language or a communication format or it’s derived from
a previous system or from some other source that’s been in use for a long time, it’s unlikely that
the syntax of the formats you’re testing will be correct the first time you test it.
There will be valid cases that are rejected and other cases, valid or otherwise, for which the
action is unpredictable.
I mean fundamental errors in the syntax itself and not in the routines that analyze the format.
Page 14
Software Testing Methodologies Unit IV
If you have to create the format syntax in order to design tests, you are in danger of creating the
format you want rather than the one that’s being implemented.
That’s not necessarily bad if what you want is a simpler, more reliable, and easier format to use,
implement, and test.
Ambiguities are easy to spot—there’s a dangling branch on the definition tree.
That is, something appears on the right-hand side of a definition, but there’s no definition with
that term on the left-hand side.
An obvious contradiction occurs when there are two or more definitions for the same term.
As soon as you permit recursive definitions, state dependencies, and context-dependent syntax,
the game’s over for easy ambiguity and contradiction spotting—in fact, the problem’s known to be
unsolvable.
Approaches to detecting ambiguities and contradictions in the general case is a language-
validation problem and beyond the scope of this book by a wide margin.
I’m assuming that we’re dealing only with the simpler and more obvious ambiguities and
contradictions that become apparent when the format is set down formally (e.g., written in BNF)
for the first time.
The point about syntactic ambiguities and contradictions (as I’ve said several times before) is that
although a specification can have them, a program, because it is deterministic, is always
unambiguous and consistent.
Therefore, without looking at the code, even before the code’s been designed, you know that
there must be bugs in the implementation.
Take advantage of every ambiguity and contradiction you detect in the format to push the
format’s design into something that has fewer exception conditions, fewer state dependencies,
fewer field correlations, and fewer variations. Keep in close contact with the format’s designer,
who is often also the designer of the format-analysis routine.
Maintain a constant pressure of weird cases, interactions, and combinations.
Whenever you see an opportunity to simplify the format, communicate that observation to the
format’s designer: he’ll have less to design and code, you’ll have less to test, and the user will
thank you both for a better system.
It’s true that flaws in the syntax may require a more elaborate syntax and a more complicated
implementation.
However, my experience has been that the number of instances in which the syntax can be
simplified outnumber by about 10 to 1 the instances in which it’s necessary to complicate it.
3.5. Where Did the Good Guys Go?
Syntax test design is like a lot of other things that are hard to stop once you’ve started.
A little practice with this technique and you find that the most innocuous format leads to hundreds
of tests; but there are dangers to this kind of test design.
1. It’s Easy to Forget the Normal Cases—I’ve done it often. You get so entangled in creative
garbage that you forget that the system must also be subjected to good inputs. I’ve made it a
practice to check every test area explicitly for the normal case. Covering the syntax definition
graph does it.
2. Don’t Go Overboard with Combinations—It takes iron nerves to do this. You’ve done all
the single-error cases, and in your mind you know exactly how to create the double—and
higher-error cases. And there are so many of them that you can create an impressive mound
of test cases in short order. “How can the test miss anything if I’ve tried 1000 input format
errors?,” you think. Remind yourself that any one strategy is inherently limited to discovering
certain types of bugs. Remind yourself that those N2 double-error cases and N3 triple-error
cases may be no more effective than trying every value from 1 to 1023 in testing a loop.
Page 15
Software Testing Methodologies Unit IV
Don’t let the test become top-heavy with syntax tests at the expense of everything else just
because syntax tests are so easy to design.
3. Don’t Ignore Structure—Just because you can design thousands of test cases without
looking at the code that handles those cases doesn’t mean you should do it that way.
Knowing the program’s design may help you eliminate cases wholesale without sacrificing
the integrity and thoroughness of the test. As an example, say that operator-command
keywords are validated by a general-purpose preprocessor routine. The rest of the input
character string is passed to the appropriate handler for that operator command only after the
keyword has been validated. There would be no point to designing test cases that deal with
the interaction of keyword errors, the delimiter between the keyword and the first field, and
format errors in the first field. You don’t have to know a whole lot about the implementation.
Often, just knowing what parts of the format are handled by which routines is enough to avoid
designing a lot of impressive but useless error combinations that won’t prove a thing. The
bug that could creep across that kind of interface would be so exotic that you would have to
design it. If it takes several hours of work to postulate and “design” a bug that a test case is
supposed to catch, you can safely consider that test case as too improbable to worry about—
certainly in the context of syntax testing.
4. There’s More than One Kind of Test—Did you forget that you designed path tests and
domain tests—that there are state tests to design (Chapter 11), data-flow tests (Chapter 5),
or logic-based tests (Chapter 10)? Each model of the system’s behavior leads to tests
designed from a different point of view, but many of these tests overlap. Although redundant
tests are harmless, they cost money and little is learned from them.
5. Don’t Make More of the Syntax Than There Is—You can increase or decrease the scope
of the syntax by falsely making it more or less tolerant than it really is. This may lead to the
false classification of some good input strings as bad and vice versa—not a terrible problem,
because if there is confusion in your mind, there may be confusion in the designer’s mind. At
worst, you’ll have to reclassify the outcome of some cases from “accept” to “reject,” or vice
versa.
6. Don’t Forget the Pesticide Paradox—Syntax tests wear out fast. Programmers who don’t
change their design style after being mauled by syntax testing can probably be thrashed by
any testing technique, now and forever. However they do it, by elegant methods or by brute
force, good programmers will eventually become immune to syntax testing.
Page 17
Software Testing Methodologies Unit IV
It’s about the only test design automation area in which you can count on a payback the first time
out.
4.2.2. Primitive Methods
You can do design automation with a word processor.
If you don’t have that, will you settle for a copying machine and a bottle of white-out? Design a
covering set of correct input strings.
If you want to, because you have to produce paper documentation for every test case, bracket
your test strings with control sequences such as “$$$XXX” so that you’ll be able to extract them
later on.
Let’s say you’re doing operator commands.
Pick any command and reproduce the test sheet as often as you need to cover all the bad cases
for that command.
Then, using the word processor’s search-and-replace feature, replace the correct substring with
the chosen bad substring.
If you use the syntax definition graph as a guide, you’ll see how to generate all the single-error
cases by judicious uses of search-and-replace commands.
Once you have the single-error cases done, go on to the double errors if you don’t already have
more cases than you can handle.
With double errors you have to examine each case to be sure that it is still an error case rather
than a correct case—similarly for triple and higher errors.
If you’re starting with a capture/replay system, then you can do the editing either in the system’s
own editor or with a word processor.
It’s really more difficult to describe than to do.
Think about how you might automate syntax test design with just a copying machine and a hardy
typist: then graduate to a word processor.
If you understand these primitive methods, then you’ll understand how to automate much of
syntax test design.
4.2.3. Scripting Languages
A scripting language and processor such as CASL has the features needed to automate the
replacement of good substrings by bad ones on the fly.
You can use random number generators to select which incorrect, single, character will be used
in any spot.
Similarly for replacing incorrect keywords by correct ones and for deciding whether or not to
delete mandatory fields.
You can play all kinds of game with this, but remember that you’ll not be able to predict which
produced strings are right or wrong.
This is a good approach to use if your main purpose is to stress the software rather than to
validate the format validation software.
If you want to do it right, whatever language you do it in, you have to get more sophisticated.
4.2.4. Random String Generators
Why not just use a random number generator to generate completely random strings?
Two reasons: random strings get recognized as invalid too soon, and even a weak front end will
catch most bad strings.
The probability of hitting vulnerable points is too low, just as it was for random inputs in domain
testing—there are too many bad strings in the world.
A random string generator is very easy to build. You only have to be careful about where you put
string terminators such as carriage returns.
Throw the dice for the string length and then pack random characters (except string terminators)
in front of the terminator until you’ve reached the required length.
Page 18
Software Testing Methodologies Unit IV
Easy but useless. Even with full automation and running at night, this technique caught almost
nothing of value.
4.2.5. Getting Sophisticated
Getting sophisticated means building an anti-parser.
It’s about as complicated as a simple compiler.
The language it compiles is BNF, and instead of producing output code it produces structured
garbage. I’ll assume that you know the rudiments of how a compiler works—if not, this section is
beyond you.
As with a compiler, you begin with the lexical scan and build a symbol table. The symbols are
single characters, keywords, and left-hand sides of definitions.
Keep the three lists separate and replace the source symbols with numerical tokens.
Note that each definition string points to one or more other definition strings, characters, or
keywords—i.e., to other tokens in the symbol table.
There are two ways to screw up—bad tokens and bad pointers.
Start with a covering set of correct test cases.
This can be done by hand or by trial and error with random number generators or by using flow-
analyzer techniques such as are used in path test generators.
Given a good string, you now scan the definition tree by using a tree search procedure.
At every node in the subtree corresponding to your good test case you can decide whether you’re
going to use an incorrect token or an incorrect branch to a subsidiary definition.
Use random numbers to replace individual characters, keywords, or pointers to other definitions.
Double errors work the same way except that they use the single-error strings as a seed.
Similarly, triple-error cases are built on using the double-error cases as a seed.
They grow fast.
Another way to look at automated syntax test generation is to view the normal cases as path test
generation over BNF as the source language.
The errored cases are equivalent to creating mutations (BUDD81, WHIT87) of the source “code.
” There is no sensitization problem because all paths are achievable.
If you’ve read this far, then you know that you can’t guarantee bad strings, even for single-error
cases because that’s a known unsolvable problem.
Double errors increase the probability of correct strings because of error cancellations.
The only (imperfect) way to sort the good from the bad is to use the BNF specification as data to
a parser generator and then use the generated parser to sort for you—it can’t be perfect but it
should do for simple operator commands.
What’s the point of generating test strings and using an automatically created parser to sort the
good from the bad? If you’ve got such a parser, use it instead of the code you’re testing?
If we were dealing with entirely new code and a new command language, it would be better to
generate the parser and avoid the testing.
Using the generated parser as above is useful if it’s an older system under maintenance and your
objective is to build a big syntax-testing suite where one didn’t exist before.
4.3. Productivity, Training, and Effectiveness
I used syntax test design as basic training for persons new to a test group.
With very little effort they can churn out hundreds of good tests.
It’s a great confidence builder for people who have never done formal test design before and who
may be intimidated by the prospect of subjecting a senior designer’s masterpiece to a barrage of
calculated heartburn.
With nothing more sophisticated for automation than a word processor and a copying machine, a
testing trainee can usually produce twenty to thirty fully documented test cases per hour after a
few days of training.
Page 19
Software Testing Methodologies Unit IV
Syntax testing is also an excellent way of convincing a novice tester that testing is infinite and
that the tester’s problem is not generating tests but knowing which ones to cull.
When my trainees told me that they had run out of ideas, it was time to teach them syntax
testing.
I would always ask them to produce all single-, double-, and triple-error cases for a few well-
chosen operator commands. Think about it.
4.4. Ad-Lib Tests
Whenever you run a formal system test there’s always someone in the crowd who wants to try
ad-lib tests.
And almost always, the kind of test they want to ad-lib is an input-syntax error test. I used to
object to adlibbing, because it didn’t prove anything—I thought.
It doesn’t prove anything substantive about the system, assuming you’ve done a good job of
testing—which is why I used to object to it.
It may save time to object to ad-lib tests, but it’s not politic.
Allowing the ad-lib tests demonstrates that you have confidence in the system and your test.
Because a system wide functional demonstration should have been through a dry run in
advance, the actual test execution is largely ceremonial (or should be) and the ad-libbers are part
of the ceremony, just as hecklers are part of the ball game—it adds color to the scene.
You should never object if the system’s final recipient has cooked up a set of tests of his own.
If they’re carefully constructed, and well documented, and all the rest, you should welcome yet
another independent assault on the system’s integrity.
Ad-lib tests aren’t like that.
The customer has a hotshot operator who’s earned a reputation for crashing any system in under
2 minutes, and she’s itching to get her mitts on yours.
There’s no prepared set of tests, so you know it’s going to be ad-libbed. Agree to the ad-libbing,
but only after all other tests have been done. Here’s what happens:
1. Most of the ad-lib tests will be input strings with format violations, and the system will
reject them—as it should.
2. Most of the rest are good strings that look bad. The system accepts the strings and does
as it was told to do, but the ad-lib tester doesn’t recognize it. It will take a lot of explanation to
satisfy the customer that it was a cockpit error.
3. A few seemingly good strings will be correctly rejected because of a correlation problem
between two field values or a state dependency. These situations will also take a lot of
explanation.
4. At least once, the ad-lib tester will shout “Aha!” and claim that the system was wrong. It
will take days to dig out the documentation that shows that the way the system behaves for
that case is precisely the way the customer insisted that it behave—over the designer’s
objections.
5. Another time the ad-lib tester will shout “Aha!” but, because the inputs weren’t
documented and because nonprinting characters were used, it won’t be possible to
reproduce the effect. The ad-lib tester will be forever convinced that the system has a flaw.
6. There may be one problem, typically related to an interpretation of a specification
ambiguity, whose resolution will probably be trivial.
This may be harsh to the ad-lib testers of the world, but such testing proves little or nothing if the
system is good, if it’s been properly tested from unit on up, and if there has been good quality
control.
If ad-lib tests do prove something, then the system’s so shaky and buggy that it deserves the
worst that can be thrown at it.
Page 20
Software Testing Methodologies Unit IV
5.TESTABILITY TIPS
5.1. The Tip
Here’s the whole testability tip:
1. Bring the hidden languages out of the closet.
2. Define the syntax, formally, in BNF.
3. Simplify the syntax definition graph.
4. Build a parser.
I’ll quell the objections to the last step by pointing out that building a minicompiler is a typical
senior-year computer science project these days.
I find that interesting because although most computer science majors build a compiler once in
their lifetime, they’ll never have a chance to build a real compiler once they’re out in the world—
we just don’t need that many programming language compilers.
So drag out the old notebooks to remember how it was done and if you learned your
programming before compiler building was an undergraduate exercise, get one of the kids to do it
for you.
5.2. Compiler Overview
This overview is superficial and intended only to illustrate testability issues. Compilation consists
of three main steps: lexical analysis, parsing, and code production.
In our context, we deal most often not with a compiler as such, but with an interpreter; but if
we’re testing hidden languages, then indeed we may be interested in a compiler.
The main difference between an interpreter and compiler is that an interpreter works one
statement at a time and does the equivalent of code production on the fly.
1. Lexical Analysis—The lexical analysis phase accomplishes the following:
a. The analyzer knows enough about parsing to identify individual fields, where we define a
field as a linguistic element that uses no lower-level definitions. That is, a field is defined
solely in terms of primitive elements such as characters.
b. Identifies interfield separators or delimiters.
c. Classifies the field (e.g., integer, string, operator, keyword, variable names, program
labels). Some fields, such as numbers or strings, may be translated at this point.
d. New variable names and program labels are put into a symbol table and replaced by a
pointer to that table. If a variable name or program label is already in the table, its
appearance in the code is replaced by the pointer. The pointer is an example of a token.
e. Keywords (e.g., STOP, IF, ELSE) are also replaced by tokens as are single-character
operators. Numbers and strings are also put in a table and replaced by pointers.
f. Delimiters are eliminated where possible, such as interfield delimiters. If the language
permits multiple statements per line and there are statement delimiters, statements will be
separated so that subsequent processing will be done one statement at a time. Similarly,
multiline statements are combined and thereafter treated as a single string.
g. The output of the lexical analysis phase is the partially translated version of the source in
which all linguistic components have been replaced by tokens. The act of replacing these
components by tokens is called tokenizing.
2. Parsing—Parsing is done on tokenized strings. There are many different strategies used
for parsing, and they depend on the kind of statement, the language, and the compiler’s
objectives. There is also a vast literature on the subject. For general information you can start
with LEER84 or MAG184. From our point of view, the validation aspect of parsing consists of
showing that the string to be parsed corresponds to a path in the syntax graph. The output of
the parser is a tree with the statement identifier at the top, primitive elements (e.g.,
characters and keywords) at the bottom, and with intermediate nodes corresponding to
definitions that were traversed along the path through the syntax graph.
Page 21
Software Testing Methodologies Unit IV
3. Code Production—Code production consists of scanning the above tree (bottom-up, say)
in such a way as to assure that all objects needed are available when they are needed and
then replacing the tokens with sequences of instructions that accomplish what the tokens
signify. From our point of view and our typical use of syntax testing, the equivalent to code
production is a call to a worker subroutine or transfer of control to the appropriate program
point.
5.3. Typical Software
Unlike the above operation with its clear separation of lexical analysis, parsing, and production,
the typical software for (operator command, say) syntax validation and command processing
follows what I like to call:
(lex-a-little + parse_a_little + process_a_little)*
Because the three aspects of command interpretation are hopelessly intermixed, a single bug
can involve all three aspects. In other words, the ground for bugs is much more fertile.
5.4. Separation of Phases
Separation of the three phases means that it is virtually impossible for a bug to involve, say, the
interaction between processing, say, and parsing.
We can handle the lexical testing by low-level syntax testing based on one field at a time and be
confident that we can ignore lexical-level field interactions, except where field delimiters are
involved.
Similarly, because most delimiters are eliminated during lexical analysis, we don’t have to bother
with combinations of syntax errors and long delimiter strings.
Lexical-parsing separation means that test strings with combined lexical and syntax errors will not
be productive.
Parsing-processing separation means that we can separate domain testing from syntax testing.
Domain analysis is the first stage of processing and follows parsing, and it is therefore
independent of syntax.
The bottom line of phase separation is the wholesale elimination of possible double-error and
higher-order vulnerabilities and therefore the need to even consider such cases.
In addition to more robust software that’s easier to test, there’s a payoff in maintenance.
The lexical definitions, the syntax, and the equivalent of code that points to working subroutines
that do the actual processing can all be stored in tables rather than as code.
Separation means separate maintenance. If a processing routine is wrong, there’s no need to
change the lexical analyzer or parser.
If a new command is to be added, chances are that only the parser and keyword table will be
affected.
Similarly for enhancing existing commands.
What’s the price? Possibly more memory, possibly more processing time, but probably neither.
The ad hoc lex_a_little, parse_a_little code is a jumbled mess that often contains a lot of Code
redundancy and wasted reprocessing.
My own experience has been that in every instance where we replaced an old-style format
analyzer and processor with an explicit lex analyzer and parser, even though we had planned on
more time and more space, to our surprise the new software was tighter and faster.
5.5. Prerequisites
The language must be decent enough so that it is possible to do lexical analysis before parsing
and parsing before processing.
That means that it is possible to pull out the tokens in a single left-to-right pass over the string
and to do it independently of any other string or statement in the language.
The kind of thing that can’t be handled for example, are formats such as:
Page 22
Software Testing Methodologies Unit IV
“If the first character of the string is an alpha, then every fourth character following it is a token
delimiter and the last token is a symbol; but if the first character is numeric, only spaces and
parentheses are delimiters and any contiguous string of alphas unbroken by a delimiter is a
symbol.”—hopeless.
The above is an example of a context-dependent language. Languages with more virtuous
properties are called context-free.
We’re not dealing with general purpose programming languages here but with simpler
minilanguages such as human-interface languages and internal hidden languages.
The only excuse for context dependencies is that they were inherited.
If it’s an internal language then it can, and should, be changed to remove such dependencies.
If it’s a human-interface language, then the context dependencies must be ripped out because
humans can’t really deal with such command structures.
Page 23
Software Testing Methodologies Unit IV
Page 24
Software Testing Methodologies Unit IV
If the condition is met and if the action entry is YES then the action will taken place, if NO the
action will not taken place.
Condition entry
RULE 1 RULE 2 RULE 3 RULE 4
CONDITION 1 YES YES NO NO
Condition
CONDITION 2 YES I NO I
Stub
CONDITION 3 NO YES NO I
CONDITION 4 NO YES NO YES
ACTION 1 YES YES NO NO
Action
Stub ACTION 2 NO NO YES NO
ACTION 3 NO NO NO YES
Action entry
From the above table, Action 1 will take place if conditions 1 and 2 are met and if conditions
3 and 4 are not met (rule 1) or if conditions 1,3 and 4 are met (rule 2).
Condition is another word for predicate. So replace condition with predicate.
If a condition is met then the predicate is true. Similarly for not met is false.
Now we can say that Action 1 will be taken if predicates 1 and 2 are true and if predicates 3
and 4 are false (rule 1) or if predicates 1,3 and 4 are true (rule 2).
Action 2 will be taken if all the predicates are false (rule 3).
Action 3 will be taken place if predicate 1 is false and predicate 4 is true (rule 4).
Here we need a default rule that specifies the default action to be taken when all other rules
fail. The default rules for the above table are show below.
RULE 5 RULE 6 RULE 7 RULE 8
CONDITION 1 I NO YES YES
CONDITION 2 I YES I NO
CONDITION 3 YES I NO NO
CONDITION 4 NO NO YES I
DEFAULT YES YES YES YES
ACTION
If the set of rules covers all the combinations of TRUE / FALSE (YES/ NO) for the predicates,
a default specification is not needed.
(ii) Decision-Table Processors
Decision tables can be automatically translated into code and decision table represent higher
level language. The decision table’s translator checks the source decision table for
consistency and completeness and fills in any default rules.
First it observes rule1. If the rule is satisfied, the corresponding action is executed.
Otherwise rule 2 is tried. This process is repeated until a rule is satisfied or no rule is
satisfied.
If the rule is satisfied then the corresponding action will take place. If the rule is not satisfied
then the default action taken place.
Page 25
Software Testing Methodologies Unit IV
The advantages of using decision tables are: it provides clarity, it provides relation to
specification, and it provides maintainability. The main drawback is object code inefficiency.
(iii) Decision-Tables as a basis for Test case Design:
If a specification is implemented as a decision table, then decision tables are used for test
case design.
Similarly, if a program’s logic is implemented as decision tables, then decision tables also
used for test case design.
If this is so, then the consistency and completeness of the decision table is checked by the
decision table processor.
It is not desirable to implement program as decision table because restrictions in decision
table language.
The following are restrictions.
1. The specifications are specified.
2. The order in which the predicates are evaluated does not effect the resulting action.
3. The order in which the rules are evaluated does not effect the resulting action.
4. Once a rule is satisfied and an action is executed, no other rule need to be examined.
5. If several actions can result from satisfying a rule, the order in which the actions are
executed does not matter.
It is clear from the above restrictions that action selected is based on the combination of
predicate truth values. Let us consider an automatic teller machine.
The first condition is that the card should be valid.
The second condition is the correct password should be entered.
The third condition is that the sufficient money should be present in the account.
Depending on the conditions, respective actions are executed.
(iv) Expansion of Immaterial Cases:
In decision table immaterial entries are denoted by ‘I ’.
If there are n predicates in the decision table then 2 n combination of truth values should be
considered.
The expansion is done by converting each I entry into two entries one with YES and other
with NO. Each I entry in a rule double the number of cases.
Rule 2 Rule 4
RULE RULE RULE RULE RULE RULE
2.1 2.2 4.1 4.2 4.3 4.4
CONDITION 1 YES YES NO NO NO NO
CONDITION 2 YES NO YES YES NO NO
CONDITION 3 YES YES NO YES YES NO
CONDITION 4 YES YES YES YES YES YES
In the previous table rule 2 contains one I entry and therefore it expands into two equivalent
sub rules.
Rule 4 contains two I entries and therefore it expands into four equivalent sub rules.
The expansion of rules 2 and 4 are shown in the above table.
The following table is an example of an inconsistent specification in which the expansion of
two rules gives a contradiction.
Here rules 1 and 2 are contradictory, because two column entries 1.2 & 2.3 are same.
Therefore action 1 or action 2 is taken depending on which rule is evaluated first
Page 26
Software Testing Methodologies Unit IV
RULE RULE RULE RULE RULE RULE
1 2 1.1 1.2 2.3 2.4
CONDITION 1 YES YES CONDITION 1 YES YES YES YES
CONDITION 2 I NO CONDITION 2 YES NO NO NO
CONDITION 3 YES I CONDITION 3 YES YES YES NO
CONDITION 4 NO NO CONDITION 4 NO NO NO NO
ACTION 1 YES NO ACTION 1 YES YES NO NO
ACTION 2 NO YES ACTION 2 NO NO YES YES
B, YES D
D, NO
6
(R3)
B, NO (R2)
A A, YES B
1
3 4
A, NO 9 ACTION 2 2
C C, YES (R4)
5
(R5)
C, NO D D, YES
7
D, NO
(R6)
10 ACTION 3
Page 27
Software Testing Methodologies Unit IV
The decision table corresponding to the above figure is.
RULE 1 RULE 2 RULE 3 RULE 4 RULE 5 RULE 6
CONDITION A YES YES YES NO NO NO
CONDITION B YES NO YES I I I
CONDITION C I I I YES NO NO
CONDITION D YES I NO I YES NO
ACTION 1 YES YES NO NO NO NO
ACTION 2 NO NO YES YES YES NO
ACTION 3 NO NO NO NO NO YES
Sixteen cases are represented in the previous table and no cases appear twice.
Therefore the flowgraph appears to be complete and consistent.
Count the number of Y’s and N’s in each row. They should be equal.
Consider the following flowgraph.
2
C
C C
9
C
B
A A B B C C
1 6 A1 8 A2 11
3 7 10
A
B B
4
C C
5
C
12 A3 2
Page 28
Software Testing Methodologies Unit IV
1. If condition A is met, do process A1. If condition B is met, do process A2
2. If condition C is met, do process A3
3. If none of the condition is met, do process A1, A2, and A3.
4. When more than one process is done, process A1 must be done first, then A2 and then A3.
The following table shows the conversion of this flowgraph into a decision table.
A
B B
4
C C
5
C
12 A3 2
In the above figure the straight through path which gives via nodes 3,6,7,8,10,11,12,2 has a
truth value of ABC.
The path via nodes 3,6,7,9,2 has a value of ABC
If two or more paths merge at a node then it is expressed by use of a plus sign (+) which
means OR.
Using the above we can write
N6 = A + A B C
N8 = (N6) B + A B
N11 = (N8) C + (N6) B C
N12 = N11 + A B C
N2 = N12 + (N8) C + (N6) B C
(ii) The rules of Boolean Algebra:
Boolean algebra has three operators.
x means AND. Also called multiplication. A statement such as AB means A and B both true.
+ means OR. Also called addition. A statement such as A + B mean either A is true or B is
true or both.
A means NOT. Also called negation or complementation.
Ex A is true only when statement A is false.
The Laws of Boolean algebra is shown below.
1. : A + A = A 10 A A=0
A+A=A 11. A = A
2. : A + 1 = 1 12. 0 = 1
Page 30
Software Testing Methodologies Unit IV
3. : A + 0 = A 13. 1 = 0
4. : A + B = B + A 14. De Morgan’s Law: A + B = A B
5. : A + A = 1 15. A B = A + B
6. : A A = A 16. Distributive Law: A (B + C)= AB + AC
A A=A 17. (AB) C = A(BC)
7. : A x 1= A 18. (A + B) + C = A + ( B + C)
8. : A x 0 = 0 19. A + A B = A + B
9. : AB = BA 20. A + AB = A
Individual letters in a Boolean algebra expression are called literals.
The product of several literals is called a product form (eg: ABC, DE ).
(iii) Examples:
The path expressions are simplified by applying the rules.
N6 = A + A B C
=A+BC [ since let D=B C, A + A B C = A + A D = A + D =A + B C]
N8 = (N6) B + A B = ( A + B C) B + A B
= A B + B C B + A B = ( A B + B B C) + A B
= AB + 0 C + AB = AB + A B
=( A + A ) B =1xB
=B
N11 = (N8) C +(N6) BC = BC + ( A + B C) B C
= B C + A B C +0 = C ( B + A B)
= C ( B + BA) =C(B+A)
=CB+CA = AC + BC
N12= N11 + A B C
= AC + BC + A B C = BC + A B C + AC
= C ( B + A B) + AC = C ( A + B) + AC
= CA + AC + BC = C(A + A) + BC
= C (1) + BC = C + BC
= C (1 + B) = C(1)
=C
N2= N12 + (N8) C +(N6) B C
=C+BC+(A+BC)BC
=C+BC+ABC+BCBC=C+BC+ABC+BC
= C + BC + BC(1+A) = C + BC + BC
Page 31
Software Testing Methodologies Unit IV
= C + C (B + B)
= C + C(1)
=C+C
=1
(iv) Paths and domains:
Consider a loop free entry / exit path and assume all predicates are simple.
Each predicate on the path is denoted by a capital letter either overscored or not.
The result is a term that consists of the product of several literals. For ex: A B C.
If a literal appears twice in a product term then one appearance can be removed and the
decision is redundant. For ex: consider C C, B B here we have to take only one C & one B
If a literal appears both barred and un barred in a product term then the term is equal to zero
and the path is un achievable.
A product term on an entry / exit path specifies a Domain.
For compound predicates there is a provision of separate path for each product term.
For example, we can implement ABC + DEF +GH as one path using a compound predicate
or as three separate paths i.e. ABC, DEF, GH and specify three separate domains.
Let us say we have a specification such that there is one and only one product term for each
domain then represent these domains as D1, D2, D3, ……Dm.
Consider any of these product terms Di, Dj.
For every i not equal to j, Di, Dj equal to zero. If not equal to zero, then there is an overlap of
the domains which is a contradictory domain specification.
The sum of all the Di must equal to 1 else there is an ambiguity.
(v) Test case design:
Let us consider a hierarchy of test cases for a loop that has a compound predicate.
The routine has a single entry and single exit and has no dead end code.
Because the predicates may be compound, the Boolean algebra expression of a domain will
be a sum of products after simplification.
We can build a hierarchy of test strategies by considering how we test for each domain.
Here consider
1. Simplest: Use any prime implicant in the expression. Suppose ABC + AB + DEF reduces
by AB + DEF, then AB, DEF are called prime implicant.
2. Prime implicant cover: Pick input values so that there is at least one path for each prime
implicant at the node.
3. All Terms: Test all expanded terms for that node. For example in previous figure the node
6 has five terms.
N6 = A + A B C
= AB(C + C ) + AB (C + C) + A B C
=ABC+ABC+ABC+ABC+ABC
Here there are totally five terms. Similarly for node 8 has 4 terms & node 12
has 4 terms. There is at least one path for each term.
4. Path dependence: Because in general the truth value of a predicate is obtained by
interpreting the predicate, its value may depend on the path taken there.
(3) Boolean equations:
Loops complicate things because we may have to solve a Boolean equation to determine
what predicate value combinations lead to where. Consider the following flowgraph
Page 32
Software Testing Methodologies Unit IV
B
B B A A
1 4 6 2
3 5
F2
F1 A F4
B
B
7
8
B A
F3
C C C A
9 10
A
Here the link name F1, F2, F3, F4 represents the Boolean expression corresponding to that
link.
N4 = B + F1 N7 = (N4) A + F3
= B + (N7) B C = A B + (N7) B C A
= B (1 + (N7) C) =AB
=B N2 = N6 + F4
N6 = (N4) A + B = A + B + (N7) A B C
=BA+B = A + B (1 + (N7) A B C)
=A+B =A+B
Example:
(1) Demonstrate by means of truth tables the validity of the following theorems of Boolean
Algebra.
(i) Associative laws
(ii) De Morgan’s theorems for three variables
(iii) Distributive law of + over.
(Ans) (i) Associative laws
(a) Associative law of addition
(A + B) +C = A + (B + C)
Let TRUE= T & FALSE = F then (A + B) +C & A + (B +C) is given by
A B C (A+B) (A+B)+C (A + B) + C
T T T T T F
T T F T T F
T F T T T F
T F F T T F
F T T T T F
F T F T T F
F F T F T F
F F F F F T
A B C A B C (AxB) ( Ax B ) x C
T T T F F F F F
T T F F F T F F
T F T F T F F F
T F F F T T F F
F T T T F F F F
F T F T F T F F
F F T T T F T F
F F F T T T T T
(A + B) + C = A x (B x C)
Page 34
Software Testing Methodologies Unit IV
(b) (A x B) x C = (A + B) + C
Let TRUE= T & FALSE = F then (A x B) x C & A + (B + C) is given by
A B C (AxB) (AxB)xC (A x B) x C
T T T T T F
T T F T F T
T F T F F T
T F F F F T
F T T F F T
F T F F F T
F F T F F T
F F F F F T
A B C A B C (A+B) ( A+ B ) + C
T T T F F F F F
T T F F F T F T
T F T F T F T T
T F F F T T T T
F T T T F F T T
F T F T F T T T
F F T T T F T T
F F F T T T T T
(A x B) x C = A + (B + C)
(iii) Distributive law of + over
Distributive law of + over
A + (B x C) = (A + B) x (A + C)
Let TRUE= T & FALSE = F then A + (B x C) & (A + B) x (A + C) is given by
A + (B x C) = (A + B) x (A + C)
Page 35
Software Testing Methodologies Unit IV
(2) Demonstrate by means of truth tables the validity of the following theorems of Boolean
Algebra.
(i) Commutative laws
(ii) Absorption law
(iii) Idempotency laws
(i) Commutative laws
(a) Commutative law of addition
A+B =B+A
Let TRUE= T & FALSE = F then A + B & B +A is given by
A B A+B B A B+A
T T T T T T
T F T F T T
F T T T F T
F F F F F F
From the above table it shows that A + B = B + A
(b) Commutative law of multiplication
AxB =BxA
Let TRUE= T & FALSE = F then A x B & B x A is given by
A B AxB B A BxA
T T T T T T
T F F F T F
F T F T F F
F F F F F F
From the above table it shows that A x B = B x A
(ii) Absorption law
Absorption law
A+AB =A+B
Let TRUE= T & FALSE = F then A + A B & A +B is given by
A
0 1
A
0 1
A
0 1
0 1 0 1 0 0
1 1 1 1 1 1
A B AB AB AB
A A A A
B 0 1 B 0 1 B 0 1 B 0 1
1 1 1 1
0 0 0 0
1 1 1 1 1 1 1 1
A B A B
Page 37
Software Testing Methodologies Unit IV
A A A A
B 0 1 B 0 1 B 0 1 B 0 1
0 1 0 1 0 1 1 0 1 1
1 1 1 1 1
1 1 1 1
A+ B A +B A+B B +A
A A A A
B 0 1 B 0 1 B 0 1 B 0 1
0 1 0 1 0 0 1
1
1 1 1
1 1 1 1 1
1 1 1 1 1
ABC ABC AB
AB AB
00 11 1 0 AB 0 0 11 1 0 00 11 1 0
C 0 1 C 0 1 C 0 1
0 0 1 1 1 0 1 1
1 1 1 1 1 1
BC BC+AB BC
AB AB AB
C 00 0 1 11 1 0 C 00 0 1 11 1 0 C 00 0 1 11 1 0
0 1 1 0 1 0 1 1
1
1 1 1 1 1 1 1 1 1
1
1 1 1 1 1 1 1 1 1
A C C
AB AB AB
C 00 0 1 11 1 0 C 00 0 1 11 1 0 C 00 0 1 11 1 0
0 1 1 0 1 1 0 1 1
1
1 1 1 1 1 1 1 1 1 1 1 1
B B+C A + BC + BC
Page 38
Software Testing Methodologies Unit IV
(iv) Four Variables:
The same principles hold for four or more variables
AB AB AB
CD 00 01 11 10 CD 00 01 11 10 CD 00 01 11 10
00 1 1 00 1 1 00 1 1
01 1 1 01 01 1 1
1 1 1 1
11 11 11
10 1 1 10 1 1 10 1 1
AC+AC BD BD+BD
AB AB AB
CD 00 01 11 10 CD 00 01 11 10 CD 00 01 11 10
00 1 00 1 1 00
01 1 01 1 1 01 1 1
1 1 1 1 1 1 1 1 1
11 11 11
10 1 1 10 1 1 10
ABCD+ABD+AC ABD+BD+BC BD
Examples:
(i) Using a Karnaugh map minimize
F= A B C D + A B C D + A B C D + A B C D + A B D + B C D + A B C D
Ans: The Standard SOP form is:
F(A,B,C,D)=A B C D + A B C D + A B C D + A B C D + A B D (C + C) + (A + A) B C D + A B C D
=ABCD+ABCD+ABCD+ABCD+ABCD+ABCD+ABCD
+ABCD+ABCD
AB
CD 00 01 11 10
0 4 12 8
00 1
1 5 13 9
01 1 1 1
3 7 15 10
11 1 1
2 6 14 11
10 1 1 1
Page 39
Software Testing Methodologies Unit IV
(ii) Minimize the function using Karnaugh map method
F(A,B,C,D)= ∑(1,2,3,8,9,10,11,14) + ∑d (7,15)
Ans:
AB
CD 00 01 11 10
0 4 12 8
00 1
1 5 13 9
01 1 1 1
3 7 15 10
11 1 d d 1
2 6 14 11
10 1 1 1
Page 40
Software Testing Methodologies Unit IV
6. Enter the Boolean expressions in a KV chart and check for consistency. If the
specifications are consistent, there will be no overlaps.
7. Enter the default cases and check for consistency.
8. If all boxes are covered, the specification is complete.
9. If the specification is incomplete or inconsistent, translate the corresponding boxes back
and get a clarification, explanation or revision.
10. If the default cases were not specified explicitly, translate the default cases back and get
a confirmation.
(ii) Finding and translating the logic:
The formation of specifications into sentences is given below.
Specifications are formed into sentences by using the following IF-THEN format.
IF represents predicate, THEN represents action.
Hence predicates are used by applying certain Boolean connectives like AND, OR, and NOT
and represented by A1, A2, A3.
The different phrases which can be used for the words are
IF: if, if and when, only if, only when, based on, because, but etc.
THEN: then, assign, shall, should, will, would, do etc.
AND: all, and, as well as, both, but, in conjunction with, coincidental with etc.
OR: or, either-or, and, and if..then, and/or, in addition to, otherwise etc.
NOT: but, but Not, excluding, less, neither, never, besides etc.
EXCLUSIVE OR: but, by contrast, conversely, nor etc.
IMMATERIAL: irrelevant, independent of, irregardless, irrespective, whether or not etc.
Other than these, some other dangerous phrases also exist such as respectively, similarly
etc.
Now we have a specification of the form
IF A AND B AND C, THEN A1
IF C AND D AND F, THEN A3
IF A AND B AND D, THEN A2
(iii) Ambiguities and Contradictions:
The problem of ambiguity occurs, when more than one action is activated by many boxes of
KV chart or any box is empty in KV chart.
Let us consider an ambiguous specification that is
A1 = B C D + A B C D
= (A + A) B C D + A B C D
=ABCD+ABCD+ABCD
A2= A C D + A C D + A B C + A B C
= A(B + B) C D + A( B + B) C D + A B C (D + D) + A B C(D + D)
=ABCD+ABCD+ABCD+ABCD+ABCD+ABCD
+ABCD+ABCD
= A B C D +A B C D + A B C D + A B C D
A3= B D + B C D
= (A + A) B (C + C) D + (A + A) B C D
=A B C D + A B C D + A B C D + A B C D + A B C D + A B C D
Page 41
Software Testing Methodologies Unit IV
ELSE = B C + A B C D
=(A + A) B C (D +D) + A B C D
=ABCD+ABCD+ABCD+ABCD+ABCD
Here 1,2,3 represents the actions and the 4th specifies the default case.
Now represent these specifications as follows.
AB
CD 00 01 11 10
00 4 1 1,2 2
01 3 2,3 1,2
11 4 3 3 4
10 4 3 3 4
In this case the ambiguity occurs in the case of A B C D, this gives many inconsistent or
contradictory solutions.
There are several boxes that call for more than one action.
In A B C D both action 1 and action 2 shall be taken.
For unspecified default action do the following
Insert explicit entries in the KV chart.
Apply negation.
Provide an equivalent expression as a default statement.
(iv) Don’t care and Impossible terms:
Don’t care terms (Ø) are the terms or conditions using which logic is simplified through KV
chart.
The value of Ø can be either 0 or 1.
Consider the following three impossible things.
1. Creation of a universal program verifier
2. Knowing both the exact position and the exact momentum of a fundamental particle.
3. Knowing what happened before that started the universe.
Basically impossible conditions are used to simplify the logic.
The two types of impossible conditions are
1. The condition cannot be created or improbable
2. The condition results from forcing a complex continuous one into a binary logical one.
Logic Simplification:
The steps involved in simplifying the logic are as follows.
1. Identify all impossible and illogical cases.
2. Next avail these cases effectively
3. For this purpose KV chart is used
4. Use the symbol Ø which is to be interpreted as 0 or 1.
Page 42
Software Testing Methodologies Unit IV
AB
CD 00 01 11 10
00 Ø 1
01 1 Ø Ø
11 Ø 1 1 1
10 1 1 1 1
A
B
C A B D
C A B D ACTION
D
ELSE
C
A B D
A B D
A D
B
ELSE ELSE
D
D
D
ELSE
Control flowgraph for equation (2)
A
A ACTION
A
C
C
ELSE
Page 43