0% found this document useful (0 votes)
21 views17 pages

2016 Power S

The CMIMC Power Round 2016 focuses on Deterministic Finite Automata (DFA) and their applications in computational theory. It provides definitions, examples, and problems related to DFAs, including how to construct them for specific languages and the concept of regularity in languages. The document also discusses the limitations of DFAs and includes solutions to various problems posed throughout the round.

Uploaded by

djcolucia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views17 pages

2016 Power S

The CMIMC Power Round 2016 focuses on Deterministic Finite Automata (DFA) and their applications in computational theory. It provides definitions, examples, and problems related to DFAs, including how to construct them for specific languages and the concept of regularity in languages. The document also discusses the limitations of DFAs and includes solutions to various problems posed throughout the round.

Uploaded by

djcolucia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

CMIMC Power Round 2016

CMIMC Staff

April 2, 2016

Abstract
The following is the Power Round for this year. Thanks to everyone, especially Course 15-251, for
helping out with advice and problems and what not yayayay! (Ok idk what else to put here, darn)

1
1 Introduction
When working with computation on an abstract level, it is helpful to develop mathematical models for
machines. There are several mathematical models which work, but they all have similar characteristics:
• They must be simple to understand.
• They must encompass many different computational tasks (though not necessarily all).
• They must be hardware independent. (Theoretical computer scientists don’t really care about
technical limitations; they instead care about which types of problems can be solved by a computer.)
One such model of computation is known as a Deterministic Finite Automaton (DFA). DFA’s are
simplistic models of computation that still have deep theories and implications for computational theories
in the real world. In this power round, we will explore the underlying theory behind these mysterious
models, solving interesting problems and proving fundamental results in automata theory along the way.

2 Introduction to Languages
Before we begin, we introduce the notation we shall be using to designate inputs to DFAs. This section
is mainly just definitions - manipulating languages will not come until later sections.
Definition 2.1. An alphabet is any finite, non-empty set of symbols. We often denote an alphabet by
Σ. Examples of alphabets are Σ1 = {0, 1} and Σ2 = {a, b, c}.
Definition 2.2. A string is any finite sequence of elements from an alphabet. For any string x, |x|
denotes the number of characters in the string. The empty string ε represents the empty sequence.
Definition 2.3. For an alphabet Σ, the notation Σ∗ denotes the set of all strings over Σ. For example,
{0, 1}∗ = {ε, 0, 1, 00, 01, 10, 11, 100, . . . }.
Definition 2.4. A language over an alphabet Σ is any subset of Σ∗ .
Now that we have some basic notation, it’s time to move on to the meat of this power round.

3 DFAs
Unlike most other sections of this round, we skip the flavortext and go straight to a definition.
Definition 3.1. A deterministic finite automaton (DFA) is a mechanism that takes in a string as an
input and either accepts or rejects it. This is analogous to, say, a boolean function which takes in a string
and outputs either true (i.e. accepts the string) or false (i.e. rejects it).
When drawn, DFAs look like a bunch of arrows and circles. Here is an example of a DFA:

q2 0,1
0
0,1
start q0 q1

1
q3 0,1

Page 2
Each circle represents a state. A state with two circles is an accepting state, which means the DFA
will accept if the string ends up there. Formally, a DFA over an alphabet Σ is defined as a finite set of
states (with one start state and some of which can be accepting states), each with an arrow pointing to
another state for each element of Σ. For a DFA, the language that it admits is the language of all strings
over Σ that accept when inputted into the DFA.
To read a DFA, start with the string in the state indicated by the “start” arrow. Look at the first
letter in the string, then follow the arrow corresponding to that letter to the next state. From that state,
follow the arrow corresponding to the second letter of the string, and so on until the string has no more
letters. At that point, the DFA accepts if you are in an accepting state and rejects otherwise.
For example, let’s input the string 1011 into the above DFA. It starts at q0 . The first letter is 1, so
move to q1 . The second letter is 0 so move to q2 . The third letter is 1 so move to q2 . The fourth letter is
1, so move to q2 . There are no more letters left, and we are in an accepting state, so this DFA will accept
the string 1011.

1. Consider the following DFA:

0 1

1 0 0,1
start q0 q1 q2 q3 0,1

For each of the following strings, indicate whether the DFA accepts or rejects:

string accepts rejects


0000
1111
0010
0110
0100

Solution. The first, second, and fifth strings would be reject; the third and fourth would be
accepted. (In general, this DFA accepts all strings which are of the form w = 0m 1n 0, where m and
n are integers with m ≥ 0 and n ≥ 1.)

2. Consider the following DFAs. In simplest terms, what languages would they accept?

(a) See below.

Page 3
0

q1
0 1

start q0 q3 0,1

1 0
q2

(b) See below.

q1 0 q3
0 0
1
start q0 0 1 q5 0,1
0 1
1 1
q2 q4

Solution.

(a) The set of strings w ∈ {0, 1}∗ which have at least one 0 and at least one 1. Alternately, the set
of strings in {0, 1}∗ which are not of the form 0m or 1j for nonnegative integers m and n.
(b) The set of strings w ∈ {0, 1}∞ such that w contains either one of the substrings 000 or 111.

Note. If teams do not specify the alphabet {0, 1}∗ for either problem, take off one point from the
total combined score of parts (a) and (b).

3. For each of the problems below, construct a DFA which accepts the given language. You do not need
to prove that your submission works, but it is recommended that you at least provide a summary of
how your DFA works. (This is so the graders do not take off points for DFA’s which look different
from the ones in the official solutions but still work.)

(a) L1 = {w ∈ {0, 1}∗ | w = 001}


(b) L2 = {w ∈ {a, b}∗ | w has even length}
(c) L3 = {w ∈ {0, 1}∗ \ ε | w when expressed as an integer in binary is divisible by 5}

Solution. The following DFA’s are samples.

Page 4
(a) In general, any such DFA should keep track of how close the string is to completing the 001
sequence.

q0 0 q1 0 q2 1 q3
start

1 1 0 0,1
q4

0,1

(b) All the DFA needs to do is keep track of the parity of the number of steps taken. This should
take at most two states. (Note that ε has length 0 and thus should also be accepted.)

0,1
start q0 q1
0,1

(c) Keeping track of the value of the integer modulo 5 is important. The fact that we read the
number from left to right suggests that we can actually do this in only five states; if we were
reading from right to left, we would need 20.

start q
0 1

q0 1 q1
0
1 0

q2 1 0
0 1

1 q4 q3
0

4. Let D be a DFA with exactly three states. Suppose this DFA accepts the string 000 but not the
empty string ε. What is the smallest positive integer k > 3 such that D must also accept 0k ? (Here,
exponentiation is shorthand for repetition; for example, 04 represents the string 0000.)

Solution. The answer is k = 5.


First it suffices to show that there exists a DFA for which k = 5. The following is one such DFA.

Page 5
q1
0,1

start q0 0,1 0,1

q2

Now it suffices to show that 5 is maximal. Let D be a DFA with three states which accepts 000
but not ε. Note that by definition, the start state of D is not an accept state. Let s1 , s2 , and A be
the states which the DFA halts after inputting strings 0, 00, and 000 respectively. We now split the
analysis of the possible values for the si into several very short cases. (These are very very rough
sketches.)

• CASE 1: s1 = q0 . Then the DFA must be located at the state q0 for all time steps. This
implies that q0 must be an accept state, which contradicts the fact that ε is rejected.
• CASE 2: s2 = q0 . Then s1 = A. Since s3 is an accept state, we know that 0 is accepted, and
therefore that 02k−1 is accepted by D for all positive integers k ≥ 1. In particular, we know
that D accepts 05 . Contradiction!
• CASE 3: s3 = q0 . Trivial contradiction.
• CASE 4: s1 = A 6= q0 and s2 6= q0 . In this case, 0000 lands on state s2 , and then 00000
lands on state A, which is a accept state.
• CASE 5: s1 6= q0 and s2 = A 6= q0 . Then the DFA stays on the state A for all time steps
after 00. In particular, D must accept 05 .

These are all possible cases, and so indeed k = 5 works.

4 Languages and Regularity


It is useful to understand both the capabilities and limitations of any model of computation such as a
DFA. This roughly translates to determining which problems either do not have solutions or have solutions
which extend beyond the scope of the aforementioned model of computation. This next section, as well
as the ones which succeed it, help develop the theory behind what these DFA machines can actually do.

Definition 4.1. We say that a language Σ∗ is regular if there exists a DFA D which accepts all words
w ∈ Σ∗ and rejects all words w0 6∈ Σ∗ . If this is the case, we say that D decides the language or is a
decider for the language. (Either phrase is fine.)

As an example, the language

L = {x ∈ {0, 1}∗ | the second digit from the left is a 0}

is regular, because the DFA pictured in the previous section is a decider for L.

Page 6
1. Is any language consisting of a finite number of words (sometimes called a “finite language”) regular?

Solution. Yes. Let L be a finite language, and define

S = {n ∈ N | ∃w ∈ L with |w| = n}.

Since L is finite, S is finite. This implies by the Well-Ordering Principle that S has this maximum.
Let k denote this maximum. Construct a DFA that contains |Σ|k+1 − 1 states, one for each possible
word of length at most k in |Σ|∗ . Such a DFA can then mark a state as accepting iff the word
associated with that state is in L. This DFA decides L.

Determining whether a language is regular is somewhat straightforward: construct a DFA which


decides it! This technique is the basis for almost all problems which ask you to show that some language
is regular. Proving a language is irregular, however, is a bit more difficult; one has to come up with a
precise argument that works against all possible DFAs. The key to doing this relies in the ‘F’ in DFA.
The following problem highlights the general technique.

2. Consider the language L = {0n 1n : n ∈ N}.

(a) Suppose D is a DFA with n states which decides L. Furthermore, let S be a set of consisting
of k words w ∈ {0, 1}∗ . What is the smallest value of k such that for all such sets S, there
exist two words in S which land in the same state at the end of the simulation? Provide brief
justification.
(b) Using this, show that L is irregular, i.e. show that D does not exist. (Hint: how can you use
the main idea of the previous problem to arrive at a contradiction?)

Solution.

(a) The answer is k = n + 1 by the Pigeonhole Principle.


(b) As above, suppose D is a DFA with n states which decides L. Consider the set of n + 1 strings

S = {0k | k ∈ N, 0 ≤ k ≤ n} = {, 0, 02 , . . . , 0n }.

There are n + 1 strings in this set, so by the previous problem two of them, say 0i and 0j , must
land in th same state. Now recall that since D is deterministic, for any string s ∈ {0, 1}∗ , 0i s
and 0j s must also land on the same state. With this in mind, set s = 1i . Then 0i 1i is accepted,
but 0j 1i is rejected - contradiction! Thus, D cannot exist, meaning that L is irregular.

Before we move on to the main meat of the section regarding regularity (the problems), an interlude.
Up until now, we have been informal about what exactly a DFA is in a mathematical sense. We have
refrained from doing this in order to prevent definitions from interfering with intution. However, some of
the problems from this point forward benefit greatly from formal notation, so we will introduce it here.

Definition 4.2. Formally, a deterministic finite automaton M is a 5-tuple

M = (Q, Σ, δ, q0 , F ),

where

Page 7
• Q is the (finite) set of states of M ;

• Σ is the alphabet of the strings inputted to and processed by M ;

• δ : Q × Σ 7→ Q is a transition function which details exactly how the states and alphabet function
with each other (in other words which arrows point to which states);

• q0 ∈ Q is the start state of M ;

• F ⊆ Q is the set of accepting states of M .

While formal notation is not necessary for completing the rest of this Power Round, it may help make
some of the solutions easier to write.
n
3. Show that the language {12 | n ∈ N} is irregular.

Solution. The idea is similar. Suppose there exists a DFA D which decides this language, and
suppose this DFA has k states. Consider the k + 1 strings
k
1, 12 , 14 , ... 12 .
i j
By the Pigenhole Principle, there exist two strings, say 12 and 12 , which end on the same string.

4. The above two problems demonstrate the usefulness of elementary combinatorics to prove results
regarding regularity. Using these basic ideas, we can construct very powerful results in automata
theory. One of the most fundamental is stated below.

Theorem 4.1 (Pumping Lemma). Let L be a regular language. Then there exists an integer P ≥ 1
such that any w ∈ L with |w| ≥ P can be written as w = xyz with y not equal to the empty string
such that

• |xy| ≤ P , and
• for all integers i ≥ 0, the string xy i z is also in L.

We call P the pumping length of L.

In this problem we give a proof of the lemma and then see how it can be used to solve some regularity
problems.

(a) Suppose M is a DFA which decides L. Let p be the number of states of M . Consider an input

s = s1 s2 . . . sn with n ≥ p.

Let qk be the state the automaton is in after reading the character sk . Show that there exist
integers 0 ≤ i < j ≤ p such that qi = qj .
(b) Let i and j be as above. Define

x = s1 s2 . . . si , y = si+1 . . . sj , and z = sj+1 . . . sn .

Show that these x, y, and z fit the description given in the Pumping Lemma, where P = p.

Page 8
(c) Using the Pumping Lemma, show that the language

L = {s : |s| is a prime number}

is not regular.

Solution.

(a) Trivial.
(b) Note that the first condition is trivial, since |xy| = j ≤ p. We now demonstrate that the second
condition holds for this choice of x, y, and z. Denote the function δ0 : Σ 7→ Q which takes
any word w ∈ Σ and outputs the state q ∈ Q which w lands on after being fed through D.
Note that by the manner in which we chose our i and j, δ0 (x) = δ0 (xy). Now we prove that
δ0 (x) = δ0 (xy k ) for all positive integers k. The base case k = 1 is already taken care of. Now
for the induction step, suppose the proposition holds true for some k, and write

δ0 (xy k+1 ) = δ0 (xy k y) = δ0 (xy) = δ0 (x).

This completes the proof. (Here, we use the fact that if δ0 (a) = δ0 (b) for some strings a
and b, then δ0 (ac) = δ0 (bc) for all c.) From here, the result is easy, as we now know that
δ0 (xyz) = δ0 (xy i z) for all i, and since xyz is in L, xy i z must also be in L.
(c) Suppose for the sake of contradiction that L is regular, so it has a pumping length P . Denote
by n a prime number greater than P , and consider a string s of length |s| = n. Now take
s = 0 . . . 0, and write s = xyz, where x, y, and z are analogous to the x, y, z in the statement
of the Pumping Lemma. If the length of y is l, then xy i z is of length n + (i − 1)l. By the
Pumping Lemma, we know that all such words are in our language, so in particular there must
exist n and l such that n + (i − 1)l. However, setting i = n + 1 gives a contradiction. Hence L
cannot be regular.1

5. One natural question to ask is how regularity behaves under certain string and set operations. These
might include the union and intersection of two languages as well as their concatenation. It turns out
that regular languages are closed under many different kinds of operations. The following problems
will ask to prove closure of regularity under different types of operations. They are roughly ordered
by difficulty, with a few notable exceptions.

(a) COMPLEMENT: Let L be a regular language over the alphabet Σ, and set

Lc = {x ∈ Σ∗ | x 6∈ L}.

Prove that Lc is also regular.


(b) INTERSECTION: Suppose A and B are two regular languages over a common alphabet Σ.
Prove that
A ∩ B = {x ∈ Σ∗ | x ∈ A and x ∈ B}
is also regular.
1
Proof taken from https://math.berkeley.edu/ moorxu/oldsite/notes/154/154main.pdf

Page 9
(c) UNION: Suppose A and B are two regular languages over a common alphabet Σ. Prove that
A ∪ B = {x ∈ Σ∗ | x ∈ A or x ∈ B}
is also regular.
(d) SET DIFFERENCE: Suppose A and B are two regular languages over a common alphabet
Σ. Prove that
A \ B = {x ∈ Σ∗ | x ∈ A and x 6∈ B}
is also regular. (Hint: use some of the results from above!)

Solution.

(a) Let D = (Q, Σ, δ, q0 , F ) be the DFA which decides L. Consider the alternate DFA D0 =
(Q, Σ, δ, q0 , Q \ F ). In other words, D0 is the DFA which results when all the accept states of
D are turned into reject states and vice versa.
Now we show this decides Lc . First consider a word w ∈ Lc . Note that when w is run on D,
it does not accept. In other words, its ending state is some state in Q \ F . But this is now an
accept state in D0 , so in fact w is accepted by D0 . Similarly, if a word v ∈ L, then v is not
accepted by D0 . The conclusion follows.
(b) Since A and B are regular languages, there are DFAs M = (Q, Σ, δ, q0 , F ) and M 0 = (Q0 , Σ, δ 0 , q00 , F 0 )
that decide A and B respectively. To show that A ∩ B is regular, we construct a DFA
M 00 = (Q00 , Σ, δ 00 , q000 , F 00 ) that decides A ∩ B. Set
• Q00 = Q × Q0 = {(q, q 0 ) : q ∈ Q, q 0 ∈ Q0 },
• δ 00 is such that for (q, q 0 ) ∈ Q00 and a ∈ Σ,
δ 00 ((q, q 0 ), a) = (δ(q, a), δ 0 (q 0 , a)),
• q000 = (q0 , q00 ),
• F 00 = {(q, q 0 ) : q ∈ F and q 0 ∈ F 0 }.
This completes the definition of M 00 . It remains to show that M 00 indeed decides the language
L1 ∪ L2 .
First, we show that L1 ∩ L2 is in the language decided by M 00 . Call this language L(M 00 ).
Suppose w ∈ L1 ∩ L2 , which means w either belongs to L1 or it belongs to L2 . WLOG assume
w ∈ L1 . Let n be the length of w. Then we know that w induces a sequence of states r0 ,
r1 , . . ., rn ∈ Q such that r0 = q0 , δ(ri−1 , wi ) = ri for each i ∈ {1, 2, . . . , n}, and rn ∈ F .
This w will also induce a sequence of states of M 00 , r00 , r10 , . . ., rn0 ∈ Q0 such that r00 = q00 ,
δ 0 (ri−1
0 , w ) = r 0 for each i ∈ {1, 2, . . . , n}, and r 0 ∈ F 0 . Due to how we have defined M 00 , when
i i n
we run w on M 00 , it will induce the sequence of states (r0 , r00 ), (r1 , r10 , . . ., (rn , rn0 ) ∈ Q00 such
that (r0 , r00 ) = (q0 , q00 ), δ 00 ((ri−1 , ri−1
0 ), w ) = (δ(r , a), δ 0 (r 0 , a)) for each i ∈ {1, 2, . . . , n}. The
i i i
final state (rn , rn ) is such that rn ∈ F and rn0 ∈ F 0 , which by the definition of F 00 ensures that
0

(rn , rn0 ) ∈ F 00 . Therefore M 00 accepts w, i.e. w ∈ L(M 00 ).


For the other direction, run the reverse argument. Consider an element in L(M 00 ). Since its
sequence of states ends in an accepting state, its components must also both be accepted. This
means that the sequence of states in the first component end in something in F , while the
sequence of states in the second component end in something in F 0 . Hence w ∈ A ∩ B as well,
completing the proof.

Page 10
(c) Run a similar argument as before, except set F 00 = {(q, q 0 ) : q ∈ F or q 0 ∈ F 0 } instead.
(d) Remark that
A \ B = {x ∈ Σ∗ | x ∈ A and x 6∈ B} = A ∩ B c .
From the previous parts of this problem, we know that B c is regular, so A ∩ B c must be regular
as well. This completes the proof.

6. For any language L ∈ {0, 1}∗ , define a language L0 which consists of the set of words of the form

a1 s1,2 a2 s2,3 a3 . . . ak−1 sk−1,k ak

where a1 a2 a3 . . . ak is a word in L. Here, si,j is 1 when ai 6= aj and zero otherwise. For example,
the word 1101 in L corresponds to the word 1011011 in L0 .

Suppose L is regular. Must L0 also be regular? (Note: if necessary, you may quote any results from
the previous two problems without proof.)

Solution. I claim the answer is yes.


For ease of notation, let s(x, y) denote the parity of x and y, i.e. s(x, y) = 1 iff x 6= y. Now denote
by f the function taking strings to strings such that

f (x1 x2 . . . xn ) = x1 s(x1 , x2 )x2 s(x2 , x3 ) . . . s(xn−1 , xn )xn .

Consider the two languages

L1 = {a1 x1 a2 x2 . . . an−1 xn−1 an | a1 a2 . . . an ∈ L, x1 x2 . . . xn−1 ∈ {0, 1}∗ }

and
L2 = {f (w) | w ∈ {0, 1}∗ }.
We shall now show that L1 and L2 are both regular languages. This implies that their intersection
is also regular. But their intersection is exactly L0 . Hence if we can show L1 and L2 are regular,
then we are done.
First, we show that L1 is regular. Since L is regular, there exists a DFA D = (Q, Σ, δ, q0 , F ) which
decides it. Now construct another DFA D0 = (Q0 , Σ, δ 0 , q0 , F ) with the following properties:

• Q ⊆ Q0 with |Q0 | = 3|Q|. This means that there exists a bijection g : {0, 1} × Q 7→ Q0 \ Q.
• For all w = (e, q) ∈ {0, 1} × Q, we have δ 0 (w) = g(w) and δ 0 (e, δ 0 (w)) = δ(w).

This decides L1 . To prove this, first note that any string w ∈ {0, 1}∗ with |w| even cannot be
accepted by D0 . This is because any such string must map to a state in Q0 \ Q, which we know
contains zero accept states. Now remark that δ 0 (e, δ 0 (w)) = δ(w), we can use an induction-esque
argument to prove that the state which a1 a2 . . . an lands on and the state which a1 x1 a2 x2 . . . an−1 xn
lands on are identical. Hence a string of odd length in {0, 1}∗ is accepted by D0 iff the string formed
by the odd-indexed characters of the string is accepted by D. (This is obviously not a rigorous
argument, but it should suffice for grading purposes.)
Now we show that L2 is regular. Consider the following DFA:

Page 11
q00
0
0 1
q0 q01
1
0
1
0
q qrej 0,1
1
0
1
q1 0 q11
1 0
1
q10

I claim this decides L2 . To prove this, we induct on the number of time steps. Consider any
accepting string w. Then w must end with either a 0 or a 1.

• If w ends in a 0, then w00 and w11 should accept, while w01 and w10 should reject.
• If w ends in a 1, then w10 and w01 should accept, while w00 and w11 should reject.
• In either case, w0 and w1 should reject.

It is not hard to see that all three of these conditions are satisfied. Hence we can build on the length
of the input and prove that L2 is indeed the language decided by D.
We have shown L1 and L2 to both be regular; thus L0 must indeed be regular.

5 DFA Minimization
One of the biggest overall themes in the field of computer science is optimization. If wew have a program
that can solve a problem but does so in exponential time (e.g. the number of steps needed to solve the
problem on input of length n is roughly 2n ), then the algorithm isn’t really that practical! Computer
scientists aim to design and implement processes which are efficient in some sense. As a result, it might
be beneficial to see how we can explore this type of minimization in DFAs. The most natural way to do
so involves looking at the number of states for a DFA which decides some language L.

1. Below is a picture of a DFA which accepts some language L. Although this DFA is a decider for
L, it is somewhat inefficient. Propose two modifications to this DFA so that it still decides L but
requires fewer states to do so.

Page 12
0,1 0,1 0,1
start q0 q1 q2 q3

0,1 0,1

q5 q4
0,1

Solution. There are many different possible answers for this. I personally thought of the follow-
ing:

(a) Condense the cycle of length 4 into a cycle of length 2 (which can be done since the accept
and reject states alternate).
(b) Remove one of the nodes from the beginning, and switch the accept and reject states of the
cycle of length 2.

2. Consider the language


L = {0n 1m | n − m ≡ 0 (mod 3)}.
Note that this language is a more general version of one that we’ve explored before. However, in
fact this language is regular! What is the smallest number of states needed in a DFA that decides
L? (Your answer must both establish a minimum number of states and show that this minimum is
achievable.)

Solution. I claim the answer is 7 states.


Suppose there exists a DFA D which decides L in fewer than 7 states. Consider the set of words

S = {0, 00, 000, 1, 11, 111, 10}.

By the Pigeonhole Principle, there must exist two strings which end p on the same after after ron
on D. We now divide into cases.

• CASE 1: One of the strings is 10. Then note that the second string must be one of the
other six. It is not hard to see that for each of the other strings there exists a suffix which
when added to the end of the string produces an accepting string. However, 10x must reject
for any string x. So we get a contradiction.
• CASE 2: Both strings contain all zeroes. Call these strings 0i and 0j respectively. Then
note that appending 06−i to both strings yields that 06 and 06+j−i must either both accept or
both reject. This is false, because the former accepts while the latter rejects.
• CASE 3: Both strings contain all ones. Use an argument as above.
• CASE 4: One string contains all ones while the other contains all zeroes. Suppose
the strings are 0i and 1j for some i and j between 1 and 3. Now append the string 06−i to
both strings. This gives that 06 and 1j 06−i must either both accept or both reject. But this is
false, since 06 accepts but any string of the form 1m 0n must automatically be rejected.

Page 13
We have exhausted all cases, and so we are done.
It suffices to show that 7 is attainable. Here is such a DFA.

q0 0 q1 0 q2
start

1 1 1

q3 1 q4 0 q5

0 1
0 0

qrej

3. (a) Define
Ln = {x | x ∈ {0, 1}∗ and the nth -symbol from the left is a 1 }.
Show that there is some constant number c (independent of n) such that there is a DFA that
accepts Ln with at most n + c states.
(b) Define
Rn = {x | x ∈ {0, 1}∗ and the nth -symbol from the right is a 1 }.
Show that any DFA that accepts Rn has at least 2n states.

Solution.

(a) The following is one such DFA.

qrej 0,1
0

0,1 0,1 0,1 0,1


start q0 q1 q2 ··· qn−1

1
qn 0,1

This DFA works because it takes no care to the first n − 1 characters of the input string, but
then checks the nth character of the string to see if it is a 1. In terms of n, this DFA has n + 2
states, so c = 2 works.

Page 14
(b) Let D be a DFA which decides Rn . Suppose for the sake of contradiction that D has fewer
than 2n states. Consider the set of strings w ∈ {0, 1}∗ with |w| ≤ n. There are 2n such possible
strings. Thus, by the Pigenhole Principle two of these strings must end up in the same state
after run by the DFA.
Since these two strings are distinct, they must differ in some bit. Let this bit be the ith bit to
the right. Denote by A the string which has a 1 in this bit and B the string which has a 0 at
this bit. Now append the string 0n−i to both strings. Then A0n−i has a 1 in the nth bit, while
B0n−i has a 0 in this bit. But it must be true that either both of hese strings are accepted by
D or both are rejected by D - contradiction! Hence our original assumption was false and D
must have at least 2n states as desired.

It turns out that DFAs are simple enough in nature that we can algorithmically construct a minimal
DFA for any given language L! (The only caveat is that we need to find some DFA which decides L first
before applying this algorithm, but, eh, that’s a worthy trade-off. To explore this idea, we need to develop
the notion of equivalence classes.
Definition 5.1. Let Σ be an alphabet, and let A be any language over Σ∗ . For any word v ∈ Σ∗ , let

Sv (A) = {w ∈ Σ∗ | vw ∈ L}.

We then say that two strings x, y ∈ Σ∗ are equivalent iff Sx (A) = Sy (A). We denote this mathematically
through the relation x ≡A y. Note that ≡A is an equivalence relation, and so we may refer to the
equivalence classes2 of A.
There exists a similar notion of equivalence with regards to DFAs.
Definition 5.2. Given a DFA M = (Q, Σ, δ, q0 , F ), we say that two strings x, y ∈ Σ∗ are indistinguishable
iff δ ∗(q0 , x) = δ ∗ (q0 , y). In other words, the state reached by M on input x is the same as the state reached
by M on input y. We denote the notion of equivalence as x ≡M y. Just as above, note that ≡M is an
equivalence relation and so we may refer to the equivalence classes of M .

1. Let L = {0n 1n | n ∈ N}. What is S111100 (L)?

Solution. S111100 (L) is the lonely singleton {00}. This is not hard to show true.

2. Suppose M is a DFA with N states. At most how many equivalence classes can M have? What is
an example of a DFA M for which equality does not hold?

Solution. At most N states. Proof uses a Pigeonhole argument. An example of when equality
does not hold is when we have redundant states, e.g. the DFA in the example above.

3. Show that the notions of equivalency and indistinguishability are identical. In other words, let M
be a DFA and let A = L(M ) (i.e. A is the language which is decided by M ). Then for any x, y ∈ Σ∗ ,
if x ≡M y, then x ≡A y.

Solution. Denote by δ ∗ (q0 , x) the state reached by M on input x. This function will be used to
make the solution easier to write.
2
For any set X equipped with an equivalence relation ∼ and for any a in X, we say that the equivalence class of a is
the subset of all elements x ∈ X for which x ∼ a. It follows that these equivalence classes partition X into groups; we then
define the number of equivalence classes of ∼ as the number of groups in this partition.

Page 15
Let z ∈ σ ∗ . Clearly, δ ∗ (q0 , xz) = δ ∗ (q0 , yz). Then

xz ∈ A ⇐⇒ δ ∗ (q0 , xz) ∈ F ⇐⇒ δ ∗ (q0 , yz) ∈ F ⇐⇒ yz ∈ A.

Since this holds true for arbitrary z, we must have x ≡A y.

We now have enough background knowledge to state an important result.


Theorem 5.1 (Myhill-Nerode). A language A is regular iff the equivalence relation ≡A has a finite
number of equivalence classes. Furthermore, there exists some DFA M with L(M ) = A such that every
state of M uniquely corresponds to an equivalence class of ≡A .

1. Prove this theorem.

Solution. The only if direction is easy. Note that since A is regular, there exists a DFA D which
decides it. By the previous problem, we know that ≡A has at most as many equivalence classes as
≡D . But ≡D has a finite number of equivalence classes - in fact, as most the number of states of D.
The if direction is a bit tougher. The main idea here is to use the equivalence classes to construct
the DFA explicitly.
Suppose a language L has a finite number of equivalence classes, labeled T1 , T2 , . . ., Tn . (The use
of T is to distinguish these suffix sets from the suffix set of a word in L.) We state a lemma.
LEMMA: If two words v and w have the same suffix sets Sv and Sw , then for any a ∈ Σ we will
have Sva = Swa .

Proof. Note that by the definition of suffix sets, if a word x ∈ Sva , then ax ∈ Sv and vice versa.
Since Sv = Sw , we have ax ∈ Sw ⇐⇒ x ∈ Swa . Since this holds for arbitrary a, we have Sva = Swa
as desired.

We now construct the DFA D. Let q1 , . . . , qn be states which correspond to the equivalence classes
T1 through Tn . The start state q of D is the state which corresponds to the equivalence class
containing ε. Since each state is the suffix set for some string, we will denote each state by some
canonical string. For the transition function, let δ(Sv , a) = Sva , where v is the string we choose to
represent the state Sv . By the above claim, the transition function is the same regardless of the
choice of v.
Now it suffices to show that this DFA works. Let w = w1 . . . wn be a word accepted by the DFA.
By the definition of the transition function, it must be the case that δ(Sε , w1 ) = Sw1 , δ(Sw1 , w2 ) =
Sw1 w2 , and so on; an inductive argument shows that

δ(Sw1 ...wn−1 , wn ) = Sw .

Furthermore, since this is accepted, we now that Sw must contain ε, which implies w ∈ L. Reversing
the argument above gives that every word in L is accepted by this DFA.
Hence we have shown that D is the DFA which decides the language A, and thus A must be regular.3

Using this theorem, we now can construct an algorithm for minimizing any DFA! The algorithm is as
follows. (Note: in this algorithm, we use the official definition of a DFA given in the previous section.)
3
Thanks to the CMU course 15-251 F15 for providing the basis for the solution to this problem.

Page 16
• Draw a table for all pairs of states (Qi , Qj ) (not just those which are connected directly). Initially,
these pairs will be unmarked.

• Mark all state pairs (Qi , Qj ) with the property that Qi ∈ F and Qj 6∈ F and mark them.

• For some input A, if there exist states Qi and Qj with the property that (δ(Qi , A), δ(Qj , A)) is
marked, then mark the pair (Qi , Qj ). Repeat this until there are no more states that can be
marked.

• Combine all the unmarked pairs of states into a single state in the reduced DFA. Return this new
DFA.

1. Perform this algorithm on the following DFA. What is the result?

q1 1 q3 1 q5 0,1
0

0 0 1
1

start q0 q2 q4 0
1 0

Solution. This should be the resulting output.

0 0

q0 1 q1 1 q2
start 0,1

2. Show using the Myhill-Nerode Theorem that this procedure produces a DFA with the minimal
number of states.

Solution. Note that each marked pair represents a pair of states which cannot possibly be part
of the same equivalence class - just note that ε cannot appear in both. By inductively crossing
out more and more pairs of states, we can thus reduce the number of states we have to check for
equivalence. (This is not a formal argument; rather, this is a general idea of what is going on.)

Page 17

You might also like