0% found this document useful (0 votes)
21 views45 pages

Formal Languages & Automata Theory

The document discusses converting NFAs to equivalent DFAs and describes operations on regular languages such as concatenation, union, and Kleene star. It provides examples of constructing regular expressions by combining simpler languages.

Uploaded by

Cyrus Li
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views45 pages

Formal Languages & Automata Theory

The document discusses converting NFAs to equivalent DFAs and describes operations on regular languages such as concatenation, union, and Kleene star. It provides examples of constructing regular expressions by combining simpler languages.

Uploaded by

Cyrus Li
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

NFA to DFA conversion and regular expressions

CSCI 3130 Formal Languages and Automata Theory

Siu On CHAN
Fall 2018
Chinese University of Hong Kong

1/22
DFAs and NFAs are equally powerful

NFA can do everything a DFA can do


How about the other way?

Every NFA is equivalent to some DFA for the same language

2/22
NFA → DFA algorithm

Given an NFA, figure out

1. the initial active states


2. how the set of active states changes upon reading an input
symbol

3/22
NFA → DFA example

ε,1
NFA: q0 q1 ε q2
0

Initial active states (before reading any input)?

4/22
NFA → DFA example

ε,1
NFA: q0 q1 ε q2
0

Initial active states (before reading any input)?

partial
DFA: {q0 , q1 , q2 }

How does the set of active states change?


NFA → DFA example

ε,1
NFA: q0 q1 ε q2
0

Initial active states (before reading any input)?

partial
DFA: {q0 , q1 , q2 }

How does the set of active states change?


NFA → DFA example

ε,1
NFA: q0 q1 ε q2
0

Initial active states (before reading any input)?

partial 1
DFA: {q0 , q1 , q2 } {q1 , q2 }

How does the set of active states change?


NFA → DFA example

ε,1
NFA: q0 q1 ε q2
0

Initial active states (before reading any input)?

0 0,1

partial 1
1
DFA: {q0 , q1 , q2 } {q1 , q2 } ∅
0

How does the set of active states change?

4/22
NFA → DFA summary

0 0,1

1
DFA: 1
{q0 , q1 , q2 } {q1 , q2 } ∅
0

Every DFA state corresponds to a subset of NFA states


A DFA state is accepting if it contains an accepting NFA state

5/22
Regular expressions
Regular expressions

Powerful string matching feature in advanced editors (e.g. Vim,


Emacs) and modern programming languages (e.g. PERL, Python)

PERL regex examples:


colou?r matches “color”/“colour”
[A-Za-z]*ing matches any word ending in “ing”

We will learn to parse complicated regex recursively


by building up from simpler ones
Also construct the language matched by the expression recursively

Will focus on regular expressions in formal language theory


(notations differ from PERL/Python/POSIX regex)

6/22
String concatenation

st = abbbab
s = abb ts = bababb
t = bab ss = abbabb
sst = abbabbbab
s = x1 . . . xn , t = y1 . . . ym

st = x1 . . . xn y1 . . . ym

7/22
Operations on languages

• Concantenation of languages L1 and L2

L1 L2 = {st : s ∈ L1 , t ∈ L2 }

• n-th power of language L

Ln = {s1 s2 . . . sn | s1 , s2 , . . . , sn ∈ L}

• Union of L1 and L2

L1 ∪ L2 = {s | s ∈ L1 or s ∈ L2 }

8/22
Example

L1 = {0, 01} L2 = {ε, 1, 11, 111, . . . }

L1 L2 = {0, 01, 011, 0111, . . . } ∪ {01, 011, 0111, 01111, . . . }


= {0, 01, 011, 0111, . . . }
0 followed by any number of 1s

L12 = {00, 001, 010, 0101} L22 = L2


L2n = L2 for any n > 1

L1 ∪ L2 = {0, 01, ε, 1, 11, 111, . . . }

9/22
Operations on languages

The star of L are contains strings made up of zero or more chunks


from L

L∗ = L0 ∪ L1 ∪ L2 ∪ . . .

Example: L1 = {0, 01} and L2 = {ε, 1, 11, 111, . . . }


What is L1∗ ? L2∗ ?

10/22
Example

L1 = {0, 01}

L10 = {ε}
L11 = {0, 01}
L12 = {00, 001, 010, 0101}
L13 = {000, 0001, 0010, 00101, 0100, 01001, 01010, 010101}

Which of the following are in L1∗ ?


00100001 00110001 10010001

11/22
Example

L1 = {0, 01}

L10 = {ε}
L11 = {0, 01}
L12 = {00, 001, 010, 0101}
L13 = {000, 0001, 0010, 00101, 0100, 01001, 01010, 010101}

Which of the following are in L1∗ ?


00100001 00110001 10010001
Yes No No

11/22
Example

L1 = {0, 01}

L10 = {ε}
L11 = {0, 01}
L12 = {00, 001, 010, 0101}
L13 = {000, 0001, 0010, 00101, 0100, 01001, 01010, 010101}

Which of the following are in L1∗ ?


00100001 00110001 10010001
Yes No No

L1∗ contains all strings such that any 1 is preceded by a 0

11/22
Example

L2 = {ε, 1, 11, 111, . . . }


any number of 1s

L20 = {ε}
L21 = L2
L22 = L2
L2n = L2 (n > 1)

12/22
Example

L2 = {ε, 1, 11, 111, . . . }


any number of 1s

L2∗ = L20 ∪ L21 ∪ L22 ∪ . . .


L20 = {ε}
= {ε} ∪ L2 ∪ L2 ∪ . . .
L21 = L2
= L2
L22 = L2
L2n = L2 (n > 1)
L2∗ = L2

12/22
Combining languages

We can construct languages by starting with simple ones, like {0}


and {1}, and combining them

{0}({0} ∪ {1})∗ ⇒ 0(0 + 1)∗


all strings that start with 0

13/22
Combining languages

We can construct languages by starting with simple ones, like {0}


and {1}, and combining them

{0}({0} ∪ {1})∗ ⇒ 0(0 + 1)∗


all strings that start with 0

({0}{1}∗ ) ∪ ({1}{0}∗ ) ⇒ 01∗ + 10∗


0 followed by any number of 1s, or
1 followed by any number of 0s

13/22
Combining languages

We can construct languages by starting with simple ones, like {0}


and {1}, and combining them

{0}({0} ∪ {1})∗ ⇒ 0(0 + 1)∗


all strings that start with 0

({0}{1}∗ ) ∪ ({1}{0}∗ ) ⇒ 01∗ + 10∗


0 followed by any number of 1s, or
1 followed by any number of 0s

0(0 + 1)∗ and 01∗ + 10∗ are regular expressions


Blueprints for combining simpler languages into complex ones

13/22
Syntax of regular expressions

A regular expression over Σ is an expression formed by the following


rules

• The symbols ∅ and ε are regular expressions


• Every symbol a in Σ is a regular expression
• If R asd S are regular expressions, so are R + S, RS and R∗

Examples:
∅ ε

0(0 + 1)∗ 1 (ε + 0)
01∗ + 10∗ (0 + 1)∗ 01(0 + 1)∗

A language is regular if it is represented by a regular expression

14/22
Understanding regular expressions

Σ = {0, 1}

01∗ = 0(1)∗ represents {0, 01, 011, 0111, . . . }


0 followed by any number of 1s

01∗ is not (01)∗

15/22
Understanding regular expressions

0 + 1 yields {0, 1} strings of length 1

(0 + 1)∗ yields {ε, 0, 1, 00, 01, 10, 11, . . . } any string

(0 + 1)∗ 010 any string that ends in 010

(0 + 1)∗ 01(0 + 1)∗ any string containing 01

16/22
Understanding regular expressions

What language does the following represent?


((0 + 1)(0 + 1))∗ + ((0 + 1)(0 + 1)(0 + 1))∗

17/22
Understanding regular expressions

What language does the following represent?


((0 + 1)(0 + 1))∗ + ((0 + 1)(0 + 1)(0 + 1))∗

((0 + 1)(0 + 1))∗ ((0 + 1)(0 + 1)(0 + 1))∗

17/22
Understanding regular expressions

What language does the following represent?


((0 + 1)(0 + 1))∗ + ((0 + 1)(0 + 1)(0 + 1))∗

((0 + 1)(0 + 1))∗ ((0 + 1)(0 + 1)(0 + 1))∗

(0 + 1)(0 + 1) (0 + 1)(0 + 1)(0 + 1)

17/22
Understanding regular expressions

What language does the following represent?


((0 + 1)(0 + 1))∗ + ((0 + 1)(0 + 1)(0 + 1))∗

((0 + 1)(0 + 1))∗ ((0 + 1)(0 + 1)(0 + 1))∗

(0 + 1)(0 + 1) (0 + 1)(0 + 1)(0 + 1)


strings of length 2 strings of length 3

17/22
Understanding regular expressions

What language does the following represent?


((0 + 1)(0 + 1))∗ + ((0 + 1)(0 + 1)(0 + 1))∗

((0 + 1)(0 + 1))∗ ((0 + 1)(0 + 1)(0 + 1))∗


strings of even length strings whose length is a
multiple of 3

(0 + 1)(0 + 1) (0 + 1)(0 + 1)(0 + 1)


strings of length 2 strings of length 3

17/22
Understanding regular expressions

What language does the following represent?


((0 + 1)(0 + 1))∗ + ((0 + 1)(0 + 1)(0 + 1))∗
strings whose length is even or a multiple of 3
= strings of length 0, 2, 3, 4, 6, 8, 9, 10, 12, . . .

((0 + 1)(0 + 1))∗ ((0 + 1)(0 + 1)(0 + 1))∗


strings of even length strings whose length is a
multiple of 3

(0 + 1)(0 + 1) (0 + 1)(0 + 1)(0 + 1)


strings of length 2 strings of length 3

17/22
Understanding regular expressions

What language does the following represent?


((0 + 1)(0 + 1) + (0 + 1)(0 + 1)(0 + 1))∗

18/22
Understanding regular expressions

What language does the following represent?


((0 + 1)(0 + 1) + (0 + 1)(0 + 1)(0 + 1))∗

(0 + 1)(0 + 1) + (0 + 1)(0 + 1)(0 + 1)

18/22
Understanding regular expressions

What language does the following represent?


((0 + 1)(0 + 1) + (0 + 1)(0 + 1)(0 + 1))∗

(0 + 1)(0 + 1) + (0 + 1)(0 + 1)(0 + 1)

(0 + 1)(0 + 1) (0 + 1)(0 + 1)(0 + 1)

18/22
Understanding regular expressions

What language does the following represent?


((0 + 1)(0 + 1) + (0 + 1)(0 + 1)(0 + 1))∗

(0 + 1)(0 + 1) + (0 + 1)(0 + 1)(0 + 1)

(0 + 1)(0 + 1) (0 + 1)(0 + 1)(0 + 1)


strings of length 2 strings of length 3

18/22
Understanding regular expressions

What language does the following represent?


((0 + 1)(0 + 1) + (0 + 1)(0 + 1)(0 + 1))∗

(0 + 1)(0 + 1) + (0 + 1)(0 + 1)(0 + 1)


strings of length 2 or 3

(0 + 1)(0 + 1) (0 + 1)(0 + 1)(0 + 1)


strings of length 2 strings of length 3

18/22
Understanding regular expressions

What language does the following represent?


((0 + 1)(0 + 1) + (0 + 1)(0 + 1)(0 + 1))∗
strings that can be broken into blocks, where each block has length 2
or 3

(0 + 1)(0 + 1) + (0 + 1)(0 + 1)(0 + 1)


strings of length 2 or 3

(0 + 1)(0 + 1) (0 + 1)(0 + 1)(0 + 1)


strings of length 2 strings of length 3

18/22
Understanding regular expressions

What language does the following represent?


((0 + 1)(0 + 1) + (0 + 1)(0 + 1)(0 + 1))∗
strings that can be broken into blocks, where each block has length 2
or 3

Which are in the language?


ε 1 01 011 00110 011010110

19/22
Understanding regular expressions

What language does the following represent?


((0 + 1)(0 + 1) + (0 + 1)(0 + 1)(0 + 1))∗
strings that can be broken into blocks, where each block has length 2
or 3

Which are in the language?


ε 1 01 011 00110 011010110
3 7 3 3 3 3

The regular expression represents all strings except 0 and 1

19/22
Understanding regular expressions

What language does the following represent?


ends in at most two 0s
z }| {
(1 + 01 + 001)∗ (ε + 0 + 00)
| {z }
at most two 0s between two consecutive 1s

20/22
Understanding regular expressions

What language does the following represent?


ends in at most two 0s
z }| {

(1 + 01 + 001) (ε + 0 + 00)
| {z }
at most two 0s between two consecutive 1s

20/22
Understanding regular expressions

What language does the following represent?


ends in at most two 0s
z }| {

(1 + 01 + 001) (ε + 0 + 00)
| {z }
at most two 0s between two consecutive 1s

Never three consecutive 0s

The regular expression represents strings not containing 000

Examples:

ε 00 0110010110 0010010

20/22
Writing regular expressions

Write a regular expression for all strings with two consecutive 0s

21/22
Writing regular expressions

Write a regular expression for all strings with two consecutive 0s

(anything)00(anything)

(0 + 1)∗ 00(0 + 1)∗

21/22

You might also like