0% found this document useful (0 votes)

7 views33 pages

Finite Automata

The document discusses finite automata conversions and lexing, highlighting the equivalence of regular expressions, NFAs, and DFAs in expressing regular languages. It details algorithms for converting between these forms, including Thompson's construction for RE to NFA, subset construction for NFA to DFA, and Hopcroft's algorithm for DFA minimization. Additionally, it covers lexer generation techniques and handling keywords and whitespace in the context of lexical analysis.

Uploaded by

Bijay Nag

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views33 pages

Finite Automata

Uploaded by

Bijay Nag

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

CS 432

Fall 2018
Mike Lam, Professor

Finite Automata Conversions

and Lexing
Finite Automata
●
Key result: all of the following have the same expressive
power (i.e., they all describe regular languages):
– Regular expressions (REs)
– Non-deterministic finite automata (NFAs)
– Deterministic finite automata (DFAs)
●
Proof by construction
– An algorithm exists to convert any RE to an NFA
– An algorithm exists to convert any NFA to a DFA
– An algorithm exists to convert any DFA to an RE
– For every regular language, there exists a minimal DFA
●
Has the fewest number of states of all DFAs equivalent to RE
Finite Automata
●
Finite automata transitions:

Kleene's construction

Hopcroft's
algorithm
(minimize)

Regex NFA DFA Lexer

Thompson's Subset Lexer

construction construction generators

Brzozowski's algorithm
(direct to minimal DFA)

(dashed lines indicate transitions to a minimized DFA)

Finite Automata Conversions
●
RE to NFA: Thompson's construction
– Core insight: inductively build up NFA using “templates”
– Core concept: use null transitions to build NFA quickly
●
NFA to DFA: Subset construction
– Core insight: DFA nodes represent subsets of NFA nodes
– Core concept: use null closure to calculate subsets
●
DFA minimization: Hopcroft’s algorithm
– Core insight: create partitions, then keep splitting
●
DFA to RE: Kleene's construction
– Core insight: repeatedly eliminate states by combining regexes
Thompson's Construction
●
Basic idea: create NFA inductively, bottom-up
– Base case:
●
Start with individual alphabet symbols (see below)
– Inductive case:
●
Combine by adding new states and null/epsilon transitions
●
Templates for the three basic operations
– Invariant:
●
The NFA always has exactly one start state and one accepting state

a
Thompson's: Concatenation
A B
Thompson's: Concatenation
AB
Thompson's: Union

B
Thompson's: Union

A|B
Thompson's: Closure

A
Thompson's: Closure
A*
ε
Thompson's Construction
Base case

Concatenation

Union

Closure
ε
Subset construction
●
Basic idea: create DFA incrementally
– Each DFA state represents a subset of NFA states
– Use null closure operation to “collapse” null/epsilon transitions
– Null closure: all states reachable via epsilon transitions
●
Essentially: where can we go “for free?”
●
Formally: ε-closure(s) = {s} ∪ { t ∈ S | (s,ε→t) ∈ δ }
– Simulates running all possible paths through the NFA

Null closure of A = { A }
Null closure of B = { B, D }
Null closure of C =
Null closure of D =
Subset construction
●
Basic idea: create DFA incrementally
– Each DFA state represents a subset of NFA states
– Use null closure operation to “collapse” null/epsilon transitions
– Null closure: all states reachable via epsilon transitions
●
Essentially: where can we go “for free?”
●
Formally: ε-closure(s) = {s} ∪ { t ∈ S | (s,ε→t) ∈ δ }
– Simulates running all possible paths through the NFA

Null closure of A = { A }
Null closure of B = { B, D }
Null closure of C = { C, D }
Null closure of D = { D }
Subset construction
●
Basic idea: create DFA incrementally
– Each DFA state represents a subset of NFA states
– Use null closure operation to “collapse” null/epsilon transitions
– Null closure: all states reachable via epsilon transitions
●
Essentially: where can we go “for free?”
●
Formally: ε-closure(s) = {s} ∪ { t ∈ S | (s,ε→t) ∈ δ }
– Simulates running all possible paths through the NFA

Null closure of A = { A }
Null closure of B = { B, D }
Null closure of C = { C, D }
Null closure of D = { D }
Formal Algorithm
SubsetConstruction(S, Σ, s0, SA, δ):
t0 := ε-closure(s0)
S' := { t0 } S'A := ∅ W := { t0 }
while W ≠ ∅:
choose u in W and remove it from W
for each c in Σ:
t := ε-closure(δ(u,c))
δ'(u,c) = t
if t is not in S' then
add t to S’ and W
add t to S'A if any state in t is also in SA
return (S', Σ, t0, S'A, δ')
Subset Example
Subset Example
Subset Example

{B,D}
a

{A}

b
{C,D}
Subset Example
SubsetConstruction(S, Σ, s0, SA, δ):
t0 := ε-closure(s0)
S' := { t0 } S'A := ∅ W := { t0 }
while W ≠ ∅:
choose u in W and remove it from W
for each c in Σ:
t := ε-closure(δ(u,c))
δ'(u,c) = t
if t is not in S' then
add t to S’ and W
add t to S'A if there exists a state v in t that is also in SA
return (S', Σ, t0, S'A, δ')
Subset Example

{B,D,E} a
a a
{A,E}
b {E}
b
b
{C,D}
Algorithms
●
Subset construction is a fixed-point algorithm
– Textbook: “Iterated application of a monotone function”
– Basically: A loop that is mathematically guaranteed to
terminate at some point
– When it terminates, some desirable property holds
●
In the case of subset construction: the NFA has been
converted to a DFA!
Hopcroft’s DFA Minimization
●
Split into two partitions (final & non-final)
●
Keep splitting a partition while there are states with differing behaviors
– Two states transition to differing partitions on the same symbol
– Or one state transitions on a symbol and another doesn’t
●
When done, each partition becomes a single state

{B,D}
a

{A} Same behavior; collapse!

b
{C,D}
a,b {B,C,D}

{A}
Hopcroft’s DFA Minimization
●
Split into two partitions (final & non-final)
●
Keep splitting a partition while there are states with differing behaviors
– Two states transition to differing partitions on the same symbol
– Or one state transitions on a symbol and another doesn’t
●
When done, each partition becomes a single state

{B,D}
a
a
Differing behavior on
{A} ‘a’; split partition! {B,D}
a
b
{C,D} {A}

b
{C,D}
Kleene's Construction
●
Replace edge labels with REs
– "a" → "a" and "a,b" → "a|b"
●
Eliminate states by combining REs
– See pattern below; apply pairwise around each state to be eliminated
– Repeat until only one or two states remain
●
Build final RE
– One state with "A" self-loop → "A*"
– Two states: see pattern below

B A C
Eliminating A C Combining final B
states: two states:

D D

AB*C|D A*B(C|DA*B)*
Brzozowski’s Algorithm
●
Direct NFA → minimal DFA conversion
●
Sub-procedures:
– Reverse(n): invert all transitions in NFA n, adding a new start
state connected to all old final states
– Subset(n): apply subset construction to NFA n
– Reach(n): remove any part of NFA n unreachable from start state
●
Apply them all in order three times to get minimal DFA
– First time eliminates duplicate suffixes
– Second time eliminates duplicate prefixes
– MinDFA(n) = Reach(Subset(Reverse(Reach(Subset(Reverse(n))))))
– Potentially easier to code than Hopcroft’s algorithm
Brzozowski’s Algorithm
●
MinDFA(n) = Reach(Subset(Reverse(Reach(Subset(Reverse(n))))))

Example from
EAC (p.76)
NFA/DFA complexity
●
What are the time and space requirements to...
– Build an NFA?
– Run an NFA?
– Build a DFA?
– Run a DFA?

aa*|b {B,D}
a
{A}
ε
b
{C,D}
NFA/DFA complexity
●
Thompson's construction
– At most two new states and four transitions per regex character
– Thus, a linear space increase with respect to the # of regex characters
– Constant # of operations per increase means linear time as well
●
NFA execution
– Proportional to both NFA size and input string size
– Must track multiple simultaneous “current” states
●
Subset construction
– Potential exponential state space explosion
– A n-state NFA could require up to 2n DFA states
– However, this rarely happens in practice
●
DFAs execution
– Proportional to input string size only (only track a single “current” state)
NFA/DFA complexity
●
NFAs build quicker (linear) but run slower
– Better if you will only run the FA a few times
– Or if you need features that are difficult to implement with DFAs
●
DFAs build slower but run faster (linear)
– Better if you will run the FA many times

NFA DFA
Build time O(m) O(2m)
Run time O(m×n) O(n)

m = length of regular expression

n = length of input string
Lexers
●
Auto-generated
– Table-driven: generic scanner, auto-generated tables
– Direct-coded: hard-code transitions using jumps
– Common tools: lex/flex and similar
●
Hand-coded
– Better I/O performance (i.e., buffering)
– More efficient interfacing w/ other phases
– This is what we’ll do for P2
Handling Keywords
●
Issue: keywords are valid identifiers
●
Option 1: Embed into NFA/DFA
– Separate regex for keywords
– Easier/faster for generated scanners
●
Option 2: Use lookup table
– Scan as identifier then check for a keyword
– Easier for hand-coded scanners
– (Thus, this is probably easier for P2)
Handling Whitespace
●
Issue: whitespace is usually ignored
– Write a regex and remove it before each new token
●
Side effect: some results are counterintuitive
– Is this a valid token? “3abc”
– For now, it’s actually two!
– We’ll reject them later, in the parsing phase

Unit 01 - Part 3
No ratings yet
Unit 01 - Part 3
18 pages
548445041
No ratings yet
548445041
17 pages
02 Automata
No ratings yet
02 Automata
78 pages
Can We Build A Finite Automaton For Every Regular Expression?, - Build FA Based On The Definition of Regular Expression
No ratings yet
Can We Build A Finite Automaton For Every Regular Expression?, - Build FA Based On The Definition of Regular Expression
66 pages
SEM04a-NFA Construction and Minimum DFA
No ratings yet
SEM04a-NFA Construction and Minimum DFA
48 pages
Lecture 3 Lexical Analyzer
No ratings yet
Lecture 3 Lexical Analyzer
44 pages
Compiler Construction Lecture 5-6
No ratings yet
Compiler Construction Lecture 5-6
37 pages
Compiler Construction - CS606 Power Point Slides Lecture 08
No ratings yet
Compiler Construction - CS606 Power Point Slides Lecture 08
43 pages
Lecture 08
No ratings yet
Lecture 08
38 pages
Lecture 08
No ratings yet
Lecture 08
39 pages
Compiler 5
No ratings yet
Compiler 5
42 pages
NFA to DFA Conversion Guide
No ratings yet
NFA to DFA Conversion Guide
62 pages
Nfa To Dfa 08
No ratings yet
Nfa To Dfa 08
39 pages
Module 5
No ratings yet
Module 5
7 pages
Lect 04
No ratings yet
Lect 04
12 pages
Lecture 4 - NFA To DFA
No ratings yet
Lecture 4 - NFA To DFA
38 pages
Lec 4 CH 2
No ratings yet
Lec 4 CH 2
39 pages
4-Lexical Analysis Part3
No ratings yet
4-Lexical Analysis Part3
37 pages
331 Basics
No ratings yet
331 Basics
26 pages
Automata 5
No ratings yet
Automata 5
33 pages
Formal Languages & Finite Theory of Automata: BS Course
No ratings yet
Formal Languages & Finite Theory of Automata: BS Course
56 pages
3 - Lecture 07
No ratings yet
3 - Lecture 07
70 pages
Automata Theory for CS Students
No ratings yet
Automata Theory for CS Students
33 pages
Compiler Construction Basics
No ratings yet
Compiler Construction Basics
79 pages
Aho-3 7
No ratings yet
Aho-3 7
5 pages
Convert Regex to NFA and DFA
No ratings yet
Convert Regex to NFA and DFA
28 pages
Dfa 1
No ratings yet
Dfa 1
23 pages
Lecture 08
No ratings yet
Lecture 08
39 pages
Non Deterministic Finite Automata
No ratings yet
Non Deterministic Finite Automata
37 pages
Lec2 0 NFA
No ratings yet
Lec2 0 NFA
30 pages
Non Deterministic Finite Automata (NFA)
No ratings yet
Non Deterministic Finite Automata (NFA)
26 pages
Keterangan Epsilon-Nfa Ke Dfa Rev 1
No ratings yet
Keterangan Epsilon-Nfa Ke Dfa Rev 1
9 pages
Compiler Lecture 8
No ratings yet
Compiler Lecture 8
23 pages
Dfa and Nfa
No ratings yet
Dfa and Nfa
50 pages
Lexical Analysis: DFA Minimization & Wrap Up
No ratings yet
Lexical Analysis: DFA Minimization & Wrap Up
32 pages
Compiler Construction Week 7
No ratings yet
Compiler Construction Week 7
10 pages
Regular Expressions & Automata
No ratings yet
Regular Expressions & Automata
4 pages
Nfa To Dfa
No ratings yet
Nfa To Dfa
8 pages
NFA To DFA Example
No ratings yet
NFA To DFA Example
27 pages
04 Regular Expressions & FAs
No ratings yet
04 Regular Expressions & FAs
46 pages
CH 2 Part 2 - Non-Deterministic Finite Automata
No ratings yet
CH 2 Part 2 - Non-Deterministic Finite Automata
32 pages
Comprehensive Analysis of Finite Automata and Reguar
No ratings yet
Comprehensive Analysis of Finite Automata and Reguar
41 pages
Compiler Lecture 8
No ratings yet
Compiler Lecture 8
23 pages
Compiler Design 1 Assignment 1: Lexical Analysis (Correction)
No ratings yet
Compiler Design 1 Assignment 1: Lexical Analysis (Correction)
10 pages
Lecture 07
No ratings yet
Lecture 07
39 pages
NFA To DFA Conversion (Subset Construction Method) : Dept. of Computer Science Faculty of Science and Technology
No ratings yet
NFA To DFA Conversion (Subset Construction Method) : Dept. of Computer Science Faculty of Science and Technology
23 pages
L3 Nfa 1
No ratings yet
L3 Nfa 1
30 pages
Nondeterministic Finite Automata: Nondeterminism Subset Construction ε-Transitions
No ratings yet
Nondeterministic Finite Automata: Nondeterminism Subset Construction ε-Transitions
35 pages
TOC UNIT Ii.1
No ratings yet
TOC UNIT Ii.1
32 pages
Lecture 07
No ratings yet
Lecture 07
40 pages
CS-352 - Spring 2024 - Lec4
No ratings yet
CS-352 - Spring 2024 - Lec4
38 pages
Two Issues in Lexical Analysis
No ratings yet
Two Issues in Lexical Analysis
11 pages
Lec 6
No ratings yet
Lec 6
27 pages
Compiler Design: Lexical Analysis Sample Exercises and Solutions
No ratings yet
Compiler Design: Lexical Analysis Sample Exercises and Solutions
30 pages
03 Toc
No ratings yet
03 Toc
35 pages
Patterns, Automata, and Regular Expressions
No ratings yet
Patterns, Automata, and Regular Expressions
4 pages
CC Lec 5
No ratings yet
CC Lec 5
24 pages
Operating System
No ratings yet
Operating System
2 pages
Guidelenes For Appointment 2024
No ratings yet
Guidelenes For Appointment 2024
23 pages
Guide To The Subset Construction
No ratings yet
Guide To The Subset Construction
102 pages
Second Midterm Solutions
No ratings yet
Second Midterm Solutions
1 page
Bottom Up Parsing
No ratings yet
Bottom Up Parsing
91 pages
Compiler Design
No ratings yet
Compiler Design
4 pages
Comprehensive Introduction To Object Oriented Programming With Java 1st Edition Wu Solutions Manual
No ratings yet
Comprehensive Introduction To Object Oriented Programming With Java 1st Edition Wu Solutions Manual
10 pages
Examples of Grammar
No ratings yet
Examples of Grammar
4 pages
Unix Lab
No ratings yet
Unix Lab
28 pages
Operating System
No ratings yet
Operating System
2 pages
Bottom Up Parsing1
No ratings yet
Bottom Up Parsing1
103 pages
Probability Solution Manual
No ratings yet
Probability Solution Manual
118 pages
CT 1
No ratings yet
CT 1
4 pages
PHD Admission Notice - 2025-26
No ratings yet
PHD Admission Notice - 2025-26
11 pages
Rosen 7 e Extra Examples 1301
No ratings yet
Rosen 7 e Extra Examples 1301
2 pages
Concepts 2
No ratings yet
Concepts 2
3 pages
Set No. 1
No ratings yet
Set No. 1
8 pages
Questons
No ratings yet
Questons
3 pages
Assignment 1 F08
No ratings yet
Assignment 1 F08
2 pages
hw3 Soln
No ratings yet
hw3 Soln
14 pages
hw4 Soln
No ratings yet
hw4 Soln
10 pages
Draw The General View of Telecommunication and Explain The Function of The Each Unit?
No ratings yet
Draw The General View of Telecommunication and Explain The Function of The Each Unit?
22 pages
Math Problem Solving Guide
No ratings yet
Math Problem Solving Guide
6 pages
Charisma Multispectral Imaging Manual 2013 PDF
No ratings yet
Charisma Multispectral Imaging Manual 2013 PDF
192 pages
Ey Erformance Ndicators: 4wire KPI Dashboard K P I
No ratings yet
Ey Erformance Ndicators: 4wire KPI Dashboard K P I
4 pages
JNTUA R15 77 6 BTech CSE
No ratings yet
JNTUA R15 77 6 BTech CSE
195 pages
Asd PPT
No ratings yet
Asd PPT
36 pages
Dot-Hack CCG - Demo Sample Deck B
No ratings yet
Dot-Hack CCG - Demo Sample Deck B
3 pages
EcoFlow Delta 1300 User Manual
No ratings yet
EcoFlow Delta 1300 User Manual
56 pages
Wivi
No ratings yet
Wivi
17 pages
GE Voluson P8 BT16
No ratings yet
GE Voluson P8 BT16
5 pages
Microwave & Telecom Tech.
No ratings yet
Microwave & Telecom Tech.
43 pages
Industrial RTD Input Modules
No ratings yet
Industrial RTD Input Modules
4 pages
Utkarsh Dixit
No ratings yet
Utkarsh Dixit
1 page
Revista Spectrum Plastics 2010
No ratings yet
Revista Spectrum Plastics 2010
32 pages
Stcs - Vmir: Shrinking Tube Control System
No ratings yet
Stcs - Vmir: Shrinking Tube Control System
2 pages
Time Capsule 4th Gen Setup
No ratings yet
Time Capsule 4th Gen Setup
44 pages
Introduction To VLSI Design: Amit Kumar Mishra ECE Department IIT Guwahati
No ratings yet
Introduction To VLSI Design: Amit Kumar Mishra ECE Department IIT Guwahati
20 pages
Papd 911 Calls Raw
No ratings yet
Papd 911 Calls Raw
493 pages
Institution Registration - English
No ratings yet
Institution Registration - English
3 pages
Humanex A1 C
100% (1)
Humanex A1 C
3 pages
EU eCTD Validation Criteria Guide
No ratings yet
EU eCTD Validation Criteria Guide
23 pages
Internship Program by DCDIUM Technologies - v10.0
No ratings yet
Internship Program by DCDIUM Technologies - v10.0
2 pages
DAE Scientific Officer Recruitment 2022
No ratings yet
DAE Scientific Officer Recruitment 2022
34 pages
Service Manual for TV Technicians
No ratings yet
Service Manual for TV Technicians
48 pages
M.Tech VLSI Lab Report
No ratings yet
M.Tech VLSI Lab Report
5 pages
Admission Letter-1
No ratings yet
Admission Letter-1
1 page
MATH
No ratings yet
MATH
91 pages
Four-Essential-Questions-For-Boards-To - Ask-About-Generative-Ai
No ratings yet
Four-Essential-Questions-For-Boards-To - Ask-About-Generative-Ai
5 pages
Taming The Trap Dipole: A Self-Supported Dipole For 10/15/17 Meters Can Be A Fine Thing-If It's Designed Right
No ratings yet
Taming The Trap Dipole: A Self-Supported Dipole For 10/15/17 Meters Can Be A Fine Thing-If It's Designed Right
3 pages
Problem Set 1 Answer Sheet
No ratings yet
Problem Set 1 Answer Sheet
4 pages

Finite Automata

Uploaded by

Finite Automata

Uploaded by

CS 432

Finite Automata Conversions

Regex NFA DFA Lexer

Thompson's Subset Lexer

(dashed lines indicate transitions to a minimized DFA)

{A} Same behavior; collapse!

m = length of regular expression

You might also like