0% found this document useful (0 votes)

5 views33 pages

Finite Automata

The document discusses finite automata conversions and lexing, highlighting the equivalence of regular expressions, NFAs, and DFAs in expressing regular languages. It details algorithms for converting between these forms, including Thompson's construction for RE to NFA, subset construction for NFA to DFA, and Hopcroft's algorithm for DFA minimization. Additionally, it covers lexer generation techniques and handling keywords and whitespace in the context of lexical analysis.

Uploaded by

Bijay Nag

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views33 pages

Finite Automata

Uploaded by

Bijay Nag

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

CS 432

Fall 2018
Mike Lam, Professor

Finite Automata Conversions

and Lexing
Finite Automata
●
Key result: all of the following have the same expressive
power (i.e., they all describe regular languages):
– Regular expressions (REs)
– Non-deterministic finite automata (NFAs)
– Deterministic finite automata (DFAs)
●
Proof by construction
– An algorithm exists to convert any RE to an NFA
– An algorithm exists to convert any NFA to a DFA
– An algorithm exists to convert any DFA to an RE
– For every regular language, there exists a minimal DFA
●
Has the fewest number of states of all DFAs equivalent to RE
Finite Automata
●
Finite automata transitions:

Kleene's construction

Hopcroft's
algorithm
(minimize)

Regex NFA DFA Lexer

Thompson's Subset Lexer

construction construction generators

Brzozowski's algorithm
(direct to minimal DFA)

(dashed lines indicate transitions to a minimized DFA)

Finite Automata Conversions
●
RE to NFA: Thompson's construction
– Core insight: inductively build up NFA using “templates”
– Core concept: use null transitions to build NFA quickly
●
NFA to DFA: Subset construction
– Core insight: DFA nodes represent subsets of NFA nodes
– Core concept: use null closure to calculate subsets
●
DFA minimization: Hopcroft’s algorithm
– Core insight: create partitions, then keep splitting
●
DFA to RE: Kleene's construction
– Core insight: repeatedly eliminate states by combining regexes
Thompson's Construction
●
Basic idea: create NFA inductively, bottom-up
– Base case:
●
Start with individual alphabet symbols (see below)
– Inductive case:
●
Combine by adding new states and null/epsilon transitions
●
Templates for the three basic operations
– Invariant:
●
The NFA always has exactly one start state and one accepting state

a
Thompson's: Concatenation
A B
Thompson's: Concatenation
AB
Thompson's: Union

B
Thompson's: Union

A|B
Thompson's: Closure

A
Thompson's: Closure
A*
ε
Thompson's Construction
Base case

Concatenation

Union

Closure
ε
Subset construction
●
Basic idea: create DFA incrementally
– Each DFA state represents a subset of NFA states
– Use null closure operation to “collapse” null/epsilon transitions
– Null closure: all states reachable via epsilon transitions
●
Essentially: where can we go “for free?”
●
Formally: ε-closure(s) = {s} ∪ { t ∈ S | (s,ε→t) ∈ δ }
– Simulates running all possible paths through the NFA

Null closure of A = { A }
Null closure of B = { B, D }
Null closure of C =
Null closure of D =
Subset construction
●
Basic idea: create DFA incrementally
– Each DFA state represents a subset of NFA states
– Use null closure operation to “collapse” null/epsilon transitions
– Null closure: all states reachable via epsilon transitions
●
Essentially: where can we go “for free?”
●
Formally: ε-closure(s) = {s} ∪ { t ∈ S | (s,ε→t) ∈ δ }
– Simulates running all possible paths through the NFA

Null closure of A = { A }
Null closure of B = { B, D }
Null closure of C = { C, D }
Null closure of D = { D }
Subset construction
●
Basic idea: create DFA incrementally
– Each DFA state represents a subset of NFA states
– Use null closure operation to “collapse” null/epsilon transitions
– Null closure: all states reachable via epsilon transitions
●
Essentially: where can we go “for free?”
●
Formally: ε-closure(s) = {s} ∪ { t ∈ S | (s,ε→t) ∈ δ }
– Simulates running all possible paths through the NFA

Null closure of A = { A }
Null closure of B = { B, D }
Null closure of C = { C, D }
Null closure of D = { D }
Formal Algorithm
SubsetConstruction(S, Σ, s0, SA, δ):
t0 := ε-closure(s0)
S' := { t0 } S'A := ∅ W := { t0 }
while W ≠ ∅:
choose u in W and remove it from W
for each c in Σ:
t := ε-closure(δ(u,c))
δ'(u,c) = t
if t is not in S' then
add t to S’ and W
add t to S'A if any state in t is also in SA
return (S', Σ, t0, S'A, δ')
Subset Example
Subset Example
Subset Example

{B,D}
a

{A}

b
{C,D}
Subset Example
SubsetConstruction(S, Σ, s0, SA, δ):
t0 := ε-closure(s0)
S' := { t0 } S'A := ∅ W := { t0 }
while W ≠ ∅:
choose u in W and remove it from W
for each c in Σ:
t := ε-closure(δ(u,c))
δ'(u,c) = t
if t is not in S' then
add t to S’ and W
add t to S'A if there exists a state v in t that is also in SA
return (S', Σ, t0, S'A, δ')
Subset Example

{B,D,E} a
a a
{A,E}
b {E}
b
b
{C,D}
Algorithms
●
Subset construction is a fixed-point algorithm
– Textbook: “Iterated application of a monotone function”
– Basically: A loop that is mathematically guaranteed to
terminate at some point
– When it terminates, some desirable property holds
●
In the case of subset construction: the NFA has been
converted to a DFA!
Hopcroft’s DFA Minimization
●
Split into two partitions (final & non-final)
●
Keep splitting a partition while there are states with differing behaviors
– Two states transition to differing partitions on the same symbol
– Or one state transitions on a symbol and another doesn’t
●
When done, each partition becomes a single state

{B,D}
a

{A} Same behavior; collapse!

b
{C,D}
a,b {B,C,D}

{A}
Hopcroft’s DFA Minimization
●
Split into two partitions (final & non-final)
●
Keep splitting a partition while there are states with differing behaviors
– Two states transition to differing partitions on the same symbol
– Or one state transitions on a symbol and another doesn’t
●
When done, each partition becomes a single state

{B,D}
a
a
Differing behavior on
{A} ‘a’; split partition! {B,D}
a
b
{C,D} {A}

b
{C,D}
Kleene's Construction
●
Replace edge labels with REs
– "a" → "a" and "a,b" → "a|b"
●
Eliminate states by combining REs
– See pattern below; apply pairwise around each state to be eliminated
– Repeat until only one or two states remain
●
Build final RE
– One state with "A" self-loop → "A*"
– Two states: see pattern below

B A C
Eliminating A C Combining final B
states: two states:

D D

AB*C|D A*B(C|DA*B)*
Brzozowski’s Algorithm
●
Direct NFA → minimal DFA conversion
●
Sub-procedures:
– Reverse(n): invert all transitions in NFA n, adding a new start
state connected to all old final states
– Subset(n): apply subset construction to NFA n
– Reach(n): remove any part of NFA n unreachable from start state
●
Apply them all in order three times to get minimal DFA
– First time eliminates duplicate suffixes
– Second time eliminates duplicate prefixes
– MinDFA(n) = Reach(Subset(Reverse(Reach(Subset(Reverse(n))))))
– Potentially easier to code than Hopcroft’s algorithm
Brzozowski’s Algorithm
●
MinDFA(n) = Reach(Subset(Reverse(Reach(Subset(Reverse(n))))))

Example from
EAC (p.76)
NFA/DFA complexity
●
What are the time and space requirements to...
– Build an NFA?
– Run an NFA?
– Build a DFA?
– Run a DFA?

aa*|b {B,D}
a
{A}
ε
b
{C,D}
NFA/DFA complexity
●
Thompson's construction
– At most two new states and four transitions per regex character
– Thus, a linear space increase with respect to the # of regex characters
– Constant # of operations per increase means linear time as well
●
NFA execution
– Proportional to both NFA size and input string size
– Must track multiple simultaneous “current” states
●
Subset construction
– Potential exponential state space explosion
– A n-state NFA could require up to 2n DFA states
– However, this rarely happens in practice
●
DFAs execution
– Proportional to input string size only (only track a single “current” state)
NFA/DFA complexity
●
NFAs build quicker (linear) but run slower
– Better if you will only run the FA a few times
– Or if you need features that are difficult to implement with DFAs
●
DFAs build slower but run faster (linear)
– Better if you will run the FA many times

NFA DFA
Build time O(m) O(2m)
Run time O(m×n) O(n)

m = length of regular expression

n = length of input string
Lexers
●
Auto-generated
– Table-driven: generic scanner, auto-generated tables
– Direct-coded: hard-code transitions using jumps
– Common tools: lex/flex and similar
●
Hand-coded
– Better I/O performance (i.e., buffering)
– More efficient interfacing w/ other phases
– This is what we’ll do for P2
Handling Keywords
●
Issue: keywords are valid identifiers
●
Option 1: Embed into NFA/DFA
– Separate regex for keywords
– Easier/faster for generated scanners
●
Option 2: Use lookup table
– Scan as identifier then check for a keyword
– Easier for hand-coded scanners
– (Thus, this is probably easier for P2)
Handling Whitespace
●
Issue: whitespace is usually ignored
– Write a regex and remove it before each new token
●
Side effect: some results are counterintuitive
– Is this a valid token? “3abc”
– For now, it’s actually two!
– We’ll reject them later, in the parsing phase

Unit 01 - Part 3
No ratings yet
Unit 01 - Part 3
18 pages
548445041
No ratings yet
548445041
17 pages
02 Automata
No ratings yet
02 Automata
78 pages
Can We Build A Finite Automaton For Every Regular Expression?, - Build FA Based On The Definition of Regular Expression
No ratings yet
Can We Build A Finite Automaton For Every Regular Expression?, - Build FA Based On The Definition of Regular Expression
66 pages
SEM04a-NFA Construction and Minimum DFA
No ratings yet
SEM04a-NFA Construction and Minimum DFA
48 pages
Lecture 3 Lexical Analyzer
No ratings yet
Lecture 3 Lexical Analyzer
44 pages
Compiler Construction Lecture 5-6
No ratings yet
Compiler Construction Lecture 5-6
37 pages
Compiler Construction - CS606 Power Point Slides Lecture 08
No ratings yet
Compiler Construction - CS606 Power Point Slides Lecture 08
43 pages
Lecture 08
No ratings yet
Lecture 08
38 pages
Lecture 08
No ratings yet
Lecture 08
39 pages
Compiler 5
No ratings yet
Compiler 5
42 pages
NFA to DFA Conversion Guide
No ratings yet
NFA to DFA Conversion Guide
62 pages
Nfa To Dfa 08
No ratings yet
Nfa To Dfa 08
39 pages
Module 5
No ratings yet
Module 5
7 pages
Lect 04
No ratings yet
Lect 04
12 pages
Lecture 4 - NFA To DFA
No ratings yet
Lecture 4 - NFA To DFA
38 pages
Lec 4 CH 2
No ratings yet
Lec 4 CH 2
39 pages
4-Lexical Analysis Part3
No ratings yet
4-Lexical Analysis Part3
37 pages
331 Basics
No ratings yet
331 Basics
26 pages
Automata 5
No ratings yet
Automata 5
33 pages
Formal Languages & Finite Theory of Automata: BS Course
No ratings yet
Formal Languages & Finite Theory of Automata: BS Course
56 pages
3 - Lecture 07
No ratings yet
3 - Lecture 07
70 pages
Automata Theory for CS Students
No ratings yet
Automata Theory for CS Students
33 pages
Compiler Construction Basics
No ratings yet
Compiler Construction Basics
79 pages
Aho-3 7
No ratings yet
Aho-3 7
5 pages
Convert Regex to NFA and DFA
No ratings yet
Convert Regex to NFA and DFA
28 pages
Dfa 1
No ratings yet
Dfa 1
23 pages
Lecture 08
No ratings yet
Lecture 08
39 pages
Non Deterministic Finite Automata
No ratings yet
Non Deterministic Finite Automata
37 pages
Lec2 0 NFA
No ratings yet
Lec2 0 NFA
30 pages
Non Deterministic Finite Automata (NFA)
No ratings yet
Non Deterministic Finite Automata (NFA)
26 pages
Keterangan Epsilon-Nfa Ke Dfa Rev 1
No ratings yet
Keterangan Epsilon-Nfa Ke Dfa Rev 1
9 pages
Compiler Lecture 8
No ratings yet
Compiler Lecture 8
23 pages
Dfa and Nfa
No ratings yet
Dfa and Nfa
50 pages
Lexical Analysis: DFA Minimization & Wrap Up
No ratings yet
Lexical Analysis: DFA Minimization & Wrap Up
32 pages
Compiler Construction Week 7
No ratings yet
Compiler Construction Week 7
10 pages
Regular Expressions & Automata
No ratings yet
Regular Expressions & Automata
4 pages
Nfa To Dfa
No ratings yet
Nfa To Dfa
8 pages
NFA To DFA Example
No ratings yet
NFA To DFA Example
27 pages
04 Regular Expressions & FAs
No ratings yet
04 Regular Expressions & FAs
46 pages
CH 2 Part 2 - Non-Deterministic Finite Automata
No ratings yet
CH 2 Part 2 - Non-Deterministic Finite Automata
32 pages
Comprehensive Analysis of Finite Automata and Reguar
No ratings yet
Comprehensive Analysis of Finite Automata and Reguar
41 pages
Compiler Lecture 8
No ratings yet
Compiler Lecture 8
23 pages
Compiler Design 1 Assignment 1: Lexical Analysis (Correction)
No ratings yet
Compiler Design 1 Assignment 1: Lexical Analysis (Correction)
10 pages
Lecture 07
No ratings yet
Lecture 07
39 pages
NFA To DFA Conversion (Subset Construction Method) : Dept. of Computer Science Faculty of Science and Technology
No ratings yet
NFA To DFA Conversion (Subset Construction Method) : Dept. of Computer Science Faculty of Science and Technology
23 pages
L3 Nfa 1
No ratings yet
L3 Nfa 1
30 pages
Nondeterministic Finite Automata: Nondeterminism Subset Construction ε-Transitions
No ratings yet
Nondeterministic Finite Automata: Nondeterminism Subset Construction ε-Transitions
35 pages
TOC UNIT Ii.1
No ratings yet
TOC UNIT Ii.1
32 pages
Lecture 07
No ratings yet
Lecture 07
40 pages
CS-352 - Spring 2024 - Lec4
No ratings yet
CS-352 - Spring 2024 - Lec4
38 pages
Two Issues in Lexical Analysis
No ratings yet
Two Issues in Lexical Analysis
11 pages
Lec 6
No ratings yet
Lec 6
27 pages
Compiler Design: Lexical Analysis Sample Exercises and Solutions
No ratings yet
Compiler Design: Lexical Analysis Sample Exercises and Solutions
30 pages
03 Toc
No ratings yet
03 Toc
35 pages
Patterns, Automata, and Regular Expressions
No ratings yet
Patterns, Automata, and Regular Expressions
4 pages
CC Lec 5
No ratings yet
CC Lec 5
24 pages
Guide To The Subset Construction
No ratings yet
Guide To The Subset Construction
102 pages
Examples of Grammar
No ratings yet
Examples of Grammar
4 pages
Operating System
No ratings yet
Operating System
2 pages
Operating System
No ratings yet
Operating System
2 pages
Comprehensive Introduction To Object Oriented Programming With Java 1st Edition Wu Solutions Manual
No ratings yet
Comprehensive Introduction To Object Oriented Programming With Java 1st Edition Wu Solutions Manual
10 pages
Unix Lab
No ratings yet
Unix Lab
28 pages
Guidelenes For Appointment 2024
No ratings yet
Guidelenes For Appointment 2024
23 pages
Probability Solution Manual
No ratings yet
Probability Solution Manual
118 pages
Bottom Up Parsing1
No ratings yet
Bottom Up Parsing1
103 pages
Compiler Design
No ratings yet
Compiler Design
4 pages
Bottom Up Parsing
No ratings yet
Bottom Up Parsing
91 pages
PHD Admission Notice - 2025-26
No ratings yet
PHD Admission Notice - 2025-26
11 pages
CT 1
No ratings yet
CT 1
4 pages
hw3 Soln
No ratings yet
hw3 Soln
14 pages
Set No. 1
No ratings yet
Set No. 1
8 pages
Concepts 2
No ratings yet
Concepts 2
3 pages
hw4 Soln
No ratings yet
hw4 Soln
10 pages
Assignment 1 F08
No ratings yet
Assignment 1 F08
2 pages
Second Midterm Solutions
No ratings yet
Second Midterm Solutions
1 page
Rosen 7 e Extra Examples 1301
No ratings yet
Rosen 7 e Extra Examples 1301
2 pages
Questons
No ratings yet
Questons
3 pages
DUST in The WIND TAB by Kansas at Ultimate-Guitar
No ratings yet
DUST in The WIND TAB by Kansas at Ultimate-Guitar
6 pages
Fast Data Exchange With Canoe: 2015-08-07 Application Note An-And-1-119
No ratings yet
Fast Data Exchange With Canoe: 2015-08-07 Application Note An-And-1-119
25 pages
PHIL222 Paper 7
No ratings yet
PHIL222 Paper 7
3 pages
Aspiring for MS in Computer Science
No ratings yet
Aspiring for MS in Computer Science
2 pages
New 120 ITIL v3 Foundation Exam Dumps Practical Questions
100% (1)
New 120 ITIL v3 Foundation Exam Dumps Practical Questions
33 pages
Source Code
No ratings yet
Source Code
23 pages
PHP Notes
100% (1)
PHP Notes
97 pages
MPDU User Manual V1
No ratings yet
MPDU User Manual V1
44 pages
Associate Cloud Engineer Exam - Free Actual Q&As, Page 4 - ExamTopics
No ratings yet
Associate Cloud Engineer Exam - Free Actual Q&As, Page 4 - ExamTopics
3 pages
Graphical User Interfaces in Haskell: Koen Lindström Claessen
No ratings yet
Graphical User Interfaces in Haskell: Koen Lindström Claessen
31 pages
3 Recitation StochasticGradientDescent
No ratings yet
3 Recitation StochasticGradientDescent
10 pages
Siebel Architecture (Basic Concepts)
No ratings yet
Siebel Architecture (Basic Concepts)
21 pages
Process Control & Automation Expert
No ratings yet
Process Control & Automation Expert
7 pages
DB2 Advance Performance Monitoring
100% (1)
DB2 Advance Performance Monitoring
47 pages
GST 214 Summary
No ratings yet
GST 214 Summary
6 pages
Inventor 2012
No ratings yet
Inventor 2012
49 pages
DESIGN AND IMPLEMENTATION OF A WEB BASED SYSTEM FOR DISTANCE LEARNING Final
No ratings yet
DESIGN AND IMPLEMENTATION OF A WEB BASED SYSTEM FOR DISTANCE LEARNING Final
71 pages
Data Science
No ratings yet
Data Science
13 pages
Mood Based Music Player Project Expanded
No ratings yet
Mood Based Music Player Project Expanded
12 pages
Cristallo 400 SP LR 1
No ratings yet
Cristallo 400 SP LR 1
2 pages
HBDH 7180
No ratings yet
HBDH 7180
2 pages
MP Unit 5
No ratings yet
MP Unit 5
76 pages
ELP305 Design and Systems Lab (Starting Jan 2 2023, Every Afternoon)
No ratings yet
ELP305 Design and Systems Lab (Starting Jan 2 2023, Every Afternoon)
3 pages
API Guide for Travel Booking
No ratings yet
API Guide for Travel Booking
11 pages
Design Thinking Assessment KV Hackathon
No ratings yet
Design Thinking Assessment KV Hackathon
11 pages
File Management in Operating Systems
No ratings yet
File Management in Operating Systems
40 pages
Data Culture Guide for Executives
No ratings yet
Data Culture Guide for Executives
10 pages
Software Requirements Specification - Payment Gateway
No ratings yet
Software Requirements Specification - Payment Gateway
4 pages
Mplab
No ratings yet
Mplab
3 pages
Datasheet - RHEL On RHEV
No ratings yet
Datasheet - RHEL On RHEV
2 pages

Finite Automata

Uploaded by

Finite Automata

Uploaded by

CS 432

Finite Automata Conversions

Regex NFA DFA Lexer

Thompson's Subset Lexer

(dashed lines indicate transitions to a minimized DFA)

{A} Same behavior; collapse!

m = length of regular expression

You might also like