Compiled by : Subhayu
UNIT 1 – Finite State Machines and
Regular Languages
Chapter 1: Introduction to Formal Language & Basic
Terminology
Why should you care about this topic?
Imagine you're building a spell checker, a chatbot, or even an AI model that
understands human commands. All of these depend on one common thing —
language. But not just English or Hindi — a more abstract version of language. A
"formal" language that machines can understand, define, and operate on.
This unit lays the foundation of how we design, process, and analyze such languages.
If you understand this well, you're unlocking a powerful door into compiler design,
automata theory, natural language processing, and even AI.
Let’s start from scratch
1. Alphabet (Σ)
An alphabet is a finite set of symbols.
Think of it as the "building blocks" of any language.
● For English:
Alphabet = {A, B, C, ..., Z}
● For binary computers:
Alphabet = {0, 1}
● For DNA sequence processing:
Alphabet = {A, T, G, C}
We usually denote the alphabet as Σ (capital sigma).
Key idea: The alphabet is just a set — there is no order or meaning yet. It’s the raw
material.
2. String
A string is a finite sequence of symbols from an alphabet.
● If Σ = {a, b}, then "ab", "aaa", and "bab" are all valid strings.
● Even an empty string is allowed, denoted as ε (epsilon).
Think of it like this:
Alphabet = letters
String = words made from letters
3. Length of a String
The number of symbols in a string.
● Length of "aba" is 3
● Length of ε is 0
Notation: |w| = length of string w
E.g., if w = “abbab”, then |w| = 5
4. Formal Language
This is the set of all valid strings over a given alphabet, based on certain rules.
A language is just a set of strings. That's it.
● Example 1: L = {ε, a, aa, aaa, aaaa, ...}
→ This language contains all strings with only 'a's.
● Example 2: L = {w ∈ {0,1}* | w ends with 1}
→ All binary strings that end in 1, like "1", "01", "101", etc.
So, a formal language defines what strings are allowed and what are not.
Real-Life Analogy
Let’s say you want to create a password checker:
● Your alphabet is:
Σ = {a-z, A-Z, 0-9, @, #, !}
● A valid password (string) must:
→ Have at least 6 characters
→ Contain at least one special character
Your password rules = formal language
All strings that follow these rules = members of your language
You’re already using formal languages — you just didn’t know it!
5. Operations on Formal Languages (Preview)
Once we define languages, we can perform operations like:
● Union (L1 ∪ L2) – combine two languages
● Concatenation (L1L2) – join strings from two languages
● Kleene Star (L*) – repeat strings in a language any number of times
We’ll explore these in detail soon.
Answer the following :
1. What is an alphabet in the context of formal language?
2. Give an example of a string of length 3 over Σ = {x, y}.
3. What does ε represent?
4. Is the set {a, ab, ba, bbb} a language over Σ = {a, b}?
5. If Σ = {0, 1}, list 3 strings that belong to the language of binary numbers ending
in 0.
6. Can a formal language be infinite? Give an example.
7. Write a real-life system where rules over strings define acceptance.
Imagine this with me:
You're teaching a robot how to understand language. But it’s not a smart AI like
ChatGPT — it’s a very basic robot that can only follow rules.
How do we write those rules?
How do we describe the patterns we want it to understand?
That’s where formal languages come into play.
1.1 What is a Formal Language?
A Formal Language is a set of strings formed from a finite set of symbols, based on
specific rules.
Let’s break that down.
1.2 Key Terminology
Term Meaning
Alphabet (Σ) A finite set of symbols. Think of it as letters you’re allowed to use.
Example: Σ = {0, 1}
String A finite sequence of symbols from an alphabet. Ex: 0101, 111
Empty String A string with no characters. It still exists — like an empty folder.
(ε)
Language A set of strings made from symbols in Σ. Ex: L = {ε, 0, 1, 01, 10}
Length of Number of symbols in the string. Ex:
string
Concatenation Joining strings. If x = “01” and y = “1”, then xy = “011”
Kleene Star All possible strings (including ε) you can make from Σ. Ex: if Σ =
(Σ*) {a, b}, Σ* = {ε, a, b, aa, ab, ba, bb, aaa, …}
1.3 Real-Life Analogy
Let’s say you're building a ticket scanner for a concert.
You define a rule: "A valid ticket number must start with a 'C', followed by three
digits."
Here:
● Alphabet Σ = {C, 0, 1, 2, ..., 9}
● A string like “C123” is valid
● “123C” is invalid
You’ve just defined a formal language — the set of all valid ticket numbers!
1.4 Operations on Formal Languages
Here’s how we can manipulate languages:
1. Union (L1 ∪ L2)
Combines two languages.
If L1 = {a, b}, L2 = {c}, then L1 ∪ L2 = {a, b, c}
2. Concatenation (L1L2)
All combinations by picking one from L1 and one from L2.
If L1 = {a}, L2 = {b, c}, then L1L2 = {ab, ac}
3. Kleene Star (L*)
All strings you can form using zero or more repetitions of elements of L.
If L = {a}, L* = {ε, a, aa, aaa, …}
4. Intersection (L1 ∩ L2)
Strings common in both.
If L1 = {a, ab}, L2 = {ab, bc}, L1 ∩ L2 = {ab}
5. Complement (L ̄ )
All strings not in the language, assuming a fixed Σ*.
If Σ = {0,1}, and L = {0,1}, then ̄ L = {ε, 00, 01, 10, 11, 000, …}
Let’s Pause and Reflect
You now know what a formal language is and how to combine and manipulate them.
This is the foundation for things like regular expressions, compilers,
pattern-matching tools, and even AI interpreters.
Answer the following :
1. Define an alphabet with your own example.
2. What is the length of the string “11010”?
3. Give 2 examples of strings over Σ = {x, y}.
4. Explain the difference between a string and a language.
5. What is ε and how is it different from a null value in programming?
6. Describe Kleene Star in your own words with an example.
7. If L1 = {a, b} and L2 = {b, c}, find L1 ∩ L2 and L1 ∪ L2.
Answers :
1. Define an alphabet with your own example.
An alphabet in formal language theory is a finite, non-empty set of symbols.
Example:
Let’s define an alphabet:
Σ = {0, 1, #}
This alphabet consists of the symbols 0, 1, and #, and could be used to represent
binary strings with separators.
2. What is the length of the string “11010”?
The length of a string is the number of symbols it contains.
For the string “11010”, the number of symbols is 5.
Answer: Length = 5
3. Give 2 examples of strings over Σ = {x, y}.
A string is just a sequence of symbols from the given alphabet.
Given Σ = {x, y}:
Two example strings are:
→ xyx
→ yy
(Note: You can repeat symbols, mix them in any order, or even have an empty string.)
4. Explain the difference between a string and a language.
● A string is a single sequence of symbols from an alphabet.
● A language is a set of such strings.
Example:
→ String: xyx
→ Language: L = {x, yx, xyx, xyy}
So, a language is like a collection of valid strings that follow specific rules or patterns.
5. What is ε and how is it different from a null value in programming?
● ε (epsilon) represents the empty string, i.e., a string with zero symbols. It’s
still a string — just one that contains nothing.
● A null value in programming often means “undefined”, “not assigned”, or “no
value at all”, depending on the language.
Key Difference:
→ ε is a valid and defined element in formal languages.
→ null is more about the absence of a value or reference in code.
6. Describe Kleene Star in your own words with an example.
The Kleene Star ( * ) is an operation that means “repeat zero or more times”.
If you apply Kleene Star to a symbol or language, it gives all possible combinations of
strings made by repeating that symbol/language — including the empty string (ε).
Example:
If Σ = {a}, then:
Σ* = {ε, a, aa, aaa, aaaa, …}
(you can have 0 or more a's)
If L = {ab}, then:
L* = {ε, ab, abab, ababab, …}
7. If L₁ = {a, b} and L₂ = {b, c}, find L₁ ∩ L₂ and L₁ ∪ L₂.
● Intersection ( ∩ ) = common elements in both sets
● Union ( ∪ ) = all unique elements from both sets
Given:
L₁ = {a, b}
L₂ = {b, c}
L₁ ∩ L₂ = {b}
L₁ ∪ L₂ = {a, b, c}
1.5 Why Formal Languages?
You might wonder — we already have programming languages, right?
Well, yes, but programming languages are built on formal languages. Every compiler
checks your code using a set of rules, defined in a formal grammar. Understanding
this helps you:
● Detect errors better
● Write more efficient code
● Design better logic and automata
1.6 Types of Languages (based on Grammar)
As we go ahead in TOC, we classify formal languages into Chomsky’s Hierarchy
(covered later), but for now remember:
● Regular Languages — most basic, can be recognized by finite automata
● Context-Free Languages — can be recognized by pushdown automata
● Context-Sensitive and Recursively Enumerable — more powerful, more
complex
Right now, we’ll stick to Regular Languages.
Real-Life Example 2:
An ATM password system that only allows:
● 4-digit numbers
● Starting with 1 or 2
This is a language, and can be recognized by a machine (finite automaton) we'll build
soon.
Answer the following :
1. What is the difference between Σ* and a language L?
2. Can a formal language be infinite? Give an example.
3. How many strings of length 2 can be formed over Σ = {0,1}?
4. Write a language of all strings that start and end with the same symbol over Σ
= {a, b}.
5. Define complement of a language with an example.
6. What does “ε ∈ L” mean?
7. List 2 real-world systems where formal languages are used implicitly.
Answers :
1. What is the difference between Σ* and a language L?
Σ* (read as “sigma star”) is the set of all possible strings (including the empty string
ε) that can be formed using symbols from the alphabet Σ.
A language L is just a subset of Σ*, i.e., it contains only some of those strings based
on specific rules or patterns.
Example:
Let Σ = {a, b}
Then Σ* = {ε, a, b, aa, ab, ba, bb, aaa, …} (all possible combinations)
L = {ab, aab, aaab} ⊆ Σ* (a specific language)
2. Can a formal language be infinite? Give an example.
Yes, a formal language can definitely be infinite.
Example:
L = {aⁿ | n ≥ 1} = {a, aa, aaa, aaaa, …}
This language contains all strings with one or more ‘a’s, and it never ends.
3. How many strings of length 2 can be formed over Σ = {0,1}?
Each position in the string can be either 0 or 1.
For length 2:
→ First position: 2 choices (0 or 1)
→ Second position: 2 choices (0 or 1)
Total strings = 2 × 2 = 4
Strings: {00, 01, 10, 11}
4. Write a language of all strings that start and end with the same
symbol over Σ = {a, b}.
We want strings that start and end with the same character.
Language L = {a, b, aa, bb, aba, bab, aabaa, …}
Formally:
L = {w ∈ Σ* | first and last symbol of w are the same}
5. Define complement of a language with an example.
The complement of a language L over an alphabet Σ is the set of all strings in Σ* that
are not in L.
Example:
Let Σ = {a, b}
Let L = {ab, ba}
Then, L̅ (complement of L) = Σ* – L = all strings over {a, b} except "ab" and "ba".
6. What does “ε ∈ L” mean?
It means that the empty string ε is a member of the language L.
This tells us that L accepts a string of zero length (i.e., no symbols at all).
7. List 2 real-world systems where formal languages are used
implicitly.
1. Programming Languages:
Every programming language like C, Python, or Java is defined by a formal
grammar that determines what statements are valid.
2. Search Engines / Regex Filters:
When you use search queries or regular expressions, you're using a form of
formal language to match patterns in text.
Chapter 2: Finite State Machines —
Properties, Limitations, DFA, NFA,
and their Equivalence
What is a Finite State Machine (FSM)?
A Finite State Machine is a computational model used to design and analyze systems
that follow a series of defined steps (states) depending on input.
Think of an FSM as a "flowchart with memory."
It reads an input string (symbols one by one) and transitions between
states based on rules until the input ends. If it ends in an accepting state,
it accepts the input.
✦ Real-Life Analogy
Imagine a vending machine.
● States: Waiting for money, Accepting coins, Selecting item, Dispensing item
● Inputs: Coin inserted, Button pressed
● Transitions: If ₹10 is inserted → go to item selection → press A1 → dispense
snack
This whole process can be modeled using an FSM.
FSM – Basic Terms Recap
Before we talk DFA and NFA, let’s lock down the key components every FSM has:
Term Meaning
States (Q) Distinct configurations (nodes or circles) the machine can
be in.
Alphabet (Σ) Set of symbols the machine can read (like {0, 1}).
Transition Function Tells what state to move to on reading a symbol.
(δ)
Start State (q₀) Where the machine begins before any input.
Final States (F) States which, if reached after reading the input, mean the
input is accepted.
Deterministic Finite Automata (DFA)
➤ Definition:
A DFA is a FSM where for each state and input symbol, there is exactly one transition.
In short:
Given a current state and an input, you always know what the next state is.
➤ DFA Properties:
● Deterministic: No ambiguity.
● Every state has transitions defined for all symbols in Σ.
● No use of ε (empty string) transitions.
● Can be easily implemented in hardware/software.
➤ Example DFA:
Let’s design a DFA to accept strings ending in 01 over the alphabet {0,1}.
States: {q0, q1, q2}
Start state: q0
Final state: q2
Transitions:
q0 --0--> q1
q0 --1--> q0
q1 --1--> q2
q1 --0--> q1
q2 --0--> q1
q2 --1--> q0
Try testing the string 1101 — you’ll end up at q2 → accepted.
Quick Check:
Can a DFA be in two states at once?
No. By definition, it’s deterministic, meaning exactly one state at any time.
Limitations of DFA:
● Can't handle nested structures like balanced parentheses.
● Needs exponentially more states for certain patterns.
● Not suited for problems requiring memory of unbounded size.
Non-Deterministic Finite Automata (NFA)
➤ Definition:
An NFA is like a DFA, but:
From one state and one symbol, it can move to multiple possible next
states.
Also, it can:
● Have ε-transitions: move to a new state without consuming any symbol.
● Be in multiple states simultaneously.
➤ Why NFAs are Powerful in Design:
They make it easier to construct automata for complex languages, even though
eventually, we may convert them to DFA for implementation.
➤ NFA vs DFA:
Feature DFA NFA
Transition per input One Zero or more
ε-transitions Not allowed Allowed
Simultaneous states One Many (through
branching)
Implementation Easier Harder
ease
Language Same power (regular Same power (regular
Recognition langs) langs)
Equivalence of DFA and NFA
Even though NFAs seem more powerful (multiple paths, ε-transitions), they
recognize the same class of languages as DFAs — the Regular Languages.
Theorem: For every NFA, there exists an equivalent DFA that accepts the
same language.
➤ Conversion Process:
The standard way to convert an NFA to DFA is the Subset Construction (also called
Powerset Construction). It involves:
1. Creating new DFA states as sets of NFA states.
2. Transitions defined as unions of possible NFA paths.
3. Start state is the ε-closure of NFA start state.
4. Final DFA states are those containing at least one NFA final state.
It can result in up to 2ⁿ states if NFA has n states — that's the price for determinism.
Real-Life Analogy for NFA:
Imagine a person walking in a maze with doors.
● A DFA person can open only one door per turn.
● An NFA person can split into multiple versions of themselves, each trying a
different door.
Eventually, if any version of the person reaches the exit — the input is accepted.
Answer these to check your understanding so far:
1. Define the 5-tuple structure of a DFA.
2. What is the major difference between DFA and NFA?
3. Can DFA have ε-transitions? Why or why not?
4. What does it mean for a string to be “accepted” by an FSM?
5. Is it possible for an NFA to not have any transition for some input symbol from
a state?
6. What’s the practical challenge of converting an NFA to a DFA?
7. Which is easier to design: DFA or NFA? Why?
Answers :
1. Define the 5-tuple structure of a DFA.
A Deterministic Finite Automaton (DFA) is formally defined as a 5-tuple (Q, Σ, δ, q₀,
F), where:
● Q: Finite set of states
● Σ (Sigma): Finite input alphabet
● δ (delta): Transition function δ: Q × Σ → Q
● q₀: Start state, where q₀ ∈ Q
● F: Set of accepting/final states, F ⊆ Q
This structure tells us how the DFA behaves, starting from a state, reading inputs,
and transitioning deterministically.
2. What is the major difference between DFA and NFA?
The major difference lies in determinism:
● DFA: For every state and input symbol, only one transition exists.
● NFA: For a given state and input, multiple transitions may be possible, or even
none. It may also include ε (epsilon) transitions—transitions that happen
without consuming input.
This makes DFA easier to simulate, while NFA is more flexible to design.
3. Can DFA have ε-transitions? Why or why not?
No, a DFA cannot have ε-transitions.
In DFA, the machine must read exactly one symbol at each step. Since ε-transitions
do not consume input, allowing them would break the deterministic nature of the
automaton.
4. What does it mean for a string to be “accepted” by an FSM?
A string is said to be accepted by a Finite State Machine (FSM) if, starting from the
initial state and after consuming the entire input string, the machine ends in a final
(accepting) state.
In formal terms:
If δ*(q₀, w) ∈ F, where w is the input string, then w is accepted.
5. Is it possible for an NFA to not have any transition for some input
symbol from a state?
Yes, in an NFA, it is allowed that for some state q and input symbol a, the transition
function δ(q, a) is undefined or leads to no states.
This is one of the features that distinguishes NFAs from DFAs. In DFA, every state
must have a transition for every symbol in the alphabet.
6. What’s the practical challenge of converting an NFA to a DFA?
The biggest challenge is the potential exponential growth in states.
When converting NFA to DFA using the subset construction method, each state in
the DFA represents a set of NFA states. For an NFA with n states, the equivalent DFA
might have up to 2ⁿ states.
This can lead to state explosion, making it hard to implement for complex NFAs.
7. Which is easier to design: DFA or NFA? Why?
An NFA is easier to design because:
● You can allow multiple transitions for the same input.
● You can skip transitions or use ε-transitions.
● It allows more flexibility in building automata for pattern recognition.
On the other hand, DFAs are more structured but rigid, which makes them harder to
design for complex patterns, although easier to simulate once designed.
Answer the following :
1. Design a DFA to accept all strings over {0,1} that start and end with 1.
2. Construct an NFA that accepts the language: strings ending in either 10 or 01.
3. True/False: Every DFA is an NFA, but not every NFA is a DFA.
4. Why are ε-transitions helpful in NFAs?
5. Give a real-world example where FSM can be used other than vending
machine.
6. Explain, using your own words, why NFAs do not accept more languages than
DFAs.
7. Why might a compiler prefer using DFA over NFA during lexical analysis?
Answers :
1. Design a DFA to accept all strings over {0,1} that start and end with
1.
Alphabet (Σ): {0, 1}
Requirement: String must begin with 1 and end with 1.
We will need to track:
● Whether the first symbol is 1
● And whether the last symbol is 1
States:
● q0: Start state, waiting for the first symbol
● q1: Read 1 as first symbol
● q2: Intermediate state for strings that start with 1 and continue
● q3: Dead state (invalid if string starts with 0)
Transitions:
● From q0:
○ on 1 → go to q1
○ on 0 → go to q3 (dead state)
● From q1:
○ on 0 → go to q2
○ on 1 → stay in q1 (1 is both starting and possibly ending)
● From q2:
○ on 0 → stay in q2
○ on 1 → go back to q1 (as 1 becomes the latest character)
● From q3:
○ on 0 or 1 → stay in q3
Final State:
● q1 is the only accepting state because it guarantees:
○ The string started with 1
○ The last symbol was also 1
2. Construct an NFA that accepts the language: strings ending in either
10 or 01.
Alphabet (Σ): {0, 1}
We only care about the last two symbols.
States:
● q0: Start state
● q1: After reading 0
● q2: After reading 1
● q3: Accepting state for 01
● q4: Accepting state for 10
Transitions:
● From q0:
○ on 0 → go to q1
○ on 1 → go to q2
● From q1:
○ on 1 → go to q3
● From q2:
○ on 0 → go to q4
● From all states, self-loops can be added to allow arbitrary prefix before the 10
or 01 ending.
Accepting States:
● q3 and q4
3. True/False: Every DFA is an NFA, but not every NFA is a DFA.
Answer: True
Every DFA is technically a restricted NFA (with exactly one transition per input
symbol). However, NFAs can have multiple transitions, including ε-moves, which
DFAs do not allow.
4. Why are ε-transitions helpful in NFAs?
ε-transitions allow the automaton to change state without reading any input
symbol. This makes NFAs more flexible and easier to design, especially when
combining smaller automata (like in regex implementation).
5. Give a real-world example where FSM can be used other than a
vending machine.
Example: Elevator Control System
● States: On each floor (1st, 2nd, 3rd, etc.)
● Inputs: Up button, Down button, Floor request
● Transitions: Move up, move down, stay
The elevator system transitions between floors based on input buttons and calls,
much like a Finite State Machine.
6. Explain, using your own words, why NFAs do not accept more
languages than DFAs.
Even though NFAs look more powerful (multiple paths, ε-transitions), they do not
accept more languages. This is because every NFA can be converted into a DFA that
accepts the exact same language. The difference is only in design complexity, not in
expressive power.
7. Why might a compiler prefer using DFA over NFA during lexical
analysis?
Compilers prefer DFAs because:
● DFAs are faster: They do not need to guess or explore multiple paths.
● At every input symbol, a DFA has exactly one next state, making execution
predictable and efficient.
● NFAs may require backtracking or parallel state tracking, which slows down
real-time performance.
So, although NFAs are easier to design conceptually, DFAs are better for
implementation in tools like compilers.
Chapter 3 : Regular Languages and
Regular Expressions
What Are Regular Languages?
A Regular Language is a set of strings (over some alphabet) that can be recognized by
a finite automaton. These languages are the simplest class in the Chomsky hierarchy
and are the basis for pattern matching, lexical analysis in compilers, and text search
tools like grep.
Real-Life Analogy:
Imagine you are scanning a document and want to find if it contains only vowels. If
the rule is simple, like "every word must contain only vowels", a finite machine (a
checker) can read each character one by one and decide if the word is valid or not.
This is a regular language.
What Are Regular Expressions?
A Regular Expression (RegEx) is a symbolic representation used to describe a regular
language.
Syntax (Basic Building Blocks):
Given an alphabet Σ (e.g., {a, b}), we build expressions using:
1. Empty string (ε) – Represents a string with no characters.
2. Single character – Like 'a' or 'b' from Σ.
3. Concatenation – If r1 and r2 are regular expressions, so is r1r2.
4. Union (OR) – If r1 and r2 are regular expressions, so is r1 | r2.
5. Kleene Star – If r is a regular expression, then r* represents zero or more
repetitions.
Example:
Let’s describe the set of strings over {a, b} that contain zero or more a’s followed by a
single b:
Regular Expression: a*b
This matches: b, ab, aaab, etc.
Relationship: Regular Expressions ↔ Regular Languages
Every regular expression denotes a regular language, and every regular language can
be expressed using a regular expression.
This connection is formally proven using Kleene’s Theorem.
Kleene’s Theorem
This is a fundamental result that bridges the gap between automata and regular
expressions.
Kleene’s Theorem states:
A language is regular if and only if there exists a finite automaton that
accepts it, or there exists a regular expression that describes it.
It has two parts:
1. Part 1 (RE → FA): For every regular expression, there exists a finite automaton
that accepts the same language.
2. Part 2 (FA → RE): For every finite automaton, there exists a regular expression
describing its language.
This equivalence allows us to switch between representations depending on what we
are trying to solve.
Properties of Regular Languages
Regular languages have many important closure properties, which means they are
closed under the following operations:
Operation Meaning
Union (L1 ∪ L2) Strings in L1 or L2
Concatenation Strings formed by joining a string from L1
(L1L2) with L2
Kleene Star (L*) Zero or more repetitions of strings in L
Complement (~L) Strings not in L
Intersection (L1 ∩ Strings common to both L1 and L2
L2)
Difference (L1 – L2) Strings in L1 but not in L2
Reversal (L^R) Reverse all strings in the language
Homomorphism Replace each symbol using a fixed mapping
rule
These properties make regular languages very flexible to work with.
Pumping Lemma for Regular Languages
This is a powerful proof technique used to show that a language is NOT regular.
Let’s walk through it step-by-step.
Pumping Lemma Statement:
If L is a regular language, then there exists a number p (pumping length), such that
any string w in L with length at least p can be divided into three parts, w = xyz,
satisfying:
1. |y| ≥ 1 (y is not empty)
2. |xy| ≤ p (the first repetition occurs within the first p characters)
3. For all i ≥ 0, the string xyⁱz is also in L
Why is it useful?
The Pumping Lemma is used to prove that a given language is not regular by
contradiction.
Real-Life Example:
Let’s try to prove that the language L = { aⁿbⁿ | n ≥ 0 } is not regular.
Assume it is regular.
Then by the Pumping Lemma, there exists a p such that for any string w ∈ L with
|w| ≥ p, we can split it as w = xyz, satisfying the conditions.
Take w = a^p b^p (i.e., p a’s followed by p b’s)
According to the lemma:
● y must contain only a’s (since |xy| ≤ p)
● If we pump y, say repeat it twice: x y² z, we get more a’s than b’s
● Resulting string is not of the form aⁿbⁿ → contradiction
Hence, L is not regular
Answer the following questions briefly:
1. Define a regular language with your own example.
2. Write a regular expression for strings that start with a and end with b.
3. State Kleene’s Theorem and explain its significance.
4. Are regular languages closed under intersection? Justify your answer.
5. What is the use of the Pumping Lemma?
6. Construct a regular expression for binary strings with an even number of 0s.
7. Prove whether the language { aⁿbⁿcⁿ | n ≥ 1 } is regular or not.
Answers :
1. Define a regular language with your own example.
A regular language is a type of formal language that can be expressed using a finite
automaton or a regular expression. It contains a finite or countably infinite set of
strings, and has predictable, pattern-based structure.
Example:
L = { w ∈ {0,1}* | w contains an even number of 1s }
This can be represented by a DFA and a regular expression, hence it is regular.
2. Write a regular expression for strings that start with ‘a’ and end with ‘b’.
a(a|b)*b
Explanation:
● Starts with a
● Followed by any combination of a or b (including none)
● Ends with b
3. State Kleene’s Theorem and explain its significance.
Kleene’s Theorem states:
“A language is regular if and only if it can be accepted by a finite automaton.”
This theorem bridges two representations: automata (DFA/NFA) and regular
expressions.
Significance: It proves that regular expressions and finite automata are equally
powerful in expressing regular languages.
4. Are regular languages closed under intersection? Justify your answer.
Yes, regular languages are closed under intersection.
This means that if L₁ and L₂ are regular, then L₁ ∩ L₂ is also regular.
Justification: We can construct a new DFA using the cross-product construction of
the DFAs of L₁ and L₂ that simulates both machines in parallel.
5. What is the use of the Pumping Lemma?
The Pumping Lemma is a tool to prove that a language is not regular.
It states that all regular languages have a property where long enough strings can
be "pumped" (a part repeated) and still remain in the language.
If you can find a string in a language that violates this condition, the language is not
regular.
6. Construct a regular expression for binary strings with an even number of 0s.
Regular expression:
(1*01*01*)*
Explanation:
● Each (1*01*0) pair introduces two 0s, maintaining evenness.
● Any number of such pairs (including 0) will result in even number of 0s.
7. Prove whether the language { aⁿbⁿcⁿ | n ≥ 1 } is regular or not.
This language is not regular.
Reason: It requires counting and matching the number of as, bs, and cs — a task
that finite automata can't perform because they lack memory.
Using Pumping Lemma:
Assume it is regular. Take the string s = aⁿbⁿcⁿ where n is greater than the
pumping length p.
Any pumped version will break the strict order or balance among a, b, and c. Hence,
it violates the Pumping Lemma, and the language is not regular.
For more resources, visit https://subhayu18.github.io/Noteify/