0% found this document useful (0 votes)

42 views30 pages

Flat 23cst-334 Unit 1

Uploaded by

misha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views30 pages

Flat 23cst-334 Unit 1

Uploaded by

misha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Compiled by : Subhayu

UNIT 1 – Finite State Machines and

Regular Languages
Chapter 1: Introduction to Formal Language & Basic
Terminology

Why should you care about this topic?

Imagine you're building a spell checker, a chatbot, or even an AI model that
understands human commands. All of these depend on one common thing —
language. But not just English or Hindi — a more abstract version of language. A
"formal" language that machines can understand, define, and operate on.

This unit lays the foundation of how we design, process, and analyze such languages.
If you understand this well, you're unlocking a powerful door into compiler design,
automata theory, natural language processing, and even AI.
Let’s start from scratch
1. Alphabet (Σ)

An alphabet is a finite set of symbols.

Think of it as the "building blocks" of any language.

● For English:
Alphabet = {A, B, C, ..., Z}

● For binary computers:

Alphabet = {0, 1}

● For DNA sequence processing:

Alphabet = {A, T, G, C}

We usually denote the alphabet as Σ (capital sigma).

Key idea: The alphabet is just a set — there is no order or meaning yet. It’s the raw
material.

2. String

A string is a finite sequence of symbols from an alphabet.

● If Σ = {a, b}, then "ab", "aaa", and "bab" are all valid strings.

● Even an empty string is allowed, denoted as ε (epsilon).

Think of it like this:

Alphabet = letters
String = words made from letters

3. Length of a String

The number of symbols in a string.

● Length of "aba" is 3

● Length of ε is 0
Notation: |w| = length of string w
E.g., if w = “abbab”, then |w| = 5

4. Formal Language

This is the set of all valid strings over a given alphabet, based on certain rules.

A language is just a set of strings. That's it.

● Example 1: L = {ε, a, aa, aaa, aaaa, ...}

→ This language contains all strings with only 'a's.

● Example 2: L = {w ∈ {0,1}* | w ends with 1}

→ All binary strings that end in 1, like "1", "01", "101", etc.

So, a formal language defines what strings are allowed and what are not.

Real-Life Analogy
Let’s say you want to create a password checker:

● Your alphabet is:

Σ = {a-z, A-Z, 0-9, @, #, !}

● A valid password (string) must:

→ Have at least 6 characters
→ Contain at least one special character

Your password rules = formal language

All strings that follow these rules = members of your language

You’re already using formal languages — you just didn’t know it!

5. Operations on Formal Languages (Preview)

Once we define languages, we can perform operations like:

● Union (L1 ∪ L2) – combine two languages

● Concatenation (L1L2) – join strings from two languages

● Kleene Star (L*) – repeat strings in a language any number of times

We’ll explore these in detail soon.

Answer the following :

1. What is an alphabet in the context of formal language?

2. Give an example of a string of length 3 over Σ = {x, y}.

3. What does ε represent?

4. Is the set {a, ab, ba, bbb} a language over Σ = {a, b}?

5. If Σ = {0, 1}, list 3 strings that belong to the language of binary numbers ending
in 0.

6. Can a formal language be infinite? Give an example.

7. Write a real-life system where rules over strings define acceptance.

Imagine this with me:

You're teaching a robot how to understand language. But it’s not a smart AI like
ChatGPT — it’s a very basic robot that can only follow rules.

How do we write those rules?

How do we describe the patterns we want it to understand?
That’s where formal languages come into play.

1.1 What is a Formal Language?

A Formal Language is a set of strings formed from a finite set of symbols, based on
specific rules.

Let’s break that down.

1.2 Key Terminology

Term Meaning

Alphabet (Σ) A finite set of symbols. Think of it as letters you’re allowed to use.
Example: Σ = {0, 1}

String A finite sequence of symbols from an alphabet. Ex: 0101, 111

Empty String A string with no characters. It still exists — like an empty folder.
(ε)

Language A set of strings made from symbols in Σ. Ex: L = {ε, 0, 1, 01, 10}

Length of Number of symbols in the string. Ex:

string

Concatenation Joining strings. If x = “01” and y = “1”, then xy = “011”

Kleene Star All possible strings (including ε) you can make from Σ. Ex: if Σ =
(Σ*) {a, b}, Σ* = {ε, a, b, aa, ab, ba, bb, aaa, …}

1.3 Real-Life Analogy

Let’s say you're building a ticket scanner for a concert.

You define a rule: "A valid ticket number must start with a 'C', followed by three
digits."

Here:

● Alphabet Σ = {C, 0, 1, 2, ..., 9}

● A string like “C123” is valid

● “123C” is invalid
You’ve just defined a formal language — the set of all valid ticket numbers!

1.4 Operations on Formal Languages

Here’s how we can manipulate languages:

1. Union (L1 ∪ L2)

Combines two languages.
If L1 = {a, b}, L2 = {c}, then L1 ∪ L2 = {a, b, c}

2. Concatenation (L1L2)

All combinations by picking one from L1 and one from L2.
If L1 = {a}, L2 = {b, c}, then L1L2 = {ab, ac}

3. Kleene Star (L*)

All strings you can form using zero or more repetitions of elements of L.
If L = {a}, L* = {ε, a, aa, aaa, …}

4. Intersection (L1 ∩ L2)

Strings common in both.
If L1 = {a, ab}, L2 = {ab, bc}, L1 ∩ L2 = {ab}

5. Complement (L ̄ )
All strings not in the language, assuming a fixed Σ*.
If Σ = {0,1}, and L = {0,1}, then ̄ L = {ε, 00, 01, 10, 11, 000, …}

Let’s Pause and Reflect

You now know what a formal language is and how to combine and manipulate them.
This is the foundation for things like regular expressions, compilers,
pattern-matching tools, and even AI interpreters.

Answer the following :

1. Define an alphabet with your own example.

2. What is the length of the string “11010”?

3. Give 2 examples of strings over Σ = {x, y}.

4. Explain the difference between a string and a language.

5. What is ε and how is it different from a null value in programming?

6. Describe Kleene Star in your own words with an example.

7. If L1 = {a, b} and L2 = {b, c}, find L1 ∩ L2 and L1 ∪ L2.

Answers :

1. Define an alphabet with your own example.

An alphabet in formal language theory is a finite, non-empty set of symbols.

Example:
Let’s define an alphabet:
Σ = {0, 1, #}
This alphabet consists of the symbols 0, 1, and #, and could be used to represent
binary strings with separators.

2. What is the length of the string “11010”?

The length of a string is the number of symbols it contains.

For the string “11010”, the number of symbols is 5.

Answer: Length = 5

3. Give 2 examples of strings over Σ = {x, y}.

A string is just a sequence of symbols from the given alphabet.

Given Σ = {x, y}:

Two example strings are:

→ xyx
→ yy

(Note: You can repeat symbols, mix them in any order, or even have an empty string.)

4. Explain the difference between a string and a language.

● A string is a single sequence of symbols from an alphabet.

● A language is a set of such strings.

Example:
→ String: xyx
→ Language: L = {x, yx, xyx, xyy}

So, a language is like a collection of valid strings that follow specific rules or patterns.

5. What is ε and how is it different from a null value in programming?

● ε (epsilon) represents the empty string, i.e., a string with zero symbols. It’s
still a string — just one that contains nothing.

● A null value in programming often means “undefined”, “not assigned”, or “no

value at all”, depending on the language.

Key Difference:
→ ε is a valid and defined element in formal languages.
→ null is more about the absence of a value or reference in code.

6. Describe Kleene Star in your own words with an example.

The Kleene Star ( * ) is an operation that means “repeat zero or more times”.

If you apply Kleene Star to a symbol or language, it gives all possible combinations of
strings made by repeating that symbol/language — including the empty string (ε).

Example:
If Σ = {a}, then:
Σ* = {ε, a, aa, aaa, aaaa, …}
(you can have 0 or more a's)

If L = {ab}, then:
L* = {ε, ab, abab, ababab, …}

7. If L₁ = {a, b} and L₂ = {b, c}, find L₁ ∩ L₂ and L₁ ∪ L₂.

● Intersection ( ∩ ) = common elements in both sets

● Union ( ∪ ) = all unique elements from both sets

Given:
L₁ = {a, b}
L₂ = {b, c}

L₁ ∩ L₂ = {b}
L₁ ∪ L₂ = {a, b, c}

1.5 Why Formal Languages?

You might wonder — we already have programming languages, right?

Well, yes, but programming languages are built on formal languages. Every compiler
checks your code using a set of rules, defined in a formal grammar. Understanding
this helps you:

● Detect errors better

● Write more efficient code

● Design better logic and automata

1.6 Types of Languages (based on Grammar)

As we go ahead in TOC, we classify formal languages into Chomsky’s Hierarchy
(covered later), but for now remember:

● Regular Languages — most basic, can be recognized by finite automata

● Context-Free Languages — can be recognized by pushdown automata

● Context-Sensitive and Recursively Enumerable — more powerful, more

complex

Right now, we’ll stick to Regular Languages.

Real-Life Example 2:

An ATM password system that only allows:

● 4-digit numbers
● Starting with 1 or 2

This is a language, and can be recognized by a machine (finite automaton) we'll build
soon.

Answer the following :

1. What is the difference between Σ* and a language L?

2. Can a formal language be infinite? Give an example.

3. How many strings of length 2 can be formed over Σ = {0,1}?

4. Write a language of all strings that start and end with the same symbol over Σ
= {a, b}.

5. Define complement of a language with an example.

6. What does “ε ∈ L” mean?

7. List 2 real-world systems where formal languages are used implicitly.

Answers :

1. What is the difference between Σ* and a language L?

Σ* (read as “sigma star”) is the set of all possible strings (including the empty string
ε) that can be formed using symbols from the alphabet Σ.
A language L is just a subset of Σ*, i.e., it contains only some of those strings based
on specific rules or patterns.

Example:
Let Σ = {a, b}
Then Σ* = {ε, a, b, aa, ab, ba, bb, aaa, …} (all possible combinations)
L = {ab, aab, aaab} ⊆ Σ* (a specific language)

2. Can a formal language be infinite? Give an example.

Yes, a formal language can definitely be infinite.

Example:
L = {aⁿ | n ≥ 1} = {a, aa, aaa, aaaa, …}
This language contains all strings with one or more ‘a’s, and it never ends.

3. How many strings of length 2 can be formed over Σ = {0,1}?

Each position in the string can be either 0 or 1.

For length 2:
→ First position: 2 choices (0 or 1)
→ Second position: 2 choices (0 or 1)

Total strings = 2 × 2 = 4

Strings: {00, 01, 10, 11}

4. Write a language of all strings that start and end with the same
symbol over Σ = {a, b}.

We want strings that start and end with the same character.

Language L = {a, b, aa, bb, aba, bab, aabaa, …}

Formally:
L = {w ∈ Σ* | first and last symbol of w are the same}

5. Define complement of a language with an example.

The complement of a language L over an alphabet Σ is the set of all strings in Σ* that
are not in L.

Example:
Let Σ = {a, b}
Let L = {ab, ba}
Then, L̅ (complement of L) = Σ* – L = all strings over {a, b} except "ab" and "ba".

6. What does “ε ∈ L” mean?

It means that the empty string ε is a member of the language L.

This tells us that L accepts a string of zero length (i.e., no symbols at all).
7. List 2 real-world systems where formal languages are used
implicitly.

1. Programming Languages:

Every programming language like C, Python, or Java is defined by a formal
grammar that determines what statements are valid.

2. Search Engines / Regex Filters:

When you use search queries or regular expressions, you're using a form of
formal language to match patterns in text.
Chapter 2: Finite State Machines —
Properties, Limitations, DFA, NFA,
and their Equivalence

What is a Finite State Machine (FSM)?

A Finite State Machine is a computational model used to design and analyze systems
that follow a series of defined steps (states) depending on input.

Think of an FSM as a "flowchart with memory."

It reads an input string (symbols one by one) and transitions between
states based on rules until the input ends. If it ends in an accepting state,
it accepts the input.

✦ Real-Life Analogy
Imagine a vending machine.

● States: Waiting for money, Accepting coins, Selecting item, Dispensing item

● Inputs: Coin inserted, Button pressed

● Transitions: If ₹10 is inserted → go to item selection → press A1 → dispense

snack

This whole process can be modeled using an FSM.

FSM – Basic Terms Recap

Before we talk DFA and NFA, let’s lock down the key components every FSM has:

Term Meaning
States (Q) Distinct configurations (nodes or circles) the machine can
be in.

Alphabet (Σ) Set of symbols the machine can read (like {0, 1}).

Transition Function Tells what state to move to on reading a symbol.

(δ)

Start State (q₀) Where the machine begins before any input.

Final States (F) States which, if reached after reading the input, mean the
input is accepted.

Deterministic Finite Automata (DFA)

➤ Definition:

A DFA is a FSM where for each state and input symbol, there is exactly one transition.

In short:

Given a current state and an input, you always know what the next state is.

➤ DFA Properties:

● Deterministic: No ambiguity.

● Every state has transitions defined for all symbols in Σ.

● No use of ε (empty string) transitions.

● Can be easily implemented in hardware/software.

➤ Example DFA:
Let’s design a DFA to accept strings ending in 01 over the alphabet {0,1}.

States: {q0, q1, q2}

Start state: q0

Final state: q2

Transitions:

q0 --0--> q1

q0 --1--> q0

q1 --1--> q2

q1 --0--> q1

q2 --0--> q1

q2 --1--> q0

Try testing the string 1101 — you’ll end up at q2 → accepted.

Quick Check:
Can a DFA be in two states at once?
No. By definition, it’s deterministic, meaning exactly one state at any time.

Limitations of DFA:
● Can't handle nested structures like balanced parentheses.

● Needs exponentially more states for certain patterns.

● Not suited for problems requiring memory of unbounded size.

Non-Deterministic Finite Automata (NFA)

➤ Definition:
An NFA is like a DFA, but:

From one state and one symbol, it can move to multiple possible next
states.

Also, it can:

● Have ε-transitions: move to a new state without consuming any symbol.

● Be in multiple states simultaneously.

➤ Why NFAs are Powerful in Design:

They make it easier to construct automata for complex languages, even though
eventually, we may convert them to DFA for implementation.

➤ NFA vs DFA:

Feature DFA NFA

Transition per input One Zero or more

ε-transitions Not allowed Allowed

Simultaneous states One Many (through

branching)

Implementation Easier Harder

ease

Language Same power (regular Same power (regular

Recognition langs) langs)
Equivalence of DFA and NFA
Even though NFAs seem more powerful (multiple paths, ε-transitions), they
recognize the same class of languages as DFAs — the Regular Languages.

Theorem: For every NFA, there exists an equivalent DFA that accepts the
same language.

➤ Conversion Process:

The standard way to convert an NFA to DFA is the Subset Construction (also called
Powerset Construction). It involves:

1. Creating new DFA states as sets of NFA states.

2. Transitions defined as unions of possible NFA paths.

3. Start state is the ε-closure of NFA start state.

4. Final DFA states are those containing at least one NFA final state.

It can result in up to 2ⁿ states if NFA has n states — that's the price for determinism.

Real-Life Analogy for NFA:

Imagine a person walking in a maze with doors.

● A DFA person can open only one door per turn.

● An NFA person can split into multiple versions of themselves, each trying a
different door.

Eventually, if any version of the person reaches the exit — the input is accepted.

Answer these to check your understanding so far:

1. Define the 5-tuple structure of a DFA.

2. What is the major difference between DFA and NFA?

3. Can DFA have ε-transitions? Why or why not?

4. What does it mean for a string to be “accepted” by an FSM?

5. Is it possible for an NFA to not have any transition for some input symbol from
a state?

6. What’s the practical challenge of converting an NFA to a DFA?

7. Which is easier to design: DFA or NFA? Why?

Answers :

1. Define the 5-tuple structure of a DFA.

A Deterministic Finite Automaton (DFA) is formally defined as a 5-tuple (Q, Σ, δ, q₀,

F), where:

● Q: Finite set of states

● Σ (Sigma): Finite input alphabet

● δ (delta): Transition function δ: Q × Σ → Q

● q₀: Start state, where q₀ ∈ Q

● F: Set of accepting/final states, F ⊆ Q

This structure tells us how the DFA behaves, starting from a state, reading inputs,
and transitioning deterministically.

2. What is the major difference between DFA and NFA?

The major difference lies in determinism:

● DFA: For every state and input symbol, only one transition exists.

● NFA: For a given state and input, multiple transitions may be possible, or even
none. It may also include ε (epsilon) transitions—transitions that happen
without consuming input.

This makes DFA easier to simulate, while NFA is more flexible to design.

3. Can DFA have ε-transitions? Why or why not?

No, a DFA cannot have ε-transitions.

In DFA, the machine must read exactly one symbol at each step. Since ε-transitions
do not consume input, allowing them would break the deterministic nature of the
automaton.
4. What does it mean for a string to be “accepted” by an FSM?

A string is said to be accepted by a Finite State Machine (FSM) if, starting from the
initial state and after consuming the entire input string, the machine ends in a final
(accepting) state.

In formal terms:
If δ*(q₀, w) ∈ F, where w is the input string, then w is accepted.

5. Is it possible for an NFA to not have any transition for some input
symbol from a state?

Yes, in an NFA, it is allowed that for some state q and input symbol a, the transition
function δ(q, a) is undefined or leads to no states.

This is one of the features that distinguishes NFAs from DFAs. In DFA, every state
must have a transition for every symbol in the alphabet.

6. What’s the practical challenge of converting an NFA to a DFA?

The biggest challenge is the potential exponential growth in states.

When converting NFA to DFA using the subset construction method, each state in
the DFA represents a set of NFA states. For an NFA with n states, the equivalent DFA
might have up to 2ⁿ states.

This can lead to state explosion, making it hard to implement for complex NFAs.

7. Which is easier to design: DFA or NFA? Why?

An NFA is easier to design because:

● You can allow multiple transitions for the same input.

● You can skip transitions or use ε-transitions.

● It allows more flexibility in building automata for pattern recognition.

On the other hand, DFAs are more structured but rigid, which makes them harder to
design for complex patterns, although easier to simulate once designed.
Answer the following :
1. Design a DFA to accept all strings over {0,1} that start and end with 1.

2. Construct an NFA that accepts the language: strings ending in either 10 or 01.

3. True/False: Every DFA is an NFA, but not every NFA is a DFA.

4. Why are ε-transitions helpful in NFAs?

5. Give a real-world example where FSM can be used other than vending
machine.

6. Explain, using your own words, why NFAs do not accept more languages than
DFAs.

7. Why might a compiler prefer using DFA over NFA during lexical analysis?

Answers :

1. Design a DFA to accept all strings over {0,1} that start and end with
1.

Alphabet (Σ): {0, 1}

Requirement: String must begin with 1 and end with 1.

We will need to track:

● Whether the first symbol is 1

● And whether the last symbol is 1

States:

● q0: Start state, waiting for the first symbol

● q1: Read 1 as first symbol

● q2: Intermediate state for strings that start with 1 and continue

● q3: Dead state (invalid if string starts with 0)

Transitions:
● From q0:

○ on 1 → go to q1

○ on 0 → go to q3 (dead state)

● From q1:

○ on 0 → go to q2

○ on 1 → stay in q1 (1 is both starting and possibly ending)

● From q2:

○ on 0 → stay in q2

○ on 1 → go back to q1 (as 1 becomes the latest character)

● From q3:

○ on 0 or 1 → stay in q3

Final State:

● q1 is the only accepting state because it guarantees:

○ The string started with 1

○ The last symbol was also 1

2. Construct an NFA that accepts the language: strings ending in either

10 or 01.

Alphabet (Σ): {0, 1}

We only care about the last two symbols.

States:

● q0: Start state

● q1: After reading 0

● q2: After reading 1

● q3: Accepting state for 01

● q4: Accepting state for 10

Transitions:

● From q0:

○ on 0 → go to q1

○ on 1 → go to q2

● From q1:

○ on 1 → go to q3

● From q2:

○ on 0 → go to q4

● From all states, self-loops can be added to allow arbitrary prefix before the 10
or 01 ending.

Accepting States:

● q3 and q4

3. True/False: Every DFA is an NFA, but not every NFA is a DFA.

Answer: True

Every DFA is technically a restricted NFA (with exactly one transition per input
symbol). However, NFAs can have multiple transitions, including ε-moves, which
DFAs do not allow.

4. Why are ε-transitions helpful in NFAs?

ε-transitions allow the automaton to change state without reading any input
symbol. This makes NFAs more flexible and easier to design, especially when
combining smaller automata (like in regex implementation).
5. Give a real-world example where FSM can be used other than a
vending machine.

Example: Elevator Control System

● States: On each floor (1st, 2nd, 3rd, etc.)

● Inputs: Up button, Down button, Floor request

● Transitions: Move up, move down, stay

The elevator system transitions between floors based on input buttons and calls,
much like a Finite State Machine.

6. Explain, using your own words, why NFAs do not accept more
languages than DFAs.

Even though NFAs look more powerful (multiple paths, ε-transitions), they do not
accept more languages. This is because every NFA can be converted into a DFA that
accepts the exact same language. The difference is only in design complexity, not in
expressive power.

7. Why might a compiler prefer using DFA over NFA during lexical
analysis?

Compilers prefer DFAs because:

● DFAs are faster: They do not need to guess or explore multiple paths.

● At every input symbol, a DFA has exactly one next state, making execution
predictable and efficient.

● NFAs may require backtracking or parallel state tracking, which slows down
real-time performance.

So, although NFAs are easier to design conceptually, DFAs are better for
implementation in tools like compilers.
Chapter 3 : Regular Languages and
Regular Expressions

What Are Regular Languages?

A Regular Language is a set of strings (over some alphabet) that can be recognized by
a finite automaton. These languages are the simplest class in the Chomsky hierarchy
and are the basis for pattern matching, lexical analysis in compilers, and text search
tools like grep.

Real-Life Analogy:

Imagine you are scanning a document and want to find if it contains only vowels. If
the rule is simple, like "every word must contain only vowels", a finite machine (a
checker) can read each character one by one and decide if the word is valid or not.
This is a regular language.

What Are Regular Expressions?

A Regular Expression (RegEx) is a symbolic representation used to describe a regular

language.

Syntax (Basic Building Blocks):

Given an alphabet Σ (e.g., {a, b}), we build expressions using:

1. Empty string (ε) – Represents a string with no characters.

2. Single character – Like 'a' or 'b' from Σ.

3. Concatenation – If r1 and r2 are regular expressions, so is r1r2.

4. Union (OR) – If r1 and r2 are regular expressions, so is r1 | r2.

5. Kleene Star – If r is a regular expression, then r* represents zero or more

repetitions.

Example:

Let’s describe the set of strings over {a, b} that contain zero or more a’s followed by a
single b:

Regular Expression: a*b

This matches: b, ab, aaab, etc.

Relationship: Regular Expressions ↔ Regular Languages

Every regular expression denotes a regular language, and every regular language can
be expressed using a regular expression.

This connection is formally proven using Kleene’s Theorem.

Kleene’s Theorem
This is a fundamental result that bridges the gap between automata and regular
expressions.

Kleene’s Theorem states:

A language is regular if and only if there exists a finite automaton that

accepts it, or there exists a regular expression that describes it.

It has two parts:

1. Part 1 (RE → FA): For every regular expression, there exists a finite automaton
that accepts the same language.

2. Part 2 (FA → RE): For every finite automaton, there exists a regular expression
describing its language.

This equivalence allows us to switch between representations depending on what we

are trying to solve.

Properties of Regular Languages

Regular languages have many important closure properties, which means they are
closed under the following operations:

Operation Meaning

Union (L1 ∪ L2) Strings in L1 or L2

Concatenation Strings formed by joining a string from L1
(L1L2) with L2

Kleene Star (L*) Zero or more repetitions of strings in L

Complement (~L) Strings not in L

Intersection (L1 ∩ Strings common to both L1 and L2

L2)

Difference (L1 – L2) Strings in L1 but not in L2

Reversal (L^R) Reverse all strings in the language

Homomorphism Replace each symbol using a fixed mapping

rule

These properties make regular languages very flexible to work with.

Pumping Lemma for Regular Languages

This is a powerful proof technique used to show that a language is NOT regular.

Let’s walk through it step-by-step.

Pumping Lemma Statement:

If L is a regular language, then there exists a number p (pumping length), such that
any string w in L with length at least p can be divided into three parts, w = xyz,
satisfying:

1. |y| ≥ 1 (y is not empty)

2. |xy| ≤ p (the first repetition occurs within the first p characters)

3. For all i ≥ 0, the string xyⁱz is also in L

Why is it useful?

The Pumping Lemma is used to prove that a given language is not regular by
contradiction.

Real-Life Example:

Let’s try to prove that the language L = { aⁿbⁿ | n ≥ 0 } is not regular.

Assume it is regular.
Then by the Pumping Lemma, there exists a p such that for any string w ∈ L with
|w| ≥ p, we can split it as w = xyz, satisfying the conditions.

Take w = a^p b^p (i.e., p a’s followed by p b’s)

According to the lemma:

● y must contain only a’s (since |xy| ≤ p)

● If we pump y, say repeat it twice: x y² z, we get more a’s than b’s

● Resulting string is not of the form aⁿbⁿ → contradiction

Hence, L is not regular

Answer the following questions briefly:

1. Define a regular language with your own example.

2. Write a regular expression for strings that start with a and end with b.

3. State Kleene’s Theorem and explain its significance.

4. Are regular languages closed under intersection? Justify your answer.

5. What is the use of the Pumping Lemma?

6. Construct a regular expression for binary strings with an even number of 0s.
7. Prove whether the language { aⁿbⁿcⁿ | n ≥ 1 } is regular or not.

Answers :
1. Define a regular language with your own example.
A regular language is a type of formal language that can be expressed using a finite
automaton or a regular expression. It contains a finite or countably infinite set of
strings, and has predictable, pattern-based structure.

Example:
L = { w ∈ {0,1}* | w contains an even number of 1s }
This can be represented by a DFA and a regular expression, hence it is regular.

2. Write a regular expression for strings that start with ‘a’ and end with ‘b’.
a(a|b)*b
Explanation:

● Starts with a

● Followed by any combination of a or b (including none)

● Ends with b

3. State Kleene’s Theorem and explain its significance.

Kleene’s Theorem states:
“A language is regular if and only if it can be accepted by a finite automaton.”
This theorem bridges two representations: automata (DFA/NFA) and regular
expressions.
Significance: It proves that regular expressions and finite automata are equally
powerful in expressing regular languages.

4. Are regular languages closed under intersection? Justify your answer.

Yes, regular languages are closed under intersection.
This means that if L₁ and L₂ are regular, then L₁ ∩ L₂ is also regular.
Justification: We can construct a new DFA using the cross-product construction of
the DFAs of L₁ and L₂ that simulates both machines in parallel.

5. What is the use of the Pumping Lemma?

The Pumping Lemma is a tool to prove that a language is not regular.
It states that all regular languages have a property where long enough strings can
be "pumped" (a part repeated) and still remain in the language.
If you can find a string in a language that violates this condition, the language is not
regular.

6. Construct a regular expression for binary strings with an even number of 0s.
Regular expression:
(1*01*01*)*
Explanation:

● Each (1010) pair introduces two 0s, maintaining evenness.

● Any number of such pairs (including 0) will result in even number of 0s.

7. Prove whether the language { aⁿbⁿcⁿ | n ≥ 1 } is regular or not.

This language is not regular.
Reason: It requires counting and matching the number of as, bs, and cs — a task
that finite automata can't perform because they lack memory.
Using Pumping Lemma:
Assume it is regular. Take the string s = aⁿbⁿcⁿ where n is greater than the
pumping length p.
Any pumped version will break the strict order or balance among a, b, and c. Hence,
it violates the Pumping Lemma, and the language is not regular.

For more resources, visit https://subhayu18.github.io/Noteify/

Fall Semester 2023-24 CSE1013 TH AP2023242000613 Reference Material I 02-Aug-2023 Module - I Part - I
No ratings yet
Fall Semester 2023-24 CSE1013 TH AP2023242000613 Reference Material I 02-Aug-2023 Module - I Part - I
38 pages
Language Basics for Beginners
No ratings yet
Language Basics for Beginners
29 pages
TOC Lec1
No ratings yet
TOC Lec1
39 pages
Lecture 1 - Chapter 1-Introduction
No ratings yet
Lecture 1 - Chapter 1-Introduction
152 pages
Day 3 - Regexps
No ratings yet
Day 3 - Regexps
52 pages
Lecture 2
No ratings yet
Lecture 2
31 pages
Lecture 1
No ratings yet
Lecture 1
14 pages
Automata Theory: Language Basics
No ratings yet
Automata Theory: Language Basics
50 pages
L1-Basics-1-1 (Regullar Expression)
No ratings yet
L1-Basics-1-1 (Regullar Expression)
12 pages
Theory of Computation: Lecture # 1-2-3
No ratings yet
Theory of Computation: Lecture # 1-2-3
36 pages
Chapter One - Introduction
No ratings yet
Chapter One - Introduction
30 pages
Lecture 1 - Chapter 1-Introduction
No ratings yet
Lecture 1 - Chapter 1-Introduction
152 pages
Automata Theory for CS Students
No ratings yet
Automata Theory for CS Students
62 pages
ToC Notes - Unit 1
No ratings yet
ToC Notes - Unit 1
65 pages
01-Introduction&Languages (1) 112431
No ratings yet
01-Introduction&Languages (1) 112431
21 pages
Theoretical Foundations of CS / Wide Variety of Problems - Frameworks/models To Solve Problems - Analyze Problems & Algorithms
No ratings yet
Theoretical Foundations of CS / Wide Variety of Problems - Frameworks/models To Solve Problems - Analyze Problems & Algorithms
50 pages
Toc Unit 1 Finite Automata
No ratings yet
Toc Unit 1 Finite Automata
132 pages
LN2 - FL1 - Basic Elements of FL - L2
No ratings yet
LN2 - FL1 - Basic Elements of FL - L2
3 pages
Alphabets, Strings, and Languages
No ratings yet
Alphabets, Strings, and Languages
47 pages
LANGUAGES
No ratings yet
LANGUAGES
11 pages
Flat Notes 2
No ratings yet
Flat Notes 2
110 pages
02a Regular Languages
No ratings yet
02a Regular Languages
30 pages
FLAT Complete Notes
No ratings yet
FLAT Complete Notes
110 pages
Flat Notes
No ratings yet
Flat Notes
110 pages
Theory of Automata - Lecture 01
No ratings yet
Theory of Automata - Lecture 01
57 pages
Theory of Automata and Formal Languages
No ratings yet
Theory of Automata and Formal Languages
39 pages
COS 335-Automata Theory and Formal Languages
No ratings yet
COS 335-Automata Theory and Formal Languages
9 pages
TOA Concepts
No ratings yet
TOA Concepts
26 pages
Languages, Grammar and Recognizers
No ratings yet
Languages, Grammar and Recognizers
17 pages
Theory of Automata
No ratings yet
Theory of Automata
12 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
Lecture 1
No ratings yet
Lecture 1
35 pages
Toc Unit 1 Updated 1
No ratings yet
Toc Unit 1 Updated 1
132 pages
Lecture Note
No ratings yet
Lecture Note
32 pages
Automata Theory and Computability
No ratings yet
Automata Theory and Computability
189 pages
Automata Theory & Language Basics
No ratings yet
Automata Theory & Language Basics
116 pages
FLAT Removed
No ratings yet
FLAT Removed
85 pages
Automata Theory Essentials
No ratings yet
Automata Theory Essentials
23 pages
Lecture 02-Preliminaries
No ratings yet
Lecture 02-Preliminaries
35 pages
Toa Handout 2
No ratings yet
Toa Handout 2
41 pages
Lecture 02 Language Definig Method
No ratings yet
Lecture 02 Language Definig Method
17 pages
Lesson 01
No ratings yet
Lesson 01
31 pages
Week 2: (Properties of Formal Language)
No ratings yet
Week 2: (Properties of Formal Language)
45 pages
AUTOMATA Good Book
No ratings yet
AUTOMATA Good Book
60 pages
Automata and Formal Language Theory
No ratings yet
Automata and Formal Language Theory
18 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
32 pages
Lesson 01
No ratings yet
Lesson 01
39 pages
Automata
No ratings yet
Automata
17 pages
Welcome To !: Theory of Automata
No ratings yet
Welcome To !: Theory of Automata
39 pages
Lect 02
No ratings yet
Lect 02
34 pages
Welcome To ! Lecture # 1: Theory of Automata
No ratings yet
Welcome To ! Lecture # 1: Theory of Automata
38 pages
Lecture 2 PDF
No ratings yet
Lecture 2 PDF
33 pages
Intro. To Formal Languages
No ratings yet
Intro. To Formal Languages
32 pages
CH 1 - StringsAndLanguages
No ratings yet
CH 1 - StringsAndLanguages
26 pages
Theory of Computer Science - SCJ 3203: Paridah Samsuri Mohd Soperi Mohd Zahid
No ratings yet
Theory of Computer Science - SCJ 3203: Paridah Samsuri Mohd Soperi Mohd Zahid
49 pages
English: Neo Convent Sr. Sec. School, Paschim Vihar, New Delhi - 63. UNIT TEST SYLLABUS - 2024-2025 Class - Iii
No ratings yet
English: Neo Convent Sr. Sec. School, Paschim Vihar, New Delhi - 63. UNIT TEST SYLLABUS - 2024-2025 Class - Iii
8 pages
A Study of English Language Learning Strategies of Elementary School Pupils at SDN Bengkulu
100% (1)
A Study of English Language Learning Strategies of Elementary School Pupils at SDN Bengkulu
66 pages
Lesson Plan
No ratings yet
Lesson Plan
4 pages
Day 06 계시다 (Revision)
No ratings yet
Day 06 계시다 (Revision)
3 pages
Language Pedagogy & Assessment
No ratings yet
Language Pedagogy & Assessment
5 pages
Tinubu Preliminary
No ratings yet
Tinubu Preliminary
12 pages
Lec 0967
No ratings yet
Lec 0967
3 pages
4th/5th Grade Writing Rubrics
No ratings yet
4th/5th Grade Writing Rubrics
5 pages
Direkte en Indirekte Rede Notes and Questions
100% (1)
Direkte en Indirekte Rede Notes and Questions
11 pages
Victor Manuel Murga Quintanilla-Homework V
No ratings yet
Victor Manuel Murga Quintanilla-Homework V
8 pages
CS606 Midterm Solved MCQS by Junaid Malik
No ratings yet
CS606 Midterm Solved MCQS by Junaid Malik
30 pages
Criteria For Research Defense Jun 2023
No ratings yet
Criteria For Research Defense Jun 2023
3 pages
AEF3e L2 Quicktest 12
No ratings yet
AEF3e L2 Quicktest 12
3 pages
Present Tense 'To Be' Usage Guide
No ratings yet
Present Tense 'To Be' Usage Guide
4 pages
Major Families of Words in English Grammar. The Problem of Parts of Speech. The Survey of The Nominal Parts of Speech
No ratings yet
Major Families of Words in English Grammar. The Problem of Parts of Speech. The Survey of The Nominal Parts of Speech
35 pages
Past Prefect
No ratings yet
Past Prefect
6 pages
Daily Lesson Plan TS25 (Week 8 All)
No ratings yet
Daily Lesson Plan TS25 (Week 8 All)
3 pages
Grade 8 English Listening Test
No ratings yet
Grade 8 English Listening Test
4 pages
Mksaad OSAC OpenSourceArabicCorpora EECS10 Rev9
No ratings yet
Mksaad OSAC OpenSourceArabicCorpora EECS10 Rev9
7 pages
Alphanumeric Series
No ratings yet
Alphanumeric Series
27 pages
Common Core Diagnostic Test - 2019-20-1-2
No ratings yet
Common Core Diagnostic Test - 2019-20-1-2
2 pages
Phonics Poetry Pages
100% (4)
Phonics Poetry Pages
64 pages
Final Exam Linguistics
100% (1)
Final Exam Linguistics
5 pages
Action Research Grade 2
No ratings yet
Action Research Grade 2
25 pages
How To Read Texts A Student Guide To Critical Approaches and Skills 2nd Edition Neil Mccaw No Waiting Time
100% (9)
How To Read Texts A Student Guide To Critical Approaches and Skills 2nd Edition Neil Mccaw No Waiting Time
108 pages
2327 A5.1-A2 N12 - ĐỀ
No ratings yet
2327 A5.1-A2 N12 - ĐỀ
4 pages
2015 Old Javanese 1
100% (2)
2015 Old Javanese 1
88 pages
AT Final
No ratings yet
AT Final
4 pages
2021 - Global Englishes in The Second Phase of Teacher Education in Germany
No ratings yet
2021 - Global Englishes in The Second Phase of Teacher Education in Germany
18 pages
Root Words and Prefixes
No ratings yet
Root Words and Prefixes
29 pages

Flat 23cst-334 Unit 1

Uploaded by

Flat 23cst-334 Unit 1

Uploaded by

Compiled by : Subhayu

UNIT 1 – Finite State Machines and

Why should you care about this topic?

An alphabet is a finite set of symbols.

Think of it as the "building blocks" of any language.

●​ For binary computers:​

●​ For DNA sequence processing:​

We usually denote the alphabet as Σ (capital sigma).

A string is a finite sequence of symbols from an alphabet.

●​ Even an empty string is allowed, denoted as ε (epsilon).​

Think of it like this:

The number of symbols in a string.

A language is just a set of strings. That's it.

●​ Example 1: L = {ε, a, aa, aaa, aaaa, ...}​

●​ Example 2: L = {w ∈ {0,1}* | w ends with 1}​

●​ Your alphabet is:​

●​ A valid password (string) must:​

Your password rules = formal language​

5. Operations on Formal Languages (Preview)

Once we define languages, we can perform operations like:

●​ Union (L1 ∪ L2) – combine two languages​

●​ Concatenation (L1L2) – join strings from two languages​

We’ll explore these in detail soon.

Answer the following :

2.​ Give an example of a string of length 3 over Σ = {x, y}.​

3.​ What does ε represent?​

6.​ Can a formal language be infinite? Give an example.​

Imagine this with me:

How do we write those rules?​

1.1 What is a Formal Language?

Let’s break that down.

1.2 Key Terminology

String A finite sequence of symbols from an alphabet. Ex: 0101, 111

Length of Number of symbols in the string. Ex:

Concatenation Joining strings. If x = “01” and y = “1”, then xy = “011”

1.3 Real-Life Analogy

Let’s say you're building a ticket scanner for a concert.

●​ Alphabet Σ = {C, 0, 1, 2, ..., 9}​

●​ A string like “C123” is valid​

1.4 Operations on Formal Languages

Here’s how we can manipulate languages:

1.​ Union (L1 ∪ L2)​

2.​ Concatenation (L1L2)​

3.​ Kleene Star (L*)​

4.​ Intersection (L1 ∩ L2)​

Let’s Pause and Reflect

Answer the following :

2.​ What is the length of the string “11010”?​

3.​ Give 2 examples of strings over Σ = {x, y}.​

5.​ What is ε and how is it different from a null value in programming?​

6.​ Describe Kleene Star in your own words with an example.​

7.​ If L1 = {a, b} and L2 = {b, c}, find L1 ∩ L2 and L1 ∪ L2.​

1. Define an alphabet with your own example.

An alphabet in formal language theory is a finite, non-empty set of symbols.

2. What is the length of the string “11010”?

The length of a string is the number of symbols it contains.

For the string “11010”, the number of symbols is 5.

3. Give 2 examples of strings over Σ = {x, y}.

A string is just a sequence of symbols from the given alphabet.

Given Σ = {x, y}:

Two example strings are:​

4. Explain the difference between a string and a language.

●​ A language is a set of such strings.​

5. What is ε and how is it different from a null value in programming?

●​ A null value in programming often means “undefined”, “not assigned”, or “no

6. Describe Kleene Star in your own words with an example.

7. If L₁ = {a, b} and L₂ = {b, c}, find L₁ ∩ L₂ and L₁ ∪ L₂.

●​ Intersection ( ∩ ) = common elements in both sets​

●​ Union ( ∪ ) = all unique elements from both sets​

1.5 Why Formal Languages?

●​ Detect errors better​

●​ Write more efficient code​

●​ Design better logic and automata​

1.6 Types of Languages (based on Grammar)

●​ Regular Languages — most basic, can be recognized by finite automata​

●​ Context-Free Languages — can be recognized by pushdown automata​

● For binary computers:

● For DNA sequence processing:

● Even an empty string is allowed, denoted as ε (epsilon).

● Example 1: L = {ε, a, aa, aaa, aaaa, ...}

● Example 2: L = {w ∈ {0,1}* | w ends with 1}

● Your alphabet is:

● A valid password (string) must:

Your password rules = formal language

● Union (L1 ∪ L2) – combine two languages

● Concatenation (L1L2) – join strings from two languages

2. Give an example of a string of length 3 over Σ = {x, y}.

3. What does ε represent?

6. Can a formal language be infinite? Give an example.

How do we write those rules?

● Alphabet Σ = {C, 0, 1, 2, ..., 9}

● A string like “C123” is valid

1. Union (L1 ∪ L2)

2. Concatenation (L1L2)

3. Kleene Star (L*)

4. Intersection (L1 ∩ L2)

2. What is the length of the string “11010”?

3. Give 2 examples of strings over Σ = {x, y}.

5. What is ε and how is it different from a null value in programming?

6. Describe Kleene Star in your own words with an example.

7. If L1 = {a, b} and L2 = {b, c}, find L1 ∩ L2 and L1 ∪ L2.

Two example strings are:

● A language is a set of such strings.

● A null value in programming often means “undefined”, “not assigned”, or “no

● Intersection ( ∩ ) = common elements in both sets

● Union ( ∪ ) = all unique elements from both sets

● Detect errors better

● Write more efficient code

● Design better logic and automata

● Regular Languages — most basic, can be recognized by finite automata

● Context-Free Languages — can be recognized by pushdown automata

● Context-Sensitive and Recursively Enumerable — more powerful, more

2. Can a formal language be infinite? Give an example.

3. How many strings of length 2 can be formed over Σ = {0,1}?

5. Define complement of a language with an example.

6. What does “ε ∈ L” mean?

Each position in the string can be either 0 or 1.

It means that the empty string ε is a member of the language L.

1. Programming Languages:

2. Search Engines / Regex Filters:

Think of an FSM as a "flowchart with memory."

● Inputs: Coin inserted, Button pressed

● Transitions: If ₹10 is inserted → go to item selection → press A1 → dispense

● Every state has transitions defined for all symbols in Σ.

● No use of ε (empty string) transitions.

● Can be easily implemented in hardware/software.

● Needs exponentially more states for certain patterns.

● Not suited for problems requiring memory of unbounded size.

● Have ε-transitions: move to a new state without consuming any symbol.

● Be in multiple states simultaneously.

1. Creating new DFA states as sets of NFA states.

2. Transitions defined as unions of possible NFA paths.

3. Start state is the ε-closure of NFA start state.

● A DFA person can open only one door per turn.

1. Define the 5-tuple structure of a DFA.

2. What is the major difference between DFA and NFA?

3. Can DFA have ε-transitions? Why or why not?

4. What does it mean for a string to be “accepted” by an FSM?

6. What’s the practical challenge of converting an NFA to a DFA?

7. Which is easier to design: DFA or NFA? Why?

● Q: Finite set of states

● Σ (Sigma): Finite input alphabet

● δ (delta): Transition function δ: Q × Σ → Q

● q₀: Start state, where q₀ ∈ Q

● F: Set of accepting/final states, F ⊆ Q