Symbol
Symbols simply call it as a character.
It is an atomic unit, such as a digit, character, lowercase letter, etc. Sometimes
it is also a word. The formal language does not deal with the “meaning” of the
symbols.
For example,
a,b,c,……………z
0,1,2,…………..9
+,-,*,%,…………special characters.
Alphabet
The set of characters is called as the alphabet.
An alphabet is a finite, non-empty set of symbols. It is denoted by Σ or E.
For example,
Σ ={0,1} set of binary alphabets.
Σ ={a,b,c,……..,z} set of all lower case letters.
Σ ={A,B,C,………Z} set of all upper case letters.
Σ ={+,&,%,……….} set of all special characters.
String or Word
A string is a finite set sequence of symbols choose from some alphabets
For example,
00011001 is a string from the binary alphabet Σ={0,1} and aabbcabcd is a string from the alphabet
Σ={a,b,c,d}.
If, w = 0110 y = 0aa x = aabcaa z = 111. Then,
o Special string − s (also denoted by X)
o Concatenation − wz = 0110111
o Length − |w| = 4 |s| = 0 |x| = 6
o Reversal − yR = aa0
Some special sets of strings are as follows −
E* All strings of symbols from E
E+ E* - {s}
For example,
E = {0, 1}
E* = {s, 0, 1, 00, 01, 10, 11, 000, 001,...}
E+ = {0, 1, 00, 01, 10, 11, 000, 001,.}
Length of string
It is the number of symbols in the string or word. It is denoted by |w|.
For example,
w=01011001 from binary alphabet Σ={0,1}
|w| = 8
X= abbaddabba from binary alphabet Σ={a,b}
|X| = 10
Language
A language is a set of strings from some alphabet (finite or infinite). In other
words, any subset L of E* is a language in TOC.
Some special languages are as follows −
{} The empty set/language, containing no string.
{s} A language containing one string, the empty string.
Examples
E = {0, 1}
L = {x | x is in E* and x contains an even number of 0’s}
E = {0, 1, 2,., 9, .}
L = {x | x is in E* and x forms a finite length real number}
= {0, 1.5, 9.326,.}
E = {a, b, c,., z, A, B,., Z}
L = {x | x is in E* and x is a Pascal reserved word}
= {BEGIN, END, IF,...}
E = {Pascal reserved words} U { (, ), ., :, ;,...} U {Legal Pascal identifiers}
L = {x | x is in E* and x is a syntactically correct Pascal program}
E = {English words}
L = {x | x is in E* and x is a syntactically correct English sentence}
Kleene Star • Definition: The Kleene star, Σ * , is a unary operator on a set of symbols or strings, Σ,
that gives the infinite set of all possible strings of all possible lengths over Σ including λ. •
Representation: Σ* = Σ0 U Σ1 U Σ2 U……. where Σp is the set of all possible strings of length p. •
Example: If Σ = {a, b}, Σ*= {λ, a, b, aa, ab, ba, bb,………..}
Kleene Closure / Plus • Definition: The set Σ + is the infinite set of all possible strings of all possible
lengths over Σ excluding λ. • Representation: Σ+ = Σ1 U Σ2 U Σ3 U……. Σ + = Σ* − { λ } • Example: If Σ =
{ a, b } , Σ+ ={ a, b, aa, ab, ba, bb,………..}
Grammar in theory of computation is a finite set of formal rules that are
generating syntactically correct sentences.
The formal definition of grammar is that it is defined as four tuples −
G=(V,T,P,S)
G is a grammar, which consists of a set of production rules. It is used to generate the strings of a
language.
T is the final set of terminal symbols. It is denoted by lower case letters.
V is the final set of non-terminal symbols. It is denoted by capital letters.
P is a set of production rules, which is used for replacing non-terminal symbols (on the left side of
production) in a string with other terminals (on the right side of production).
S is the start symbol used to derive the string.
Terminal Symbols - Terminal symbols are the components of the
sentences that are generated using grammar and are denoted using
small case letters like a, b, c etc.
Non-Terminal Symbols - Non-Terminal Symbols take part in the generation
of the sentence but are not the component of the sentence. These types
of symbols are also called Auxiliary Symbols and Variables. They are
represented using a capital letter like A, B, C, etc.
Example 1
Consider a grammar
G = (V , T , P , S)
Where,
V = { S , A , B } ⇒ Non-Terminal symbols
T={a,b} ⇒ Terminal symbols
Production rules P = { S → ABa , A → BB , B → ab , AA → b }
S={S} ⇒ Start symbol
Example 2
Consider a grammar
G=(V,T,P,S)
Where,
V= {S, A, B} ⇒ non terminal symbols
T = { 0,1} ⇒ terminal symbols
Production rules P = { S→A1B
A→0A| ε
B→0B| 1B| ε }
S= {S} ⇒ start symbol.
Types of grammar
The different types of grammar −
Grammar Language Automata Production rules
Recursively
Type-0 Turing machine No restriction
enumerable
Linear-bounded non-
Type-1 Context-sensitive αAβ→αγβ
deterministic machine
Non-deterministic push down
Type-2 Context-free A→γ
automata
A→αB
Type-3 Regular Finite state automata
A→α
The diagram representing the types of grammar in the theory of computation
(TOC) is as follows −
Type - 0 Grammar
Type-0 grammars generate recursively enumerable languages. The
productions have no restrictions. They are any phase structure grammar
including all formal grammars.
They generate the languages that are recognized by a Turing machine.
The productions can be in the form of α → β where α is a string of
terminals and nonterminals with at least one non-terminal and α cannot
be null. β is a string of terminals and non-terminals.
Example
S → ACaB
Bc → acB
CB → DB
aD → Db
Type - 1 Grammar
Type-1 grammars generate context-sensitive languages. The productions
must be in the form
αAβ→αγβ
where A ∈ N (Non-terminal)
and α, β, γ ∈ (T ∪ N)* (Strings of terminals and non-terminals)
The strings α and β may be empty, but γ must be non-empty.
The rule S → ε is allowed if S does not appear on the right side of any rule.
The languages generated by these grammars are recognized by a linear
bounded automaton.
Example
AB → AbBc
A → bcA
B→b
Type - 2 Grammar
Type-2 grammars generate context-free languages.
The productions must be in the form A → γ
where A ∈ N (Non terminal)
and γ ∈ (T ∪ N)* (String of terminals and non-terminals).
These languages generated by these grammars are be recognized by a
non-deterministic pushdown automaton.
Example
S→Xa
X→a
X → aX
X → abc
X→ε
Type - 3 Grammar
Type-3 grammars generate regular languages. Type-3 grammars must have
a single non-terminal on the left-hand side and a right-hand side
consisting of a single terminal or single terminal followed by a single non-
terminal.
The productions must be in the form X → a or X → aY
where X, Y ∈ N (Non terminal)
and a ∈ T (Terminal)
The rule S → ε is allowed if S does not appear on the right side of any rule.
Example
X→ε
X → a | aY
Y→b