Formal Language and Automata
Theory
    Prof. Sachin Jain, Assistant Professor
      Computer Science & Engineering
    CHAPTER-1
Introduction of FLAT
What is FLAT ?
•   It is the mathematical study of computing machine and its capabilities.
•   It is a study of formal language and automata theory.
Alphabet
•   The non empty finite set of symbols is called as an alphabet and it is denoted by
    ∑.
     •   Example
                ∑ = {a.b.c………z}
                ∑ = {0,1}
                ∑ = {1}
String
•   Any finite sequence of symbol from alphabet ∑ , is called as string and is
    represented by w.
     •   Example
               ∑ = {a,b}
                    W = a,ab,aa,bb
                    W= ab,aab,abc
               ∑ = {0,1}
                    W = 0,01,00,11,1
                    W = 102,2013
               ∑ = {1}
                    W = 1, 11, 111
                    W = 1,12, 121
Length of a String
 •       If w is any string over alphabet ∑ then the number of symbols involved in the
         sequence of string is called as length of the string and denoted by |w|.
     •    Example
                 ∑ = {a,b} W = a,ab,aa,bb ,|w|=1,2,2,2
                 ∑ = {0,1} W = 0,01,00,11,1 |w| = 1,2,2,2
     •    Empty String
     •    A string of length zero or string without any symbols is known as empty
          string and is denoted by €
                         w = € , |w| = 0
                         w. € = w = €.w
Substring
 •   Let u,w be the two strings over same alphabet ∑ then u is said to be substring of
     w if u can be obtained from w.
 •   Any consecutive sequence of symbols over given string.
      •   If u is a substring of w then |u| <= |w|
      •   Every string is substring to itself.
      •   Empty string “€” is substring for every string.
      •   Example
                W = TOC
                Zero length Substring= €        One length substring=T,O,C
                Two length substring= TO,OC Three length substring=TOC
Types of Substring
 •   The substrings are classified into two types
         1.   Trivial Substring or improper Substring
         2.   Non-trivial or proper substring
 •   Trivial Substring or improper Substring
             If w is any string over alphabet ∑ then the substring w itself and € is called as trivial
              substring
 •   Non-trivial Substring or proper Substring
             If w is any string over alphabet ∑ then any substring of w over w other than w itself and €
              is called non trivial substring.
Facts about Substring
 •   If w is any string with distinct symbols and |w|= n
        1. No of substrings = ∑n +1 = n(n+1)/2
        2. No. of trivial string = 2
        3. No. of non trivial substring = ∑n -1
        4. No. of non empty subdtring= ∑n
        5. No of substrings of distinct length = n
        6. No. of strings of length n generated over alphabet ∑ = | ∑ |n
Prefix and Suffix
 •   Prefix
               The sequence of starting or leading symbol is called as prefix.
 •   Suffix
               The sequence of ending or trailing symbol is called as suffix.
       •       Example
       •       w=TOC , |w|=3
       •       Prefix : € , T , TO , TOC
       •       Suffix : TOC , OC , C , €
Facts about Prefix and Suffix
 •   If w is any string of length ‘n’ then
         1. No. of prefix = No. of suffix = n+1
         2. Trivial substrings are common for both prefix and suffix
         3. Every prefix and suffix must be a substring but every substring need not be
            prefix or suffix.
Power of an alphabet
 •   If ∑ is any alphabet then ∑k is the set of all the strings of length k.
 •   Example : ∑ = {0,1}
               ∑2 = {00,01,10,11}
               ∑3 = {000,001,010,011,100,101,110,111}
               ∑k = {all k-length stings}
 •   +ve closure(∑+)
            ∑+ = {w | |w|>=1}
 •   Kleen closure(∑*)
          ∑* = {w | |w|>=0}
 •   ∑* = ∑+ ꓴ €
Language
 •   The collection of strings from the alphabet ∑ is called language
      •   Example ∑ = {0,1}
      •   L ={00,01,10,11}
      •   L = { (01)n | n>=1 }
      •   L = { 0n 1m |m>=1,n>=1}
 •   If ∑ is any alphabet then ∑* is called as universal language
Formal Language
 •   The collection of strings where we can put some restriction in the formation of
     string is called as formal language.
      •   Example ∑ = {0,1}
      •   L ={00,01,10,11}
      •   L = { (01)n | n>=1 }
      •   L = { 0n 1m |m>=1,n>=1}
Chomsky classification of Formal Language
 •   According to Chomsky the formal languages are classified as
        1.   Type 0 or Recursive Enumerable Languages
        2.   Type 1 or Context sensitive languages
        3.   Type 2 or Context free languages
        4.   Type 3 or Context Regular languages
Types of Language
 •   Empty Languages
           The Language that does not contain any string even empty string is called as empty language
            and is denoted by ф.
 •   Non Empty Languages
           The Language that contain at least one string is called as non empty language .
 •   Finite Languages
           The Language which contains finite number of strings and length of each string is finite is
            called as finite language .
           L = { 0n 1n |1<=n<=1}
 •   Infinite Languages
           The Language which contains infinite number of strings and length of each string is finite is
            called as infinite language .
           L = { 0n 1n |n>=1}
Automata
 •   The mathematical system that can represent the formal language is called as
     Automata i.e. The mathematical representation of formal language is called as
     an automata.
 •   Types of Automata
         1. Finite Automata(FA)
         2. Push Down Automata(PDA)
         3. Linear Bound Automata(LBA)
        4.   Turing Machine(LBA)
Expressive Power of an Automata
 •   The number of languages accepted by an automata is called as Expressive Power
     of an Automata.
        1.   E(FA)=1
        2.   E(PDA)=2
        3.   E(LBA)=3
        4.   E(TM)=4
Grammar
 •   The collection of rules which are used to generate the string is called grammar.
 •   Grammar G is a collection of 4 tuples {V,T,P,S}
                              G= {V,T,P,S} ,Where
           V= set of all Nonterminal symbol/variable
           P= setoff all productions
           T= set of all terminal symbols
           S= starting symbol
          Example
            1. A            XYZ (r1) V={A,X,Y,Z}
                 X            a (r2)    T={a,b,c}
                 Y            b (r3)     P={r1,r2,r3,r4}
                  Z           c (r4)     S=A
Chomsky classification of Grammar
 •   According to Chomsky the Grammar is classified as
        1.   Type 0 or Recursive Enumerable Grammar
        2.   Type 1 or Context sensitive Grammar
        3.   Type 2 or Context free Grammar
        4.   Type 3 or Context Regular Grammar
Other classification of Grammar
 •   Recursive grammar
         The grammar g is said to be recursive if atleast one production contain
            the same variable at both left hand side and right hand side of
            production.
         Example        S       asb|€
 •   Non Recursive grammar
         The grammar g is said to be recursive if noproduction contain the same
            variable at both left hand side and right hand side of production.
         Example        S       ab|€
Derivation
 •   The process of deriving a string is called as derivation.
 •   The geometrical representation of derivation is called Derivation tree or parse
     tree.
 •   Steps involved in derivation is called sentential form
 •   Example                         Derivation        Parse Tree
            1.   A       XYZ         A       XYZ                      A
                 X         a         A       aYZ
                 Y         b         A       abZ               X     Y      Z
                 Z         c         A      abc
                                                               a     b      c
www.paruluniversity.ac.in