0% found this document useful (0 votes)
132 views40 pages

Smart Mini Compilar

This document describes the lexical analysis phase of a compiler. It discusses how lexical analysis takes source code and splits it into tokens by eliminating whitespace and comments. It also describes how the lexical analyzer interacts with the syntax analyzer by reading character streams, checking for valid tokens, and sending token data to the syntax analyzer. The document provides an overview of the structure of a Lex program, which produces lexical analyzers and reads an input stream defining the source code to output a C program that executes the lexer.

Uploaded by

Hamid Raza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
132 views40 pages

Smart Mini Compilar

This document describes the lexical analysis phase of a compiler. It discusses how lexical analysis takes source code and splits it into tokens by eliminating whitespace and comments. It also describes how the lexical analyzer interacts with the syntax analyzer by reading character streams, checking for valid tokens, and sending token data to the syntax analyzer. The document provides an overview of the structure of a Lex program, which produces lexical analyzers and reads an input stream defining the source code to output a C program that executes the lexer.

Uploaded by

Hamid Raza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 40

THE SUPERIOR COLLEGE LAHORE

Faculty of Computer Science & IT

Compiler Construction

PROJECT REPORT

Smart mini Compiler

Student Name Student ID Class Contact Number Email Address

Gul Faraz Ahmed BCSM-F16-148 BSCS-8B 0347-6513429 farazahmed306@gmail.com


Abdul Raffay BCSM-F16-078 BSCS-8B 0304-5923213 bcsm-f16-078@superior.edu.pk
Anas Gillani BCSM-F16-422 BSCS-8B 0303-8826171 bcsm-f16-422@superior.edu.pk
Basil Tariq BCSM-F16-425 BSCS-8B 0302-6381772 bcsm-f16-425@superior.edu.pk

Mam Maryam
Assistant professor
Smart mini Compiler

Dedication

We dedicate this project to God Almighty our creator, our strong pillar, our source of
inspiration, wisdom, knowledge and understanding. He has been the source of our
strength throughout this program and on His wings only have we soared. We also
dedicate this work to my teacher; Mam Maryam who has encouraged us all the way and
whose encouragement has made sure that we give it all it takes to finish that which we
have started. To my friends who have been affected in every way possible by this quest.
Thank you. God bless you.

Faculty of CS&IT, The Superior College Lahore, Pakistan 2


Smart mini Compiler

Table of Contents
Dedication........................................................................................................................................ii
Table of Contents...........................................................................................................................iv
Chapter 1..........................................................................................................................................1
Introduction......................................................................................................................................2
1.1. Compilar Design Phase.....................................................................................................3
1.2. Analysis Phase……………………………………….……………………….….………3
1.3 Synthesis Phase………………………………………..…………………………………3
1.4 Graphical Representation………………………….….………………………………….4
Chapter 2..........................................................................................................................................5
LEXCIAL ANALYSIS...................................................................................................................5
2.1 Introduction.......................................................................................................................6
2.2 The Lexical Analysis.....................................................................................................6
2.3 THE LEX CODE.......................................................................................................... 7
Chapter 3........................................................................................................................................18
SYNTAX ANALYSIS..............................................................................................................18
3.1 Introduction.....................................................................................................................19
3.2 Structure of LEX Program………………….…………………………………………..19
3.3 Structure of Yacc Program……………………………………………………………...20
3.4 Paras Tree……………………………………………………………………………….32
Chapter 4........................................................................................................................................35
Conclusion ....................................................................................................................................36

Faculty of CS&IT, The Superior College Lahore, Pakistan 3


Smart mini Compiler

Chapter 1
Introduction

Faculty of CS&IT, The Superior College Lahore, Pakistan 4


Smart mini Compiler

Chapter 1: Introduction

A compiler is a software program that takes statements written in a different programming


language and converts them into a machine language or code that processors use with a
computer. The file used to write a C-language requires what is called the source statements. The
programmer then runs the corresponding language compiler, defining the file name that contains
the source statements.
When running, the compiler first parses all the language statements syntactically one after the
other, and then constructs the output code in one or more sequential steps, ensuring that
statements that apply to other statements are correctly referenced in the final code. The
compilation output is called object code, or an object module at times.

Faculty of CS&IT, The Superior College Lahore, Pakistan 5


Smart mini Compiler

Lexical analysis reflects a compiler's first step.It takes the modified source code from
preprocessor language and is written in the form of sentences. The lexical analyzer splits these
syntaxes into a set of tokens by eliminating some space in the source code or comments.

Symbol table is an important data structure that compilers build and retain to store information
on the existence of various entities such as variable names, function names etc. All the pieces of
the research and the creation of a compiler using the symbol table. Use Lex, we developed a
lexical analyzer for the C language.

It takes a C code as input, and outputs a tokens line. The output tokens shown contain keywords,
identifiers, signed / unsigned integer / floating point constants, operators, special characters,
headers, data form specifications, list, single-line statement, multi-line statement, pre-processor
instruction, pre-defined functions (print and scan), user-defined functions and the main feature.
The token, token sort, and token line number are seen in the C code.
The line number is shown so that the code can be debugged more quickly for errors. Errors in
single-line comments are shown along with multi-line comments and line numbers. The output
also includes the tokens symbol list, and their sort. Using hash organization, the symbol table is
created.

1.1. COMPILER DESIGN PHASES

A compiler is a software program that takes statements written in a different programming


language and converts them into a machine language or code that processors use with a
computer. A compiler may be divided loosely into two stages, depending on how they are
compiled.

1.2. Analysis Phase


The compiler's analytical process, known as the compiler's front-end, reads the source file,
splits it into key sections and then searches for lexical, grammar and syntax errors. The analysis

Faculty of CS&IT, The Superior College Lahore, Pakistan 6


Smart mini Compiler

process produces an intermediate representation of the source file and symbol table. This step
is generated by:
1. Lexical Analysis
2. Syntax Analysis
3. Semantic Analysis
4. Intermediate Code Generation

1.3. Synthesis Phase


Described as the compiler's back-end, the synthesis process uses intermediate source code
representation and symbol table to produce the target script. This step is defined by:
1. Code Optimization
2. Code Generator

1.4. GRAPICAL REPRESENTATION:

Faculty of CS&IT, The Superior College Lahore, Pakistan 7


Smart mini Compiler

Faculty of CS&IT, The Superior College Lahore, Pakistan 8


Smart mini Compiler

Chapter 2
Lexical Analysis

Faculty of CS&IT, The Superior College Lahore, Pakistan 9


Smart mini Compiler

Chapter 2: Lexical Analysis

2.

2.1. Introduction

Lexical analysis reflects a compiler's first step. It takes the modified source code from
preprocessor language and is written in the form of sentences. The lexical analyzer splits these
syntaxes into a set of tokens by eliminating some space in the source code or comments.
If a token is considered invalid by the lexical analyzer, it will produce an error. The lexical
analyzer interacts similarly with the analyzer of syntaxes. It reads character streams from
source code, checks for valid tokens, and when it requests, sends the data to the syntax
analyzer.

Diagram

2.2. THE LEXICAL ANALYSIS:

In computer science the method of transforming a sequence of characters (such as in a


computer system or web page) into a sequence of tokens (strings with an specified

Faculty of CS&IT, The Superior College Lahore, Pakistan 10


Smart mini Compiler

"meaning") is lexical analysis. A software that performs lexical analysis can be called a lexer,
tokenizer, or scanner (although the first step of a lexer is often referred to as "scanner").
In general, such a lexer is paired with a parser, which analyzes the grammar of programming
languages, web sites, and so on.
The script we write is a computer program named "lex," which produces lexical analysers
("scanners" or "lexers"). Lex reads an input stream that defines the source code for the
lexical analyser and outputs that execute the lexer in the programming language of C.

The structure of the lex program consists of three sections:


{definition section}
%%
{rules section}
%%
{C code section}

2.3. THE LEX CODE

%{
int lineno = 1;
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define AUTO 1
#define BREAK 2
#define CASE 3
#define CHAR 4
#define CONST 5
#define CONTINUE 6
#define DEFAULT 7
#define DO 8
#define DOUBLE 9
#define ELSE 10
#define ENUM 11

Faculty of CS&IT, The Superior College Lahore, Pakistan 11


Smart mini Compiler

#define EXTERN 12
#define FLOAT 13
#define FOR 14
#define GOTO 15
#define IF 16
#define INT 17
#define LONG 18
#define REGISTER 19
#define RETURN 20
#define SHORT 21
#define SIGNED 22
#define SIZEOF 23
#define STATIC 24
#define STRUCT 25
#define SWITCH 26
#define TYPEDEF 27
#define UNION 28
#define UNSIGNED 29
#define VOID 30
#define VOLATILE 31
#define WHILE 32
#define IDENTIFIER 33
#define SLC 34
#define MLCS 35
#define MLCE 36
#define LEQ 37
#define GEQ 38
#define EQEQ 39
#define NEQ 40
#define LOR 41
#define LAND 42
#define ASSIGN 43
#define PLUS 44
#define SUB 45
#define MULT 46
#define DIV 47
#define MOD 48
#define LESSER 49
#define GREATER 50
#define INCR 51
#define DECR 52
#define COMMA 53
#define SEMI 54
#define HEADER 55

Faculty of CS&IT, The Superior College Lahore, Pakistan 12


Smart mini Compiler

#define MAIN 56
#define PRINTF 57
#define SCANF 58
#define DEFINE 59
#define INT_CONST 60
#define FLOAT_CONST 61
#define TYPE_SPEC 62
#define DQ 63
#define OBO 64
#define OBC 65
#define CBO 66
#define CBC 67
#define HASH 68
#define ARR 69
#define FUNC 70
#define NUM_ERR 71
#define UNKNOWN 72
#define CHAR_CONST 73
#define SIGNED_CONST 74
#define STRING_CONST 75
%}
alpha [A-Za-z]
digit [0-9]
und [_]
space [ ]
tab [ ]
line [\n]
char \'.\'
at [@]
string \"(.^([%d]|[%f]|[%s]|[%c]))\"
%%
{space}* {}
{tab}* {}
{string} return STRING_CONST;
{char} return CHAR_CONST;
{line} {lineno++;}
auto return AUTO;
break return BREAK;
case return CASE;
char return CHAR;
const return CONST;
continue return CONTINUE;
default return DEFAULT;
do return DO;

Faculty of CS&IT, The Superior College Lahore, Pakistan 13


Smart mini Compiler

double return DOUBLE;


else return ELSE;
enum return ENUM;
extern return EXTERN;
float return FLOAT;
for return FOR;
goto return GOTO;
if return IF;
int return INT;
long return LONG;
register return REGISTER;
return return RETURN;
short return SHORT;
signed return SIGNED;
sizeof return SIZEOF;
static return STATIC;
struct return STRUCT;
switch return SWITCH;
typedef return TYPEDEF;
union return UNION;
unsigned return UNSIGNED;
void return VOID;
volatile return VOLATILE;
while return WHILE;
printf return PRINTF;
scanf return SCANF;
{alpha}({alpha}|{digit}|{und})* return IDENTIFIER;
[+-][0-9]{digit}*(\.{digit}+)? return SIGNED_CONST;
"//" return SLC;
"/*" return MLCS;
"*/" return MLCE;
"<=" return LEQ;
">=" return GEQ;
"==" return EQEQ;
"!=" return NEQ;
"||" return LOR;
"&&" return LAND;
"=" return ASSIGN;
"+" return PLUS;
"-" return SUB;
"*" return MULT;
"/" return DIV;
"%" return MOD;
"<" return LESSER;

Faculty of CS&IT, The Superior College Lahore, Pakistan 14


Smart mini Compiler

">" return GREATER;


"++" return INCR;
"--" return DECR;
"," return COMMA;
";" return SEMI;
"#include<stdio.h>" return HEADER;
"#include <stdio.h>" return HEADER;
"main()" return MAIN;
{digit}+ return INT_CONST;
({digit}+)\.({digit}+) return FLOAT_CONST;
"%d"|"%f"|"%u"|"%s" return TYPE_SPEC;
"\"" return DQ;
"(" return OBO;
")" return OBC;
"{" return CBO;
"}" return CBC;
"#" return HASH;
{alpha}({alpha}|{digit}|{und})*\[{digit}*\] return ARR;
{alpha}({alpha}|{digit}|{und})*\(({alpha}|{digit}|{und}|{space})*\) return FUNC;
({digit}+)\.({digit}+)\.({digit}|\.)* return NUM_ERR;
({digit}|{at})+({alpha}|{digit}|{und}|{at})* return UNKNOWN;
%%
struct node
{
char token[100];
char attr[100];
struct node *next;
};
struct hash
{
struct node *head;
int count;
};
struct hash hashTable[1000];
int eleCount = 1000;
struct node * createNode(char *token, char *attr)
{
struct node *newnode;
newnode = (struct node *) malloc(sizeof(struct node));
strcpy(newnode->token, token);
strcpy(newnode->attr, attr);
newnode->next = NULL;
return newnode;
}

Faculty of CS&IT, The Superior College Lahore, Pakistan 15


Smart mini Compiler

int hashIndex(char *token)


{
int hi=0;
int l,i;
for(i=0;token[i]!='\0';i++)
{
hi = hi + (int)token[i];
}
hi = hi%eleCount;
return hi;
}
void insertToHash(char *token, char *attr)
{
int flag=0;
int hi;
hi = hashIndex(token);
struct node *newnode = createNode(token, attr);
/* head of list for the bucket with index "hashIndex" */
if (hashTable[hi].head==NULL)
{
hashTable[hi].head = newnode;
hashTable[hi].count = 1;
return;
}
struct node *myNode;
myNode = hashTable[hi].head;
while (myNode != NULL)
{
if (strcmp(myNode->token, token)==0)
{
flag = 1;
break;
}
myNode = myNode->next;
}
if(!flag)
{
//adding new node to the list
newnode->next = (hashTable[hi].head);
//update the head of the list and no of nodes in the current bucket
hashTable[hi].head = newnode;
hashTable[hi].count++;
}
return;

Faculty of CS&IT, The Superior College Lahore, Pakistan 16


Smart mini Compiler

}
void display()
{
struct node *myNode;
int i,j, k=1;
printf("-------------------------------------------------------------------");
printf("\nSNo \t|\tToken \t\t|\tToken Type \t\n");
printf("-------------------------------------------------------------------\n");
for (i = 0; i < eleCount; i++)
{
if (hashTable[i].count == 0)
continue;
myNode = hashTable[i].head;
if (!myNode)
continue;
while (myNode != NULL)
{
printf("%d\t\t", k++);
printf("%s\t\t\t", myNode->token);
printf("%s\t\n", myNode->attr);
myNode = myNode->next;
}
}
return;
}
int main()
{
int scan, slcline=0, mlc=0, mlcline=0, dq=0, dqline=0;
yyin = fopen("isPrime.c","r");
printf("\n\n");
scan = yylex();
while(scan)
{
if(lineno == slcline)
{
scan = yylex();
continue;
}
if(lineno!=dqline && dqline!=0)
{
if(dq%2!=0)
printf("\n******** ERROR!! INCOMPLETE STRING at Line %d
********\n\n", dqline);
dq=0;

Faculty of CS&IT, The Superior College Lahore, Pakistan 17


Smart mini Compiler

}
if((scan>=1 && scan<=32) && mlc==0)
{
printf("%s\t\t\tKEYWORD\t\t\t\tLine %d\n", yytext, lineno);
insertToHash(yytext, "KEYWORD");
}
if(scan==33 && mlc==0)
{
printf("%s\t\t\tIDENTIFIER\t\t\tLine %d\n", yytext, lineno);
insertToHash(yytext, "IDENTIFIER");
}
if(scan==34)
{
printf("%s\t\t\tSingleline Comment\t\tLine %d\n", yytext, lineno);
slcline = lineno;
}
if(scan==35 && mlc==0)
{
printf("%s\t\t\tMultiline Comment Start\t\tLine %d\n", yytext, lineno);
mlcline = lineno;
mlc = 1;
}
if(scan==36 && mlc==0)
{
printf("\n******** ERROR!! UNMATCHED MULTILINE COMMENT END %s at
Line %d ********\n\n", yytext, lineno);
}
if(scan==36 && mlc==1)
{
mlc = 0;
printf("%s\t\t\tMultiline Comment End\t\tLine %d\n", yytext, lineno);
}
if((scan>=37 && scan<=52) && mlc==0)
{
printf("%s\t\t\tOPERATOR\t\t\tLine %d\n", yytext, lineno);
insertToHash(yytext, "OPERATOR");
}
if((scan==53||scan==54||scan==63||(scan>=64 && scan<=68)) && mlc==0)
{
printf("%s\t\t\tSPECIAL SYMBOL\t\t\tLine %d\n", yytext, lineno);
if(scan==63)
{
dq++;
dqline = lineno;

Faculty of CS&IT, The Superior College Lahore, Pakistan 18


Smart mini Compiler

}
insertToHash(yytext, "SPECIAL SYMBOL");
}
if(scan==55 && mlc==0)
{
printf("%s\tHEADER\t\t\t\tLine %d\n",yytext, lineno);
}
if(scan==56 && mlc==0)
{
printf("%s\t\t\tMAIN FUNCTION\t\t\tLine %d\n", yytext, lineno);
insertToHash(yytext, "IDENTIFIER");
}
if((scan==57 || scan==58) && mlc==0)
{
printf("%s\t\t\tPRE DEFINED FUNCTION\t\tLine %d\n", yytext, lineno);
insertToHash(yytext, "PRE DEFINED FUNCTION");
}
if(scan==59 && mlc==0)
{
printf("%s\t\t\tPRE PROCESSOR DIRECTIVE\t\tLine %d\n", yytext, lineno);
}
if(scan==60 && mlc==0)
{
printf("%s\t\t\tINTEGER CONSTANT\t\tLine %d\n", yytext, lineno);
insertToHash(yytext, "INTEGER CONSTANT");
}
if(scan==61 && mlc==0)
{
printf("%s\t\t\tFLOATING POINT CONSTANT\t\tLine %d\n", yytext, lineno);
insertToHash(yytext, "FLOATING POINT CONSTANT");
}
if(scan==62 && mlc==0)
{
printf("%s\t\t\tTYPE SPECIFIER\t\t\tLine %d\n", yytext, lineno);
}
if(scan==69 && mlc==0)
{
printf("%s\t\t\tARRAY\t\t\t\tLine %d\n", yytext, lineno);
insertToHash(yytext, "ARRAY");
}
if(scan==70 && mlc==0)
{
printf("%s\t\t\tUSER DEFINED FUNCTION\t\tLine %d\n", yytext, lineno);
insertToHash(yytext, "USER DEFINED FUNCTION");

Faculty of CS&IT, The Superior College Lahore, Pakistan 19


Smart mini Compiler

}
if(scan==71 && mlc==0)
{
printf("\n******** ERROR!! CONSTANT ERROR %s at Line %d
********\n\n", yytext, lineno);
}
if(scan==72 && mlc==0)
{
printf("\n******** ERROR!! UNKNOWN TOKEN %s at Line %d
********\n\n", yytext, lineno);
}
if(scan==73 && mlc==0)
{
printf("%s\t\t\tCHARACTER CONSTANT\t\t\tLine %d\n", yytext, lineno);
insertToHash(yytext, "CHARACTER CONSTANT");
}
if(scan==74 && mlc==0)
{
printf("%s\t\t\tSIGNED CONSTANT\t\t\tLine %d\n", yytext, lineno);
insertToHash(yytext, "SIGNED CONSTANT");
}
if(scan==75 && mlc==0)
{
printf("%s\t\t\tSTRING CONSTANT\t\t\tLine %d\n", yytext, lineno);
insertToHash(yytext, "STRING CONSTANT");
}
scan = yylex();
}
if(mlc==1)
printf("\n******** ERROR!! UNMATCHED COMMENT STARTING at Line %d
********\n\n",mlcline);
printf("\n");
printf("\n\t******** SYMBOL TABLE ********\t\t\n");
display();
printf("-------------------------------------------------------------------\n\n");
}
int yywrap()
{
return 1;
}
Input file (isPrime.c)
#include<stdio.h>
int main()
{

Faculty of CS&IT, The Superior College Lahore, Pakistan 20


Smart mini Compiler

int a,i,flag=0;
printf("Input no");
scanf("%d",&a);
i=2;
while(i <= a/2)
{
if(a%i == 0)
{
flag=1;
break;
}
i++;
}
if(flag==0)
printf("%d Prime", a);
else
printf("Not Prime");
return 0;
}

Faculty of CS&IT, The Superior College Lahore, Pakistan 21


Smart mini Compiler

Chapter 3
SYNTAX ANALSIS

Faculty of CS&IT, The Superior College Lahore, Pakistan 22


Smart mini Compiler

Chapter 3: SYNTAX ANALYSIS

3.1 Introduction

Syntax analysis in computer science is the method of searching for the syntactic consistency of
the code. Syntax analysis or parsing is meant to verify that we have a correct sequence of
tokens. Tokens are true symbol sets, keywords, signatures etc. The parser must be capable of
processing the infinite number of potential legitimate programs that can be given to him. The
standard way to describe the language is by defining a grammar.

A grammar is a collection of rules (or productions) which determine the language syntax (i.e.,
what is a valid language sentence). For a particular language there can be more than one
grammar. To find any errors in the code the parser analyzes the source code (token stream)
against the output rules. This phase's performance is a parse-tree.
Through writing two scripts, the syntax analyzer for the C language functions as a lexical
analyzer (lexer) and generates a stream of tokens, while the other one functions as a parser.
The lexer's called the lex software. Lex reads an input stream that defines the source code for
the lexical analyser and outputs that execute the lexer in the programming language of C.

3.2 Structure of Lex Program:


The structure of the lex program consists of three sections:
{definition section}
%%
{rules section}
%%
{C code section}

Faculty of CS&IT, The Superior College Lahore, Pakistan 23


Smart mini Compiler

The section on description describes macros, and imports header files in C. Any C code that
must be copied verbatim into the created source file can also be written here.

The portion of rules connects standard patterns of speech to claims of C. If the lexer sees text
matching a given pattern in the data, then the corresponding C code will be executed.

The section on C code includes C statements and functions that are copied verbatim to source
file created. Such claims probably contain code in the Laws section named by the laws.

3.3 Structure of Yacc Program:


The written parser is known as the Yacc program. The Yacc file structure is identical to the
lexer's, which consists of three sections:
{declarations}
%%
{rules}
%%
{routines}
The declarations portion of a yacc file that consist of the following:
% token — identifies the token names recognized by the yacc file
%start — identifies a non-terminal name for the start symbol
% right — identifies tokens associated with other tokens
% left — identifies tokens associated with other tokens.
%nonassoc—Identifies tokens rather then just associative tokens

The rules section consists of the context-free grammar used for parse tree creation. A general
rule is organized according to the following:
nonterminal
: sentential form
| sentential form

Faculty of CS&IT, The Superior College Lahore, Pakistan 24


Smart mini Compiler

.................
| sentential form
;
Actions can be aligned with rules and are performed while following the corresponding
sentential type.
The section on routines can include the C program defining the input file, the operation
routines and other user-defined functions

Lex Program :

alpha [A-Za-z_]
fl (f|F|l|L)
ul (u|U|l|L)*
digit [0-9]
space []
hex [a-fA-F0-9]
exp [Ee][+-]?{digit}+

%{
char datatype[100] = "dummy";
int tl;
char next;
#include <stdio.h>
#include <string.h>
%}

%%
\n { yylineno++; }
"/*" { multicomment(); }
"//" { singlecomment(); }

Faculty of CS&IT, The Superior College Lahore, Pakistan 25


Smart mini Compiler

"#include<"({alpha})*".h>" {}

"#define"({space})""({alpha})""({alpha}|{digit})*""({space})""({digit})+""
{ return DEFINE;}
"#define"({space})""({alpha}({alpha}|{digit})*)""({space})""(({digit}+)\.({digit}+))""
{ return DEFINE;}
"#define"({space})""({alpha}({alpha}|{digit})*)""({space})""({alpha}({alpha}|{digit})*)""
{ return DEFINE;}

{digit}+ { insertToConstTable(yytext, yylineno, "INT"); return


CONSTANT; }
({digit}+)\.({digit}+) { insertToConstTable(yytext, yylineno, "FLOAT"); return
CONSTANT; }
0[xX]{hex}+{ul}? { insertToConstTable(yytext, yylineno, "FLOAT"); return
CONSTANT; }
{digit}+{ul}? { insertToConstTable(yytext, yylineno, "FLOAT"); return
CONSTANT; }
'(\\.|[^\\'])+' { insertToConstTable(yytext, yylineno, "FLOAT"); return
CONSTANT; }
{digit}+{exp}{fl}? { insertToConstTable(yytext, yylineno, "FLOAT"); return
CONSTANT; }
{digit}*"."{digit}+({exp})?{fl}? { insertToConstTable(yytext, yylineno, "FLOAT"); return
CONSTANT; }
{digit}+"."{digit}*({exp})?{fl}? { insertToConstTable(yytext, yylineno, "FLOAT"); return
CONSTANT; }

{alpha}?\"(\\.|[^\\"])*\" { insertToConstTable(yytext, yylineno, "STRING"); return


STRING_LITERAL; }

"->" { return PTR_OP; }

Faculty of CS&IT, The Superior College Lahore, Pakistan 26


Smart mini Compiler

"++" { return INC_OP; }


"--" { return DEC_OP; }
"<<" { return LEFT_OP; }
">>" { return RIGHT_OP; }
"<=" { return LE_OP; }
">=" { return GE_OP; }
"==" { return EQ_OP; }
"!=" { return NE_OP; }
"&&" { return AND_OP; }
"||" { return OR_OP; }
"*=" { return MUL_ASSIGN; }
"/=" { return DIV_ASSIGN; }
"%=" { return MOD_ASSIGN; }
"+=" { return ADD_ASSIGN; }
"-=" { return SUB_ASSIGN; }
"<<=" { return LEFT_ASSIGN; }
">>=" { return RIGHT_ASSIGN; }
"&=" { return AND_ASSIGN; }
"^=" { return XOR_ASSIGN; }
"|=" { return OR_ASSIGN; }

"auto" { return AUTO; }


"break" { return BREAK; }
"case" { return CASE; }
"char" { return CHAR; }
"const" { return CONST; }
"continue" { return CONTINUE; }
"default" { return DEFAULT; }
"do" { return DO; }
"double" { return DOUBLE; }
"else" { return ELSE; }

Faculty of CS&IT, The Superior College Lahore, Pakistan 27


Smart mini Compiler

"enum" { return ENUM; }


"extern" { return EXTERN; }
"float" { strcpy(datatype, "FLOAT"); tl = yylineno; return FLOAT; }
"for" { return FOR; }
"goto" { return GOTO; }
"if" { return IF; }
"int" { strcpy(datatype, "INT"); tl = yylineno; return INT; }
"long" { return LONG; }
"register" { return REGISTER; }
"return" { return RETURN; }
"short" { return SHORT; }
"signed" { return SIGNED; }
"sizeof" { return SIZEOF; }
"static" { return STATIC; }
"struct" { return STRUCT; }
"switch" { return SWITCH; }
"typedef" { return TYPEDEF; }
"union" { return UNION; }
"unsigned" { return UNSIGNED; }
"void" { return VOID; }
"volatile" { return VOLATILE; }
"while" { return WHILE; }

";" { strcpy(datatype, "dummy"); return(';'); }


("{"|"<%") { return('{'); }
("}"|"%>") { return('}'); }
"," { return(','); }
":" { return(':'); }
"=" { return('='); }
"(" { return('('); }

Faculty of CS&IT, The Superior College Lahore, Pakistan 28


Smart mini Compiler

")" { return(')'); }
("["|"<:") { return('['); }
("]"|":>") { return(']'); }
"." { return('.'); }
"&" { return('&'); }
"!" { return('!'); }
"~" { return('~'); }
"-" { return('-'); }
"+" { return('+'); }
"*" { return('*'); }
"/" { return('/'); }
"%" { return('%'); }
"<" { return('<'); }
">" { return('>'); }
"^" { return('^'); }
"|" { return('|'); }
"?" { return('?'); }
"printf"|"scanf" { insertToHash(yytext,"PROCEDURE",yylineno); return IDENTIFIER; }
"main" { insertToHash(yytext,"PROCEDURE",yylineno); return
IDENTIFIER; }
{alpha}({alpha}|{digit})* {
if(strcmp(datatype, "dummy")==0)
return IDENTIFIER;
else
{
insertToHash(yytext,datatype,yylineno);
return IDENTIFIER;
}
}
[ \t\v\n\f] { }
. { /* ignore bad characters */ }

Faculty of CS&IT, The Superior College Lahore, Pakistan 29


Smart mini Compiler

%%

struct cnode
{
char num[50];
//int lno;
char type[20];
};
struct cnode ctable[100];
int ccount = 0;

void insertToConstTable(char *num, int l, char *type)


{
strcpy(ctable[ccount].num, num);
strcpy(ctable[ccount].type, type);
//ctable[ccount].lno = l;
ccount++;
}

void disp()
{
int i;
printf("\n\n------------------------------CONSTANT TABLE------------------------------\n");
printf("--------------------------------------------------------------------------\n");
printf("Value \t\t|\tData Type\t\t\n");
printf("--------------------------------------------------------------------------\n");
for(i=0;i<ccount;i++)
{
printf("%s\t\t\t", ctable[i].num);
printf("%s\t\t\t\n", ctable[i].type);
//printf("\t%d\t\n", ctable[i].lno);

Faculty of CS&IT, The Superior College Lahore, Pakistan 30


Smart mini Compiler

}
printf("\n\n");
}

struct node
{
char token[100];
char attr[100];
//int line[100];
int line_count;
struct node *next;
};

struct hash
{
struct node *head;
int hash_count;
};

struct hash hashTable[1000];


int eleCount = 1000;

struct node * createNode(char *token, char *attr, int l)


{
struct node *newnode;
newnode = (struct node *) malloc(sizeof(struct node));
strcpy(newnode->token, token);
strcpy(newnode->attr, attr);
//newnode->line[0] = l;
newnode->line_count = 1;
newnode->next = NULL;

Faculty of CS&IT, The Superior College Lahore, Pakistan 31


Smart mini Compiler

return newnode;
}

int hashIndex(char *token)


{
int hi=0;
int l,i;
for(i=0;token[i]!='\0';i++)
{
hi = hi + (int)token[i];
}
hi = hi%eleCount;
return hi;
}

void insertToHash(char *token, char *attr, int l)


{
int flag=0;
int hi;
hi = hashIndex(token);
struct node *newnode = createNode(token, attr, l);
/* head of list for the bucket with index "hashIndex" */
if (hashTable[hi].head==NULL)
{
hashTable[hi].head = newnode;
hashTable[hi].hash_count = 1;
return;
}
struct node *myNode;
myNode = hashTable[hi].head;
while (myNode != NULL)

Faculty of CS&IT, The Superior College Lahore, Pakistan 32


Smart mini Compiler

{
if (strcmp(myNode->token, token)==0)
{
flag = 1;
//myNode->line[(myNode->line_count)++] = l;
if(strcmp(myNode->attr, attr)!=0)
{
strcpy(myNode->attr, attr);
}
break;
}
myNode = myNode->next;
}
if(!flag)
{
//adding new node to the list
newnode->next = (hashTable[hi].head);
//update the head of the list and no of nodes in the current bucket
hashTable[hi].head = newnode;
hashTable[hi].hash_count++;
}
return;
}

void display()
{
struct node *myNode;
int i,j, k=1;
printf("\n-----------------------------------------Symbol
Table---------------------------------------------\n");

Faculty of CS&IT, The Superior College Lahore, Pakistan 33


Smart mini Compiler

printf("--------------------------------------------------------------------------------------------------");
printf("\nToken \t\t|\tToken Type \t\t\t\t\t \n");
printf("--------------------------------------------------------------------------------------------------\n"
);
for (i = 0; i < eleCount; i++)
{
if (hashTable[i].hash_count == 0)
continue;
myNode = hashTable[i].head;
if (!myNode)
continue;
while (myNode != NULL)
{
//printf("%d\t\t", k++);
printf("%s\t\t\t", myNode->token);
printf("%s\t\t\t", myNode->attr);
/*for(j=0;j<(myNode->line_count);j++)
printf("%d ",myNode->line[j]);*/
printf("\n");
myNode = myNode->next;
}
}

printf("--------------------------------------------------------------------------------------------------\n");
return;
}

yywrap()
{
return(1);

Faculty of CS&IT, The Superior College Lahore, Pakistan 34


Smart mini Compiler

}
multicomment()
{
char c, c1;
while ((c = input()) != '*' && c != 0);
c1=input();
if(c=='*' && c1=='/')
{
c=0;
}
if (c != 0)
putchar(c1);
}
singlecomment()
{
char c;
while(c=input()!='\n');
if(c=='\n')
c=0;
if(c!=0)
putchar(c);
}

Faculty of CS&IT, The Superior College Lahore, Pakistan 35


Smart mini Compiler

3.4 Parse Tree:

Faculty of CS&IT, The Superior College Lahore, Pakistan 36


Smart mini Compiler

Faculty of CS&IT, The Superior College Lahore, Pakistan 37


Smart mini Compiler

Faculty of CS&IT, The Superior College Lahore, Pakistan 38


Smart mini Compiler

Chapter 4
Conclusion

Faculty of CS&IT, The Superior College Lahore, Pakistan 39


Smart mini Compiler

Chapter 5: Conclusion

For a subset of C language, the lexical analyzer and the syntax analyzer is created which
includes selection statements, compound statements, iteration statements, jumping
statements, user-defined functions and primary expressions. It is important to
remember that if due caution is not taken when defining the context-free grammar for
the language, conflicts (shift-reduce and reduce-reduce) will arise in the case of a syntax
analyser. We will also define unambiguous grammar for proper operation of the parser.

Faculty of CS&IT, The Superior College Lahore, Pakistan 40

You might also like