PRINCIPLES OF COMPILER DESIFN
PRACTICAL – 1
Submitted by: 22bce194(C4 batch)
Introduction to Lexical Analyzer generator tool lex/flex
Objective:
Learn the role of a Lexical Analyzer in the compiler design process.
Understand the usage of Lex/Flex as a tool to generate lexical analyzers.
Perform installation and configuration of Flex and GCC.
Write, compile, and run a basic Lex program.
Introduction:
The Lexical Analysis phase is the first step in the compilation
process, responsible for scanning the source code and converting it
into a stream of tokens (keywords, identifiers, constants, operators,
etc.).
Lex/Flex is a popular tool for generating lexical analyzers
automatically based on user-defined patterns.
Lex = Original UNIX tool.
Flex = Faster, widely-used open-source version on
Windows/Linux.
The .l file contains three sections:
1. Definitions: Regular expressions and declarations.
2. Rules: Patterns with associated actions.
3. User Code: Main function and C code.
Software Requirements:
Flex (win_flex or flex.exe) for Windows
GCC compiler (MinGW)
Command Prompt / Terminal
Installation Steps:
1. Install Flex: Download win_flex for Windows and place it in a
directory added to PATH.
2. Install GCC: Install MinGW or Dev-C++ to compile the
generated C code.
3. Verify Installation:
flex --version
gcc –version
4. Set Environment Variables if required for gcc and flex.
Basic Lex Program:
File: Program1.l
%{
#include <stdio.h>
%}
%%
[0-9]+ { printf("%s is NUM\n", yytext); }
\n { /* skip newline */ }
[^0-9\n]+ { printf("%s is invalid token\n", yytext); }
%%
int yywrap() {
return 1;
}
int main() {
yylex();
return 0;
}
Execution Steps:
1. Create a file Program1.l with the above code.
2. Open Command Prompt in the folder and run:
cmd
flex Program1.l
gcc lex.yy.c -o lexer.exe
lexer.exe
3. Input:
hello 123 world
Ctrl+Z + Enter
4. Output:
Word: hello
Number: 123
Word: world
Program 2:
%{
#include <stdio.h>
%}
%%
[0-9]+"."[0-9]+ { printf("FNUM "); }
[0-9]+ { printf("NUM "); }
[_a-zA-Z][_a-zA-Z0-9]* { printf("ID "); }
[-+=*/] { printf(" %c \n", yytext[0]); }
[ \t\n] { /* Ignore whitespace */ }
. { printf("UNKNOWN "); }
%%
int yywrap() { return 1; }
int main()
{
yylex(); // Call lexical analyzer
return 0;
}
Program 3:
%{
#include <stdio.h>
%}
%%
[0-9]+ { printf("%s is NUM\n", yytext); }
\n { /* skip newline */ }
[^0-9\n]+ { printf("%s is invalid token\n", yytext); }
%%
int yywrap() {
return 1;
}
int main() {
yylex();
return 0;
}
Program 4:
%{
#include <stdio.h>
%}
%%
[0-9]+ { printf("%s is NUM\n", yytext); }
\n { /* skip newline */ }
[^0-9\n]+ { printf("%s is invalid token\n", yytext); }
%%
int yywrap() {
return 1;
}
int main() {
yylex();
return 0;
}
Program 5:
%{
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// Token codes
#define LT 256
#define LE 257
#define EQ 258
#define GT 259
#define IF 260
#define THEN 261
#define ELSE 262
#define ID 263
#define NUMBER 264
#define RELOP 265
int yylval;
int install_id() {
printf("Installing identifier: %s\n", yytext);
return 1; // Just return dummy index
}
int install_num() {
printf("Installing number: %s\n", yytext);
return atoi(yytext); // Convert string to integer
}
%}
delim [ \t\n]
ws {delim}+
letter [A-Za-z]
digit [0-9]
id {letter}({letter}|{digit})*
number {digit}+(\.{digit}+)?(E[+\-]?{digit}+)?
%%
{ws} { /* Ignore whitespace */ }
if { printf("Token: IF\n"); return IF; }
then { printf("Token: THEN\n"); return THEN; }
else { printf("Token: ELSE\n"); return ELSE; }
{id} { yylval = install_id(); printf("Token: ID\n"); return ID; }
{number} { yylval = install_num(); printf("Token: NUMBER\n"); return
NUMBER; }
"<=" { yylval = LE; printf("Token: RELOP <=\n"); return RELOP; }
"<" { yylval = LT; printf("Token: RELOP <\n"); return RELOP; }
"=" { yylval = EQ; printf("Token: RELOP =\n"); return RELOP; }
">" { yylval = GT; printf("Token: RELOP >\n"); return RELOP; }
. { printf("Unrecognized: %s\n", yytext); }
%%
int yywrap() { return 1; }
int main()
{
printf("Lexical Analysis Started...\n");
int token;
while ((token = yylex()) != 0) {
// Optionally print token codes
printf("Token code: %d\n", token);
}
printf("Lexical Analysis Finished.\n");
return 0;
}
Input: if count <= 100 then result = 42 else result > 5
Result:
Successfully installed Flex and GCC, and executed a basic Lex
program to recognize words and numbers.
Conclusion:
The experiment introduced the Lexical Analyzer and demonstrated
how Lex/Flex automates token generation using patterns. It also
provided practical exposure to setting up tools, writing .l files, and
compiling them to executable format.