0% found this document useful (0 votes)
11 views7 pages

CD File

Compiler design file
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views7 pages

CD File

Compiler design file
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Program – 1

Aim – Practice and Study of LEX/YACC of compiler writing.

Theory –

Flex: A Flex application is used to create .I files (lexical analyzer programs) that define
lexical rules for text pattern matching. After writing a I file, it is executed using the Flex tool,
which generates a C program that compiles into an executable, transforming input text based
on defined patterns.

Flex is a tool for generating lexical analyzers, scanners, or lexers. Written by Vern Paxson in C, circa
1987,Flex is designed to produce lexical analyzers that is faster than the original Lex program. Today
it is often used along with Berkeley Yacc or GNU Bison parser generators. Both Flex and Bison are
more flexible, and produce faster code, than their ancestors Lex and Yacc.

In this process, Bison creates a parser from the input file provided by the user. In turn, Flex creates
the function yylex() from a .l file, which is a file containing rules of lexical analysis. The function
yylex(), created by Flex automatically, retrieves tokens from the input stream and is invoked by the
parser to analyze the text of the input .yylex() function Note: This is the main function generated
from Flex. It performs the rules in the rule section
of the .l file to conduct the lexical analysis

Bison: A Bison application complements Flex by handling grammar and parsing for .1
files. After Flex generates a lexical analyzer, Bison creates a parser that processes the
tokenized input from the .I file. Together, they build a complete compiler or interpreter,
enabling complex syntax analysis and execution of programming languages or custom
commands.

Dev C++: Dev C++ is an IDE that helps in executing .l files generated by Flex. After Flex
processes the .I file, it produces a C file. Dev C++ can then be used to compile this C file into
an executable, allowing the lexical analyzer to run and process input based on the defined
patterns. Dev C++ also provides debugging tools, enabling you to identify and fix errors in
the generated C code. It offers an integrated environment to streamline the compilation
process, making it easier to execute and test the I file's output efficiently.

(.1) File: A.T file is a source file used in lexical analysis, typically associated with the Flex
tool (Fast Lexical Analyzer). It contains rules and patterns that define how input text should
be tokenized or broken down into meaningful symbols, known as tokens.
lex.yy.c: lex.yy.c is the C source code file generated by the Flex (or Lex) tool from a .I
file. When you run Flex on a '.I' file, it translates the lexical rules and patterns defined in that
file into a C program, which is saved as 'lex.yy.c'. This file contains the implementation ofa
lexical analyzer that processes input text, identifies tokens based on the defined patterns, and
can be compiled into an executable program.
yywrap(): The `yywrap()' function in a`.I' (Lex) file is a routine that Lex calls when it
reaches the end of the input file. By default, it returns 1, signaling the end of input. If
'yywrap()' returns 0, Lex continues scanning with the same rules on more input. This
function is useful when dealing with multiple input files or when you want to reset the input
state. If your program doesn't require special handling at the end of input, you can simply
omit it, as the default behavior is often sufficient.

YACC (Yet Another Compiler Compiler): It serves as a powerful grammar parser and generator. In
essence, it functions as a tool that takes in grammar specifications and transforms it into an
executable module capable of meticulously structuring input tokens into a coherent syntax tree,
aligning seamlessly with the prescribed grammar rules. YACC was developed by Stephen C. Johnson
in the 1970s. Initially, the YACC was written in the B programming language and was soon rewritten
in C. It was originally designed to be used by Lex. In addition to that, YACC was also rewritten in
OCaml, Ratfor, ML, ADA, Pascal, Java, Python, Ruby and Go. The input of YACC in compiler
design is the rule or grammar, and the output is a C program.

Relationship between LEX and YACC:-


In compiler design, LEX and YACC work together to automate the process of building language
translators. LEX (Lexical Analyzer) is used to generate the lexical analyzer (scanner) that reads the
input source code and breaks it into tokens based on patterns defined using regular expressions.
YACC (Yet Another Compiler Compiler) generates the syntax analyzer (parser) that takes these
tokens as input and checks whether they follow the grammar rules of the language, typically defined
using context-free grammar. In short, LEX handles the tokenization phase, while YACC handles the
parsing phase, and the two integrate seamlessly—LEX sends recognized tokens to YACC, enabling a
structured and efficient compiler workflow.

Aspect LEX YACC


Purpose Tokenizer (Lexical Analysis) Parser (Syntax Analysis)
Input Regular expressions (patterns) Context-free grammar (BNF rules)
Output Tokens (like ID, NUM, PLUS, etc.) Parse tree / syntax validation
Role Breaks input into tokens Checks token sequence against grammar
Connection Passes tokens to YACC via yylex() Calls yylex() to get next token
Error Handling Detects invalid characters Detects syntax errors
Generated File lex.yy.c y.tab.c (and y.tab.h)

Source Code –

%{
#include <stdio.h>
#include <stdlib.h>
%}
%%
[0-9]+ { printf("Welcome\n"); exit(0);
. { printf("Wrong\n"); exit(0); }
%%

int yywrap() {
return 1;
}

int main() {
printf("Enter:");
yylex();
return 0;
}

Output –
Program – 2
Aim – Write a program to count number of tokens in a string

Software Used – Flex, C Compiler (gcc)

Theory – A token refers to a meaningful unit of input that the lexical analyzer identifies during the
scanning phase. These tokens can represent keywords, operators, identifiers, literals, or symbols in
the source code. The lexer uses regular expressions to define patterns that match specific types of
tokens. When a match is found in the input stream, the corresponding token is returned to the parser,
often along with an associated value (such as the actual text or a numerical value).

For instance, the string "123" might be recognized as a NUMBER token with a value of 123. In
programs that also use tools like Yacc or Bison for parsing, tokens act as a bridge between the lexical
analyzer and the parser, helping to structure and interpret the input according to grammar rules.
Essentially, tokens are the building blocks of the syntax tree that the parser constructs from the input.

lex.yy.c: lex.yy.c is the C source code file generated by the Flex (or Lex) tool from aГ file. When you
run Flex on a '.l file, it translates the lexical rules and patterns defined in that file into a C program,
which is saved as 'lex.yy.c'. This file contains the implementation of a lexical analyzer that processes
input text, identifies tokens based on the defined patterns, and can be compiled into an executable
program.

yywrap(): The 'yywrap()' function in a Lex file signals the end of input processing. It is called when
'yylex()' reaches the end of input. By default, 'yywrap()' returns 1 to indicate that no more input files
are available. If you need to process multiple files, 'yywrap()' can be customized to handle additional
files or perform cleanup tasks. In most simple applications, returning 1 is sufficient to indicate that
no further processing is required

Source code –
%{
#include <stdio.h>
#include <stdlib.h>
int tokens = 0;
%}

%%

[a-zA-Z_][a-zA-Z0-9_]* { tokens++; }
[0-9]+ { tokens++; }
[\t\n ]+ ;
. ;

%%

int yywrap() {
return 1;
}

int main() {
printf("Enter: ");
yylex();

printf("Number of tokens: %d\n", tokens);


return 0;
}

Output –
Program – 3
Aim – Write a program of calculator that can evaluate arithmetic expression with integer

Software Used – Flex, C Compiler (gcc)

Theory – Arithmetic expression evaluation using Lex and Yacc is an important concept in compiler
design. Lex serves as a lexical analyzer that breaks input into tokens such as numbers, operators, and
parentheses. Yacc acts as a parser, applying grammar rules to evaluate expressions while handling
operator precedence and associativity. Together, they process arithmetic inputs like (8+2)*3, ensuring
accurate results and demonstrating the collaboration of lexical analysis and parsing.
As the expression is scanned, numbers are pushed onto the operand stack, and operators are pushed
onto the operator stack according to precedence rules. Parentheses control the order of operations. At
the end, the operators are applied in the correct order to compute the final result.

Source code –
%{
#include <stdio.h>
#include <stdlib.h>

int num1, num2;


char op;
%}

%%
[0-9]+ {
if(num1 == 0)
num1 = atoi(yytext);
else
num2 = atoi(yytext);
}

[+\-*/] { op = yytext[0]; }
\n { return 0; }
. ;
%%

int yywrap() { return 1; }

int main() {
printf("Enter : ");
yylex();
switch(op) {
case '+': printf("Result = %d\n", num1 + num2); break;
case '-': printf("Result = %d\n", num1 - num2); break;
case '*': printf("Result = %d\n", num1 * num2); break;
case '/':
if(num2 != 0) printf("Result = %d\n", num1 / num2);
else printf("Error: Division by zero\n");
break;
default:
printf("Invalid operator!\n");
}
return 0;
}

Output –

You might also like