Practical File: Computer Network and Security
Practical File: Computer Network and Security
Submitted By
Tsewang Norboo
UE223109
Submitted To
Nitin Bhatia
Faculty, Computer Science and Engineering
Introduction
A Lexical Analyzer (Lexer) is the first phase of a compiler. It processes the source code and converts it into
tokens. A token is the smallest unit of a program that has meaning, such as keywords, identifiers, operators,
constants, and symbols.
The Lexical Analyzer works as a scanner that reads the input character stream and groups them into
meaningful sequences, which are then passed to the Syntax Analyzer (Parser) for further processing.
token_patterns = [
(r'\b(if|else|while|for|return|int|float|char)\b', 'KEYWORD'),
(r'\b[A-Za-z_][A-Za-z0-9_]*\b', 'IDENTIFIER'),
(r'\d+', 'NUMBER'),
(r'[+\-*/=<>!]+', 'OPERATOR'),
(r'[(){},;]', 'SYMBOL'),
(r'\s+', None)]
def lexical_analyzer(code):
tokens = []
index = 0
match_found = False
regex = re.compile(pattern)
if match:
if token_type:
tokens.append((match.group(), token_type))
index = match.end()
match_found = True
break
if not match_found:
return tokens
def print_token_table(tokens):
print("Lexeme\tToken")
print("-" * 20)
print(f"{lexeme}\t{token}")
tokens = lexical_analyzer(code_snippet)
print_token_table(tokens)
Output :
Conclusion
• Lexical Analysis is the first step of compilation.
• It converts a source code into tokens.
• Lex simplifies the implementation of a Lexical Analyzer by automatically generating token
recognition code.
Experiment No. 2
Aim : Given an input string for a regular expression. Obtain the corresponding NFA with null-move.
Introduction
A Lexical Analyzer (Lexer) is the first phase of a compiler. It processes the source code and converts it into
tokens. A token is the smallest unit of a program that has meaning, such as keywords, identifiers,
operators, constants, and symbols.
The Lexical Analyzer works as a scanner that reads the input character stream and groups them into
meaningful sequences, which are then passed to the Syntax Analyzer (Parser) for further processing
Code :
Output :
Conclusion
By converting a regular expression into an NFA with null-moves (ε-moves), we effectively represent its
transitions and operations like concatenation, alternation, and repetition. This forms the foundation for
lexical analysis and pattern matching, enabling efficient string processing in compilers and search
algorithms.
Experiment No. 3
Introduction to LEX
LEX is a lexical analyzer generator that is used to create scanners or tokenizers for processing structured
text. It is widely used in compiler construction for tokenizing input based on predefined patterns.
Advantages of LEX
• Automates lexical analysis in compiler design.
• Uses regular expressions for efficient pattern matching.
• Generates optimized C code, which can be compiled with GCC.
• Helps in preprocessing, filtering, and data extraction.
Conclusion
LEX programs play a crucial role in text processing and compiler construction. By defining tokenizing
rules, they help automate text analysis, syntax highlighting, and input filtering in programming
languages and tools.
Experiment No. 4
Introduction
Parsing is a fundamental process in compiler design and language processing, converting source code into a
structured representation for analysis. One of the simplest parsing techniques is Recursive Descent Parsing,
which consists of a set of recursive functions for each non-terminal in the grammar.
Recursive Descent Parsing is a top-down parsing method where each function attempts to match input
tokens with a production rule in a recursive manner. It is commonly used for simple and unambiguous
grammars such as LL(1) grammars, where parsing decisions are made using only one lookahead token.
Theory
Recursive Descent Parsing operates by implementing a function for each non-terminal in a given grammar.
The parser follows these principles:
1. Top-down parsing: It starts from the start symbol and recursively expands non-terminals using
production rules.
2. One-token lookahead: It makes decisions based on the next available token in the input.
3. Backtracking avoidance: LL(1) grammars ensure that parsing choices are deterministic without
needing to backtrack.
Algorithm
1. Define a function for each non-terminal in the grammar.
2. Read the input token and match it with expected grammar rules.
3. Recursively call functions for corresponding non-terminals.
4. If an input mismatch occurs, return an error.
5. If the input is parsed successfully, return a valid parse tree.
Code :
return SUCCESS;}
// T -> F T' }
{ cursor++;
scanf("%s", string); {
puts("Input Action"); }
puts("--------------------------------"); else{
{ }
puts("--------------------------------"); }
} }
else }
{ else{
return 1; }
} }
}
int T(){ int F(){
if (Tdash()){ cursor++;
} if (*cursor == ')'){
else{ cursor++;
} }
} else{
return FAILED; }
} }
} else{
return FAILED;
int Tdash(){ }
if (*cursor == '*'){ }
if (F()) { cursor++;
return SUCCESS; }
} else {
return FAILED; }
} }
else{
return FAILED;
else {
return SUCCESS;
}
Output :
Conclusion
Recursive Descent Parsing is a simple and effective method for parsing deterministic, context-free grammars
such as LL(1) grammars. This implementation demonstrates parsing arithmetic expressions using recursive
function calls while maintaining operator precedence. Although it is limited in handling ambiguous or left-
recursive grammars, it is widely used in lightweight language parsers and compilers.
Further enhancements can include:
• Support for more operators and complex expressions.
• Implementing a symbol table for variable recognition.
• Improving efficiency by eliminating unnecessary recursion.
Experiment No. 5
Introduction:
In compiler design, syntax analysis is a crucial phase where a parser verifies the syntactical structure of the
source code. For top-down parsing, FIRST and FOLLOW sets are foundational concepts that help in
constructing parsing tables and developing LL(1) parsers.
Objective:
To implement:
1. FIRST set for a given grammar
2. FOLLOW set for a given grammar
3. LL(1) Parsing Table and simulate LL(1) parsing
Theory:
FIRST Set:
The FIRST set of a symbol is the set of terminals that begin the strings derivable from that symbol.
Rules:
• If X → aα, then a ∈ FIRST(X)
• If X → ε, then ε ∈ FIRST(X)
• If X → Y₁Y₂…Yn, and ε ∈ FIRST(Y₁), …, FIRST(Yk), then add FIRST(Yk+1) (excluding ε) to
FIRST(X)
• If all Yi can derive ε, then ε ∈ FIRST(X)
FOLLOW Set:
The FOLLOW set of a non-terminal A is the set of terminals that can appear immediately to the right of A in
some sentential form.
Rules:
• Place $ in FOLLOW(S) where S is the start symbol
• If A → αBβ, then everything in FIRST(β) (except ε) is in FOLLOW(B)
• If A → αB or A → αBβ where FIRST(β) contains ε, then everything in FOLLOW(A) is in
FOLLOW(B)
LL(1) Parsing Table:
An LL(1) parser uses a parsing table to decide which production to use, based on the current input symbol
and top of the stack.
• For each production A → α:
o For each terminal a in FIRST(α), add A → α to table[A][a]
o If ε ∈ FIRST(α), then for each b in FOLLOW(A), add A → α to table[A][b]
Algorithm:
FIRST and FOLLOW:
1. Input the grammar and initialize FIRST and FOLLOW sets as empty.
2. Recursively calculate FIRST sets for each non-terminal.
3. After FIRST is complete, compute FOLLOW sets using rules until no more additions occur.
LL(1) Parsing:
1. Use FIRST and FOLLOW sets to construct the parsing table.
2. Simulate parsing with a stack:
o Initialize stack with $ and start symbol.
o At each step, check the top of stack and current input symbol.
o Use the parsing table to apply a production, or report error.
Code:
#include <iostream>
#include <map>
#include <set>
#include <vector>
#include <sstream>
using namespace std;
bool isTerminal(char c) {
return !(c >= 'A' && c <= 'Z') || c == '#';
}
void computeFirst(char symbol) { for (auto& rule : grammar) {
if (firstSet.find(symbol) != firstSet.end()) char lhs = rule.first;
return; for (string prod : rule.second) {
for (int i = 0; i < prod.length(); i++) {
set<char> result; if (prod[i] == symbol) {
for (string prod : grammar[symbol]) { bool isLast = (i == prod.length() - 1);
if (prod == "#") { if (!isLast) {
result.insert('#'); char next = prod[i + 1];
continue; if (isTerminal(next)) {
} follow.insert(next);
} else {
for (char ch : prod) { follow.insert(firstSet[next].begin(),
firstSet[next].end());
if (isTerminal(ch)) {
follow.erase('#');
result.insert(ch);
if (firstSet[next].find('#') !=
break; firstSet[next].end()) {
} else { computeFollow(lhs);
computeFirst(ch); follow.insert(followSet[lhs].begin(),
result.insert(firstSet[ch].begin(), firstSet[ch].end()); followSet[lhs].end());
if (firstSet[ch].find('#') == firstSet[ch].end()) }
break; }
else } else {
} computeFollow(lhs);
} follow.insert(followSet[lhs].begin(),
followSet[lhs].end());
}
}
firstSet[symbol] = result;
}
}
}
}
void computeFollow(char symbol) {
}
if (followSet.find(symbol) != followSet.end() && symbol !=
startSymbol) }
return; }
computeFollow(lhs); firstProd.insert(firstSet[ch].begin(),
firstSet[ch].end());
follow.insert(followSet[lhs].begin(),
followSet[lhs].end()); if (firstSet[ch].find('#') == firstSet[ch].end())
{
}
addFollow = false;
}
break;
} else {
} else {
if (lhs != symbol) {
firstProd.erase('#');
computeFollow(lhs);
}
follow.insert(followSet[lhs].begin(),
followSet[lhs].end()); }
} }
} if (addFollow) {
} firstProd.insert(followSet[lhs].begin(),
followSet[lhs].end());
}
}
}
}
}
}
for (char t : firstProd) {
parsingTable[{lhs, t}] = prod;
}
}
}
}
void parseInput(string input) { int main() {
input += '$'; int n;
vector<char> stack = {'$', startSymbol}; cout << "Enter number of productions: ";
int i = 0; cin >> n;
cout << "\nParsing Steps:\n"; cout << "Enter grammar (Format: A->xyz | A-># for
epsilon):\n";
cout << "Stack\t\tInput\t\tAction\n";
while (!stack.empty()) {
for (int i = 0; i < n; i++) {
string stackStr(stack.begin(), stack.end());
string line;
cout << stackStr << "\t\t" << input.substr(i) << "\t\t";
cin >> line;
char top = stack.back();
char lhs = line[0];
char curr = input[i];
if (i == 0) startSymbol = lhs;
Output :
Conclusion :
This practical demonstrates how FIRST and FOLLOW sets guide the construction of LL(1) parsing tables.
LL(1) parsers are simple and efficient for grammars that satisfy the LL(1) condition (no ambiguity, left
recursion).
Experiment No. 6
Theory:
• Bottom-Up Parsing is also called shift-reduce parsing. It attempts to construct a parse tree for an
input string starting from the leaves (input) and working up to the root (start symbol).
• In Operator Precedence Parsing, parsing decisions are made by comparing precedence relations
between terminals using a precedence table.
• The parser uses three precedence relations:
o E1 < E2 (Shift): Push E2 to stack.
o E1 > E2 (Reduce): Apply production to reduce.
o E1 = E2 (Match): Used mainly for parentheses.
• Grammar used:
E → E + E | E * E | (E) | id
• We use a precedence table that defines the relations between +, *, id, (, ), and $.
Algorithm:
1. Input the expression ending with $.
2. Replace all id with i (for simplification).
3. Initialize a stack with $.
4. Loop through the input:
o Compare the top of stack with current input symbol using precedence table.
o If relation is < or =, perform Shift.
o If relation is >, perform Reduce.
o If input is fully consumed and stack has only start symbol, Accept.
5. Report errors for invalid sequences.
Code:
#include <iostream> while (true) {
using namespace std; // Display stack
// Operator precedence table stack<char> temp = st;
map<char, map<char, char>> precedence = { string stack_content, input_content = input.substr(i);
{'+', {{'+', '>'}, {'*', '<'}, {'i', '<'}, {'(', '<'}, {')', '>'}, {'$', while (!temp.empty()) {
'>'}}},
stack_content = temp.top() + stack_content;
{'*', {{'+', '>'}, {'*', '>'}, {'i', '<'}, {'(', '<'}, {')', '>'}, {'$',
'>'}}}, temp.pop();
{'(', {{'+', '<'}, {'*', '<'}, {'i', '<'}, {'(', '<'}, {')', '='}}},
{')', {{'+', '>'}, {'*', '>'}, {')', '>'}, {'$', '>'}}}, cout << stack_content << "\t" << input_content << "\t";
Output :
Conclusion : The Operator Precedence Parser successfully parses arithmetic expressions using shift-reduce
operations guided by a precedence table. It is efficient for expressions with operators and handles ambiguity
using defined precedences.
Experiment No. 7
void buildParsingTable() {
ACTION[0]['a'] = "S2";
ACTION[2]['b'] = "S4";
ACTION[4]['$'] = "Reduce B -> b";
ACTION[0]['A'] = "G1";
ACTION[1]['$'] = "Accept";
ACTION[2]['B'] = "G1";
ACTION[3]['$'] = "Reduce A -> a";
GOTO[0][1] = 3; // A
GOTO[2][2] = 1; // B
}
void parseInput(string input) {
stack<int> stateStack;
stack<char> symbolStack;
stateStack.push(0);
input += '$';
int i = 0;
while (true) {
int state = stateStack.top();
char curr = input[i];
if (ACTION[state].find(curr) == ACTION[state].end()) {
cout << "Error: No action for state " << state << " and symbol " << curr << endl;
break;
}
string action = ACTION[state][curr];
cout << "State: " << state << ", Input: " << input.substr(i) << ", Action: " << action << endl;
if (action[0] == 'S') {
int next = action[1] - '0';
stateStack.push(next);
symbolStack.push(curr);
i++;
} else if (action.substr(0, 6) == "Reduce") {
string prod = action.substr(7);
char lhs = prod[0];
int popCount = 1;
Output :
Conclusion : In this experiment, we successfully implemented an LR parser in C++ for a simplified grammar. The parser
correctly processes input strings by shifting and reducing according to ACTION and GOTO tables. LR parsing is efficient and
suitable for real-world programming language syntax analysis.
Experiment No. 8
For this experiment, we’ll evaluate arithmetic expressions (like 3+5*2) using syntax-directed definitions during parsing. The
translation is done alongside parsing using stack-based evaluation.
E→E+T
E→T
T→T*F
T→F
F → (E)
F → num
Algorithm:
1. Read the input expression.
2. Convert the expression into postfix (or parse using stack with translation rules).
3. Use a stack to simulate semantic actions (e.g., evaluating operations).
4. When reducing by a rule, apply the semantic action:
o For example, E → E + T: compute E.val = E1.val + T.val.
5. Continue parsing until the entire expression is evaluated.
Code:
#include <iostream>
using namespace std;
int precedence(char op) {
if(op == '+' || op == '-') return 1;
if(op == '*' || op == '/') return 2;
return 0;
}
int applyOp(int a, int b, char op) {
switch(op) {
case '+': return a + b;
case '-': return a - b;
case '*': return a * b;
case '/': return b ? a / b : throw runtime_error("Division by zero"); }
return 0; }
// Evaluate expression using SDT logic else {
int evaluate(string expr) { while(!ops.empty() && precedence(ops.top()) >=
precedence(expr[i])) {
stack<int> values;
int val2 = values.top(); values.pop();
stack<char> ops;
int val1 = values.top(); values.pop();
char op = ops.top(); ops.pop();
for(size_t i = 0; i < expr.length(); i++) {
values.push(applyOp(val1, val2, op));
if(expr[i] == ' ') continue;
}
ops.push(expr[i]);
// If number, push to value stack
}
if(isdigit(expr[i])) {
}
int val = 0;
while(!ops.empty()) {
while(i < expr.length() && isdigit(expr[i])) {
int val2 = values.top(); values.pop();
val = (val*10) + (expr[i]-'0');
int val1 = values.top(); values.pop();
i++;
char op = ops.top(); ops.pop();
}
values.push(applyOp(val1, val2, op));
values.push(val);
}
i--;
return values.top();
}
}
// If '(', push to ops stack
int main() {
else if(expr[i] == '(') {
string expression;
ops.push(expr[i]);
cout << "Enter arithmetic expression: ";
}
getline(cin, expression);
// If ')', solve until '('
else if(expr[i] == ')') {
try {
while(!ops.empty() && ops.top() != '(') {
int result = evaluate(expression);
int val2 = values.top(); values.pop();
cout << "Result: " << result << endl;
int val1 = values.top(); values.pop();
} catch(const exception& e) {
char op = ops.top(); ops.pop();
cerr << "Error: " << e.what() << endl;
values.push(applyOp(val1, val2, op));
}
}
return 0;
ops.pop(); // pop '('
}
}
Output :
Conclusion:
In this experiment, we successfully implemented syntax-directed translation for arithmetic expressions. The program evaluates the
expression by applying semantic rules during parsing. This technique is essential in compilers for generating intermediate code
and performing computations during syntax analysis.