0% found this document useful (0 votes)
16 views19 pages

PCD 2m

Uploaded by

venkat Mohan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views19 pages

PCD 2m

Uploaded by

venkat Mohan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Unit-1

1. Differences between compiler and interpreter

Feature Compiler Interpreter

Execution Translates entire program Translates and runs program


before running line by line

Speed Faster execution after Slower, due to real-time


compilation translation

Output Generates an executable file No separate output file

Error Handling Displays all errors after Stops at first error


compilation

Example Languages C, C++ Python, JavaScript

2. What are the parts of a compiler?

●​ Lexical Analyzer
●​ Syntax Analyzer (Parser)
●​ Semantic Analyzer
●​ Intermediate Code Generator
●​ Code Optimizer
●​ Code Generator
●​ Symbol Table Manager
●​ Error Handler

3. Differences between passes and phases


Feature Phases Passes

Definition Logical divisions of compilation Number of times source code is


work scanned

Count Fixed (typically 6-7) Can be single or multiple

Example Lexical analysis, parsing, etc. Single-pass or multi-pass


compilers
4. Is the compiler or interpreter efficient? Write reasons.

●​ Compiler is more efficient in terms of execution time because it translates the entire
program into machine code beforehand.​

●​ Interpreter is more flexible, useful for debugging and quick scripting.​

5. Do you prefer to have more passes in the compiler? Justify.

Yes, more passes allow:

●​ Better error checking


●​ More sophisticated optimizations
●​ Easier modular design​
However, it may increase compilation time.

6. Front end of the compiler / language dependent phases

These phases depend on the source language's syntax and semantics:

●​ Lexical Analysis
●​ Syntax Analysis
●​ Semantic Analysis
●​ Intermediate Code Generation

7. Give examples for compiled and interpreted languages

●​ Compiled Languages: C, C++, Rust, Go


●​ Interpreted Languages: Python, JavaScript, Ruby

8. Back end of the compiler / language independent phases

These phases focus on target machine and are mostly language-independent:

●​ Intermediate Code Optimization


●​ Code Generation
●​ Code Optimization​
9. Cousins of the compiler

●​ Assembler – Converts assembly to machine code


●​ Linker – Links multiple object files
●​ Loader – Loads programs into memory
●​ Preprocessor – Processes directives (e.g., macros in C)​

10. Examples of single pass and multi pass compilers and which one is efficient

●​ Single Pass: Pascal, early versions of C


●​ Multi Pass: Modern C, Java, FORTRAN
●​ Efficiency:​

○​ Single-pass: faster, less memory


○​ Multi-pass: better error checking, optimization​

11. When do you prefer a single pass compiler?

●​ When speed is crucial (e.g., embedded systems)


●​ When code size is small
●​ When limited resources are available​

12. When do you prefer a multi-pass compiler?

●​ For large, complex programs


●​ When high optimization is needed
●​ When precise error handling is required​
Unit-2

1. Differences between DFA and NFA

Aspect DFA (Deterministic Finite NFA (Nondeterministic Finite


Automaton) Automaton)

Transitions One transition per input Multiple transitions allowed


symbol

Epsilon Moves Not allowed Allowed

Determinism Deterministic Non-deterministic

Implementation Easier to implement Harder to implement

Speed Faster due to determinism Slower in general

Expressive Power Equal to NFA Equal to DFA

2. What is meant by a regular set?

A regular set is a set of strings that can be described by a regular expression and recognized by
a finite automaton. Regular sets are used to define the lexical syntax of programming
languages.

3. What are tokens? Give examples.

Tokens are the smallest units of a program that have meaning. They are generated during
lexical analysis.

Examples:

●​ Keywords: int, while


●​ Identifiers: sum, count
●​ Operators: +, -, *
●​ Literals: 100, "hello"
●​ Symbols: ;, (, )​
4. Write regular expressions for the given strings.

Let’s assume these:

●​ Integer: [0-9]+​

●​ Identifier: [a-zA-Z_][a-zA-Z0-9_]*​

●​ Floating-point: [0-9]+\.[0-9]+​

●​ Whitespace: [ \t\n]+​

●​ Operators: (\+|\-|\*|\/|==|!=|<=|>=)

5. What are the different buffering techniques? Which is best?

Buffering techniques:

●​ Unbuffered: One character at a time (inefficient).


●​ Single Buffer: Whole input in one buffer.
●​ Double Buffer: Two halves of a buffer used alternately (best due to efficiency and
reduced I/O).

Best: Double buffering – minimizes disk access delays.

6. Transition Diagram

7. Regular definitions for the following:

●​ Whitespace: ws → [ \t\n]+
●​ Digit: d → [0-9]
●​ Integer: int → d+
●​ Float: float → d+\.d+
●​ Identifier: id → [a-zA-Z_][a-zA-Z0-9_]*
●​ Operators: op → (\+|\-|\*|\/|==|=)​
8. Lex for simple applications.

Example: Lex program to recognize identifiers and numbers.

%%
[0-9]+ { printf("Integer: %s\n", yytext); }
[a-zA-Z_][a-zA-Z0-9_]* { printf("Identifier: %s\n", yytext); }
[\n\t ] { /* ignore whitespace */ }
. { printf("Unknown: %s\n", yytext); }
%%

9. For the given code, write the tokens.

For code: int sum = a + b;

Tokens:

●​ int → Keyword
●​ sum → Identifier
●​ = → Operator
●​ a → Identifier
●​ + → Operator
●​ b → Identifier
●​ ; → Delimiter

10. Various error recovery strategies in lexical analysis

●​ Panic Mode: Skip characters until a known delimiter is found.


●​ Phrase-level Recovery: Replace or insert symbols to continue parsing.
●​ Error Productions: Use grammar rules that match errors.
●​ Global Correction: Tries minimal changes to make input valid (costly).​

11. Why is lexical analysis separated from syntax analysis?

●​ Simplicity: Each phase focuses on a single task.


●​ Efficiency: Lexical analysis can be optimized separately.
●​ Modularity: Easier to debug and maintain.
●​ Reusability: Same lexer can be used with different parsers.
12. Patterns, lexemes, and tokens for the given input.

Example input: while (a < 10) a = a + 1;

Lexeme Token Pattern

while KEYWORD while

( SYMBOL \(

a IDENTIFIER [a-zA-Z_][a-zA-Z0-9_]*

< OPERATOR <

10 NUMBER [0-9]+

) SYMBOL \)

a IDENTIFIER same as above

= OPERATOR =

a IDENTIFIER same as above

+ OPERATOR +

1 NUMBER [0-9]+

; SYMBOL ;

13. Transition table

Example for recognizing a simple identifier (id → [a-zA-Z_][a-zA-Z0-9_]*):

State Input Next State

0 [a-zA-Z_] 1

1 [a-zA-Z0-9_] 1

1 other - (accept)
Unit-3

1. When is a grammar said to be ambiguous? Give an example.

A grammar is ambiguous if there exists at least one string with more than one parse
tree (or leftmost/rightmost derivation).

Example:

E → E + E | E * E | id

The string id + id * id can be parsed in multiple ways due to unclear precedence, making the
grammar ambiguous.

2. Left recursion elimination

Grammar with left recursion:

A → Aα | β
Eliminated form:
A → βA'
A' → αA' | ε
Example:
E→E+T|T
→ E → T E'
→ E' → + T E' | ε

3. Left factoring examples

Used to make grammar suitable for predictive parsing.

Before:

S → if E then S else S | if E then S

After left factoring:

S → if E then S S'
S' → else S | ε

4. Comparison between CFG and regular expressions


Feature Regular Expressions Context-Free Grammar (CFG)

Expressiveness Less expressive More expressive

Used in Lexical analysis Syntax analysis

Recognizes Regular languages Context-free languages

Handles nesting No Yes

5. Give example for LL(1) grammar


E → T E'
E' → + T E' | ε
T → id

This grammar is LL(1) as it has no left recursion and no common prefixes.

6. LL(1) grammar properties

●​ No left recursion
●​ Should be left factored
●​ Can be parsed using predictive parser
●​ Parsing table has at most one entry per cell​

7. Give example for not an LL(1) grammar


S→aA|aB
A→x
B→y

Conflict: Both a A and a B start with the same token a – not LL(1).
8. Operator grammar

●​ A type of grammar used to handle operator precedence.


●​ No epsilon or left recursion.
●​ Has operator precedence relations between terminals like <·, =·, >·.​

Example:

E → E + E | E * E | id

9. Drawbacks of recursive descent parser

●​ Cannot handle left-recursive grammars


●​ Backtracking may be required if grammar is not LL(1)
●​ May be inefficient for complex grammars​

10. Error heuristics in predictive parser

●​ Use of panic mode: Skip to a synchronizing token


●​ Error productions: Add rules to handle common mistakes
●​ Panic recovery sets: Defined for each non-terminal​

11. Different actions of shift reduce parser

●​ Shift: Move next input symbol to the stack


●​ Reduce: Replace symbols on the stack with a non-terminal
●​ Accept: Input is successfully parsed
●​ Error: Parsing fails due to invalid input
12. FIRST rules or algorithm

FIRST(X) = set of terminals that begin strings derived from X.

Rules:

●​ If X is terminal: FIRST(X) = {X}


●​ If X → ε is a production: ε ∈ FIRST(X)
●​ If X → Y1Y2...Yn, add FIRST(Y1) to FIRST(X), and if ε in FIRST(Y1), continue to Y2, and so
on.​

13. FOLLOW rules or algorithm

FOLLOW(A) = set of terminals that can appear immediately to the right of A in some sentential
form.

Rules:

●​ If A is the start symbol: add $ to FOLLOW(A)


●​ If there’s a rule B → αAβ, then add FIRST(β) - {ε} to FOLLOW(A)
●​ If FIRST(β) contains ε or β is ε, add FOLLOW(B) to FOLLOW(A)​

14. Kernel items and nonkernel items

●​ Kernel items:
○​ Items with a dot after the first symbol
○​ Items of the form [A → α·β], where α ≠ ε
○​ Includes the start item [S' → ·S]​

●​ Non-kernel items:
○​ Items of the form [A → ·α] where α is any string (start of production)​
15. Differences between top-down and bottom-up parsers
Feature Top-Down Parsing Bottom-Up Parsing

Start Symbol Starts from start symbol Starts from input

Derivation Leftmost derivation Rightmost derivation (in


reverse)

Examples LL(1), Recursive descent LR(0), SLR, LALR, CLR

Complexity Simpler More powerful, complex

16. Which is the powerful parser?

Bottom-up parsers (LR parsers) are more powerful than top-down parsers because:

●​ They can parse a larger class of grammars.


●​ They can handle left recursion and ambiguous constructs better.​

17. How will you implement precedence relations?

Using operator precedence parsing, define:

●​ Precedence table with relations <·, =·, >· between operators.


●​ Modify parser to use these to decide when to shift or reduce.​

18. When will a predictive parser encounter an error?

●​ If the current token does not match the entry in the parsing table
●​ Or the entry is empty, meaning no rule applies
19. Can we construct a parser using ambiguous grammar? Justify.

Technically yes, but:

●​ It’s not advisable because it leads to multiple parse trees, making syntax analysis and
semantic interpretation unclear.
●​ Parser generators like YACC can resolve ambiguity with precedence and associativity
rules.

20. What is YACC?

YACC (Yet Another Compiler Compiler) is a tool used to:

●​ Generate LALR(1) parsers


●​ Accept a grammar specification and produce C code for the parser
●​ Often used with LEX for lexical analysis​
Unit IV

1. Types of Attributes

Attributes are used in syntax-directed translation.

●​ Synthesized Attributes: Computed from the children nodes and passed upwards in the
parse tree.
●​ Inherited Attributes: Passed downwards or across siblings in the parse tree.​

2. Types of Intermediate Languages

1.​ Three-Address Code (TAC)​


Example: t1 = a + b​

2.​ Quadruples​
Format: (op, arg1, arg2, result)​
Example: (+ , a , b , t1)​

3.​ Triples​
Format: (op, arg1, arg2) where result is implied by position​
Example: (+ , a , b)​

4.​ Indirect Triples​


Uses pointers to triples.​

3. What are the types of Intermediate Representations? Which is the best one?

Types:

●​ Syntax Trees
●​ Postfix Notation
●​ Three-Address Code
●​ Quadruples
●​ Triples​
Best: Three-Address Code (TAC)​
Reason: Easy to translate into machine code and analyze for optimization.

4. What is meant by Backpatching?

Backpatching is the process of filling in jump addresses (like in if or while statements) later
when they become known.

Used in code generation for boolean expressions and control flow.

5. Different Storage Allocation Strategies

●​ Static Allocation: Memory assigned at compile time.


●​ Stack Allocation: Uses a runtime stack for local variables.
●​ Heap Allocation: For dynamically allocated memory (e.g., malloc).

Stack is used for function calls; heap is for objects/data with unknown lifetimes.
Unit V

1. DAG Construction for the Given Expression

DAG (Directed Acyclic Graph) helps detect:

●​ Common subexpressions
●​ Dead code
●​ Reordering of computations

For a = b + c + b + c, DAG avoids recomputing b + c.

2. Next Use Information

It tells where a variable will be used next. Helps in:

●​ Register allocation
●​ Dead code elimination

Each instruction keeps track of:

●​ Live/dead status
●​ Next use position​

3. Structure Preserving Optimizations

These preserve program control flow and structure.

Examples:

●​ Loop unrolling
●​ Inlining
●​ Code motion (loop-invariant code)​
4. Function Preserving Optimizations

These maintain functional behavior but may change structure.

Examples:

●​ Constant folding
●​ Strength reduction
●​ Dead code elimination
●​ Copy propagation​

5. How to Calculate the Cost of an Instruction?

Based on:

●​ Number of machine cycles


●​ Instruction type (memory vs register)
●​ Target architecture
●​ Register usage, memory access

Example: a = b + c​
Cost depends on: whether b and c are in registers or memory.

6. Equation for Data Flow Analysis

General equation for reaching definitions:t

OUT[B] = GEN[B] ∪ (IN[B] – KILL[B])


IN[B] = ⋃ OUT[P] for all predecessors P of B

Used in:

●​ Live variable analysis


●​ Available expressions
●​ Reaching definitions​

7. What is Copy Propagation?

It replaces occurrences of variables with the values they were copied from.

Example:

x = y
z = x + 1

Becomes:

z = y + 1

8. Common Subexpression Elimination

Avoids recomputation of expressions that are computed earlier and still valid.

Before:t

t1 = a + b
t2 = a + b

After:

t1 = a + b
t2 = t1

9. Strength Reduction

Replaces expensive operations with cheaper ones.

Example:t

x = i * 2 → x = i + i
x = a^2 → x = a * a
10. Code Optimization Techniques

●​ Constant folding
●​ Dead code elimination
●​ Common subexpression elimination
●​ Loop unrolling
●​ Strength reduction
●​ Copy propagation
●​ Peephole optimization​

11. Loop Optimization Techniques

●​ Loop Invariant Code Motion: Move calculations outside the loop.


●​ Loop Unrolling: Reduce number of iterations.
●​ Induction Variable Elimination
●​ Strength Reduction in Loops​

You might also like