DEPARTMENT OF COMPUTER ENGINEERING
EXPERIMENT NO: - 07
ROLL NO:
Aim: Program to implement 2 Pass Assembler.
Theory:
Assembler is a program for converting instructions written in low-level assembly code into
relocatable machine code and generating along information for the loader.
It is necessary to convert user written programs into a machinery code. This is called as translation of
the high level language to low level that is machinery language. This type of translation is performed
with the help of system software. Assembler can be defined as a program that translates an assembly
language program into a machine language program. Self assembler is a program that runs on a
computer and produces the machine codes for the same computer or same machine. It is also known
as resident assembler. A cross assembler is an assembler which runs on a computer and produces the
machine codes for other computer.
It generates instructions by evaluating the mnemonics (symbols) in operation field and find the
value of symbol and literals to produce machine code. Now, if assembler do all this work in one scan
then it is called single pass assembler, otherwise if it does in multiple scans then called multiple pass
assembler. Here assembler divide these tasks in two passes:
Pass-1:
1. Define symbols and literals and remember them in symbol table and literal table
respectively.
2. Keep track of location counter
3. Process pseudo-operations
4. Defines program that assigns the memory addresses to the variables and translates the
source code into machine code.
Pass-2:
1. Generate object code by converting symbolic op-code into respective numeric op-code
2. Generate data for literals and look for values of symbols
3. Defines program which reads the source code two times
4. It reads the source code and translates the code into object code.
1
Let’s take a look on how this program is working:
1. START: This instruction starts the execution of program from location 200 and label with
START provides name for the program.(JOHN is name for program)
2. MOVER: It moves the content of literal(=’3′) into register operand R1.
3. MOVEM: It moves the content of register into memory operand(X).
4. MOVER: It again moves the content of literal(=’2′) into register operand R2 and its label is
specified as L1.
5. LTORG: It assigns address to literals(current LC value).
6. DS(Data Space): It assigns a data space of 1 to Symbol X.
7. END: It finishes the program execution.
Working of Pass-1:
Define Symbol and literal table with their addresses. Note: Literal address is specified by LTORG or
END.
Step-1: START 200
(here no symbol or literal is found so both table would be empty)
Step-2: MOVER R1, =’3′ 200
( =’3′ is a literal so literal table is made)
2
Step-3: MOVEM R1, X 201
X is a symbol referred prior to its declaration so it is stored in symbol table with blank address field.
Step-4: L1 MOVER R2, =’2′ 202
L1 is a label and =’2′ is a literal so store them in respective tables
Step-5: LTORG 203
Assign address to first literal specified by LC value, i.e., 203
Step-6: X DS 1 204
It is a data declaration statement i.e X is assigned data space of 1. But X is a symbol which was
referred earlier in step 3 and defined in step 6. This condition is called Forward Reference Problem
where variable is referred prior to its declaration and can be solved by back-patching. So now
assembler will assign X the address specified by LC value of current step.
3
Step-7: END 205
Program finishes execution and remaining literal will get address specified by LC value of END
instruction. Here is the complete symbol and literal table made by pass 1 of assembler.
Now tables generated by pass 1 along with their LC value will go to pass-2 of assembler for further
processing of pseudo-opcodes and machine op-codes.
Working of Pass-2:
Pass-2 of assembler generates machine code by converting symbolic machine-opcodes into their
respective bit configuration(machine understandable form). It stores all machine-opcodes in MOT table
(op-code table) with symbolic code, their length and their bit configuration. It will also process pseudo-
ops and will store them in POT table(pseudo-op table). Various Data bases required by pass-2:
4
Algorithm:
Pass 1 Algorithm:
1. Initialization:
Open the input assembly file for reading.
Create and initialize necessary data structures (such as symbol table, location counter,
etc.).
Set up any necessary flags or variables.
2. Read Source Code Line by Line:
Read each line of the assembly code from the input file.
Parse each line to identify labels, mnemonic instructions, and operands.
Increment the location counter appropriately for each instruction or directive
encountered.
3. Process Labels:
If a label is found, add it to the symbol table along with its address (location counter
value).
Handle any forward references appropriately (if permitted by the assembly language).
4. Handle Directives:
Process assembler directives such as ORG, EQU, DS, DC, etc.
Adjust the location counter based on the directives encountered.
5. Error Handling:
Check for syntax errors, invalid instructions, or other issues.
Report any errors encountered during the pass.
6. Output Intermediate Data:
Generate intermediate data structures as needed, such as the symbol table.
Store relevant information for use in Pass 2.
7. Cleanup:
Close the input file.
Finalize any data structures.
Pass 2 Algorithm:
1. Initialization:
Open the input assembly file for reading.
Create and initialize necessary data structures.
Set up any necessary flags or variables.
2. Read Source Code Line by Line:
Read each line of the assembly code from the input file.
Parse each line to identify labels, mnemonic instructions, and operands.
3. Process Instructions:
Convert mnemonic instructions and operands into machine code.
Resolve symbols and calculate absolute addresses using the symbol table created in Pass
1.
4. Generate Object Code:
Assemble the machine code instructions and data into object code.
Handle any relocation or modification required based on the assembler directives.
5. Output Object Code:
Write the generated object code to the output file or memory.
6. Error Handling:
Check for any errors during the assembly process.
Report any errors encountered during Pass 2.
7. Cleanup:
Close input and output files.
Finalize any data structures.
5
Code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_SYMBOLS 100
#define MAX_LINE_LENGTH 100
#define MAX_OPCODES 10
typedef struct {
char label[20];
int address;
} Symbol;
typedef struct {
char mnemonic[10];
char opcode[3];
} Opcode;
Symbol symbolTable[MAX_SYMBOLS];
int symbolCount = 0;
Opcode opcodeTable[MAX_OPCODES] = {
{"LDA", "00"}, {"STA", "0C"}, {"LDCH", "50"}, {"STCH", "54"},
{"ADD", "18"}, {"SUB", "1C"}, {"MUL", "20"}, {"DIV", "24"},
{"COMP", "28"}, {"END", ""}
};
int searchSymbol(char *label) {
for (int i = 0; i < symbolCount; ++i) {
if (strcmp(symbolTable[i].label, label) == 0)
return symbolTable[i].address;
}
return -1;
}
void addToSymbolTable(char *label, int address) {
if (searchSymbol(label) == -1) { // Prevent duplicate labels
if (symbolCount < MAX_SYMBOLS) {
strcpy(symbolTable[symbolCount].label, label);
symbolTable[symbolCount].address = address;
symbolCount++;
}
}
}
char* getOpcode(char *mnemonic) {
for (int i = 0; i < MAX_OPCODES; i++) {
if (strcmp(opcodeTable[i].mnemonic, mnemonic) == 0) {
return opcodeTable[i].opcode;
6
}
}
return NULL; // Not found
}
void pass1(FILE *input) {
char line[MAX_LINE_LENGTH], label[20], mnemonic[20], operand[20];
int locCounter = 0, startAddress = 0;
printf("=== PASS 1: Symbol Table Generation ===\n");
while (fgets(line, MAX_LINE_LENGTH, input) != NULL) {
int scanned = sscanf(line, "%s %s %s", label, mnemonic, operand);
if (scanned < 2) continue; // Ignore empty lines
if (strcmp(mnemonic, "START") == 0) {
startAddress = strtol(operand, NULL, 16);
locCounter = startAddress;
addToSymbolTable(label, locCounter);
printf("[START] Program begins at address %04X\n", locCounter);
} else if (strcmp(mnemonic, "END") == 0) {
break;
} else {
if (scanned == 3) { // Label present
addToSymbolTable(label, locCounter);
}
if (getOpcode(mnemonic)) {
locCounter += 3; // SIC instructions are usually 3 bytes
} else if (strcmp(mnemonic, "WORD") == 0) {
locCounter += 3;
} else if (strcmp(mnemonic, "RESW") == 0) {
locCounter += 3 * atoi(operand);
} else if (strcmp(mnemonic, "RESB") == 0) {
locCounter += atoi(operand);
} else if (strcmp(mnemonic, "BYTE") == 0) {
locCounter += (strlen(operand) - 3); // Adjust for byte format
}
}
}
printf("\nSymbol Table:\n");
for (int i = 0; i < symbolCount; i++) {
printf("%s -> %04X\n", symbolTable[i].label, symbolTable[i].address);
}
}
void pass2(FILE *input) {
char line[MAX_LINE_LENGTH], label[20], mnemonic[20], operand[20];
7
int address;
printf("\n=== PASS 2: Object Code Generation ===\n");
while (fgets(line, MAX_LINE_LENGTH, input) != NULL) {
int scanned = sscanf(line, "%s %s %s", label, mnemonic, operand);
if (scanned < 2) continue;
if (strcmp(mnemonic, "START") == 0) {
printf("H^%s^%06X\n", label, searchSymbol(label));
} else if (strcmp(mnemonic, "END") == 0) {
address = searchSymbol(operand);
printf("E^%06X\n", address);
break;
} else {
char *opcode = getOpcode(mnemonic);
if (opcode) {
address = searchSymbol(operand);
if (address == -1) address = 0; // Handle undefined symbols
printf("T^%06X^03^%s%04X\n", searchSymbol(label), opcode, address);
} else {
printf("T^%06X^03^%s\n", searchSymbol(label), mnemonic); // Handle directives
}
}
}
}
int main() {
FILE *input = fopen("input.txt", "r");
if (input == NULL) {
printf("Error opening input file.\n");
return 1;
}
pass1(input);
rewind(input);
pass2(input);
fclose(input);
return 0;
}
8
Output:
Conclusion: We have studied and implemented 2 Pass Assembler.