Symbol Tables
CSC 3205: Compiler Design
                                Marriette Katarahweire
                                 26th February 2020
CSC 3205: Compiler Design                                1/13
    Symbol Tables
                Symbol tables are data structures that are used by compilers
                to hold information about source-program constructs.
                It is an important data structure created and maintained by
                the compiler in order to keep track of semantics of variable
                The information is collected incrementally by the analysis
                phases of a compiler
                Used by the synthesis phases to generate the target code.
                Entries in the symbol table contain information about an
                identifier eg. its character string (or lexeme), type,position in
                storage and any other relevant information.
                Symbol tables typically need to support multiple declarations
                of the same identifier within a program.
                It is used by compiler to achieve compile time efficiency.
CSC 3205: Compiler Design                                                           2/13
    Symbol Table and other Compiler Phases
                Lexical Analysis: Creates new table entries in the table,
                example entries about token.
                Syntax Analysis: Adds information regarding attribute type,
                scope, dimension, line of reference, use, etc in the table.
                Semantic Analysis: Uses available information in the table to
                check for semantics i.e. to verify that expressions and
                assignments are semantically correct(type checking) and
                update it accordingly.
                Intermediate Code generation: Refers symbol table for
                knowing how much and what type of run-time is allocated
                and table helps in adding temporary variable information.
                Code Optimization: Uses information present in symbol table
                for machine dependent optimization.
                Target Code generation: Generates code by using address
                information of identifier present in the table.
CSC 3205: Compiler Design                                                       3/13
    Items stored in Symbol table
                Variable names and constants
                Procedure and function names
                Literal constants and strings
                Compiler generated temporaries
                Labels in source languages
                What information is to be associated with the name?
                How do we access this information?
                Some symbol tables also include the keywords in the same
                table while others use a separate table for keywords.
CSC 3205: Compiler Design                                                  4/13
    Symbol Tables: Scopes
                The scope of a declaration is the portion of a program to
                which the declaration applies.
                Scopes are implemented by setting up a separate symbol table
                for each scope.
                A program block with declarations will have its own symbol
                table with an entry for each declaration in the block.
                This approach also works for other constructs that set up
                scopes; for example, a class would have its own table, with an
                entry for each field and method.
                Declarations may be constants, type declarations, variable
                declarations, procedure declarations, class declarations
CSC 3205: Compiler Design                                                        5/13
    Symbol Table Purpose
         A symbol table may serve the following purposes depending upon
         the language
                To store the names of all entities in a structured form at one
                place.
                To verify if a variable has been declared.
                To implement type checking, by verifying assignments and
                expressions in the source code are semantically correct.
                To determine the scope of a name (scope resolution)
CSC 3205: Compiler Design                                                        6/13
    Who creates the Symbol Table?
         Who creates the symbol table, the lexical, syntax or semantic
         analysis phase?
         - It is the semantic analysis phase which creates the symbol table
         because it is not until the semantic analysis phase that enough
         information is known about a name to describe it.
         - Many compilers set up a table at lexical analysis time for the
         various variables in the program, and fill in information about the
         symbol later during semantic analysis when more information
         about the variable is known.
         - The parser can create entries. With its knowledge of the
         syntactic structure of a program, a parser is often in a better
         position than the lexical analyzer to distinguish among different
         declarations of an identifier.
         - In some cases, a lexical analyzer can create a symbol-table entry
         as soon as it sees the characters that make up a lexeme.
         - More often, the lexical analyzer can only return to the parser a
         token, say id, along with a pointer to the lexeme.
CSC 3205: Compiler Design                                                      7/13
    Class and Related Attributes
                A name in a program can represent a variable, a constant, a
                parameter, a record or union type, a field in a record or union,
                a procedure or function, a macro, an array, a label, or a file
                etc. These are values for a symbol’s class.
                Not all languages have all these possibilities. The possible
                classes for a name vary with the language, as do the other
                attributes. Once a name’s class is known, the rest of the
                information may vary depending on the value of the class
                attribute.
                If a name is of class variable, or constant, or parameter, there
                is another attribute which records the name’s type.
                For a name whose class is procedure or function, other
                attributes indicate the number of parameters, the parameters,
                themselves, and the result type for functions.
                For a name whose class is array, other attributes are the
                number of dimensions and the array bounds (if known)
CSC 3205: Compiler Design                                                          8/13
    Symbol Table Operations
                There are two main operations on symbol tables: Insert (or
                Enter ) and Lookup (or Retrieval)
                Most languages require declaration of names. When the
                declaration is processed, the name is inserted into the symbol
                table.
                For languages not requiring declarations, e.g., FORTRAN and
                SNOBOL, a name is inserted into the symbol table on its first
                use in the program. Each subsequent use of a name causes a
                symbol table lookup operation.
                Given the characters of a name, searching finds the attributes
                associated with that name. The search time depends on the
                data structure used to represent the symbol table.
CSC 3205: Compiler Design                                                        9/13
    Implementation of Symbol Tables
                An unordered list would enter each name sequentially as it is
                declared. The lookup operation must then search linearly.
                Inefficient unless it is known that the number of entries will be
                exceedingly small.
                An Ordered List: ordered according to the characters in the
                name. Can be implemented as an array or as a linked list.
                A Binary Tree: combines the fast search time of an ordered
                array with the insertion ease of a linked list.
                Hash tables are the best method for efficiency. Used by most
                production-quality compilers. More space is generally needed
                so it is not as space-efficient as binary trees or list structures
                A stack where a pointer is kept to the top of the stack for
                each block. Names are pushed onto the stack as they are
                encountered. This is an easy structure to implement, but
                relatively inefficient in operation.
                Combinations of the data structures may be used. eg a
                hashed symbol table implemented as a stack.
CSC 3205: Compiler Design                                                            10/13
    Scope Management
         . . .
         int value=10;
         void pro_one(){
            int one_1;
            int one_2;
             {
                     int one_3;   // inner scope 1
                     int one_4;
             }
             int one_5;
             {
                     int one_6;     //inner scope 2\\
                     int one_7;
             }
         }
         void pro_two(){
            int two_1;
            int two_2;
             {
                     int two_3;    // inner scope 3
                     int two_4;
                 }
             int two_5;
         }
CSC 3205: Compiler Design                               11/13
    Hierarchical Symbol Table
CSC 3205: Compiler Design       12/13
    Hierarchical Symbol Table
         whenever a name needs to be searched in a symbol table, it is
         searched using the following algorithm:
                first a symbol will be searched in the current scope, i.e.
                current symbol table.
                if a name is found, then search is completed, else it will be
                searched in the parent symbol table until, either the name is
                found or global symbol table has been searched for the name.
CSC 3205: Compiler Design                                                       13/13