Mastering Algorithms With C - Kyle Loudon [72]
A chained hash table is a good way to implement a symbol table because, in addition to being an efficient way to store and retrieve information, we can use it to store a virtually unlimited amount of data. This is important for a compiler since it is difficult to know how many symbols a program will contain before lexical analysis.
The runtime complexity of lex is O (1), assuming next_token runs in a constant amount of time. This is because lex simply calls chtbl_insert, which is an O (1) operation.
Example 8.4. Header for a Simple Lexical Analyzer
/*****************************************************************************
* *
* --------------------------------- lex.h -------------------------------- *
* *
*****************************************************************************/
#ifndef LEX_H
#define LEX_H
#include "chtbl.h"
/*****************************************************************************
* *
* Define the token types recognized by the lexical analyzer. *
* *
*****************************************************************************/
typedef enum Token_ {lexit, error, digit, other} Token;
/*****************************************************************************
* *
* --------------------------- Public Interface --------------------------- *
* *
*****************************************************************************/
Token lex(const char *istream, CHTbl *symtbl);
#endif
Example 8.5. Implementation of a Simple Lexical Analyzer
/*****************************************************************************
* *
* --------------------------------- lex.c -------------------------------- *
* *
*****************************************************************************/
#include #include #include #include "chtbl.h" #include "lex.h" #include "symbol.h" /***************************************************************************** * * * ---------------------------------- lex --------------------------------- * * * *****************************************************************************/ Token lex(const char *istream, CHTbl *symtbl) { Token token; Symbol *symbol; int length, retval, i; /***************************************************************************** * * * Allocate space for a symbol. * * * *****************************************************************************/ if ((symbol = (Symbol *)malloc(sizeof(Symbol))) == NULL) return error; /***************************************************************************** * * * Process the next token. * * * *****************************************************************************/ if ((symbol->lexeme = next_token(istream)) == NULL) { /************************************************************************** * * * Return that there is no more input. * * * **************************************************************************/ free(symbol); return lexit; } else { /************************************************************************** * * * Determine the token type. * * * **************************************************************************/ symbol->token = digit; length = strlen(symbol->lexeme); for (i = 0; i < length; i++) { if (!isdigit(symbol->lexeme[i])) symbol->token = other; } memcpy(&token, &symbol->token, sizeof(Token)); /************************************************************************** * * * Insert the symbol into the symbol table. * * * **************************************************************************/ if ((retval = chtbl_insert(symtbl,