Beautiful Code [139]
Top Down Operator Precedence > Symbol Table
9.2. Symbol Table
We will use a symbol table to drive our parser:
var symbol_table = {};
The original_symbol object will be the prototype for all other symbols. It contains methods that report errors. These will usually be overridden with more useful methods:
var original_symbol = {
nud: function ( ) {
this.error("Undefined.");
},
led: function (left) {
this.error("Missing operator.");
}
};
Let's define a function that defines symbols. It takes a symbol id and an optional binding power that defaults to zero. It returns a symbol object for that id. If the symbol already exists in the symbol_table, it returns that symbol object. Otherwise, it makes a new symbol object that inherits from original_symbol, stores it in the symbol table, and returns it. A symbol object initially contains an id, a value, a left binding power, and the stuff it inherits from the original_symbol:
var symbol = function (id, bp) {
var s = symbol_table[id];
bp = bp || 0;
if (s) {
if (bp >= s.lbp) {
s.lbp = bp;
}
} else {
s = object(original_symbol);
s.id = s.value = id;
s.lbp = bp;
symbol_table[id] = s;
}
return z;
};
The following symbols are popular separators and closers:
symbol(":");
symbol(";");
symbol(",");
symbol(")");
symbol("]");
symbol("}");
symbol("else");
The (end) symbol indicates that there are no more tokens. The (name) symbol is the prototype for new names, such as variable names. They are spelled strangely to avoid collisions:
symbol("(end)");
symbol("(name)");
The (literal) symbol is the prototype for all string and number literals:
var itself = function ( ) {
return this;
};
symbol("(literal)").nud = itself;
The this symbol is a special variable. In a method invocation, it is the reference to the object:
symbol("this").nud = function ( ) {
scope.reserve(this);
this.arity = "this";
return this;
};
Top Down Operator Precedence > Tokens
9.3. Tokens
We assume that the source text has been transformed into an array of simple token objects (tokens), each containing a type member that is a string ("name", "string", "number", "operator") and a value member that is a string or number.
The token variable always contains the current token:
var token;
The advance function makes a new token object and assigns it to the token variable. It takes an optional id parameter, which it can check against the id of the previous token. The new token object's prototype will be a name token in the current scope or a symbol from the symbol table. The new token's arity will be "name", "literal", or "operator". Its arity may be changed later to "binary", "unary", or "statement" when we know more about the token's role in the program:
Code View: Scroll / Show All
var advance = function (id) {
var a, o, t, v;
if (id && token.id !== id) {
token.error("Expected '" + id + "'.");
}
if (token_nr >= tokens.length) {
token = symbol_table["(end)"];
return;
}
t = tokens[token_nr];
token_nr += 1;
v = t.value;
a = t.type;
if (a === "name") {
o = scope.find(v);
} else if (a === "operator") {
o = symbol_table[v];
if (!o) {
t.error("Unknown operator.");
}
} else if (a === "string" || a === number") {
a = "literal";
o = symbol_table["(literal)"];
} else {
t.error("Unexpected token.");
}
token = object(o);
token.value = v;
token.arity = a;
return token;
};
Top Down Operator Precedence > Precedence
9.4. Precedence
Tokens are objects that bear methods that allow them to make precedence decisions, match other tokens, and build trees (and in a more ambitious project also check types, optimize,