Beautiful Code [7]
I suggested to Rob that we find the smallest regular expression package that would illustrate the basic ideas while still recognizing a useful and nontrivial class of patterns. Ideally, the code would fit on a single page.
Rob disappeared into his office. As I remember it now, he emerged in no more than an hour or two with the 30 lines of C code that subsequently appeared in Chapter 9 of The Practice of Programming. That code implements a regular expression matcher that handles the following constructs.
Character Meaning
c Matches any literal character c.
. (period) Matches any single character.
^ Matches the beginning of the input string.
$ Matches the end of the input string.
* Matches zero or more occurrences of the previous character.
This is quite a useful class; in my own experience of using regular expressions on a day-to-day basis, it easily accounts for 95 percent of all instances. In many situations, solving the right problem is a big step toward creating a beautiful program. Rob deserves great credit for choosing a very small yet important, well-defined, and extensible set of features from among a wide set of options.
Rob's implementation itself is a superb example of beautiful code: compact, elegant, efficient, and useful. It's one of the best examples of recursion that I have ever seen, and it shows the power of C pointers. Although at the time we were most interested in conveying the important role of good notation in making a program easier to use (and perhaps easier to write as well), the regular expression code has also been an excellent way to illustrate algorithms, data structures, testing, performance enhancement, and other important topics.
Top Down Operator Precedence > JavaScript
9. Top Down Operator Precedence
Douglas Crockford
In 1973, vaughan pratt presented "top down operator precedence"[*] at the first annual Principles of Programming Languages Symposium in Boston. In the paper, Pratt described a parsing technique that combines the best properties of Recursive Descent and the Operator Precedence syntax technique of Robert W Floyd.[]. He claimed that the technique is simple to understand, trivial to implement, easy to use, extremely efficient, and very flexible. I will add that it is also beautiful.
[*] Pratt's paper is available at http://portal.acm.org/citation.cfm?id=512931; more information about Pratt himself can be found at http://boole.stanford.edu/pratt.html.
[] For a description of Floyd, see "Robert W Floyd, In Memoriam," Donald E. Knuth, http://sigact.acm.org/floyd
It might seem odd that such an obviously utopian approach to compiler construction is completely neglected today. Why is this the case? Pratt suggested in the paper that preoccupation with BNF grammars and their various offspring, along with their related automata and theorems, have precluded development in directions that are not visibly in the domain of automata theory.
Another explanation is that his technique is most effective when used in a dynamic, functional programming language. Its use in a static, procedural language would be considerably more difficult. In his paper, Pratt used LISP and almost effortlessly built parse trees from streams of tokens.
But parsing techniques are not greatly valued in the LISP community, which celebrates the Spartan denial of syntax. There have been many attempts since LISP's creation to give the language a rich, ALGOL-like syntax, including:
Pratt's CGOL
http://zane.brouhaha.com/~healyzh/doc/cgol.doc.txt
LISP 2
http://community.computerhistory.org/scc/projects/LISP/index.html#LISP_2_
MLISP
ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/68/92/CS-TR-68-92.pdf
Dylan
http://www.opendylan.org
Interlisp's Clisp
http://community.computerhistory.org/scc/projects/LISP/interlisp/Teitelman-3IJCAI.pdf
McCarthy's original M-expressions
http://www-formal.stanford.edu/jmc/history/lisp/lisp.html
All failed to find acceptance. The functional programming community found the correspondence between programs and