Beautiful Code [191]
Example 13-3. The filter method for association type columns that handle wildcards
Code View: Scroll / Show All
static struct genePos *wildAssociationFilter(
struct slName *wildList, boolean orLogic. struct column *col,
struct sqlConnection *conn, struct genePos *list)
/* Filter associations that match any of a list of wildcards. */
{
/* Group associations by gene ID. */
struct assocGroup *ag = assocGroupNew(16);
struct sqlResult *sr = sqlGetResult(conn, col->queryFull);
char **row;
while ((row = sqlNextRow(sr)) != NULL)
assocGroupAdd(ag, row[0], row[1]);
sqlFreeResult(&sr);
/* Look for matching associations and put them on passHash. */
struct hash *passHash = newHash(16); /* Hash of items passing filter */
struct genePos *gp;
for (gp = list; gp != NULL; gp = gp->next)
{
char *key = (col->protKey ? gp->protein : gp->name);
struct assocList *al = hashFindVal(ag->listHash, key);
if (al != NULL)
{
if (wildMatchRefs(wildList, al->list, orLogic))
hashAdd(passHash, gp->name, gp);
}
}
/* Build up filtered list, clean up, and go home. */
list = weedUnlessInHash(list, passHash);
hashFree(&passHash);
assocGroupFree(&ag);
return list;
}
The function prototype is followed by a one-sentence comment that summarizes what the function does. The code within the function is broken into "paragraphs," each starting with a comment summarizing what the block does in English.
Programmers can read this function at several different levels of details. For some, the name itself tells them all they need. Others will want to read the opening comment as well. Still others will read all the comments, ignoring the code. Those interested in the full details will read every line.
Because human memory is so strongly associative, once a reader has read the function at one level of detail, reading it at a higher level will generally be enough to recall the more detailed levels. This happens in part because the higher levels form a framework for organizing your memory of the function even as you are reading the lower levels.
In general, the larger the programming entity, the more documentation it deserves. A variable needs at least a word, a function at least a sentence, and larger entities such as modules or objects perhaps a paragraph. It's very helpful if a program as a whole can have a few pages of documentation providing an overview.
It's possible to have too much documentation as well as too little. Documentation is of no use if people don't read it, and people tend to avoid reading long text, especially if it is repetitious.
Humans tend to remember the important things best, though a few people are blessed (or cursed) with a good memory for trivia. The words used in a programming name are important, but whether the style is varName, VarName, varname, var_name, VARNAME, vrblnam, or Variable_Name is not so important. What is important is that a single convention be adopted and followed consistently, so that the programmer need not waste time and memory remembering which style is used in any particular case.
Other keys to keeping code understandable are:
Use a scope as local as possible. Never use a global variable when an object variable will do, and never use an object variable when a local variable will do.
Minimize side effects. In particular, avoid altering any variables except the return value in a function. A function that obeys this rule is called "reentrant," and is a thing of beauty. Not only is it easy to understand, it is automatically thread-safe and capable of being used recursively. Beyond readability, code with few side effects is easier to reuse in different contexts.
These days, many programmers are well aware of the negative impact of global variables on code reuse. Another thing that can discourage code reuse is dependence on data structures. The object-oriented programming style sometimes can end up backfiring in this regard. If useful code