HTML, XHTML and CSS All-In-One for Dummies - Andy Harris [261]
If you choose to have multiple power fields, the problem doesn’t go away. You still have to determine how many skills the hero can have. If you designate ten skill fields and one of your heroes learns an eleventh power, you’ve got a problem.
The obvious solution is to provide far more room than anybody needs. If it’s a text field, make it huge; and if it’s multiple fields, make hundreds of them. Both solutions are wasteful. Remember, a database can often have hundreds or thousands of records, and each one has to be the same size. If you make your record definition bigger than it needs to be, this waste is multiplied hundreds or thousands of times.
You may argue that this is not the 1970s. Processor power and storage space are really cheap today, so why am I worrying about saving a few bytes here and there? Well, cheap is still not free. Programmers tend to be working with much larger data sets than they did in the early days, so efficiency still matters. And here’s another important change. Today, data is much more likely to be transmitted over the Internet. The big deal today isn’t really processor or storage efficiency. Today’s problem is transmission efficiency, which comes down to the same principle: Don’t store unnecessary data.
When databases have listed fields, you tend to see other problems. If the field doesn’t have enough room for all the data, people will start abbreviating. If you’re looking for a hero with invisibility, you can’t simply search for “invisibility” in the powers field because it may be “inv,” “in,” or “invis” (or even “can’t see”). If you desperately need an invisible hero, the search can be frustrating, and you may miss a result because you didn’t guess all the possible abbreviations.
If the database uses the listed fields model, you have another problem. Now, your search has to look through all ten (or hundreds of) power fields because you don’t know which one holds the “invisible” power. This problem makes your search queries far more complicated than they would have been otherwise.
Another so-called solution you sometimes see is to have a whole bunch of Boolean fields: Invisibility, Super-speed, X-ray vision, and so on. This fix solves part of the problem because Boolean data is small. It’s still troublesome, though, because now the data developer has to anticipate every possible power. You may have an other field, but it then reintroduces the problem of listed fields.
Listed fields are a nightmare.
Repetition and reliability
Another common problem with data comes with repetition. If you allow data to be repeated in your database, you can have some really challenging side effects. Refer to Table 3-1, earlier in this chapter, and get ready to answer some questions about it. . . .
What is the Slime Master’s evil plot?
This question seems simple enough, but Table 3-1 provides an ambiguous response. If you look at the first row (The Plumber), the plot is Overcome Chicago with slime. If you look at The Janitor, you see that the plot is to Overcome New York with slime. Which is it? Presumably, it’s the same plot, but in one part of the database, New York is the target, and elsewhere, it’s Chicago. From the database, you can’t really tell which is correct or if it could be both. I was required to type in the plot in two different records. It’s supposed to be the same plot, but I typed it differently. Now, the data has a conflict, and you don’t know which record to trust.
Is it possible the plots were supposed to be different? Sure, but you don’t want to leave that assumption to chance. The point of data design is to ask exactly these questions and to design your data scheme to reinforce the rules of your organization.
Here’s a related question. What if you needed to get urgent information to any hero fighting the Septic Slime Master? You’d probably write a query like
SELECT * FROM hero WHERE villain = ‘Septic Slime Master’
That query is a pretty reasonable request, but it wouldn’t work. The villain in The Janitor record