Online Book Reader

Home Category

HTML, XHTML and CSS All-In-One for Dummies - Andy Harris [260]

By Root 1551 0

linus@gnuWho.org

4

Andy Harris

Wiley Press

andy@aharrisBooks.net

XML is commonly used as a common data language, especially in AJAX applications.

Chapter 3: Normalizing Your Data

In This Chapter

Understanding why single-table databases are inadequate

Recognizing common data anomalies

Creating entity-relationship diagrams

Using MySQL Workbench to create data diagrams

Understanding the first three normal forms

Defining data relationships


Databases can be deceptive. Even though databases are pretty easy to create, beginners usually run into problems as soon as they start working with actual data.

Computer scientists (particularly a gentleman named E. F. Codd in the 1970s) have studied potential data problems and defined techniques for organizing data. This scheme is called data normalization. In this chapter, you discover why single-table databases rarely work for real-world data and how to create a well-defined data structure according to basic normalization rules.

On the CD-ROM, I include a script called buildHero.sql that builds all the tables in this chapter. Feel free to load that script into your MySQL environment to see all these tables for yourself.


Recognizing Problems with Single-Table Data

Packing everything you’ve got into a single table is tempting. Although you can do it pretty easily (especially with SQL), and it seems like a good solution, things can go wrong pretty quickly.

Table 3-1 shows a seemingly simple database describing some superheroes.

Table 3-1 A Sample Database

Name

Powers

Villain

Plot

Mission

Age

The Plumber

Sewer snake of doom, unclogging, ability to withstand smells

Septic Slime Master

Overcome Chicago with slime

Stop the Septic Slime

37

Binary Boy

Hexideci-mation beam, obfuscation

Octal

Eliminate the numerals 8 and 9

Make the world safe for binary representation

19

The Janitor

Mighty Mop

Septic Slim Master

Overcome New York with slime

Stop the Septic Slime

41

It seems that not much can go wrong here because the database is only three records and six fields. The data is simple, and there isn’t that much of it. Still, a lot of trouble is lurking just under the surface. The following sections outline potential problems.


The identity crisis

What’s Table 3-1 about? At first, it seems to be about superheroes, but some of the information isn’t about the superhero as much as things related to the superhero, such as villains and missions. This issue may not seem like a big deal, but it causes all kinds of practical problems later on. A table should be about only one thing. When it tries to be about more than that, it can’t do its job as well.

Every time a beginner (and, often, an advanced data developer) creates a table, the table usually contains fields that don’t belong there. You have to break things up into multiple tables so that each table is really about only one thing. The process for doing so solves a bunch of other problems, as well.


The listed powers

Take a look at the powers field. Each superhero can have more than one power. Some heroes have tons of powers. The problem is, how do you handle a situation where one field can have a lot of values? You frequently see the following solutions:

♦ One large text field: That’s what I did in this case. I built a massive (255 character) VARCHAR field and hoped it would be enough. The user just has to type all the possible skills.

♦ Multiple fields: Sometimes, a data designer just makes a bunch of fields, such as power1, power2, and so on.

Both these solutions have the same general flaw. You never know how much room to designate because you never know exactly how many items will be in the list. Say that you choose the large text field approach. You may have a really clever hero with a lot of powers, so you fill up the entire field with a list of powers. What happens if your hero learns one more power? Should you

Return Main Page Previous Page Next Page

®Online Book Reader