Online Book Reader

Home Category

Facebook Cookbook - Jay Goldman [116]

By Root 598 0
in the high-throughput scenario. You are taking referential integrity into your own hands.

Cached counters


One of the most common database tasks is to discover how many child records a parent record has, resulting in redundant and expensive counting operations. Most modern database platforms do an impressive job of caching query results under normal load, but at some point it’s better to track this count ourselves.

Create columns on the parent table in which to store these counts, and verify that your application contains hooks to update this value any time a child record is added or removed. For example, given Items.list_id fk> List.id, you could add List.items_count to your table and eliminate the requirement to incur a count operation.

Precalculated sums


Any expensive data transformation operations should be performed at the time a record is written, regardless of whether the operation is triggered programmatically or through SQL. However, although SQL supports functions like SUM(), these tools are not encouraged for use during a highly concurrent read operation.

Similar to implementing a cached counter, you can add columns to store these precalculated values. Generally this is assuming that there is a parent and child foreign key relationship in place. Any time a dependent record is changed, these calculated sum values on the parent table will need to be regenerated.

Discussion


First off, give yourself a quick refresher of normalization and its awkward cousin, denormalization:

http://en.wikipedia.org/wiki/Database_normalization

http://en.wikipedia.org/wiki/Denormalization

To paraphrase Wikipedia, a denormalized data model is not the same as a data model that has not been normalized, and denormalization should only take place after a satisfactory level of normalization has taken place.

Using database constraints to enforce referential integrity on a set of write-heavy, normalized tables might prove slower than using a join! Don’t ever forget that your relational database management system (RDBMS) is trying to do its best to speed things along, regardless of your attempts at optimization.

Note that database views might seem like a shortcut to denormalization, with all of the benefits of a join and none of the syntax requirements. However, most RDBMS systems provide views so that a security model can be attached; they still do all of the ugly joins behind the scenes.

Some object-relational mapping (ORM) toolkits, such as ActiveRecord, abstract traditional concepts such as foreign keys and constraints. This is fine—if yours is the only application that will be accessing this data.

Not all of these solutions will work for every application. You should extensively profile your application using tools like ab (Apache Benchmark) to verify a positive change before giving up the advantages of a normalized schema.

Next steps


It wasn’t that long ago that programmers were not encouraged to have an opinion on database infrastructure, which was traditionally the realm of the database admin (DBA). Today, this is no longer the case. Developer Rowan Hick has some pointers for optimizing your database access:

If you are using MySQL, bookmark this excellent resource: http://www.mysqlperformanceblog.com/.

If you are using MySQL or Postgres, look into the EXPLAIN statement. It is invaluable for spotting where you have missed creating a key index, or are inadvertently doing full table scans. You should always watch for queries that take longer than you expect by looking at slow query logs, if your database supports them.

Make sure that you understand table locking, with the goal of reducing updates to a table or even offloading updates to another denormalized table. See http://dev.mysql.com/doc/refman/5.0/en/table-locking.html.

Eventually you might need to pursue more dramatic strategies, such as sharding. Sharding splits data across multiple tables and servers. For example, you might have users A–M on one server and N–Z on another.

If you are using Rails, check out “How to avoid

Return Main Page Previous Page Next Page

®Online Book Reader