Thursday, October 30, 2008

Denormalization

I'm convinced that denormalization is a word used mostly by people who don't know how to normalize a schema in the first place. They generally fall into the pose of being a "real-world' dba as opposed to a ridiculous academic or relational purist. Often, the real-world dba's say they have denormalized a database when they have merely failed to normalize it.

Of course, denormalization is something that is often useful and occasionally necessary. However, my belief is that you should first put the schema into 3rd normal form and only then consider denormalization. Until you find a normal form for the schema how can you know whether you have properly understood the process you are trying to model?

Often denormalizers do not understand the long-term implications of what they are doing and build serious long-term performance problems into their design. Typically, you will find that a lot of cursors, colossal union queries, table-valued functions, etc are necessary to patch together the data in a poorly normalized design.

After a database has been in service for months or years it is almost impossible to make substantial schema changes to correct the problems. Often man-years of development have gone into the applications accessing the database and making all the changes that would be necessary is just not an affordable course of action. Developers are reduced to putting band-aids on band-aids to keep their applications chugging along.

No comments: