Would normalization improve or hinder response time?

jacksonmacd · Jan 27, 2008

This is about an application developed by another division of the
company where I work, so I don't really hold any influence over the
database structure. However, I have provided my opinion on occasion --
but it is generally ignored. Go figure...

It's about a table containing many (up to millions) of records of time
intervals. One attribute of the table is the StopCode -- a code number
to explain what happened during the time interval. StopCodes are
related one-to-many to StopCategories. The number of StopCodes would
be in the 10s to 100s, and the number of StopCategories would be less
than 10. So far, so good.

Howeverr, the table is flattened -- it has columns for both StopCode
and StopCategory. The developers/custodians believe that the violation
of Normalization rules is justified because it will "improve
performance because the reporting engine does not need to look up the
StopCategory for every record."

I believe that's just bunk! however, I can *sorta* see their point --
the lookup is done once, and then forever stored in the table. With
normalization, a join needs to be performed every time the database is
queried. Makes some kind of sense...

Then it occured to me that the database engine would (probably?)
retrieve the StopCategories once during any query, and retrieve one
column from the table, and do the join in memory. But with the
flattened structure, it would need to retrieve two columns from the
table for every record. Seems to be eliminating a join but increasing
the amount of disk activity. Seems to be trading an in-memory activity
with an on-disk activity, which *must* be slower.

Does my logic make sense? Proper/appropriate indexing would obviously
be required, and YMMV depending of any number of factors, but I am
interested in the principle. Does the same answer apply whether the
database is Jet or SQL Server? I would like to be in a position of
knowledge to counter their "joins are too slow" arguement.

Thanks.

Armen Stein · Jan 27, 2008

Howeverr, the table is flattened -- it has columns for both StopCode
and StopCategory. The developers/custodians believe that the violation
of Normalization rules is justified because it will "improve
performance because the reporting engine does not need to look up the
StopCategory for every record."

I believe that's just bunk! however, I can *sorta* see their point --
the lookup is done once, and then forever stored in the table. With
normalization, a join needs to be performed every time the database is
queried. Makes some kind of sense...

It could improve times just a bit. A join to a lookup table in itself
isn't that costly, but in your case you may be doing it for millions
of records. I suppose it would depend also how complex the rest of
the query is, especially if StopCategory is also joined to other
tables.

The main concern I would have is that a StopCode might be reassigned
to a different StopCategory. Do they have triggers in place to update
millions of records with the new value? Or is this data frozen once
it is generated?

We've denormalized for performance before, but never for a simple
lookup. It's usually for an aggregate of some kind (e.g. the latest
date, an order total.), that we've proven will cause unacceptable
performance. And we make sure that triggers keep the denormalized
value current.

Does my logic make sense? Proper/appropriate indexing would obviously
be required, and YMMV depending of any number of factors, but I am
interested in the principle. Does the same answer apply whether the
database is Jet or SQL Server? I would like to be in a position of
knowledge to counter their "joins are too slow" arguement.

Well, I wouldn't likely store millions of records in Jet to begin
with. But again, in either Jet or SQL Server I don't think it makes
sense to denormalize a simple lookup unless specific testing has been
performed to prove that the normalized structure is too slow.

Armen Stein
Microsoft Access MVP
www.JStreetTech.com

VBA - retrieve data on last create named range from previous inserted named range	0	Sep 24, 2022
Analyzing/Normalizing Database	5	Jul 24, 2008
Normalization Question	6	Sep 26, 2007
Entering data in a field enters the same data in other forms/field	2	Aug 26, 2009
How to normalize this data	1	Mar 17, 2011
Help with normalization	2	Jul 20, 2010
normalized table structure	2	Dec 11, 2007
Second Thoughts on Normalizing	11	Aug 28, 2009

Would normalization improve or hinder response time?

jacksonmacd

Armen Stein

Ask a Question

Similar Threads