J
jacksonmacd
This is about an application developed by another division of the
company where I work, so I don't really hold any influence over the
database structure. However, I have provided my opinion on occasion --
but it is generally ignored. Go figure...
It's about a table containing many (up to millions) of records of time
intervals. One attribute of the table is the StopCode -- a code number
to explain what happened during the time interval. StopCodes are
related one-to-many to StopCategories. The number of StopCodes would
be in the 10s to 100s, and the number of StopCategories would be less
than 10. So far, so good.
Howeverr, the table is flattened -- it has columns for both StopCode
and StopCategory. The developers/custodians believe that the violation
of Normalization rules is justified because it will "improve
performance because the reporting engine does not need to look up the
StopCategory for every record."
I believe that's just bunk! however, I can *sorta* see their point --
the lookup is done once, and then forever stored in the table. With
normalization, a join needs to be performed every time the database is
queried. Makes some kind of sense...
Then it occured to me that the database engine would (probably?)
retrieve the StopCategories once during any query, and retrieve one
column from the table, and do the join in memory. But with the
flattened structure, it would need to retrieve two columns from the
table for every record. Seems to be eliminating a join but increasing
the amount of disk activity. Seems to be trading an in-memory activity
with an on-disk activity, which *must* be slower.
Does my logic make sense? Proper/appropriate indexing would obviously
be required, and YMMV depending of any number of factors, but I am
interested in the principle. Does the same answer apply whether the
database is Jet or SQL Server? I would like to be in a position of
knowledge to counter their "joins are too slow" arguement.
Thanks.
company where I work, so I don't really hold any influence over the
database structure. However, I have provided my opinion on occasion --
but it is generally ignored. Go figure...
It's about a table containing many (up to millions) of records of time
intervals. One attribute of the table is the StopCode -- a code number
to explain what happened during the time interval. StopCodes are
related one-to-many to StopCategories. The number of StopCodes would
be in the 10s to 100s, and the number of StopCategories would be less
than 10. So far, so good.
Howeverr, the table is flattened -- it has columns for both StopCode
and StopCategory. The developers/custodians believe that the violation
of Normalization rules is justified because it will "improve
performance because the reporting engine does not need to look up the
StopCategory for every record."
I believe that's just bunk! however, I can *sorta* see their point --
the lookup is done once, and then forever stored in the table. With
normalization, a join needs to be performed every time the database is
queried. Makes some kind of sense...
Then it occured to me that the database engine would (probably?)
retrieve the StopCategories once during any query, and retrieve one
column from the table, and do the join in memory. But with the
flattened structure, it would need to retrieve two columns from the
table for every record. Seems to be eliminating a join but increasing
the amount of disk activity. Seems to be trading an in-memory activity
with an on-disk activity, which *must* be slower.
Does my logic make sense? Proper/appropriate indexing would obviously
be required, and YMMV depending of any number of factors, but I am
interested in the principle. Does the same answer apply whether the
database is Jet or SQL Server? I would like to be in a position of
knowledge to counter their "joins are too slow" arguement.
Thanks.