Hi Steve,
Am I then going to add a "meaningless" autonumber field to this table?
Eh? And then have two fields in the table, both uniquely identifying each
record? No.
That's certainly your prerogative, but I would still add the autonumber field to this table.
Why? Consider the effects on database size, speed and network traffic in a multi-user database.
Your 4-character code will take 4 bytes of storage if Unicode compression is set, or 8 bytes of
storage without it. So, as long as you have Unicode compression set, there would not be any
penalty for size versus a long integer data type. But what about speed? The Access Help file
states the following:
"When a field's Unicode Compression property is set to Yes, any character whose first byte is 0
is compressed when it is stored and uncompressed when it is retrieved."
I'm willing to bet that uncompressing this data is not free, as far as CPU time spent to
accomplish this operation.
On the tour operator's product id field, a value of KUON0204C1 would require 10 bytes of storage
in the primary table, and 10 bytes of storage for every related record in any child tables, if
Unicode compression was enabled. Lets assume there are 5000 unique product id's in the main
table, and that each product id has, on average, 500 related records in a child table. If I'm
doing my math correctly, using this text-based primary / foreign key would require:
5000 (10 + (500)(10)) = 25,050,000 bytes = 24,463 KB = 23.9 MB. (with Unicode compression)
or
5000 (20 + (500)(20)) = 50,100,000 bytes = 48,926 KB = 47.8 MB. (without Unicode compression)
On the other hand, if you add a "meaningless" autonumber field, so that you now have two fields
in the table, both uniquely identifying each record, the storage requirement would be:
5000 ((4 + 10) + (500)(4)) = 10,070,000 bytes = 9, 834 KB = 9.6 MB.
The more records a database has, the more dramatic the difference becomes.
Tom
_______________________________________
Tom,
I make extensive use of AutoNumber primary key fields. However, if
there is a single field in the existing natural data which is unique, I
use this as the primary key. For example, in one of my databases I have
a table of Schools. Each school has a 4-character code which is
allocated by the educational authorities to each school in the country,
and this is used by schools and associated bodies in many ways. Am I
then going to add a "meaningless" autonumber field to this table? Eh?
And then have two fields in the table, both uniquely identifying each
record? No. And I'd be willing to bet that Tony Toews or Luke Chung
wouldn't either. Same goes for one of my databases for a tour operator.
They have a product id field where they enter stuff like KUON0204C1.
This means something to them. It is the primary key of this table, and
the basis of relationships with over 20 other tables. It works fine.
Anyway, my intention was not to re-kindle a debate which has been done
to death many times (and if we did, you'd find I was closer to you in
practice than many others would be). I just wanted to add a little
balance and breadth to the information being provided to Jack in answer
to his question.
--
Steve Schapel, Microsoft Access MVP
_______________________________________
Tom Wickerath wrote:
Hi Steve,
My response to the argument that a "natural" key should be used as a primary key is that as soon
as you have a primary key that has meaning, the need will likely come up to change it's value
sometime in the future. I think you will agree that a well chosen primary key should NEVER need
to be changed, right?
Take, for example, the 5-character primary key that you will find in the Northwind Customer's
table. It represents the name of the company, so clearly it has meaning. What if the company
decides to change it's name? You'll either have a primary key that doesn't match the pattern of
all your other primary keys, or you'll need to change the PK value. Yes, you can use Cascade
Update to cascade this change to all related tables, but, in doing so, you are putting more
stress on the database. See the following advice from Tony Toews, in the section titled: "Why I
don't like cascade updates and deletes"
http://www.granite.ab.ca/access/bestpractices.htm
I also refer you to a paper written by Luke Chung, President of FMS, Inc. See the section titled:
"Use Meaningless Field for the Key Field":
http://www.fmsinc.com/tpapers/genaccess/databasenorm.html
If one has a meaningless PK, there should never be a need to change its value. Thus, you won't
run into the issue that Tony discusses. I prefer using an autonumber data type, which the user
never sees on any forms or reports. Another benefit of doing this is that I never need to check
for a duplicate key. Why make your job harder by insisting on "natural" keys?
Text-based primary and foreign keys tend to require more bytes, on average, than a meaningless
autonumber, a long integer, which requires 4 bytes. I believe they are also slower to process,
in operations involving searching, sorting, etc.
Tom