Relational Design Question

  • Thread starter Thread starter Ben
  • Start date Start date
B

Ben

Hi group-
I'm seeking input on a design question. I have a plan of
attack, but before implementing it, I want to make sure
there are no gotchas I haven't thought of, or bad ideas I
didn't notice.
I have a situation where I can have a number of different
types of documents tied to a case (not literal documents,
but logical groups of information). They all have some of
the same information, but each type has special
information all its own. The old design had one big table
with fields for every type (like 130/140 fields) - this
resulted in a lot of null fields and many other
headaches, and just seems like poor relational design.

My thought is to have a table that relates a case to a
document called tblDocuments, this has an identity field,
DocumentID, the case ID, fields that are common to all
documents, and a document type field. Then a table for
each type of document with the specialized fields. These
tables have a DocumentID field that matches up to
tblDocuments.DocumentID (I would call this a 1 to <1
relationship I guess). tblDocuments.DocumentType tells
(through a lookup table) what DocumentType table to look
at for the specialized information.

I realize that this probably would cause a coing mess,
except for the fact that each case can only have *one*
type of document asscociated with it (though it can have
multiple incidences of that type).

Thanks in advance, any and all input is much appreciated.
Ben
 
relationship I guess). tblDocuments.DocumentType tells
(through a lookup table) what DocumentType table to look
at for the specialized information.


No, it either matches exactly to ONE document type in
all your document type tables, either because you
manually make sure that each document type is unique,
or because you have one master document type table
that has the unique document type identifiers.

If that is too much work, you can have a table
identifier INSIDE each Document Type table, (so that
you can have a multiple field unique document identifier)

But you SHOULD not just have a table that tells
you what table to look in: YOU NEED TO HAVE A UNIQUE
DOCUMENT TYPE ID FOR EACH DOCUMENT TYPE.

Believe me, I've had to work with systems that did
not have unique identifiers across multiple tables,
and you don't want to go down that path.....

(david)
 
My thought is to have a table that relates a case to a
document called tblDocuments, this has an identity field,
DocumentID, the case ID, fields that are common to all
documents, and a document type field. Then a table for
each type of document with the specialized fields.

This is a very normal scheme, known as Subclassing (the * is a PK, the + is
a foreign key):

Documents(*DocNum, CaseNo+, EditedBy+, OriginDate, etc)

HandWritten(*DocNum+, WrittenBy, IsLegible, WhereStored,..)

FaxRecvd(*DocNum+, DateRcvd, FromNumber, HasFollowUp,..)

EMailSent(*DocNum+, SentTo, WasFollowedUp,...)
tblDocuments.DocumentType tells
(through a lookup table) what DocumentType table to look
at for the specialized information.

No need for a DocumentType field: the presence of a record in the
appropriate table tells you what type of document it is. What happens if
you have an update error and the DType says FaxReceived and there is no
record in the FaxReceived table? -- it's redundant information and
therefore dangerous.

Unfortunately, Access cannot guarantee that you only can have a particular
DocNum in one of the subclassing tables -- for that you need a proper DBMS
that provides triggers. As long as you can restrict all data input and
editing to your forms, and keep people out of the tables and queries then
you can get quite close, though.

HTH


Tim F
 
Thank you for your input. I think I phrased myself poorly
because you said better what I plan to do that I did. I
plan to have one table with the primary keys and pass
those primary key values on to the other tables as thier
PKs. I call it a 1 to <1 because there isn't a record in
each of the sub tables that matches the primary table,
but the sum of all the records in all the sub tables
equals the number of records in the primary table.
Believe me, I've had to work with systems that did
not have unique identifiers across multiple tables,
and you don't want to go down that path.....

Me too, that's what I'm trying to fix ;)

Thanks again,
Ben
 
Why not put all of the fields from those 4 tables into one table?

You can't use a one to many join to multiple tables anyway.

I would take all of those fields, and put them into one table. It is no big
deal that *some* of the fields will not be used.

Clearly, from a normalizing point of view, you have things like:

Docnum, createdby, etc.

There is fairly large number of attributes that are common to those tables.

I would work very hard to avoid multiple tables here. Relations are one to
many records, NOT one to many tables (trying to relate multiple tables is a
real no no here). Further, those 4 tables can be replaced by one field
called doctype:

doctype: Document, HandWritten, FaxRecvd, EMailSent ...etc.

Further, over time you can extend this to add more types without adding new
tables.

We all know that designs that requite you to add a new table for a new type
of "anything" is a sign that things are REALLY wrong. If you have separate
tables for each type of document, then each type of "search" that you make
will have to be custom built. Same goes for reporting. Each new type of
document will need a different report. You will wind up with 4 times the
number of reports, and very possibly 4 times the amount of forms.

Normalzing says you should not have to do this.

If you trying to decide between using several tables, or having one table
with some empty fields....go the empty field route, or normalize even
further, but don't use a complete new table for each type of document.
 
We all know that designs that requite you to add a new table for a new
type of "anything" is a sign that things are REALLY wrong.

I hate to disagree with a MVP, but in this case I am afraid that this is
just plain wrong. Of course you add a new table in order to incorporate a
new type of thing: a table represents a real-world thing. This is very
different from adding a new table for a new group of records (Sales1999,
Sales2000, etc). If you have desks, you have a Desks table; if you then
want Cats you have to have a Cats table. You don't add Whiskers and
TailLength to desks.
If you have
separate tables for each type of document, then each type of "search"
that you make will have to be custom built.

Of course, and that's the point. There is no point in querying Faxes for
EmailAddress, nor HandWrittenNotes for DeletedFromMailBox. If there's
anything you want to look at all documents for, then it's in the Documents
table.

It should be noted that this wide=table approach gets a bit nearer to
solving the problem of only having one type per document, by creating a
monstrous Table Validation like

(DType="Fax" XOR FaxNumber IS NULL) AND
(DType="Fax" XOR IsUrgent IS NULL) AND
(DType="Email" XOR FromAddress IS NULL)...

To return to the issue of adding new document types: say that someone
invents the LaserSkyWritingMessage system. Now, Ben can either create a new
table with LaserColour, CloudCover etc. Otherwise, he could add all these
attributes to the Documents table, rebuild the database, rewrite all the
queries and re-design the forms, discover he's gone over 255 fields and has
to work out what to remove.... Talk about separating Data from Processing?
Oh, and don't forget to update and debug the TableValidation rule too.
Same goes for reporting.
Each new type of document will need a different report. You will wind
up with 4 times the number of reports, and very possibly 4 times the
amount of forms.

Well, of course it does: there are four times as many incongruent objects.
How often do you design an interface that handles Desks, Cats, Faxes and
MowingTheLawn in one form?
Normalizing says you should not have to do this.
And this, of course, has nothing to do with Normal theory: there is nothing
there that looks at dependencies on the key of _another_ entity.

All the best


Tim F
 
You make a great case here Tim!
Of course, and that's the point. There is no point in querying Faxes for
EmailAddress, nor HandWrittenNotes for DeletedFromMailBox. If there's
anything you want to look at all documents for, then it's in the Documents
table.

The above is true, but then again, if the doc type is a field, then that
search type can now be a combo box, or even a multi-select listbox. The
coding to search for this will be much easer then multi-tables. So, we want
a design where we JUST have to add new records...but never new tables, or
new fields.
To return to the issue of adding new document types: say that someone
invents the LaserSkyWritingMessage system. Now, Ben can either create a new
table with LaserColour, CloudCover etc. Otherwise, he could add all these
attributes to the Documents table, rebuild the database

Well, yes, but if we normalizing the design, then we don't ever add
attributes by adding fields...but can do so by adding new records. Adding
new fields is a no no. I did not mean to hint that we ever add new fields. I
perhaps was not 100% clear in my other post. So, to add the above, we simply
would add a new record to the Document type table. We would also just have
to add a new records to the other fields that we have. All of our existing
screens, reports etc would still function, and not have to be modified at
all. The key, or goal is to NOT have to add new fields or tables (my
apologies for confusing this!).

So, given our goal of no new fields, and no new tables, we wish to be able
to add new types of "things" that we want to file/store in the office.

We start to get something like:
tblDocument:
DocNum, DocType, CreationType, CreationInfo, IsLegible, PhysicalType,
DeliveryType

So, for above, we have

docTypes: ms-Word, HandWritten, Fax, Email, LaserSkyWritingMessage.

CreationTypes: Email, InternalPerson, WrittenBy, EmailBy, Singing tellagram,

CreatintType: ( text for above type field ).

IsLegible

PhysicalType: Paper, Electrong, MicroFilum, Tape, Disk, CD, Stone Tablets

DeliveryType: Electrong, PostMan, FedEx

WhereStored:

Anyway, you can see that I did the above as I just typed this email. If one
sits down and starts really normalizing out the structure, in a few hours of
work, I am sure a table structure that will handle most physical things you
can store in a office from books to dat tapes can be dealt with in the above
type structure. Of course, I am implying that any of those "lists" above can
be extended, and they of course are simple lookups to tables. Note how I can
add new types of things that I plan to store, and NOT have to add new
tables, or new fields. That should be the goal here.
 
Albert & Tim-
Thanks again for your thoughtful insight. I think I know
how to get this thing into 3NF now, maybe my idea can be
confirmed.
I made a mistake in the original post by being too
generic with 'documents'. I'm building an app that
manages and creates filings for government forms. I'm
actually storing what is in the forms on the tables for
auditing and reporting purposes. The app actually
generates the form with the data. So, alot of these
documents are similar in that they share sections, but
they each have some different sections. So, I need to
keep track of the data, which type of form was filed,
if/when/how it was filed, and what was on the form.
So, to get to 3NF, and never need to add fields, only
records, I think I would need a lookup table with
FieldID, DocumentID, FormFieldName, and maybe
FormFieldDatatype. Then a table of document data that has
FieldID and FieldData which is basically strings and all
validation happens in code. Finally, a table resolving
the CaseID with the DocumentID. There'd also have to a
table resolving a type of document with what fields go on
it.
This sounds ideal, and the most flexible, I'm just not
sure if I can migrate the existing table (213 fields, I
checked) into this more normalized structure.
Thanks for your time and your thoughts,
Ben
 
The app actually
generates the form with the data. So, alot of these
documents are similar in that they share sections, but
they each have some different sections. So, I need to
keep track of the data, which type of form was filed,
if/when/how it was filed, and what was on the form.

This is indeed starting to sound much more like a survey type of setup:

Documents(*DocNumber, TypeOfDocument, etc..)

DocAttributes(*AttCode, English)

DocDetails(*DocNumber+, *AttCode+, Value)

and perhaps some structural stuff to control what kinds of DocAttributes
can be allocated to what TypeofDocument, and even what valid Values are.
So, to get to 3NF, and never need to add fields, only
records, I think I would need a lookup table with
FieldID, DocumentID, FormFieldName, and maybe
FormFieldDatatype.

Except for the Name (which is dependent only on the FieldID and not the
whole primary key) this is pretty much it.

It has the disadvantage of needing a lot of programming and a lot of UI
magic to make it work for the user: when I made a system like this I ended
up using Excel for the front end because it was so much more flexible than
Access forms.

As Albert says, though, subclassing does make it a bit messier to add
document types as the price for a much simpler UI design. It comes down, in
the end, to how stable the sections and document details are; and how
capable your users are at learning a complex user interface.

All the best


Tim F
 
Unfortunately, Access cannot guarantee that you only can have a
DocNum in one of the subclassing tables -- for that you need a


You have a master table with a unique index, and each
subtable key is constrained to be a member of that set.

You can also enforce the condition that there is no overlap
between the subtable keys, but I lack the patience to write
that essay....

(david)
 
Back
Top