Table Design - Many Data Fields

  • Thread starter Thread starter Fiaz Idris
  • Start date Start date
F

Fiaz Idris

I am an MSExcel user migrating to MSAccess.

For each day of the year, there are about 237 data fields to
keep track and I need to generate reports every week, month, year
from these data points. The data fields to keep track may increase
in the future.

Not every fields are related in some way.

What I mean is...
fld01 to fld30 may be related to the same machine
and the other fields have nothing to do with fld01 to fld30.


Example:

fldDate, fld01, fld02, ..., fld237
01Jan04, 1500, 37, ..., 30%
02Jan04, 600, 50, ..., 20%
....
....
31Nov04, 300, 500, ..., 50%
....
....

I have one table design in mind but I don't know if it is any good.

Design (for example):

tblData01To25
fldDate, fld01, ..., fld25

tblData25To50
fldDate, fld26, ..., fld50

....
....

and so on for all the 237 fields with one-to-one relationship.

Could someone please guide me how I might go on with this.

I am an access newbie.
 
Fiaz

Excel is a spreadsheet, and pretty much limits your data structure to one
large table (sheet).

Access is a relational database, and eliminates the need to throw everything
into a single large table. Ditto for using one table per object (one sheet
per machine). In fact, you will have major headaches and heartburn if you
try to use Access as you've been using Excel.

A scan of the tablesdbdesign newsgroup will reveal a strong consensus that
any table design that includes more than 40 (30, 50, opinions differ) fields
is probably not well-normalized. To make good use of Access' strengths,
you'll need to read up on normalization, and you'll want to do some design
work before creating table structure. The general notion is that you will
be using Access tables to model objects, with characteristics, and
relationships to other objects.

From your description, it sounds like one such object (entity) is
"machines". But your suggestion of using one table per machine still leaves
problems, unless (and your description isn't clear on this) each "machine"
has totally different attributes/characteristics, as they might if machine1
was a pencil, machine2 was a automobile, machine3 was a lathe, ... Somehow,
I doubt this, as your suggested field layout seemed to include the same
"type" of fields.

Without further information, the following is only a wild guess at a
structure for Access that would allow you to readily query your data for
daily, weekly, ... reports. I've used generic field names because you
haven't provided concrete examples.

tblMachine
MachineID
MachineCharacteristic1 (this is something about a machine that
rarely/never changes)
MachineCharacteristic2 (ditto)
...

tblReadings
ReadingID
MachineID (a "foreign key", from tblMachine, indicating the machine for
which the reading applies)
ReadingDate (the date the readings were taken)
TotalPoundsProduced (I don't have any idea what these machines are
doing...)
HoursRun (ditto)
ElectricityConsumed (ditto)
...

NOTE: none of these fields probably need to be "calculated" fields --
Access handles calculations nicely in queries. By not including calculated
fields (?% fields?), you gain the benefit of avoiding having to keep all
fields related to the calculation in synchronization.
 
Thanks for your reply. More details as below:

We are running the Baggage Handling System at an airport and there
are several routing a baggage can take to reach the flight loading
points.

So, there are Check In Counters 1 to 18 with daily bag count for each
of them. And there are bags inducted on to sorters at 12 induction points.
Baggage tag read rate on those induction by an automated scanner (as a
percentage of total bags) for each the 12 induction points. Bag counts
for about 100 tilting locations and so on. Number of times the sorters
are conveyors stopped due to faults, etc. etc.

We are not tracking the count for some of the baggage exiting points
or faults, but in future we might include them.

Mainly the data are bag counts, percentage no read, fault counts for several
object in the system.

I will be plotting graphs at the end of a month, for example:

Check-In Counter Bag Counts
Read Rate On Sorter Inductions
Bag Count To Fault Ratio
....
....

If you have any more to add to what you have said, please...

And I will not be storing any calculated fields in the table. The percentages
are automatically generated at the source of data collection.
 
Thanks for your reply. More details as below:
We are running the Baggage Handling System at an airport and there
are several routing a baggage can take to reach the flight loading
points.

So, there are Check In Counters 1 to 18 with daily bag count for each
of them. And there are bags inducted on to sorters at 12 induction points.
Baggage tag read rate on those induction by an automated scanner (as a
percentage of total bags) for each the 12 induction points. Bag counts
for about 100 tilting locations and so on. Number of times the sorters
are conveyors stopped due to faults, etc. etc.

We are not tracking the count for some of the baggage exiting points
or faults, but in future we might include them.

Mainly the data are bag counts, percentage no read, fault counts for several
object in the system.

I will be plotting graphs at the end of a month, for example:

Check-In Counter Bag Counts
Read Rate On Sorter Inductions
Bag Count To Fault Ratio
...
...

If you have any more to add to what you have said, please...

And I will not be storing any calculated fields in the table. The percentages
are automatically generated at the source of data collection.

This would suggest the tables:
tblCounter: CounterNum
tblInduction: InductionNum
tblTilting: TiltingNum
tblTrans: DateTime, TypeCount (counter/induction/tilting),
StationNum(CounterNum/InductionNum/TiltingNum), NoBagsCounted, other
variables.
where DateTime is to the resolution required - ie hourly, every minute,
whatever.

One way, anyway.
Marc
 
Thanks for your reply. More details as below:

We are running the Baggage Handling System at an airport and there
are several routing a baggage can take to reach the flight loading
points.

So, there are Check In Counters 1 to 18 with daily bag count for each
of them. And there are bags inducted on to sorters at 12 induction points.
Baggage tag read rate on those induction by an automated scanner (as a
percentage of total bags) for each the 12 induction points. Bag counts
for about 100 tilting locations and so on. Number of times the sorters
are conveyors stopped due to faults, etc. etc.

The problem you're experiencing is that you're storing *data* - the
checkin counter number - in a fieldname. That's the wrong place!

You should instead have a field CheckInCounter; each bag would have a
value in this field, 1 to 18 (at present; when you add two more
counters you'll be able to insert 19 or 20 rather than having to
redesign the structure of your database).
We are not tracking the count for some of the baggage exiting points
or faults, but in future we might include them.

Similarly, if you track exit points, you need an ExitPoint field in
your table. Just store the exit point *as data*.
Mainly the data are bag counts, percentage no read, fault counts for several
object in the system.

I will be plotting graphs at the end of a month, for example:

Check-In Counter Bag Counts
Read Rate On Sorter Inductions
Bag Count To Fault Ratio

Very easy, using Totals queries and grouping by CheckInCounter.
 
I have come up with this finally, so could someone please proofread.

Using Three Tables:

Table1: tblDates
Field1: fldDate -Date/Time (Primary Key) ##(Every single day of
the year)

Table2: tblObjects
Field1: fldObjectName -Text[20] (Primary Key) ##(The objects I want
to keep track of)

Table3: tblValues
Field1: fldDate -Date/Time (Foreign Key)
Field2: fldObjectName -Text[20](Foreign Key)
Field3: fldValue -Double ##(The values for each of the objects for
everyday)

Relationships:

tblDates [1] to [MANY] tblValues
tblObjects [1] to [MANY] tblValues

Questions that are bothering me:

tblDates will remain fixed for 366 cells / year
tblObjects will remain fixed for 237 cells / year

and tblValues for 237 objects I need to keep track everyday, I will be
using
237 rows x 3 cols = 711 cells every single day in tblValues bringing
the
total to 260226 cells / year (is this too high? or low?).

So, Am I doing something wrong or maybe right????
How about the file size and performance at the end of the year using
the
database method?
I plan to store at least 6 years of data in a single .mdb.

Current practice using 1 Excel Workbook per month with 31 sheets
from 010104 to 310104 for example and each sheet have an object name
and a value box for operator to type in and the file size is approx.
370KBytes and of course 12 .xls files for the whole year.

I hope to have some insight from you pros.

Thanks.
 
I have come up with this finally, so could someone please proofread.

Using Three Tables:

Table1: tblDates
Field1: fldDate -Date/Time (Primary Key) ##(Every single day of
the year)

This would not ordinarily be needed unless there is some information
that you're storing about July 7 that is different than the
information you stored about July 6. You can have such a table, of
course, but if it has no fields other than fldDate I don't see any
real point in it!
Table2: tblObjects
Field1: fldObjectName -Text[20] (Primary Key) ##(The objects I want
to keep track of)

What is an "object" in this context, just to help me understand?
Table3: tblValues
Field1: fldDate -Date/Time (Foreign Key)
Field2: fldObjectName -Text[20](Foreign Key)
Field3: fldValue -Double ##(The values for each of the objects for
everyday)

I would suggest making Field1 and Field2 a joint, two-field Primary
Key. This can be done even if you don't have tblDates; you can make
fldDate automatically fill in today's date by setting its Default
property to Date() if you wish.
Relationships:

tblDates [1] to [MANY] tblValues
tblObjects [1] to [MANY] tblValues
fine

Questions that are bothering me:

tblDates will remain fixed for 366 cells / year

So the database will be scrapped on December 31? There are only 366
dates this year, but there were 365 last year and will be 365 more
next year!

tblObjects will remain fixed for 237 cells / year

So there are - and always will be - exactly and only 237. You'll never
add a new object, you'll never delete an object? And what do you mean
by "per year" in this context? Do you get a new and different set of
objects next January?
and tblValues for 237 objects I need to keep track everyday, I will be
using
237 rows x 3 cols = 711 cells every single day in tblValues bringing
the
total to 260226 cells / year (is this too high? or low?).

You are still thinking in spreadsheet mode.

ACCESS TABLES DO NOT HAVE CELLS.

Cells are for spreadsheets!

Access tables have *records* which consist of *fields*, to use the
proper jargon. And you might or might not have 237 records every day -
do you need to record a value for every single one of the objects,
every single day? If not, just don't create a record.

And if you do, you'll have 86,742 records over the course of a year
(each consisting of 16 bytes plus a little overhead). This is not
tiny, but it's not big; a database can contain 2 *billion* bytes, and
in ten years you'll have fewer than 16 million. A 10,000,000 row table
is getting pretty big for Access but you're nowhere close.
 
John Vinson said:
This would not ordinarily be needed unless there is some information
that you're storing about July 7 that is different than the
information you stored about July 6. You can have such a table, of
course, but if it has no fields other than fldDate I don't see any
real point in it!

I am just using the table to store the dates for everyday
that is linked with tblValues.

If you say, it is not needed, then I guess it is NOT needed.

Table2: tblObjects
Field1: fldObjectName -Text[20] (Primary Key) ##(The objects I want
to keep track of)

What is an "object" in this context, just to help me understand?

fldObjectName is the various objects I am keeping track of like
and the corresponding values will be in the tblValues

fldObjectName --> fldValue in tblValues
-------------
Sorter 1 Bag Jam --> Count of bag jams
Sorter 2 Bag Jam --> Count of bag jams
....
....
Sorter 1 Induct 1 --> Bag Count
Sorter 1 Induct 2 --> Bag Count
....
....
Exit 1 --> Bag Count
Exit 2 --> Bag Count
....
....
Sorter 1 Induct 1 --> Read Rate (this is a percentage value
Sorter 1 Induct 2 --> Read Rate (and i don't if it is good to
Sorter 2 Induct 1 --> Read Rate (store it as a double and what if I
Sorter 2 Induct 2 --> Read Rate (need to store some text instead as for
Sorter 1 Operator --> Name (<<<the object in this line ?????
....
....
and so on

Table3: tblValues
Field1: fldDate -Date/Time (Foreign Key)
Field2: fldObjectName -Text[20](Foreign Key)
Field3: fldValue -Double ##(The values for each of the objects for
everyday)

I would suggest making Field1 and Field2 a joint, two-field Primary
Key. This can be done even if you don't have tblDates; you can make
fldDate automatically fill in today's date by setting its Default
property to Date() if you wish.

OK. actually I could understand the technical setup of what you say
above. But, I don't know what I have to do and if is this how I
want it. Yes. I am dumb. :-)

Relationships:

tblDates [1] to [MANY] tblValues
tblObjects [1] to [MANY] tblValues
fine

Questions that are bothering me:

tblDates will remain fixed for 366 cells / year

So the database will be scrapped on December 31? There are only 366
dates this year, but there were 365 last year and will be 365 more
next year!

tblObjects will remain fixed for 237 cells / year

So there are - and always will be - exactly and only 237. You'll never
add a new object, you'll never delete an object? And what do you mean
by "per year" in this context? Do you get a new and different set of
objects next January?

Yes. There are 237 items or objects (as I have put above) I need to
keep track of at this point in time. Maybe I will include some more
later on.
You are still thinking in spreadsheet mode.

ACCESS TABLES DO NOT HAVE CELLS.

Cells are for spreadsheets!

Access tables have *records* which consist of *fields*, to use the
proper jargon. And you might or might not have 237 records every day -
do you need to record a value for every single one of the objects,
every single day? If not, just don't create a record.

Yes I have 237 records every single day each with a corresponding value
assigned to it like a dictionary.

fldObjectName ==> fldValue
in tblObjects in tblValues

Object ==> Value
Object ==> Value
....
....
for 237 records in tblObjects

and I don't know what I am going to do for the Value part because
it is not only numbers I am entering, sometimes it is a percentage,
or a text...
And if you do, you'll have 86,742 records over the course of a year
(each consisting of 16 bytes plus a little overhead). This is not
tiny, but it's not big; a database can contain 2 *billion* bytes, and
in ten years you'll have fewer than 16 million. A 10,000,000 row table
is getting pretty big for Access but you're nowhere close.

Yes that's what I thought.

Thanks for your time. I hope you could just bear with me for a few more
postings (atleast one)
 
I am just using the table to store the dates for everyday
that is linked with tblValues.

If you say, it is not needed, then I guess it is NOT needed.

It might be needed if, for instance, you want to restrict what dates
may legitimately be entered into tblValues. If you want any date to be
valid, and you're not storing any information other than the date
itself, it probably isn't worth the overhead of the additional table.
Table2: tblObjects
Field1: fldObjectName -Text[20] (Primary Key) ##(The objects I want
to keep track of)

What is an "object" in this context, just to help me understand?

fldObjectName is the various objects I am keeping track of like
and the corresponding values will be in the tblValues

fldObjectName --> fldValue in tblValues

Again, it sounds like you're "thinking spreadsheet".

You're storing TWO PIECES of information - which sorter was involved,
and the type of incident, in the same field; fields *should be atomic*
and should have only one value.

Rather than storing a text string "Sorter 1 Bag Jam" you should
consider having a table of Sorters, and a field for sorter; and a
table of EventTypes (contining values such as "bag count", "bag jam",
etc.) and a field EventType. tblValues would then contain records like

EventDate; Sorter; EventType; Count
7/11/04; 1; "Bag Count"; 63
7/11/04; 1; "Bag Jam"; 2

...
...
Sorter 1 Induct 1 --> Bag Count
Sorter 1 Induct 2 --> Bag Count
...
...
Exit 1 --> Bag Count
Exit 2 --> Bag Count
...
...
Sorter 1 Induct 1 --> Read Rate (this is a percentage value
Sorter 1 Induct 2 --> Read Rate (and i don't if it is good to
Sorter 2 Induct 1 --> Read Rate (store it as a double and what if I
Sorter 2 Induct 2 --> Read Rate (need to store some text instead as for
Sorter 1 Operator --> Name (<<<the object in this line ?????
...

I'm sorry,now you've lost me completely.
and I don't know what I am going to do for the Value part because
it is not only numbers I am entering, sometimes it is a percentage,
or a text...

Well, a percentage is a number; often a percentage need not be stored
in the table at all, if it can be calculated on demand from data in
the table. You could also have a number field and a text field, and
use one or the other or both as appropriate.
 
I have tried various ways of setting up the tables and relationships
from your clues, but get stuck in one place or the other.

So, kindly visit the following link to download the original excel
sample file which I want to convert to Access and send your sample.mdb
file to my e-mail address.

The URL is http://www.geocities.com/fiazidris/jv.html

Thank you.
 
Back
Top