tables with lots of zeroes

  • Thread starter Thread starter buggirl
  • Start date Start date
B

buggirl

Hi everybody,

I'm trying to design a table for my invertebrate data. I have many samples
and over 70 taxa identified. However, most of these taxa are rare and only
occur in one or two samples - therefore I end up with a table that contains
many, many zeroes, (this is a common issue in ecology).

In Excel, I always store my samples as ROWS and my taxa as COLUMNS.

I'm looking for a more efficient way of storing this data. I want to avoid
all of those zeroes!!

Any suggestions?

I would also like to be able to link this DATA table to a table containing
taxonomic information (hierarchical classification, making it easier to group
organisms as, say, 'beetles' or 'flies'). Hopefully, if I set up the DATA
table correctly then the TAXONOMIC table will be a piece of cake!

Thanks,

buggirl
 
Using "repeating columns" is a very spreadsheetly way to handle your
situation.

If you want to get the best use of Access' relationally-oriented
features/functions, though, you'll need to brush up on "relational" and
"normalization".

For example, with your design, you'd have to modify your table structure,
your query(ies), your form(s), your report(s), your code procedures,
everything if you decided to add just one more category/taxa.

With a more-normalized design, you'd have one table of taxa, one table of
"entities", and one table to show valid combinations of "individual" and
category/taxa. Your table structure might look something like:

tblEntities
EntityID
EntityTitle (what DO you call bugs, anyway?<g>)
EntityDescription
(... any other entity-specific info)

tlkpTaxa
TaxaID
TaxaTitle
TaxaDescription

trelEntityTaxa
EntityTaxaID
EntityID
TaxaID

Note that using this third table would mean you would ONLY need to add as
many records (for a given entity) as it had applicable taxa. No need to add
the "empty" (zero) records.

Good luck!

Regards

Jeff Boyce
Microsoft Office/Access MVP
 
Hi buggirl

Forget what you did in Excel! Access is NOT a spreadsheet application, it
is a relational database application. It is NOT Excel on steroids. It
requires a different approach from the row/column setup where every
combination has a cell.

Here, for a start, you need three tables:

Species
-------
SpcID (primary key - could be an autonumber)
SpcName (text)
SpcGroup (we'll talk about this later)
... any other details which relate directly to the SPECIES

Samples
--------
SmpID (primary key - could be an autonumber)
SmpDate
SmpLocation (could be text, or if the same location is sampled
repeatedly
then it would be better to have a foreign key to a
Locations table)
... any other details which relate directly to the SAMPLE

SampleCounts
-------------
SampleFK (foreign key to SmpID)
SpeciesFK (foreign key to SpcID)
SampleCount (number)

You set up a one-to-many relationship between:
SmpID and SampleFK
SpcID and SpeciesFK

Now, if a particular species is found in a particular sample, then there
will be a corresponding record in SampleCounts. If not, there will be no
record. Simple! No more zeroes!

For your taxonomy, it really depends on how complex you want your
classification to be.

The simplest would be to have one level of species grouping (beetles, flies,
other insects, birds, etc). This would mean another table, SpeciesGroups,
with:
SpgID (PK, autonumber)
SpgName (text)

You then have a relationship between SpgID and SpcGroup.

If you want to get into the various levels of the taxonomy, you could have
several "layers" of one-to-many related tables:
Kingdoms
Phyla
Class
Order
Family
Genus
Species

However, this doesn't really work in a real taxonomy because there are all
sorts of other levels which may or may not be present in a particular
taxonomic chain, such as subphylum, superfamily, tribe, subspecies, and
variety. Also, a "node" in the tree can often go by many names - for
example, Aves/birds, or Cetoniinae/goliath beetles/flower beetles.

This might seem a bit scary! If so, I'm sorry - it was not my intention - I
just got a little bit carried away :-)

I suggest you start first with five tables: SpeciesGroups, Species,
Locations, Samples, and SampleCounts, and if you want to take the taxonomy
bit further then you can add that later.
--
Good Luck :-)

Graham Mandeno [Access MVP]
Auckland, New Zealand
 
Hi Steve

Are you joking? If you are serious then you must think Jeff is an
exceptionally fast typist. Look at his posting time - one minute after you.

All three of us answered pretty much the same thing in the space of four
minutes, because this is a standard answer to a standard problem.
 
Steve said:
Did you read my post first then plagarize it?

Steve


You are the last one who should be accusing anyone of plagarism. Are you
still trying to sell a CD of all the code you gleaned from these newsgroups
without giving credit to the original authors?

John...
 
You give yourself too much credit ... I rarely bother to read your posts!

"Plagarize" may be an actionable term.

If you bothered to check the date/time stamps you'd see that my response to
the original post happened within one minute of your response. Does it
strike you as likely or even possible that I first read yours, then, as you
accused, "plagarized" your work?

Jeff
 
Since you do not have a leg to stand on bring it on. I'm sure there are
people with deeper pockets who would be more than willing to going after you
for your abuse of these newsgroups.

John... Visio MVP
 
Ignore Steve.
This time he gave you about 40% of a correct answer which is better than
average. Usually, when he tries to answer a question without offering to do
it for you for a reasonable fee, his answer is incorrect.
 
Hooray! Thank you. I have a wonderful new database that works swimmingly.

cheers,

buggirl

Graham Mandeno said:
Hi buggirl

Forget what you did in Excel! Access is NOT a spreadsheet application, it
is a relational database application. It is NOT Excel on steroids. It
requires a different approach from the row/column setup where every
combination has a cell.

Here, for a start, you need three tables:

Species
-------
SpcID (primary key - could be an autonumber)
SpcName (text)
SpcGroup (we'll talk about this later)
... any other details which relate directly to the SPECIES

Samples
--------
SmpID (primary key - could be an autonumber)
SmpDate
SmpLocation (could be text, or if the same location is sampled
repeatedly
then it would be better to have a foreign key to a
Locations table)
... any other details which relate directly to the SAMPLE

SampleCounts
-------------
SampleFK (foreign key to SmpID)
SpeciesFK (foreign key to SpcID)
SampleCount (number)

You set up a one-to-many relationship between:
SmpID and SampleFK
SpcID and SpeciesFK

Now, if a particular species is found in a particular sample, then there
will be a corresponding record in SampleCounts. If not, there will be no
record. Simple! No more zeroes!

For your taxonomy, it really depends on how complex you want your
classification to be.

The simplest would be to have one level of species grouping (beetles, flies,
other insects, birds, etc). This would mean another table, SpeciesGroups,
with:
SpgID (PK, autonumber)
SpgName (text)

You then have a relationship between SpgID and SpcGroup.

If you want to get into the various levels of the taxonomy, you could have
several "layers" of one-to-many related tables:
Kingdoms
Phyla
Class
Order
Family
Genus
Species

However, this doesn't really work in a real taxonomy because there are all
sorts of other levels which may or may not be present in a particular
taxonomic chain, such as subphylum, superfamily, tribe, subspecies, and
variety. Also, a "node" in the tree can often go by many names - for
example, Aves/birds, or Cetoniinae/goliath beetles/flower beetles.

This might seem a bit scary! If so, I'm sorry - it was not my intention - I
just got a little bit carried away :-)

I suggest you start first with five tables: SpeciesGroups, Species,
Locations, Samples, and SampleCounts, and if you want to take the taxonomy
bit further then you can add that later.
--
Good Luck :-)

Graham Mandeno [Access MVP]
Auckland, New Zealand





buggirl said:
Hi everybody,

I'm trying to design a table for my invertebrate data. I have many samples
and over 70 taxa identified. However, most of these taxa are rare and only
occur in one or two samples - therefore I end up with a table that
contains
many, many zeroes, (this is a common issue in ecology).

In Excel, I always store my samples as ROWS and my taxa as COLUMNS.

I'm looking for a more efficient way of storing this data. I want to avoid
all of those zeroes!!

Any suggestions?

I would also like to be able to link this DATA table to a table containing
taxonomic information (hierarchical classification, making it easier to
group
organisms as, say, 'beetles' or 'flies'). Hopefully, if I set up the DATA
table correctly then the TAXONOMIC table will be a piece of cake!

Thanks,

buggirl
 
That's great news! This is one database I hope won't be bug-free ;-)

Post back later if you are interested in more of the taxonomy stuff.
--
Good Luck :-)

Graham Mandeno [Access MVP]
Auckland, New Zealand

buggirl said:
Hooray! Thank you. I have a wonderful new database that works swimmingly.

cheers,

buggirl

Graham Mandeno said:
Hi buggirl

Forget what you did in Excel! Access is NOT a spreadsheet application,
it
is a relational database application. It is NOT Excel on steroids. It
requires a different approach from the row/column setup where every
combination has a cell.

Here, for a start, you need three tables:

Species
-------
SpcID (primary key - could be an autonumber)
SpcName (text)
SpcGroup (we'll talk about this later)
... any other details which relate directly to the SPECIES

Samples
--------
SmpID (primary key - could be an autonumber)
SmpDate
SmpLocation (could be text, or if the same location is sampled
repeatedly
then it would be better to have a foreign key to
a
Locations table)
... any other details which relate directly to the SAMPLE

SampleCounts
-------------
SampleFK (foreign key to SmpID)
SpeciesFK (foreign key to SpcID)
SampleCount (number)

You set up a one-to-many relationship between:
SmpID and SampleFK
SpcID and SpeciesFK

Now, if a particular species is found in a particular sample, then there
will be a corresponding record in SampleCounts. If not, there will be no
record. Simple! No more zeroes!

For your taxonomy, it really depends on how complex you want your
classification to be.

The simplest would be to have one level of species grouping (beetles,
flies,
other insects, birds, etc). This would mean another table,
SpeciesGroups,
with:
SpgID (PK, autonumber)
SpgName (text)

You then have a relationship between SpgID and SpcGroup.

If you want to get into the various levels of the taxonomy, you could
have
several "layers" of one-to-many related tables:
Kingdoms
Phyla
Class
Order
Family
Genus
Species

However, this doesn't really work in a real taxonomy because there are
all
sorts of other levels which may or may not be present in a particular
taxonomic chain, such as subphylum, superfamily, tribe, subspecies, and
variety. Also, a "node" in the tree can often go by many names - for
example, Aves/birds, or Cetoniinae/goliath beetles/flower beetles.

This might seem a bit scary! If so, I'm sorry - it was not my
intention - I
just got a little bit carried away :-)

I suggest you start first with five tables: SpeciesGroups, Species,
Locations, Samples, and SampleCounts, and if you want to take the
taxonomy
bit further then you can add that later.
--
Good Luck :-)

Graham Mandeno [Access MVP]
Auckland, New Zealand





buggirl said:
Hi everybody,

I'm trying to design a table for my invertebrate data. I have many
samples
and over 70 taxa identified. However, most of these taxa are rare and
only
occur in one or two samples - therefore I end up with a table that
contains
many, many zeroes, (this is a common issue in ecology).

In Excel, I always store my samples as ROWS and my taxa as COLUMNS.

I'm looking for a more efficient way of storing this data. I want to
avoid
all of those zeroes!!

Any suggestions?

I would also like to be able to link this DATA table to a table
containing
taxonomic information (hierarchical classification, making it easier to
group
organisms as, say, 'beetles' or 'flies'). Hopefully, if I set up the
DATA
table correctly then the TAXONOMIC table will be a piece of cake!

Thanks,

buggirl
 
Back
Top