Hi Tom. Thanks for taking your time to help me with this. I am trying to
compare a genus/species list from the Dallas World Aquarium, with a large
taxonomic database from "Natl cntr for biotech info" - NCBI. The goal is to
produce taxonomic hierarchies for the 114 species at the aquarium/zoo, using
the parentID structure of NCBI's data.
I currently have an Access database that can generate the hierarchies
(thanks Graham Mandeno!) so I am now first trying to match genus' (genuses?)
between the small and large datasets, in order to assign NCBI NodeIDs to the
new organisms. Sorry to bore you with the details here, but this is why I
need to be able to run a find duplicates query.
The large db has the name list with up to 4 words, etc. Anyway, I tried the
new statement and it ran, but returned no records. My first question is:
Is this statement going to return the first word it encounters for its
column or is there any "duplicate finding" code included? I ran it only for
the NCBI NodeName column hoping it would return just the first words, which
could be pasted into a new table.
I'm sorry if I have been unclear here. Here are a few facts that hopefully
clarify my goal.
1. I first need a query to return the first word in each record in the
NodeName column (190,000 records)
2. I would like to paste these new single word records into a new table.
3. Worry about how to "find duplicates" between the small table (114
records) and the newly created one.
4. Assign the NCBI NodeIDs to the 114 aquarium names.
5. Generate taxonomic hierarchies for the 114 critters (monkeys, stingrays,
toucans, etc.)
Nick