Unnecessary indexes

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I've been inherited an Access application to support, and I noticed that lots
of tables have unnecessary indexes. In other words, some indexes have the
Primary property to No, Unique = No, and Ignore Nulls = No. So I ask myself
if all these three properties are No, then why is there an index? Does Access
create these indexes "behind the scenes"?
 
Access my create indexes on any fields with names containing "code", "ID",
"Num". This is determined in the Tools->Options menu.
You should have indexes on fields that might be used in joins, sorting, and
criteria.
 
Ray said:
I noticed that lots
of tables have unnecessary indexes. In other words, some indexes have the
Primary property to No, Unique = No, and Ignore Nulls = No. So I ask myself
if all these three properties are No, then why is there an index?

First, you place indexes on columns or groups of columns to make accessing
them faster in queries. This is especially useful on joined columns and
sorts on the columns that aren't the primary key of the table. Placing an
index on columns used in where clauses can also speed up queries.

Second, why would you think that disallowing nulls in an index would make the
index unnecessary? Disallowing nulls means there *has* to be a value in that
column (or index) which is often desired in an index even if it's not a
primary key or unique index. Pretty useful since you can't join on nulls.
Does Access
create these indexes "behind the scenes"?

Access automatically puts indexes on new columns like ID, Code, and Num for
new/imported tables by default unless you change this setting in Tools |
Options | Tables/Queries. Access also automatically creates an index on
relations where referential integrity is enforced.
 
No, what I mean was why have ALL three properties at No. I understand that
disallowing Nulls is important but I ask why is it necessary if all three
properties are at No. After all, if it's not a Primary key and that you allow
duplicates (Unique=No) AND Ignore Nulls=No....then why is there an index? I
guess what I'm asking is about the Unique property specifically. If its No,
then the index really isn't an index that helps find the row faster...

Ray
 
I guess what I'm asking is about the Unique property specifically. If its No,
then the index really isn't an index that helps find the row faster...

You're mistaken, there! Nonunique indexes do indeed make both
searching and sorting faster, often dramatically faster. Access can
(for example) use the index to return an array containing ONLY the
desired records, before even looking at the table. If you don't have
an Index JET must run a full table scan, retrieving every record in
the table.

John W. Vinson[MVP]
 
Maybe it's the way I'm visualizing an index. I always thought an index is a
copy of the entire table...only sorted differently by the fields in the index.
 
Maybe it's the way I'm visualizing an index. I always thought an index is a
copy of the entire table...only sorted differently by the fields in the index.

Absolutely not. It's a "B-Tree", a highly optimized, compact index to
the positions of the records on disk.

John W. Vinson[MVP]
 
Ray

Consider...

Is the index in a box just the entire contents of the book resorted? Or a
collection of "pointers" to help you find something faster?

Regards

Jeff Boyce
Microsoft Office/Access MVP
 
Ray

My fingers were faster than my brain.

That line should have read

"Is the index in a book just the ..."

Makes a bit more sense, no?

Jeff
 
Ray said:
I noticed that lots
of tables have unnecessary indexes. In other words, some indexes have the
Primary property to No, Unique = No, and Ignore Nulls = No. So I ask myself
if all these three properties are No, then why is there an index?

You appear to be confusing constraints (not null, unique, primary key)
and indexes. Constraints are for data integrity, indexes are for
performance. There is also a differentiation between the logical
(constraints) and physical implementation (indexes). Constraints form
part of the schema catalog, indexes do not (and therefore are absent
from the SQL-92 standard to which Jet aspires).

The confusion is caused by the Access user interface, which makes no
attempt to differentiate between constraints and indexes. So the
current implementation of Jet may use an index to implement a UNIQUE
constraint - so what? It's one of those 'under the hood' things that I
don't need to know. This is the beauty of a declarative language such
as SQL: I state "What, not how" e.g. I declare a constraint (e.g.
UNIQUE values) and the SQL platform decides how best to satisfy my
requirement (tree index, bit vector index, hashing system, whatever).

Jamie.

--
 
Ok, now I'm visualizing an index in a much better way. It makes a bit more
sense now. Thanks!

But that doesn't resolve my original question. If we use the book as an
analogy, then I'm to assume that each item in the index is Unique. But there
is a property in the index called Unique that can be set to No. If there can
be duplicate items in the index, then how is the item found faster? Usually
when I open a book, the index has unique items with a page number. But if
there are 50 identical items each pointing to different pages, how does this
speed up the search? Remember, the Unique property is No.

Ray
 
Ray, if a library has 50 copies of one book, you can still look in an index
to find the book.

Before computers, libraries typically maintained 3 sets of index cards for
finding a book: by author, by title, and by subject. The books on the
shelves are sorted in a different order again, but the index card told you
where to look. In the same way, Access is able to look in the index to find
the matching record(s), and quickly jump to each in turn, or sort/filter the
matches.
 
Ray said:
No, what I mean was why have ALL three properties at No.

My apologies. I didn't mean to imply you're only concerned with ignoring
nulls in an index. I'm sure you're aware indexes don't have to be primary
keys or unique to be useful so I focused my explanation on the *ignoring
nulls* part.
I ask why is it necessary if all three
properties are at No.

Ok, say I have a table of 10,000 customers which includes their addresses.
If I put an index on the state column, it won't be the primary key and it
won't be unique, will it? No, so both those settings will be no. If I also
set ignore nulls to no on this index I'll have to include the state for every
record. If I set ignore nulls to yes I can leave some of those records
without a state for the customer's address. Let's see what would happen if I
set this index to ignore nulls (= yes):

In queries I won't be able to join on the state column and get *all* the
records where the state is null because null is never equal to anything, not
even to null. Being indexed if I use state in the criteria (the where column)
, Jet doesn't have to do a full table scan to find all the records with NY in
the state column. Jet just gets the records whose index is 'NY' and ignores
the other records in the table. If most or all the records are 'NY' then
this won't save time over a full table scan, but if there are fewer then
retrieving *just* the 'NY' records with the index will be faster.

What if I set ignore nulls to no? I can still use the index to retrieve
those 'NY' records faster than using a full table scan would retrieve those
'NY' records. It's just that *all* the records will have a state identified
in the customer's address in this case (which means that I might have more
records with 'NY' as the state than if I allowed nulls in this column, but
that's not a good enough reason to always use a full table scan to pick out
the 'NY' records in the table).

So the index would still be useful to speed up my queries with all three
settings on no. Whether I want to allow nulls or not in that index is a
separate issue.
 
Ok, now I'm visualizing an index in a much better way. It makes a bit more
sense now. Thanks!

But that doesn't resolve my original question. If we use the book as an
analogy, then I'm to assume that each item in the index is Unique.

Certainly not!

Consider a Bible Concordance (a basic index). You'll find more than
one reference to Moses, or to Israel, or to Jesus.
But there
is a property in the index called Unique that can be set to No. If there can
be duplicate items in the index, then how is the item found faster? Usually
when I open a book, the index has unique items with a page number. But if
there are 50 identical items each pointing to different pages, how does this
speed up the search? Remember, the Unique property is No.

Reading 50 pages is faster than reading all 1124 pages... right?


John W. Vinson[MVP]
 
The confusion is caused by the Access user interface, which makes no
attempt to differentiate between constraints and indexes.

Thanks, Jamie! I think you've hit the mark, and in addition you've
clarified some murky concepts in my mind!

John W. Vinson[MVP]
 
Back
Top