Update Problem

Fabio Furukawa · Jan 8, 2009

Hi
I have a mdb with a table with almost 1 milion records. I need to update a
column in this table with a calculation, to do so, I use a module to do the
calculation and I update the column using DAO Edit / Update methods (one
record at a time) . But when it reachs 400.000 the mdb reachs the size of 2
gb.
Then I need to compact and repair and restart de procedure for all the
remainig records. Is there a way to "turn off" this until the module ends ?

Regards
Fabio

Dymondjack · Jan 8, 2009

The maximum size of an access db is 2gb... i don't believe you can surpass
that even temporarily. You might possible be able to use a counter to stop
at 375000 or 400000, do a compact, and then resume at whatever specified
record (that is, assuming a compact/repair can be done via code... not sure
about that though).

Maybe someone else knows a way, but I've never heard of anyone being able to
exceed 2gb size.

You can, if you are using a backend/frontend setup, link as many tables in
the FE from as many dbs as you would like, which helps a lot of people keep
more than 2gb of data. I don't know if that would work here, it seems that
the majority of your data is in one table(?) but its a thought...

-jack

Fabio Furukawa · Jan 8, 2009

Hi
I might not explain correct.
It does not exceed 2 gb, when I compact/repair, it returns to a acceptable
size.
Is there a way to avoid creanting this ?

Note: If I make a sql query of update, it not happens.

Regards

John Spencer (MVP) · Jan 8, 2009

If at all possible use an update query to update the tables. It leads to less
bloat in many cases and is almost always faster.

Since you did not give any details on what you are doing, it is impossible to
recommend any specific query solution.

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County

John W. Vinson · Jan 8, 2009

If I make a sql query of update, it not happens.

So... why not do it that way?

It's faster. It's easier to program. It doesn't bloat as much. Why not use it?

David H · Jan 8, 2009

There's a bigger issue at hand. You mention that you're updating a
calculation in a table. The generally accepted principles of database design
(sorry I was ready up on Arthur Anderson and the 'g.a.p.' stuck) call for the
avoidance of storing calculated data in lieu of the basis for the
calculation. For example, you would never store a person's age, but rather
their DOB using a query to calculate the age on the fly as its needed.

John W. Vinson · Jan 9, 2009

There's a bigger issue at hand. You mention that you're updating a
calculation in a table. The generally accepted principles of database design
(sorry I was ready up on Arthur Anderson and the 'g.a.p.' stuck) call for the
avoidance of storing calculated data in lieu of the basis for the
calculation. For example, you would never store a person's age, but rather
their DOB using a query to calculate the age on the fly as its needed.

Well... with a million row table, infrequent need to update the field, and
frequent need to sort by it or search by it, this *might* be one of the cases
where it's ok to break the rules and store the calculated value in an indexed
field. I'd still do so using an Update Query rather than a recordset, though.

And I'd insist on *objectively demonstrated* inadequate performance using the
normalized design, with the value being calculated on the fly.

John Spencer · Jan 9, 2009

ANd sometimes AGE is perfectly valid to store.

Mother's Age at Time of Child's birth is an example. That age will not
change. Mom will always have been 21 on the day junior was born.

And there are other times when age will not change. So while I agree
that DOB and calculation are often better, I've got to say "it all depends".

'====================================================
John Spencer
Access MVP 2002-2005, 2007-2008
The Hilltop Institute
University of Maryland Baltimore County
'====================================================

Fabio Furukawa · Jan 9, 2009

Thanks John,

That's my problem. The calculation I mentioned is a score that has to be
calculated monthly.

To do so, in this operation, I have to use some historical information, that
is stored in other tables. I have to use more than 15 variables in this
operation. Each of them have a specifc rule (ex: I have date_1 in table_1 and
date_2 in table_2, to calculate Age on Books, I use the lowest date between
date_1 and date_2, if age on books is 10 to 20 months use this variable as 1
in the calculation).

Because performance isses (this operation can take some time to be done
(almost 2 seconds per record) , I decided to calculate it in "batch" mode.

Fabio Furukawa · Jan 9, 2009

Sorry I made a mistake. It takes almost 2 seconds to process 10 records.

David H · Jan 9, 2009

1) I did state the generally accepted principles of database design* call for
the 'avoidance' of calculated fields being stored as opposed to the
'prohibition' of such.

2) "And I'd insist on *objectively demonstrated* inadequate performance
using the
normalized design, with the value being calculated on the fly. " I'm not
getting your post. Are you saying that you prefer performance in a
non-normalized schema to be tested first for comparision?

3) I used AGE within the context of the person's age since DOB. I would say
that the age at which a woman birth'd a child would be appropriate to store
as it represents a fixed point in time unlikely to change. (But if you're
dealing with a medical application, all medical events have an associated
billing code and as such the date of the birth could be entered as a child
record with the billing code. The mother's age could then be calculated. My
dad was a doc.)

4) I'd have to concur about the storing the something calculated if there's
sorting and searching on it for a huge number of records. I personally
believe that a 100% Normalized database is a theoretical objective, but not
practical in the world especially given that there are times were you might
want to duplicate data for audit trail purposes. Or if you're dealing with an
accounting application, having the client's account balance in a header table
for easily lookup as opposed to having to total the detail. In another
example, if user dch3 enters a record, you might want to permanently capture
the user name 'David' as the person to avoid a situation where the user name
dch3 is changed to a different name or deleted to preserve the data. (I
worked with a PMS where the structure of the database was such were a user
could be created, transactions executed and the user subsequently deleted or
assigned to another person breaking the audit trail. BAD! BAD! BAD).

*The phrase 'generally accepted principles of accounting' has been stuck in
my head for some innane reason.

John W. Vinson · Jan 9, 2009

Sorry I made a mistake. It takes almost 2 seconds to process 10 records.

Well, that certainly meets my objection: "And I'd insist on *objectively
demonstrated* inadequate performance using the normalized design, with the
value being calculated on the fly." For a million rows that would be almost
two and a half days!

Does the update query take this long, or is the 0.2 seconds/record using your
Recordset approach? Does the update query work *at all*?

Fabio Furukawa · Jan 12, 2009

Hi john

The update query is very fast because I'm using primarykey and I'm updating
only one column. But to update one record I have do make some calculations,
and this part of the process takes 0.2 seconds to finish.

David H · Jan 16, 2009

Speed can be effected by a variety of issues such as the speed of the machine
that you're working on, the speed of the network and network traffic. Toss in
there that if you're having to access data in order to perform the
calculation, all of those variable impact just looking up the data to begin
with no to mention the update.

Update Problem

Fabio Furukawa

Dymondjack

Fabio Furukawa

John Spencer (MVP)

John W. Vinson

David H

John W. Vinson

John Spencer

Fabio Furukawa

Fabio Furukawa

David H

John W. Vinson

Fabio Furukawa

David H