line input vs whole file

  • Thread starter Thread starter James Gaylord via AccessMonster.com
  • Start date Start date
J

James Gaylord via AccessMonster.com

I have a large file that I import on a weekly basis. Currently it is almost
196,000 records and grows each week. I am running through the following code
and I noticed afdter about 70,000 records it starts slowing way down. I
believe it is because I am reading in 1 line at a time and once it reaches
this point it is haiving to go back out to the hard drive and read additional
data. Is there a way and does it make sense to read the entire file in
before I start parsing the records.

I thought I had seen somehting in another post about reading in the entire
file first and then doing the parsing, but I can't locate this anywhere now.

TIA

Dim iFile As Long
Dim db As DAO.Database
Dim rs As DAO.Recordset
Dim nLine, st2nd, st3rd, stUOM, stDesc1, stDesc2 As String
Dim lngItem As Long
Dim tStart As Single
Dim x As Long


' Deletes the current table before running the update
On Error GoTo comUpdate_Click_Error

tStart = Timer
DoCmd.OpenQuery "qry:DelJDEItems", acViewNormal, acEdit

Set db = CurrentDb()
Set rs = db.OpenRecordset("JDEItemNumbers")

iFile = FreeFile

Open "Z:\VMITMMST.csv" For Input Access Read As iFile
x = 1
Do Until EOF(iFile)

Me.lblRecCount.Caption = x & " records processed"
Me.Repaint


Line Input #iFile, nLine

st2nd = Split(nLine, ",")(0)

If Left(st2nd, 1) = """" Then
st2nd = Mid(st2nd, 2, Len(st2nd) - 2)
End If

st3rd = Split(nLine, ",")(1)
If Left(st3rd, 1) = """" Then
st3rd = Mid(st3rd, 2, Len(st3rd) - 2)
End If

lngItem = Split(nLine, ",")(2)

stUOM = Split(nLine, ",")(3)
If Left(stUOM, 1) = """" Then
stUOM = Mid(stUOM, 2, Len(stUOM) - 2)
End If

stDesc1 = Split(nLine, ",")(4)
If Left(stDesc1, 1) = """" Then
stDesc1 = Mid(stDesc1, 2, Len(stDesc1) - 2)
End If
stDesc2 = Split(nLine, ",")(5)
If Len(stDesc2) > 2 And Left(stDesc2, 1) = """" Then
stDesc2 = Mid(stDesc2, 2, Len(stDesc2) - 2)
Else
stDesc2 = ""
End If
rs.AddNew
rs![2NDItemNumber] = st2nd
rs![3RDIteNumber] = st3rd
rs![ItemNumber(Short)] = lngItem
rs!UOM = stUOM
rs!Desc1 = stDesc1
rs!Desc2 = stDesc2

rs.Update
x = x + 1
Loop

rs.Close

db.Close

Close iFile

--
James B Gaylord
From the Wolf Comes the Strength of the Pack
From the Pack Comes the Strength of the Wolf
- R Kipling

Message posted via AccessMonster.com
 
But would this work when there are unmatched quotes (") in the fields. That
is why I am doing the parse the way I am. When I try just doing a straight
impot I get unparsable records that go into an error file.

Alex said:
Hi,
if this is a CSV file - perhaps you can import it into temporary table using
docmd.TransferText?

then run append query to copy data to necessary tables
I have a large file that I import on a weekly basis. Currently it is
almost
[quoted text clipped - 86 lines]
Close iFile
 
I am not convinced the slow down is due to the file size, and line input.

Try running your code without ANY recordset update code..and see how fast
just a plane Jane loop with line-inputs goes.

My guess is the total time for the full file read will be VERY VERY small.

What this means is eliminating the repeated line-inputs will NOT help your
performance at all...


Try the above simple test....I don't think the size of the text file, or
number of line inputs is the issue here..it is likely the indexing, and the
speed of the recordset code that adds the new records that is really slowing
things down.

You could also try adding records to a table that has NO indexes at all, and
see if that helps the speed issue....
 
You were right paring it down to just the line input loop it did it in under
18 seconds. What seems to really slow it down in the updating of the screen
each loop. I ran the code with the label and repaint and it added over 60
seconds. Man it is hard to believe just refreshing a screen can add that
much time. I guess I can do without the refreshing. When I added the record
update it increased the time to 72 secoonds with a 20 second decrease in time
when I remove all the indexes. Since I wan to keep the indexing I figured I
would remove the indexes and re add them after I have uploaded.
I am not convinced the slow down is due to the file size, and line input.

Try running your code without ANY recordset update code..and see how fast
just a plane Jane loop with line-inputs goes.

My guess is the total time for the full file read will be VERY VERY small.

What this means is eliminating the repeated line-inputs will NOT help your
performance at all...

Try the above simple test....I don't think the size of the text file, or
number of line inputs is the issue here..it is likely the indexing, and the
speed of the recordset code that adds the new records that is really slowing
things down.

You could also try adding records to a table that has NO indexes at all, and
see if that helps the speed issue....

--
James B Gaylord
From the Wolf Comes the Strength of the Pack
From the Pack Comes the Strength of the Wolf
- R Kipling

Message posted via AccessMonster.com
 
Well, not sure, but you can try defining import spec which will work for you

--
Best regards,
___________
Alex Dybenko (MVP)
http://accessblog.net
http://www.PointLtd.com

James Gaylord via AccessMonster.com said:
But would this work when there are unmatched quotes (") in the fields.
That
is why I am doing the parse the way I am. When I try just doing a
straight
impot I get unparsable records that go into an error file.

Alex said:
Hi,
if this is a CSV file - perhaps you can import it into temporary table
using
docmd.TransferText?

then run append query to copy data to necessary tables
I have a large file that I import on a weekly basis. Currently it is
almost
[quoted text clipped - 86 lines]
Close iFile

--
James B Gaylord
From the Wolf Comes the Strength of the Pack
From the Pack Comes the Strength of the Wolf
- R Kipling
 
OK, I can't figure how to delete my Index that happnes to be the primary key
also. Here is the line of code I get and I get an Error 3265 Item not found
in this collection

db.TableDefs("JDEItemNumbers").Indexes.Delete "[ItemNumber(Short)]"

Is it because it is the Primary Key, and if so how do I delete it.

TIA

I am not convinced the slow down is due to the file size, and line input.

Try running your code without ANY recordset update code..and see how fast
just a plane Jane loop with line-inputs goes.

My guess is the total time for the full file read will be VERY VERY small.

What this means is eliminating the repeated line-inputs will NOT help your
performance at all...

Try the above simple test....I don't think the size of the text file, or
number of line inputs is the issue here..it is likely the indexing, and the
speed of the recordset code that adds the new records that is really slowing
things down.

You could also try adding records to a table that has NO indexes at all, and
see if that helps the speed issue....

--
James B Gaylord
From the Wolf Comes the Strength of the Pack
From the Pack Comes the Strength of the Wolf
- R Kipling

Message posted via AccessMonster.com
 
A common method of dealing with slow-downs due to screen updates is to only
update every nth record. For example, if you're counting records with "i"
and you want to update every 10th record, use something like:

If i Mod 10 = 0 Then
Label.Text = i
End If


Rob
 
Back
Top