Importing Data Access

tomwesnick · May 3, 2006

I have a fairly large fix delimited text file I would like to import
into Access. When I import the txt file it doesn't go as smoothly,
resulting in numerous errors. I have verified my import/export specs
and they do not seem to be causing the problem. I opened the text file
using Notepad and there are little vertical rectangular "squares" were
my errors are occurring. Was wondering how they got there and what
could I do in the future (besides manually deleting these "shapes" from
the file). Any helped would be appreciated
Tom

Guest · May 4, 2006

Hi,
it depends on where you got the data from.
Normally square characters represent the line feed and carriage return
characters (Chr(10) & Chr(13) I think!).
HTH
Good luck

John Nurick · May 4, 2006

Hi Tom,

Probably the "squares" are non-printing characters: maybe carriage
returns and/or line feeds, maybe something else. If they were cr+lf
pairs they would appear as line breaks in Notepad, not as squares.)

Without knowing more anything about the file it's hard to be specific (I
don't know what you mean by "fix delimited"). Whjat is the structure of
the file? What errors are you getting in the import?

What I'd do is to use a hex editor to inspect the bytes in the text file
to work out just what are the characters that Notepad presents as
squares. (Some - e.g. older versions of Hexedit - are available for free
download.) It would then be possible to decide what to do about them.

tomwesnick · May 4, 2006

John-
What I meant by "fix" was that the file was a fixed width text file.
All the fields are aligned with x amount of space between each field.
Not sure what you meant by file structure but here is the field layout
for the file.

Field# Description Picture Alpha Num Starting Position
Field Length

1 Media Create Date dd-mmm-yyyy A 1
11
2 Parcel ID x(7) A 12
7
3 Transfer # x(9) A 19
9
4 Old Year S + 9(4) N 28
5
5 Old Sequence # S + 9(3) N 33
4
6 New Year S + 9(4) N 37
5
7 New Sequence # S + 9(3) N 42
4
8 Transaction Date dd-mmm-yyyy A 46
11
9 Book # x(8) A 57
8
10 Page # x(8) A 65
8
11 Old Owner Name x(40) A 73
40
12 Owner Name 1 x(40) A 113
40
13 Sale Date dd-mmm-yyyy A 153
11
14 Sale Price S + 9(10) A 164
11
15 Source Code x(1) A 175
1
16 Sale Type Code (Land/Bldg) x(1) A 176
1
17 Sale Validity Code x(2) A 177
2
18 Note 1 x(40) A 179
40
19 Note 2 x(40) A 219
40
20 Last Maintenance Date dd-mmm-yyyy A 259
11
21 Record Separator - x'0a' x(1) A 270
1

The errors that I am getting are ".................Not all of your data
was successfully imported. Error descriptions with associated row
numbers of bad records........." As mentioned before these errors crop
up were ever these "squares" appear in Notepad. Although I would
consider myself a novice I inspected the file this morning using Hex
Workshop. I did find something that may be significant. Where ever the
error occurred the hex value was 0D and had a 8 bit signed byte value
of 13 as opposed to the non-error locations which had a hex value of 20
and a 32 value for the 8 bit signed byte. Not sure if that means
anything.
Tom

tomwesnick · May 4, 2006

To help I've upload an image file of the Notepad with the shape I've
been refering to. . The link is below

http://img268.imageshack.us/my.php?image=error6nd.png

Tom

John Nurick · May 4, 2006

Hi Tom,

Hex 0D is a carriage return (CR) and 20 is a space.

In a text file, Access normally expects 0D to occur only at the end of a
record (line), which is signalled by 0D and 0A (carriage return followed
by line feed). 0D (CR) on its own is confusing things.

I guess the text file was created by non-Windows software that uses 0D
to indicate a line break within a text field: does that fit the facts?

The best thing, if possible, is to get the software that creates the
file to produce something that Access can handle, such as a CSV
(comma-separated values) file with CR+LF (0D0A) for line breaks.

Otherwise, it's basically a matter of replacing every 0D that is not
followed by an 0A with something else: either a space or some other
character that doesn't appear anywhere in the data (e.g. "\").

There's no simple way of doing this in Visual Basic. If you have Perl
and the file's not too big you can use this

perl -0 -w -pe"s/\x0D([^\x0A])/\\$1/g" oldfile.txt > newfile.txt

at the command line (if an 0D is not followed by an 0A, it replaces it
with a "\"). Otherwise, post back here and come the weekend I can give
you a hand coding it in VBA.

tomwesnick · May 5, 2006

Hi John-
I'm not sure what operating system or software is being used to create
the file. However the original file has .DAT file extension
encapsulated in a zip file. I simply change the extension to .TXT after
extraction. The file in question is slightly over 120MB.

Tom

John Nurick · May 6, 2006

Hi Tom,

Here's a rough VBA procedure that should do what's needed if you use it
to read the .DAT file and write to a .TXT file. It only processes one
line at a time, so the 120MB shouldn't be a problem.

Sub TestSubFixedWidth( _
SourceFile As String, _
DestinationFile As String)

'Searches a fixed width text file. If any record contains
'a CR that is not followed by a LF, replaces it with
'SUBST_CHAR.

Const SUBST_CHAR = "\" 'must be a single character.

Const REC_LENGTH = ??? 'Number of bytes in fixed-width record
'including CR+LF record separator.
Const DEBUG_FLAG = True 'Control whether debugging output is
'printed to the Immediate pane.

Dim S As String * REC_LENGTH 'Buffer for working on records.
Dim fIn As Long
Dim fOut As Long
Dim FilePos As Long 'only actually used for debugging output.
Dim oRE As Object 'VBScript_RegExp_55.RegExp

If DEBUG_FLAG Then
Debug.Print "--- " & SourceFile & " to " _
& DestinationFile & " ---"
End If

'Open input file
fIn = FreeFile()
Open SourceFile For Random As #fIn Len = REC_LENGTH

'Create output file
If Len(Dir(DestinationFile)) > 0 Then Kill DestinationFile
fOut = FreeFile()
Open DestinationFile For Random As #fOut Len = REC_LENGTH

'Create regular expression object
Set oRE = CreateObject("VBScript.Regexp")
With oRE
.Pattern = "\x0d([^\x0a])"
.Global = True
.Multiline = True
End With

Do Until EOF(fIn)
FilePos = Seek(fIn)
Get #fIn, , S 'read the next record

If DEBUG_FLAG Then
Debug.Print FilePos, Left(S, 10), Asc(Right(S, 2)) _
& " " & Asc(Right(S, 1))
'First item is record number, then the first 10 bytes
'of the record, and finally the ANSI values of the last
'two bytes. Except in the last record, these must be "13 10".
'If not, we have set the wrong record length.
End If

If Right(S, 2) <> vbCrLf Then 'either
' 1 we've set the wrong record length
' 2 it's not the kind of file we though it was
' 3 we're past the 0d0a that terminates the
' last record in the file.
'Whichever of these applies, there's no point continuing.
'(NB: it's conceivable that we might one day encounter
'a file in which the last record is not terminated. In
'that case this will have to be modified to check whether
'S contains an actual record.
Exit Do
End If

'Do the replacement and write to output file
S = oRE.Replace(S, SUBST_CHAR & "$1")

Put #fOut, , S

Loop

'Tidy up
Set oRE = Nothing
Close #fIn
Close #fOut
If DEBUG_FLAG Then
Debug.Print "-------------------------------------------"
End If
End Sub

tomwesnick · May 6, 2006

Hi John-
Thanks for the code.
Tom

Importing Data Access

tomwesnick

Guest

John Nurick

tomwesnick

tomwesnick

John Nurick

tomwesnick

John Nurick

tomwesnick