Trouble importing foreign language accents into Access 2003

  • Thread starter Thread starter efortin
  • Start date Start date
E

efortin

Hi,

In an earlier post, a solution to another developer's description of the
same problem I have was to set the Import Specification to "All"
and Code Page to "Auto-select" .

This solution does not solve the same problem for me. I am importing a
tab delimited text file into Access 2003, with Import Specification of "All"
and Code Page of "Auto-select" ( though I have tried Western Europe Windows
and several others) to no avail. The text value 'Sören' appears in Access as
'Sören' in every case.

Please help!
Thanks,
Elaine
 
When you make a reference to another post, you should give it reference so
that people can take a look at it and see what you are doing or try to
follow.

In your case, when you see that you accentuated letters have been replaced
by a combination of two letters, the first one beeing Ã, you must suspect
that your file is in Unicode 8 bit or utf-8. Try setting your code page to
65001.

You should also check if your file begin with a BOM (Byte Ordre Marker) as
its presence or its absence could throw off the Auto-select.

Also, it's quite possible won't be able to import Unicode encoded with
uft-8. If this is the case, re-save your ascii file but this time use
Unicode 16 bit (utf-16) or in another code page. In the case of utf-16, try
with or without a BOM (also called a signature in some dialog windows).

--
Sylvain Lafontaine, ing.
MVP - Windows Live Platform
Blog/web site: http://coding-paparazzi.sylvainlafontaine.com
Independent consultant and remote programming for Access and SQL-Server
(French)
 
Merci beaucoup, Sylvain!

Saving the file in ANSI/ASCII format solved most of the problems with
foreign characters. I am still getting some junk in the file which Access
complains about, but with good reason. I believe it was some weird character
keyed by mistake in data entry.

Elaine
 
Hello again,

I thought the problem was solved but it is not. When I reduce the file down
to only a field that has the accents, and I save it first in UTF-8, then
close ultra-edit, re-open the file and save it as ANSI/ASCII, the accent
conversion occurs. But when I do the same steps on the whole file, no accents
are converted. I have also tried UTF-16, with and without BOM.

Any help is much appreciated.

Elaine
 
Just saw your message. Beside with or without BOM, there are two types of
UTF-16 (for a total combination of four). Did you try the other
possibilities as well? It can also have another name such as simply
Unicode.

I don't know Ultra-Edit, so I'm not really sure if your problem is with
Ultra-Edit or with the importation routine.

Also, when you import data, if you leave it to Access, Access will usually
take a look at only the first 255 characters or so before determining the
encoding of the file. This is possibly what's happening to you when you add
the other fields. You should really specify to Access that this is an
Unicode file. Trying with a specific code page - other than Unicode - could
also help.

I will try to make some tests later but at this moment, I'm reduced to only
make wild guesses about your problem.

--
Sylvain Lafontaine, ing.
MVP - Windows Live Platform
Blog/web site: http://coding-paparazzi.sylvainlafontaine.com
Independent consultant and remote programming for Access and SQL-Server
(French)
 
When I import the file using File/Get External Data/Import and specify my
Import Specification file, and using the drop-down box for Language to be
'All' and for Code Page to be 'Auto-Select', the accents are correct in the
table. How can I get them to import correctly using DoCmd.TransferText ?

Code Page is the last parameter but must be a number value. Where can I find
the number values for the Code Page formats in the drop-down list? I found
and tried UTF-8 as 65001 and ASCI as 1252 but neither retained the accents.

Thanks,
Elaine
 
efortin said:
When I import the file using File/Get External Data/Import and specify my
Import Specification file, and using the drop-down box for Language to be
'All' and for Code Page to be 'Auto-Select', the accents are correct in
the
table. How can I get them to import correctly using DoCmd.TransferText ?

Code Page is the last parameter but must be a number value. Where can I
find
the number values for the Code Page formats in the drop-down list? I found
and tried UTF-8 as 65001 and ASCI as 1252 but neither retained the
accents.

UTF-8 is 65001 and if you are seeing 'Sören' instead of Sören then your
file is in UTF-8; the code page 1252 or any other than 65001 won't cut the
mustard. I don't understand why it doesn't work but many program don't
understand the code page 65001 because it's not a real code page; maybe this
is the case with DoCmd.TransferText. Using UTF-8 instead of 1252 or of
UTF-16 is a real pain with Windows. You should try replacing UTF-8 with
something else if possible when you are saving your file.

Did you try to use your import specification file with the command
DoCmd.TransferText or to create and use an Import/Export specification? See
the reference http://support.microsoft.com/kb/208991 and the very end of the
following article to learn how to createn an Export/Import specification:

http://www.experts-exchange.com/Microsoft/Development/MS_Access/Q_22942586.html

One thing you could try would be to open the file with VBA code and read the
file line by line to extract the information and store it in the database
yourself, instead of trying to use DoCmd.TransferText.

--
Sylvain Lafontaine, ing.
MVP - Windows Live Platform
Blog/web site: http://coding-paparazzi.sylvainlafontaine.com
Independent consultant and remote programming for Access and SQL-Server
(French)
 
Sylvain, I want you to know how very much I appreciate your help with this.
It is now working. The creation of an export specification which was defined
to use 'All' languages and 'Unicode UTF-8', used as the import specification
was the solution.

Thanks again,
Elaine
 
Glad to have been of some help. However, I would urge you to learn how to
read files - text and binary - in VBA so that you won't have to deal with
DoCmd.TransferText anymore if necessary. You're always better server by
yourself.

For this, you can use the standard files functions of VBA or use the Files
System Object (also known as FSO) available to all scripting languages.
Search for both on the internet. As a start, you can look into the
following references:

http://www.applecore99.com/gen/gen029.asp
http://www.applecore99.com/gen/gen029.asp

--
Sylvain Lafontaine, ing.
MVP - Windows Live Platform
Blog/web site: http://coding-paparazzi.sylvainlafontaine.com
Independent consultant and remote programming for Access and SQL-Server
(French)
 
The solution found last december has worked beautifully until early April.
Since then, most files which are supposed to be UTF-8, downloaded from a
website in Ireland, to my application in Masschusetts, no longer import into
Access 2003 using the code page for Unicode UTF-8. Once again, I am getting
"Sören' instead of "Sören", no matter if I use IE 7 or Firefox, on 2
different PC's both running Windows XP.

I have tried importing the file with a dozen dofferent encoding formats and
none interprets the accents correctly.

Is anyone else having this problem?

Did something change in the encoding world in April?

Thanks in advance,
Elaine
 
Maybe they have changed something on their side?

If you can, verify for the presence or the absence of a UTF-8 BOM (Byte
Order Mark) at the beginning of the file before and after the change.

--
Sylvain Lafontaine, ing.
MVP - Windows Live Platform
Blog/web site: http://coding-paparazzi.sylvainlafontaine.com
Independent consultant and remote programming for Access and SQL-Server
(French)
 
Hi Sylvain,

I see no BOM at the top of either file. I checked using Ultraedit and
Notepad. No BOM. I also checked files from months ago that imported fine and
they also have no BOM.

They are checking on their side as well.

Thanks,
Elaine
 
Back
Top