Problem with encoding in dataload/-export

  • Thread starter Thread starter Frank M
  • Start date Start date
F

Frank M

I am writing an application with a dataload class. The
development is done in Windows XP, and on the development
Pc I can do a dataload and export without any problems if
I use System.Text.UTF8Encoding on the datastream for
writing. The datafile is still in windows standard
codepage, also for the special chars for Scandinavia
(e.g. æ,ø,å).

However, when I transfer my application to a Pc running
Windows 2000, the special chars are converted to
gibberish (the lower ASCII chars are handled just fine).
Now, to handle it I have then tried to run the export
with UTF7Encoding, ASCIIEncoding and finally
UnicodeEncoding - none of them works for the special
chars and some not at all. Is there any other encoding I
should use?

I have also thought that the problem might not be in the
export but in the loading of the data (i.e. is the
datafile read correctly). For the dataload I use the
OleDbDatareader. As far as I can see, the OleDBDataReader
or OleDbConnection, does not include a property to
control Encoding for the load. I guess it could also be
something that is set as part of the the connection
string and not a regular property. But on Windows XP
where I develop and test I only need to control the
encoding for the StreamWriter and not for the load.

Help will be much appreciated.


With kind regards,

Frank M.
 
Hi Frank M,

This is of course the terretory from Jay B. and I don't think he is active
till tomorrow afternoon (for us) in this newsgroup

But my thought was, did you check for it in your database.

I never know too what the characters are converted, but there are a lot of
formats which are converted on that place.

Just a thought.

Cor
 
Frank,
What do you mean by a 'dataload class'?

What does 'do a dataload' mean?

What does 'do an export' mean?

Which provider are you using with the OleDbConnection?

The provider itself would handle the code page of the data that is stored,
or you would need to provide the code page as part of the connection string.
Some databases such as Access & SQL Server the code page is part of the
database, table, column itself, so you do not need to specify it to the
OleDbConnection. Others such as an AS/400 you can specify it on the
database, table, column itself, but there is a property on the
OleDbCOnnection to override and/or qualify it.

Hope this helps
Jay

I am writing an application with a dataload class. The
development is done in Windows XP, and on the development
Pc I can do a dataload and export without any problems if
I use System.Text.UTF8Encoding on the datastream for
writing. The datafile is still in windows standard
codepage, also for the special chars for Scandinavia
(e.g. æ,ø,å).

However, when I transfer my application to a Pc running
Windows 2000, the special chars are converted to
gibberish (the lower ASCII chars are handled just fine).
Now, to handle it I have then tried to run the export
with UTF7Encoding, ASCIIEncoding and finally
UnicodeEncoding - none of them works for the special
chars and some not at all. Is there any other encoding I
should use?

I have also thought that the problem might not be in the
export but in the loading of the data (i.e. is the
datafile read correctly). For the dataload I use the
OleDbDatareader. As far as I can see, the OleDBDataReader
or OleDbConnection, does not include a property to
control Encoding for the load. I guess it could also be
something that is set as part of the the connection
string and not a regular property. But on Windows XP
where I develop and test I only need to control the
encoding for the StreamWriter and not for the load.

Help will be much appreciated.


With kind regards,

Frank M.
 
Hi Cor,

The database for load is in fact only a commaseparated
textfile. I use the OleDbConnection with the Jet provider
for text files to read it. I have checked the datafile
with an editor on both computers, they are ok, char wise.
It's the output (again to textfiles) which is the
problem. On the XP it's okay, but not on the Windows 2000.

I'll just post it again tomorrow then.

Regards,

Frank
 
Frank,
Can you debug the code on the Windows 2000 machine?

Are the values in the variables the values you expect?

It sounds like the Jet Provider is not encoding or decoding the characters
correctly, unfortunately I do not have a reference for everything that the
Jet Provider supports for a connection string. To convince it that it needs
to use a different code page when decoding the text file.

Have you tried asking "down the hall" in the
microsoft.public.dotnet.framework.adonet newsgroup? Not sure if one of the
Access newsgroups would be a better choice or not, as Jet is used primarily
by Access.

Hope this helps
Jay
 
Hi Jay,
the dataload class, is just one I have for the dataload,
not really important to the problem
The dataload is import from a text file through an
OleDbConnection:
"Provider=Microsoft.Jet.OLEDB.4.0; " _
& "Data Source=" & msImpDir & ";" _
& "Extended
Properties=""Text;HDR=YES;FMT=Delimited"""

From what you say, I guess the problem could be fixed by
specifying the code page in the connection string. How do
you do that? Can you direct me to a place with some
information about it?


Regards,

Frank
 
Hi Jay,

I have worked a bit more with the problem. I now see that
the problem is not in the load of the data from the
textfile, it is in the export to the new datafile.

Just before export I can see the string that is exported
with a streamwriter.Writeline is "Præst..." (I see it in
the debugger Watch window).
but in the datafile itself, it appears as "Præs..", so
it must be the encoding of the streamwriter which should
be different. I want the standard Windows code page.

I'll post this as a new thread, as I can be more specific
about the problem.


Regards,

Frank
 
Frank,
it must be the encoding of the streamwriter which should
be different. I want the standard Windows code page.
If you want to create a file with the system's current ANSI code page, then
you need to use System.Text.Encoding.Default when you open the StreamWriter,
something like:

Imports System.Text

Dim writer As New StreamWriter("My output File.txt", False,
Encoding.Default)
I'll post this as a new thread, as I can be more specific
about the problem.
There is no real need to open a new thread, as that just confuses the issue.

Hope this helps
Jay

Hi Jay,

I have worked a bit more with the problem. I now see that
the problem is not in the load of the data from the
textfile, it is in the export to the new datafile.

Just before export I can see the string that is exported
with a streamwriter.Writeline is "Præst..." (I see it in
the debugger Watch window).
but in the datafile itself, it appears as "Præs..", so
it must be the encoding of the streamwriter which should
be different. I want the standard Windows code page.

I'll post this as a new thread, as I can be more specific
about the problem.


Regards,

Frank
 
Back
Top