Export table to UTF-8 textfile

  • Thread starter Thread starter Christian Nein
  • Start date Start date
C

Christian Nein

Hi,

I have a Access 2003 database with a table containing unicode (UTF-8)
characters (chinese) in some fields. I tried to export the table into a
textfile with the code below but the textfile just contained question marks.
For some reason, the unicode characters are not written correctly to the
file.

Do I have to use some Windows-DLL to write unicode-text files?

Thanx & best regards
Christian

Function ExportToCsv(sRecordset As String) As Boolean
On Error GoTo Err_ExportToCsv

Dim DB As Database
Dim RS As Recordset
Dim s As String
Dim line As String
Dim f As Field
Dim i As Integer

ExportToCsv = True

Set DB = CurrentDb
Set RS = DB.OpenRecordset(sRecordset, dbOpenSnapshot)

Open "C:\test.csv" For Output Access Write As #1

RS.MoveLast
RS.MoveFirst
For i = 1 To RS.RecordCount
For Each f In RS.Fields
If Not IsNull(f.Value) Then s = f.Value
line = line & s & ","
Next f
Write #1, line
line = ""
RS.MoveNext
Next i

Exit_ExportToCsv:
Close #1
Exit Function

Err_ExportToCsv:
ExportToCsv = False
MsgBox Err.Description
Resume Exit_ExportToCsv

End Function
 
Hi Christian,

Normally one would do this by creating a query that formats the data the way
you want, and then export the query using DoCmd.TransferText.

To specify UTF-8, export the query once manually (File|Export), and as you
do so click the Advanced... button in the text export wizard. Make the
necessary settings and save the result as an import/export specification,
whose name you can subsequently pass to TransferText.

As far as I know there's no other built-in way of getting from Unicode text
stored "natively" - i.e. the 16-bit subset of UTF-16 - in an Access text or
memo field to UTF-8.
 
John's example elsewhere in this thread is probably the best way to do this.
If there's any reason why you can't do it that way though, then it seems
from the documentation that you should probably be able to do this using an
ADO Stream object. The ADO API Reference says of the Charset property of the
Stream object ...

<quote>
Sets or returns a String value that specifies the character set into which
the contents of the Stream will be translated. The default value is
"Unicode". Allowed values are typical strings passed over the interface as
Internet character set strings (for example, "iso-8859-1", "Windows-1252",
etc.). For a list of the character set strings that is known by a system,
see the subkeys of HKEY_CLASSES_ROOT\MIME\Database\Charset in the Windows
Registry.
</quote>

On my system, "utf-8" is found in one of the above-mentioned registry
sub-keys.

Unfortunately, the example given in the help topic does not stand alone
sufficiently for me to be able to test it easily.

You can find the ADO API Reference on-line at ...

http://msdn.microsoft.com/library/en-us/ado270/htm/mdmscadoapireference.asp
 
Back
Top