Converting Values to Unicode Characters

  • Thread starter Thread starter Gary Smith
  • Start date Start date
G

Gary Smith

The function CHAR converts a value in the range 1-255 to to an ANSI
character. Is there a way to convert values in the range 1-65,342 to
Unicode characters?
 
You can use VLOOKUP but you'll have to create your own lookup table of
the unicode characters and I think you'll have to paste it onto the
same spreadsheet.... Maybe you can find a lookup table on the
internet somewhere that you can easily paste into your spreadsheet.
 
tsides said:
You can use VLOOKUP but you'll have to create your own lookup table of
the unicode characters and I think you'll have to paste it onto the
same spreadsheet.... Maybe you can find a lookup table on the
internet somewhere that you can easily paste into your spreadsheet.

Unfortunately, that solution works only if you already have the characters
available. What I need for the general case is something that convert a
numeric value to a character. This becomes especially critical as the
number of defined Unicode characters is now too large to fit into a single
row. I only need about 16,000 at the moment, but I like to paln ahead
wherever possible.
 
Hi Gary,
I think you would need to get to the code tables for the
translation table (the font Codepage) that you need.

I don't know if a table of unicode characters would be of
any value to you but here is a method to generate the
HTML to create such a table.

Method to Generate a Table of UNICODE characters

A1: '-- B1: 0 C1: 1 D1: 2 E1: 3 ... U1: 19
A2: 20
B2: ="&#" & TEXT(B$2,"0000") & ";"

propagate B2 across to U2 with the fill handle
propagate row 1 down through row 8360 or further

€ is the Euro

Then using a dumb conversion from Excel to HTML that
does not know anything about HTML
http://www.mvps.org/dmcritchie/excel/xl2html.htm

You can generate at Table that looks like
http://www.mvps.org/dmcritchie/rexx/unicode.htm
which if you want you can copy and paste special back to Excel.
I only went a bit beyond 8360 for the above example will
also generate another to go to your 16000 characters.
http://www.mvps.org/dmcritchie/rexx/unicode_16000.htm

Related information in
http://www.mvps.org/dmcritchie/rexx/symbols.htm

If you have to go to another newsgroup or get additional
information from somewhere else, please let us know
additional information (feedback) so that everyone can
be informed as well.
 
Hi Gary, Unicode Table in HTML
The following table through unicode 8379 is a 165KB HTML file
http://www.mvps.org/dmcritchie/rexx/unicode.htm

You requested through 16000 the following would be 313 KB
but I think that is a bit much to put out to my site for this purpose
http://www.mvps.org/dmcritchie/rexx/unicode_16000.htm

So instead you will find it as a zipped file 46KB
http://www.mvps.org/dmcritchie/rexx/unicode_16000.zip
I don't see anything beyond 9340 but you asked for 16000.

If you do not have a program to unzip you can get UltimateZip

UltimateZip - Download, reportedly better than WinZip.
UltimateZip is free unlike Winzip. Can unzip to new folder same
name, same folder, or anywhere else. Can look at files without
permanently unzipping. i.e. click on 16000.zip above, Open
http://www.ultimatezip.com/
http://www.ultimatezip.com/download.htm
 
Thanks for the response. I don't get what you're trying to do with the
entries below. ="&#" & TEXT(B$2,"0000") & ";" generates a circular
reference when entered in B2. Sadly, I didn't move quickly enough. The
folder http://www.mvps.org/dmcritchie/rexx/ is empty now.

I solved my immediate problem by creating a Unicode data file on another
system (OS/390 using HLASM) and transferrinf it as a binary file via FTP.
With a bit of fiddling I was able to import it into Excel as a text file
and then copy the column I needed to the intended target. I'm still
interested in a more general method, though. Excel *should* have a
built-in UCHAR function.


David McRitchie said:
Hi Gary,
I think you would need to get to the code tables for the
translation table (the font Codepage) that you need.
I don't know if a table of unicode characters would be of
any value to you but here is a method to generate the
HTML to create such a table.
Method to Generate a Table of UNICODE characters
A1: '-- B1: 0 C1: 1 D1: 2 E1: 3 ... U1: 19
A2: 20
B2: ="&#" & TEXT(B$2,"0000") & ";"
propagate B2 across to U2 with the fill handle
propagate row 1 down through row 8360 or further
€ is the Euro
Then using a dumb conversion from Excel to HTML that
does not know anything about HTML
http://www.mvps.org/dmcritchie/excel/xl2html.htm
You can generate at Table that looks like
http://www.mvps.org/dmcritchie/rexx/unicode.htm
which if you want you can copy and paste special back to Excel.
I only went a bit beyond 8360 for the above example will
also generate another to go to your 16000 characters.
http://www.mvps.org/dmcritchie/rexx/unicode_16000.htm
If you have to go to another newsgroup or get additional
information from somewhere else, please let us know
additional information (feedback) so that everyone can
be informed as well.
Gary Smith said:
Unfortunately, that solution [VLOOKUP table] works only if you already
have the characters
available. What I need for the general case is something that convert a
numeric value to a character. This becomes especially critical as the
number of defined Unicode characters is now too large to fit into a single
row. I only need about 16,000 at the moment, but I like to paln ahead
wherever possible.
 
Hi Gary, (posted to public.excel with email copy)
Very sorry about leaving out the /htm/ directory.
Here are the corrections in creating a Unicode
table of characters (first 256 characters depend on
font codepage in effect).

You can generate a Table that looks like
http://www.mvps.org/dmcritchie/rexx/htm/Unicode.htm
which if you want you can copy and paste special back to Excel.
I only went a bit beyond 8360 for the above example will
also generate another to go to your 16000 characters.
http://www.mvps.org/dmcritchie/rexx/htm/unicode_16000.htm
Related information in
http://www.mvps.org/dmcritchie/rexx/htm/symbols.htm
http://www.mvps.org/dmcritchie/rexx/htm/fontss.htm

formulas:
B1: 0 C1: 1 ...thru.... U1: 19
A2: 20
B2: ="&#" & TEXT($A2+B$1,"0000")&";"
C2: ="&#" & TEXT($A2+C$1,"0000")&";"
....
U2: ="&#" & TEXT($A2+U$1,"0000")&";"
---
A1:E5 range appears as:
0 1 2 3 4
20    
40 ( ) * +
60 < = > ?
80 P Q R S

HTH,
David McRitchie, Microsoft MVP - Excel [site changed Nov. 2001]
My Excel Pages: http://www.mvps.org/dmcritchie/excel/excel.htm
Search Page: http://www.mvps.org/dmcritchie/excel/search.htm
 
Thanks, that's helped quite a bit. Using a mnor variation on your
technique I was able to generate the characters I need without taking the
data to another system and back.

David McRitchie said:
Hi Gary, (posted to public.excel with email copy)
Very sorry about leaving out the /htm/ directory.
Here are the corrections in creating a Unicode
table of characters (first 256 characters depend on
font codepage in effect).
You can generate a Table that looks like
http://www.mvps.org/dmcritchie/rexx/htm/Unicode.htm
which if you want you can copy and paste special back to Excel.
I only went a bit beyond 8360 for the above example will
also generate another to go to your 16000 characters.
http://www.mvps.org/dmcritchie/rexx/htm/unicode_16000.htm
Related information in
http://www.mvps.org/dmcritchie/rexx/htm/symbols.htm
http://www.mvps.org/dmcritchie/rexx/htm/fontss.htm
formulas:
B1: 0 C1: 1 ...thru.... U1: 19
A2: 20
B2: ="&#" & TEXT($A2+B$1,"0000")&";"
C2: ="&#" & TEXT($A2+C$1,"0000")&";"
...
U2: ="&#" & TEXT($A2+U$1,"0000")&";"
---
A1:E5 range appears as:
0 1 2 3 4
20    
40 ( ) * +
60 < = > ?
80 P Q R S
HTH,
David McRitchie, Microsoft MVP - Excel [site changed Nov. 2001]
My Excel Pages: http://www.mvps.org/dmcritchie/excel/excel.htm
Search Page: http://www.mvps.org/dmcritchie/excel/search.htm
 
You're welcome, the variation that you mention is
that something that would be better for the rest of us.
---
 
David McRitchie said:
You're welcome, the variation that you mention is
that something that would be better for the rest of us.

For my actual application, I had a list of codes (15,738 of them) for
which I needed the corresponding characters. My initial sheet had a
related code in column A, the character code in hexadecimal in column B,
and a description in column C. A short sample:

A B C
212321 3000 CJK space per ANSI Z39.64
212328 FF08 Ideographic left parenthesis
212329 FF09 Ideographic right parenthesis
21232D FF0D Ideographic hyphen minus
212A46 3013 Ideographic geta symbol

I could have converted the hex values to their decimal equivalents, but
nummeric character references work just fine in hex. I inserted two
helper columns between B and C and entered a variation on your formula in
the new C1: ="&#x" & B1 & ";"

Copying C1 down the column gave me:

A B C
212321 3000  
212328 FF08 (
212329 FF09 )
21232D FF0D -
212A46 3013 〓

I then used your terrific XL2HTML macro to export column C as an HTML
file. (This takes a while with such a large file.) Having had experience
with copying and pasting from HTML files before, I knew that the simpler
the content and the nearer the layout was to the intended result, the
easier it would be. Once the HTML file had been created, it was a simple
matter of copying the entire column and pasting it back into column D
using Edit > Paste special > Unicode text.

This isn't as nice as having a function that could be used in a formula
but it gets the job done.
 
HI Gary,

This should make it possible to everything in Excel.

Excel coding, Although there is no CharW Worksheet Function, a UDF can
be created and can be used on the worksheet much like usage in HTML.

Function CharW(dec As Long) As String
CharW = ChrW(dec)
End Function

=personal.xls!CharW(9835)

Private Sub Worksheet_BeforeDoubleClick(ByVal Target As Range, Cancel As Boolean)
' ♫ -- Beamed Eighth Notes, Is available on my system
' ♫ (no Escape character) ♫ Ok in Excel and HTML
[a2] = ChrW(9835)
[b2] = ChrW(Val("&H266B"))
[c2] = ChrW(CDbl("&H266B"))
[d2] = "=pesonal.xls!CharW(9835)"
End Sub

http://www.fileformat.info/info/unicode/char/2295/
http://www.fileformat.info/info/unicode/char/266B/

http://evolt.org/article/ala/17/21234/

http://www.mvps.org/dmcritchie/rexx/htm/symbols.htm
http://www.mvps.org/dmcritchie/rexx/htm/unicode.htm

I was looking in a book for some of the names, but if I'll add the
url for the spelled out names if that's what you used to start with
to my unicode.htm pages..
 
That looks like it could be very useful. I'll give it a try at the first
oportunity. Thanks for al your help.


David McRitchie said:
This should make it possible to everything in Excel.
Excel coding, Although there is no CharW Worksheet Function, a UDF can
be created and can be used on the worksheet much like usage in HTML.
Function CharW(dec As Long) As String
CharW = ChrW(dec)
End Function
=personal.xls!CharW(9835)

Private Sub Worksheet_BeforeDoubleClick(ByVal Target As Range, Cancel As Boolean)
' ♫ -- Beamed Eighth Notes, Is available on my system
' ♫ (no Escape character) ♫ Ok in Excel and HTML
[a2] = ChrW(9835)
[b2] = ChrW(Val("&H266B"))
[c2] = ChrW(CDbl("&H266B"))
[d2] = "=pesonal.xls!CharW(9835)"
End Sub

I was looking in a book for some of the names, but if I'll add the
url for the spelled out names if that's what you used to start with
to my unicode.htm pages..
 
Back
Top