Bill,
I would extend the pattern to also match the square brackets also, then
modify the MatchEvaluator function to behave according to either the first
escape sequence or the second escape sequence...
--
Hope this helps
Jay B. Harlow [MVP - Outlook]
..NET Application Architect, Enthusiast, & Evangelist
T.S. Bradley -
http://www.tsbradley.net
| Jay;
|
| If you look at the string again, you'll see that not only the 4-digit
group
| that needs to be translated but also other characters as well: (those in
| squared brackets as below):
|
| Nghi[ê]n Cứu - Ph[ê ]B[ì]nh
|
| I'm using phpWebsite and mySQL database from an ISP (IpowerWeb.com).
| Input text is Unicode when a webpage is created/updated.
| The text string above is stored in mySQL table instead.
| I gues I have to convert the text back to Unicode to view/edit then put it
| back. mySQL probably converts the text to the above format by itself.
|
| Any suggestion on how to accomplish this?
|
| Thanks again
|
| Bill
|
|
| message | > Bill,
| > You could use a RegEx to convert the char escape codes to chars.
| >
| > You could implement what Herfried suggested with something like:
| >
| > Const input As String = "Nghiên Cứu - Phê Bình"
| >
| > Const pattern As String = "\&\#\d{4}\;"
| > Static parser As New Regex(pattern, RegexOptions.Compiled)
| > Dim output As String = parser.Replace(input, AddressOf
| > MatchEvaluator)
| >
| > Private Function MatchEvaluator(ByVal input As Match) As String
| > Dim value As String = input.Value.Substring(2, 4)
| > Return ChrW(CInt(value))
| > End Function
| >
| >
| > Does the 7913 represent a 4 digit decimal or hexidecimal number? You may
| > need to change the call to CInt accordingly...
| >
| > --
| > Hope this helps
| > Jay B. Harlow [MVP - Outlook]
| > .NET Application Architect, Enthusiast, & Evangelist
| > T.S. Bradley -
http://www.tsbradley.net
| >
| >
| > | > | Herfried;
| > |
| > | I don't know if this will work, but I need help to try it:
| > | here's sample of the text string
| > |
| > | "Nghiên Cứu - Phê Bình"
| > |
| > | I need to read each byte in the text string, then use chrW to convert
it
| > to
| > | Unicode.
| > |
| > | I tried chrW(ascW(textString)) but it only converts the 1st letter.
| > |
| > | Is there a function to read all bytes in the text string in 1 pass?
| > | Thanks
| > |
| > | Bill
| > |
| > |
| > |
| > | | > | >> I'm getting data from a mySQL database (default char set = UTF-8).
| > | >> I need to display data in Unicode but got only mongolian characters
| > like
| > | >> this: Phạm Thị Ngọc
| > | >>
| > | >> I changed the textbox font to Arial Unicode MS but still not
working.
| > | >>
| > | >> Do I need conversion of data stored in mySQL database before
| > displaying?
| > | >
| > | > Windows Forms controls cannot directly convert the character
entities
| > like
| > | > 'ạ' to the appropriate character. You may want to replace the
| > | > string "&#<number>;" with the value of 'ChrW(<number>)' or simply do
| > not
| > | > encode the characters in the database using that way.
| > | >
| > | > --
| > | > M S Herfried K. Wagner
| > | > M V P <URL:
http://dotnet.mvps.org/>
| > | > V B <URL:
http://classicvb.org/petition/>
| > |
| > |
| >
| >
|
|