Losing Shaping going from Arabic to Unicode

Guest · Nov 17, 2005

Here's my setup:
- My Regional and Language options are set to Arabic (Saudi Arabia) on both
the Regional Options tab and the Advanced tab
- My keyboard is set to Arabic Saudi Arabia (MS-101)
- My browser encoding is set to UTF-8 (I've also tried all available Arabic
endcodings on the browser)
- I'm testing with a web page that displays the Unicode code points for
characters that you enter:
http://people.w3.org/rishida/scripts/uniview/conversion

Arabic characters change shape depending on where they are positioned in a
word. When I type in the characters sin, ya, and dal (English keyboard
equivalent 'sd]') the characters change shape on the display as they are
supposed to. When I hit the button to show the Unicode and UTF-8 equivalents
I'm expecting to see Hexadecimal code points: 'FEB3 FEF3 62F' which
correspond to the 'shaped' version of the characters I typed in. Instead I
get Hexadecimal code points: '633 64A 62F' which correspond to the
stand-alone version of those same characters.

I'm using this Unicode Converter page to illustrate the problem my Web
application is showing. Both behave the same way as far as converting the
Arabic to Unicode. This is affecting data entry to a mainframe database as
the shaping information is getting lost in the transfer.

What do I need to do to be able to type in Arabic data to a browser window
and have the correct translation to Unicode?

David Candy · Nov 17, 2005

Ask in an arabic group. Why would you ask in an english group?

Guest · Nov 17, 2005

Sorry for the confusion. I'm a newbie.

I posted here because I'm on an English version of Windows, because I didn't
find my answer in any of the other questions posted to the English TechNet
community about Arabic codepages and scripts, and because I wasn't aware
there was a place on TechNet specific to Arabic questions. I will find a
different place to post my question.

Thanks for pointing me in the right direction.

David Candy said:
Ask in an arabic group. Why would you ask in an english group?

--
--------------------------------------------------------------------------------------------------
Read David defending the concept of violence.
http://margokingston.typepad.com/harry_version_2/2005/10/entering_the_ga.html#more
=================================================

Meliske said:

Here's my setup:
- My Regional and Language options are set to Arabic (Saudi Arabia) on both
the Regional Options tab and the Advanced tab
- My keyboard is set to Arabic Saudi Arabia (MS-101)
- My browser encoding is set to UTF-8 (I've also tried all available Arabic
endcodings on the browser)
- I'm testing with a web page that displays the Unicode code points for
characters that you enter:
http://people.w3.org/rishida/scripts/uniview/conversion

Arabic characters change shape depending on where they are positioned in a
word. When I type in the characters sin, ya, and dal (English keyboard
equivalent 'sd]') the characters change shape on the display as they are
supposed to. When I hit the button to show the Unicode and UTF-8 equivalents
I'm expecting to see Hexadecimal code points: 'FEB3 FEF3 62F' which
correspond to the 'shaped' version of the characters I typed in. Instead I
get Hexadecimal code points: '633 64A 62F' which correspond to the
stand-alone version of those same characters.

I'm using this Unicode Converter page to illustrate the problem my Web
application is showing. Both behave the same way as far as converting the
Arabic to Unicode. This is affecting data entry to a mainframe database as
the shaping information is getting lost in the transfer.

What do I need to do to be able to type in Arabic data to a browser window
and have the correct translation to Unicode?

Click to expand...

David Candy · Nov 17, 2005

microsoft.public.arabic.webdeveloper

There are lots of arabic groups though above isn't high traffic.

--
--------------------------------------------------------------------------------------------------
Read David defending the concept of violence.
http://margokingston.typepad.com/harry_version_2/2005/10/entering_the_ga.html#more
=================================================

Meliske said:
Sorry for the confusion. I'm a newbie.

I posted here because I'm on an English version of Windows, because I didn't
find my answer in any of the other questions posted to the English TechNet
community about Arabic codepages and scripts, and because I wasn't aware
there was a place on TechNet specific to Arabic questions. I will find a
different place to post my question.

Thanks for pointing me in the right direction.

David Candy said:

Ask in an arabic group. Why would you ask in an english group?

--
--------------------------------------------------------------------------------------------------
Read David defending the concept of violence.
http://margokingston.typepad.com/harry_version_2/2005/10/entering_the_ga.html#more
=================================================

Meliske said:

Here's my setup:
- My Regional and Language options are set to Arabic (Saudi Arabia) on both
the Regional Options tab and the Advanced tab
- My keyboard is set to Arabic Saudi Arabia (MS-101)
- My browser encoding is set to UTF-8 (I've also tried all available Arabic
endcodings on the browser)
- I'm testing with a web page that displays the Unicode code points for
characters that you enter:
http://people.w3.org/rishida/scripts/uniview/conversion

Arabic characters change shape depending on where they are positioned in a
word. When I type in the characters sin, ya, and dal (English keyboard
equivalent 'sd]') the characters change shape on the display as they are
supposed to. When I hit the button to show the Unicode and UTF-8 equivalents
I'm expecting to see Hexadecimal code points: 'FEB3 FEF3 62F' which
correspond to the 'shaped' version of the characters I typed in. Instead I
get Hexadecimal code points: '633 64A 62F' which correspond to the
stand-alone version of those same characters.

I'm using this Unicode Converter page to illustrate the problem my Web
application is showing. Both behave the same way as far as converting the
Arabic to Unicode. This is affecting data entry to a mainframe database as
the shaping information is getting lost in the transfer.

What do I need to do to be able to type in Arabic data to a browser window
and have the correct translation to Unicode?

Click to expand...

Click to expand...

Losing Shaping going from Arabic to Unicode

Guest

David Candy

Guest

David Candy