HTML entities from input fields

  • Thread starter Thread starter chernyshevsky
  • Start date Start date
C

chernyshevsky

How do I force IE to encode characters outside of the current code-page
as HTML entities? Right now, when I enter some Cyrillic text into a
ISO-8859-1 form, the text submitted ends up being CP1251. If I enter
some Polish letters, the text is CP1252. This behavior is too weird! I
need IE to do things one way and one way only.
 
How do I force

You don't, on the WWW.

Please don't crosspost pointlessly. Either your question is about HTML
authoring for the WWW and therefore belongs to c.i.w.a.h., or it is not and
does not belong here (there). Please make up your mind so that others won't
need to do that for you. I'm guessing this is about WWW authoring, so I set
followups to c.i.w.a.h.
IE to encode characters outside of the current code-page
as HTML entities?

You cannot force such grossly incorrect behavior. You just need to be
prepared to getting form data encoded that way, from IE and perhaps other
browsers as well.

What is your _original_ question, as opposite to an assumed solution that
itself aims at forcing browsers to misbehave?
Right now, when I enter some Cyrillic text into a
ISO-8859-1 form, the text submitted ends up being CP1251. If I enter
some Polish letters, the text is CP1252. This behavior is too weird!

It's weird too, but maybe not technically incorrect (the specs are fuzzy).
I need IE to do things one way and one way only.

You can't. You just need to live with it.

If you wish to be prepared to getting arbitrary character data (as a form
designer should be, right?), make the page containing the form UTF-8 encoded.
Browsers will then send the data in UTF-8 format (though of course, some old
browsers may fail to do this - but there is little hope with them anyway).

The usual tutorian in matters like this is Alan's
http://ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html
 
How do I force IE to encode characters outside of the current code-page
as HTML entities?

Do you mean numeric character references like А ? Or what?
And why do you want them?
Right now, when I enter some Cyrillic text into a ISO-8859-1 form,

Are you the author? Change the encoding (charset) to UTF-8.
Otherwise write to the author.
the text submitted ends up being CP1251. If I enter
some Polish letters, the text is CP1252.

I don't believe that. Rather Internet Explorer changes the
characters to numeric representations. See
http://ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html
"Win MSIE (various versions)".

You end up with something like this
<http://google.com/search?q=&#1052;&#1086;&#1089;&#1082;&#1074;&#1072;>
 
Back
Top