Reading ISO filenames

Jan Eliasen · May 4, 2006

Hi

I am receiving some XML form a component that I can not change. This
component reads emails from a POP3 Server and takes the body and
attachments and write them in an XML format for me.

Now, the filenames of the attachemnts are also in the XML, but
unfortunately, I sometimes get this:
"=?iso-8859-1?Q?=C5tertagande=2Epdf?=" instead of the correct
filename. I understand that this is because there was a Swedish
character in the filename, and therefore, the filename has been
encoded.

But how do I get back to the correct filename?

Thanks in advance!

--
Eliasen Jr. representing himself and not the company he works for.

Private email: (e-mail address removed)

"Ford," he said, "you're turning into a penguin. Stop it."

Kevin Yu [MSFT] · May 5, 2006

Hi Eliasen,

This might because you're not using the correct decoder to decode the
stream. It seems the filename is encode with iso-8859-1. You can try to use
that to decode. HTH.

Kevin Yu
Microsoft Online Community Support

============================================================================
==========================
When responding to posts, please "Reply to Group" via your newsreader so
that others may learn and benefit from your issue.
============================================================================
==========================

(This posting is provided "AS IS", with no warranties, and confers no
rights.)

Jan Eliasen · May 5, 2006

This might because you're not using the correct decoder to decode the
stream. It seems the filename is encode with iso-8859-1. You can try to use
that to decode. HTH.

Hi

Well, I don't have a stream or a bytearray or anything.

I have an xmldocument that has the string in an element. So the stream
that originally came via POP3 has already been converted to a string.
Perhaps the guy who creates the XML should have done it differently,
but that is too lat enow :-)

And I need to be able to understand this
string.

Any ideas?

--
Eliasen Jr. representing himself and not the company he works for.

Private email: (e-mail address removed)

"Ford," he said, "you're turning into a penguin. Stop it."

=?ISO-8859-1?Q?G=F6ran_Andersson?= · May 5, 2006

Yes, the string should have been decoded when read from the mail.

There is some definition somewhere in the MIME specifications on the
format of the string, but I think that you can figure it out by just
looking at it.

It starts with =? and ends with ?=. After the =? you have the encoding
name followed by ?Q?. The string contains character codes in the form
=xx where xx is a hexadecimal number.

Create an Encoding object using the encoding name, and use that to
convert the =xx character codes into characters.

You could use a RegEx object that identifies the codes using a
"=([\dA-F]{2})" pattern and uses a MatchEvaluator delegate to convert
each code to a character.

Jan Eliasen · May 5, 2006

Yes, the string should have been decoded when read from the mail.

I thought so :-)

Well, nothing to do there.

Anyway, I have now learned something about Regex and MatchEvaluator...
and it works great. Thanks!

--
Eliasen Jr. representing himself and not the company he works for.

Private email: (e-mail address removed)

"Ford," he said, "you're turning into a penguin. Stop it."

Reading ISO filenames

Jan Eliasen

Kevin Yu [MSFT]

Jan Eliasen

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Jan Eliasen