Possible Text Encoding Question

  • Thread starter Thread starter Mezzrow
  • Start date Start date
M

Mezzrow

Hey all.


I've seen something that confuses me a bit, and was wondering if
someone could shed some light on the issue.


I have an XML file as an embedded resource in a C# Windows Forms
application. The File is taken directly from a file on the desktop.


I load it into a stream...


....
Assembly a = Assembly.GetExecutingAssembly();
Stream resStream = a.GetManifestResourceStream(resourceName);
....


Read it into an XML document


....
XmlDocument doc = new XmlDocument();
doc.Load(resStream);
....


And write it back out without changes.


....
doc.Save(xmlFileName);
....


The file is written, and when opening the files in notepad, they look
identical.
However, the new file is half the size of the original.


Anyone know why this is or how I can correct this?
Thanks for your time.
-Mezz
 
Hey all.
I've seen something that confuses me a bit, and was wondering if
someone could shed some light on the issue.

I have an XML file as an embedded resource in a C# Windows Forms
application. The File is taken directly from a file on the desktop.

I load it into a stream...

....
Assembly a = Assembly.GetExecutingAssembly();
Stream resStream = a.GetManifestResourceStream(resourceName);
....
Read it into an XML document

....
XmlDocument doc = new XmlDocument();
doc.Load(resStream);
....
And write it back out without changes.

....
doc.Save(xmlFileName);
....
The file is written, and when opening the files in notepad, they look
identical.
However, the new file is half the size of the original.
Anyone know why this is or how I can correct this?
Thanks for your time.
-Mezz

Well, your subject may be correct. What encoding was used for the original
file? What encoding are you using for the new file? (If you don't specify
one then the default should be UTF8)

If, on the off chance, the original file was encoded as UTF16 and the contents
really only needed UTF8, you could see the file size change you mention.

You can check the encoding by looking at the first 2 or 4 bytes of the file.
If it is unicode, they should match one of these:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_42jv.asp
 
Back
Top