Reading Unix Text Files?

B

Bill Cohagan

I've got a c# WinForms app that reads an input text file using a
StreamReader, using the ReadLine() method. The input files are Unix type
text files; i.e., they have a single x0D character to mark end-of-line. This
app runs fine on my Windows XP Pro development system, but fails on the
target Win98 system. Actually I can demonstrate the problem just using
Notepad so I'm guessing the .Net Framework 1.1 is not the issue.

Using Notepad on my XP system I can open these text files and they look
right; i.e., the lines terminate in the expected places. Using Notepad on
the Win98 system however produces a view with "little black boxes" where the
lines should end.

So, it appears that XP itself is a little more Unix friendly. Is there any
support in the Framework for handling this gracefully? I'll need to be able
to handle Windows/DOS produced text files as well so I can't *depend* on the
files coming from Unix.

Thanks in advance,
Bill
 
J

Jon Skeet [C# MVP]

Bill Cohagan said:
I've got a c# WinForms app that reads an input text file using a
StreamReader, using the ReadLine() method. The input files are Unix type
text files; i.e., they have a single x0D character to mark end-of-line. This
app runs fine on my Windows XP Pro development system, but fails on the
target Win98 system. Actually I can demonstrate the problem just using
Notepad so I'm guessing the .Net Framework 1.1 is not the issue.

Using Notepad on my XP system I can open these text files and they look
right; i.e., the lines terminate in the expected places. Using Notepad on
the Win98 system however produces a view with "little black boxes" where the
lines should end.

So, it appears that XP itself is a little more Unix friendly. Is there any
support in the Framework for handling this gracefully? I'll need to be able
to handle Windows/DOS produced text files as well so I can't *depend* on the
files coming from Unix.

I would expect StreamReader to handle it just fine, coping with either
kind of line break. From the docs:

<quote>
A line is defined as a sequence of characters followed by a line feed
("\n") or a carriage return immediately followed by a line feed
("\r\n").
</quote>

However, 0x0d isn't \n, it's \r, which is unusual for a Unix box to
use. I believe Macs tend to use just \r though - could that be the
problem?
 
B

Bill Cohagan

Jon-
Thanks for the quick response. Actually I described it incorrectly. The
files are using a single 0x0A char as end-of-line rather than the expected
0D0A. I will investigate further; however as I said the program runs fine
under XP, but not under Win98 (both with Framework 1.1).

Bill
 
B

Bill Cohagan

Jon
Given a Unix type text file (0x0A end-of-line) should Win98's Notepad
display it with proper formatting or with "black box" chars?

Thanks,
Bill
 
J

Jon Skeet [C# MVP]

Bill Cohagan said:
Thanks for the quick response. Actually I described it incorrectly. The
files are using a single 0x0A char as end-of-line rather than the expected
0D0A. I will investigate further; however as I said the program runs fine
under XP, but not under Win98 (both with Framework 1.1).

Could you post a short but complete program which demonstrates the
problem?

See http://www.pobox.com/~skeet/csharp/complete.html for details of
what I mean by that.

I'm surprised by it not working though - I really would have expected
it to work fine.
 
J

Jon Skeet [C# MVP]

Bill Cohagan said:
Given a Unix type text file (0x0A end-of-line) should Win98's Notepad
display it with proper formatting or with "black box" chars?

Well, "should" is an interesting question - but yes, I believe it
*does* on normal W98 systems :)

(Whereas .NET *really* should cope.)
 
B

Bill Cohagan

Jon-
It turns out that the end-of-line character(s) were not related to the
problem I was having. I was misled initially by a poorly worded error
message -- then once I noticed the file format difference (via Notepad) I
latched on to that as the probable culprit. How easy it is to slip on a pair
of blinders!

It turns out that what differed between the XP platform and the Win98 was
the behavior of the WindowsIdentity object. In XP the wi.Name property
returns something of the form <machine>\userid (or <domainname>\userid). In
Win98 it returns an empty string (or null) and *that* was causing my
application some grief.

Thanks for the help and sorry for the wasted time.

Regards,
Bill
 
J

Jon Skeet [C# MVP]

Bill Cohagan said:
It turns out that the end-of-line character(s) were not related to the
problem I was having. I was misled initially by a poorly worded error
message -- then once I noticed the file format difference (via Notepad) I
latched on to that as the probable culprit. How easy it is to slip on a pair
of blinders!

Oh absolutely. Keeping an open mind to what the cause of a problem is
when you've got an idea about what it might be is a really tricky thing
to do.
It turns out that what differed between the XP platform and the Win98 was
the behavior of the WindowsIdentity object. In XP the wi.Name property
returns something of the form <machine>\userid (or <domainname>\userid). In
Win98 it returns an empty string (or null) and *that* was causing my
application some grief.
Ah...

Thanks for the help and sorry for the wasted time.

Not a problem at all.
 
J

Jeffrey Tan[MSFT]

Hi Bill,

I am glad you find out the problem yourself.

If you have further problem, please feel free to post. Thanks

Best regards,
Jeffrey Tan
Microsoft Online Partner Support
Get Secure! - www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top