Reading Unix Text Files?

  • Thread starter Thread starter Bill Cohagan
  • Start date Start date
B

Bill Cohagan

I've got a c# WinForms app that reads an input text file using a
StreamReader, using the ReadLine() method. The input files are Unix type
text files; i.e., they have a single x0D character to mark end-of-line. This
app runs fine on my Windows XP Pro development system, but fails on the
target Win98 system. Actually I can demonstrate the problem just using
Notepad so I'm guessing the .Net Framework 1.1 is not the issue.

Using Notepad on my XP system I can open these text files and they look
right; i.e., the lines terminate in the expected places. Using Notepad on
the Win98 system however produces a view with "little black boxes" where the
lines should end.

So, it appears that XP itself is a little more Unix friendly. Is there any
support in the Framework for handling this gracefully? I'll need to be able
to handle Windows/DOS produced text files as well so I can't *depend* on the
files coming from Unix.

Thanks in advance,
Bill
 
Bill Cohagan said:
I've got a c# WinForms app that reads an input text file using a
StreamReader, using the ReadLine() method. The input files are Unix type
text files; i.e., they have a single x0D character to mark end-of-line. This
app runs fine on my Windows XP Pro development system, but fails on the
target Win98 system. Actually I can demonstrate the problem just using
Notepad so I'm guessing the .Net Framework 1.1 is not the issue.

Using Notepad on my XP system I can open these text files and they look
right; i.e., the lines terminate in the expected places. Using Notepad on
the Win98 system however produces a view with "little black boxes" where the
lines should end.

So, it appears that XP itself is a little more Unix friendly. Is there any
support in the Framework for handling this gracefully? I'll need to be able
to handle Windows/DOS produced text files as well so I can't *depend* on the
files coming from Unix.

I would expect StreamReader to handle it just fine, coping with either
kind of line break. From the docs:

<quote>
A line is defined as a sequence of characters followed by a line feed
("\n") or a carriage return immediately followed by a line feed
("\r\n").
</quote>

However, 0x0d isn't \n, it's \r, which is unusual for a Unix box to
use. I believe Macs tend to use just \r though - could that be the
problem?
 
Jon-
Thanks for the quick response. Actually I described it incorrectly. The
files are using a single 0x0A char as end-of-line rather than the expected
0D0A. I will investigate further; however as I said the program runs fine
under XP, but not under Win98 (both with Framework 1.1).

Bill
 
Jon
Given a Unix type text file (0x0A end-of-line) should Win98's Notepad
display it with proper formatting or with "black box" chars?

Thanks,
Bill
 
Bill Cohagan said:
Thanks for the quick response. Actually I described it incorrectly. The
files are using a single 0x0A char as end-of-line rather than the expected
0D0A. I will investigate further; however as I said the program runs fine
under XP, but not under Win98 (both with Framework 1.1).

Could you post a short but complete program which demonstrates the
problem?

See http://www.pobox.com/~skeet/csharp/complete.html for details of
what I mean by that.

I'm surprised by it not working though - I really would have expected
it to work fine.
 
Bill Cohagan said:
Given a Unix type text file (0x0A end-of-line) should Win98's Notepad
display it with proper formatting or with "black box" chars?

Well, "should" is an interesting question - but yes, I believe it
*does* on normal W98 systems :)

(Whereas .NET *really* should cope.)
 
Jon-
It turns out that the end-of-line character(s) were not related to the
problem I was having. I was misled initially by a poorly worded error
message -- then once I noticed the file format difference (via Notepad) I
latched on to that as the probable culprit. How easy it is to slip on a pair
of blinders!

It turns out that what differed between the XP platform and the Win98 was
the behavior of the WindowsIdentity object. In XP the wi.Name property
returns something of the form <machine>\userid (or <domainname>\userid). In
Win98 it returns an empty string (or null) and *that* was causing my
application some grief.

Thanks for the help and sorry for the wasted time.

Regards,
Bill
 
Bill Cohagan said:
It turns out that the end-of-line character(s) were not related to the
problem I was having. I was misled initially by a poorly worded error
message -- then once I noticed the file format difference (via Notepad) I
latched on to that as the probable culprit. How easy it is to slip on a pair
of blinders!

Oh absolutely. Keeping an open mind to what the cause of a problem is
when you've got an idea about what it might be is a really tricky thing
to do.
It turns out that what differed between the XP platform and the Win98 was
the behavior of the WindowsIdentity object. In XP the wi.Name property
returns something of the form <machine>\userid (or <domainname>\userid). In
Win98 it returns an empty string (or null) and *that* was causing my
application some grief.
Ah...

Thanks for the help and sorry for the wasted time.

Not a problem at all.
 
Hi Bill,

I am glad you find out the problem yourself.

If you have further problem, please feel free to post. Thanks

Best regards,
Jeffrey Tan
Microsoft Online Partner Support
Get Secure! - www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 
Back
Top