Newline in Text Property

  • Thread starter Thread starter runningdog
  • Start date Start date
Hi Tom,

I did not wanted to start the discussion in this thread about the cr and lf

And therefore I stuffed this morning a much longer message than it was. My
point was that Bill said something about that .Net languages standards
wherin the VBCRLF was not.

This did sound for me again as an advice to use alone the single basic .Net
namespace in vb.net and not use the benefits from other .Net namespaces.

I too agree with Armin, that avoiding the vbcrlf looks for me as a Don
Quichot. It would be wise as only for that was used the LF however it is
not.

Just some thoughts,

Cor

I'm not trying to say avoid ControlChars - I'm saying I try not to hard
code the line terminator. I would suggest using ControlChars.NewLine, if
you want to use the VB.NET specific enumeration :) I'm not sure if it
will have the same behavior - since it doesn't seem to work on mbas yet
(the mono vb.net compiler) - but I suspect that it will.
 
Tom Shelton said:
And that is pretty much the case. There are exceptions of course,
but the general rule on DOS is CR/LF and *nix LF.

Despite these exceptions you are blindly using environment.newline?
Well it's built right into the standard libraries.

I'm talking about conventions in general, not only about the Framework.
There are apps not built on the Framework.
There are exceptions - but they are willfull exceptions.

Even if there was only 1 exception among 100,000 cases, I'd have to check
the format in each case.
If interoperability is the case - then yes, you have to make a
decission.

The file format - and this is only an example where line terminators are
used - is independent from managed code, unmanaged code, interop,
programming language ...
That is true. Believe me, I am very aware of this situation.
When dealing with cross platform data, you have to be aware of the
format.

You /always/ have to be aware of the format, not only when dealing wit
hcross platfrom data.
It is one of the banes of the newline problem and a lack of a
universal definition of a newline.

Right, just because of this lack, you always have to know what to expect.
There are platform specific conventions.

As long as there is no guarantee that a file (or the programmer writing the
file) obeys the convention, you can not blindly use environment.newline.
But, when interoperability
is the goal, then yes picking a specific line terminator and
specifying
it is important. Because a programmer has to know what to expect
when reading the data on another platform. In those cases, I usually
choose a CR/LF - since windows is much more universal. But, as I
said each platform has it's own convention for line termination, here
is the list for the big 3:

DOS/Windows - CR/LF
*nix - LF
Mac (just looked this one up) - CR

1. Even if you know the platfoom, you will fail if you rely on this. My
attitude is that a high probabibilty is not enough. That's why I say that I
have to check each case.
2. You sometimes don't know on which platform the file has been created. I
don't have to know it but I have to know the used seperator.
"Environment.newline" says absolutely nothing.


I also say this because I recently read a file using the streamreader
(readline method) and found out that the streamreader also recognizes a
single CR as a separator. Hmmm...... According to your definition it's a
bug, isn't it? (BTW, ran on a Win32 platform ;-) )


Sorry that my first post was a reply to *your's*. It should actually be a
general question/issue.
 
Despite these exceptions you are blindly using environment.newline?

Those exceptions are the cases where the programmer purposely overrode
the convention. I have never had that happen, since most people use the
standard IO conventions of there platform. And I only use
Environment.NewLine for output. Not for reading, espcially since
StreamReader is usually capable of descerning the source files newline
character.
I'm talking about conventions in general, not only about the Framework.
There are apps not built on the Framework.

I'm not talking about the Framework either. I'm talking about the
standard system libraries on *nix and windows systems. I'm talking
about the low level IO.
Even if there was only 1 exception among 100,000 cases, I'd have to check
the format in each case.

In general, I think your going to find that the ratio is even smaller
then that.
The file format - and this is only an example where line terminators are
used - is independent from managed code, unmanaged code, interop,
programming language ...

Yes. You have to be aware of what the terminator is going to be if you
getting data from other platforms. Like I said, usually when I do file
IO on *nix, the files are intended to be used with windows software - so
I generally use the CR/LF sequence in those cases.
You /always/ have to be aware of the format, not only when dealing wit
hcross platfrom data.

Not really - unless you specifically override the terminator. And there
would have to be a very good reason for me to do that if I knew the data
was going to another *nix system, since every standard tool on that
system is going to expect lf as the terminator character. The only time
that is in issue is if your dealing with different platforms.
Right, just because of this lack, you always have to know what to expect.

Yes - if your recieving data from other platforms, then yes you have to
be aware of the issue. I've never had a file from a windows source that
did not use the CR/LF convention.
As long as there is no guarantee that a file (or the programmer writing the
file) obeys the convention, you can not blindly use environment.newline.

The only time this is an issue is in recieving data. You have to know
when reading data what the terminator was when writing. As I said, most
of the time the data that I write on *nix is destined for windows boxes
- so I usually use a CR/LF explicitly so that the file will work with
the standard windows tools - such as notepad. I have noticed that
wordpad is smart enough to recognize a single lf though :)
1. Even if you know the platfoom, you will fail if you rely on this. My
attitude is that a high probabibilty is not enough. That's why I say that I
have to check each case.

In very, very few cases. I'm not advocating that you not check your
source file.
2. You sometimes don't know on which platform the file has been created. I
don't have to know it but I have to know the used seperator.
"Environment.newline" says absolutely nothing.

Of course not. It only tells you the newline character of the current
platform - not the one where the file originated. For cross platform
data - you do have to know the format it was written in. But the
newline character used is a very good indication of the originating
platform.
I also say this because I recently read a file using the streamreader
(readline method) and found out that the streamreader also recognizes a
single CR as a separator. Hmmm...... According to your definition it's a
bug, isn't it? (BTW, ran on a Win32 platform ;-) )

I don't think so. I think that ReadLine is a bit smarter then that. It
says in the docs that it looks for LF ("\n") or CRLF ("\r\n") as the
line terminator. Which means it is not soley using Environment.NewLine.
 
We've got different opinions, so I drop the quote this time. :-)

I don't think so. I think that ReadLine is a bit smarter then that.
It says in the docs that it looks for LF ("\n") or CRLF ("\r\n") as
the line terminator. Which means it is not soley using
Environment.NewLine.

I expect the streamreader to use the platform standard. A single CR is not
the platform standard.
Don't call this "smart", call it a bug. ;-)
 
We've got different opinions, so I drop the quote this time. :-)

It's ok to have differing oppions... That's what makes the world go
round :)
I expect the streamreader to use the platform standard. A single CR is not
the platform standard.
Don't call this "smart", call it a bug. ;-)

Why? It means less work for me :)
 
Tom Shelton said:
Why? It means less work for me :)

It meant more work for me because I had CR *within* a line and I could not
use the streamreader therefore. What would have been your solution? I read
the whole file in a string and split it on my own (because the String.Split
function doesn't take /two/ chars as /one/ separator, and the VB.split
function occupied too much memory).


--
Armin

How to quote and why:
http://www.plig.net/nnq/nquote.html
http://www.netmeister.org/news/learn2quote.html
 
Armin!

* "Armin Zingler said:
It meant more work for me because I had CR *within* a line and I could not
use the streamreader therefore. What would have been your solution? I read
the whole file in a string and split it on my own (because the String.Split
function doesn't take /two/ chars as /one/ separator, and the VB.split
function occupied too much memory).

I would expect a property where you can set the line terminator string
(defaults to 'Environment.NewLine').
 
Herfried K. Wagner said:
It meant more work for me because I had CR *within* a line and I
could not use the streamreader therefore. [...]

I would expect a property where you can set the line terminator
string (defaults to 'Environment.NewLine').

Yep. That's what I was also looking for but I didn't find it.
 
* "Armin Zingler said:
It meant more work for me because I had CR *within* a line and I
could not use the streamreader therefore. [...]

I would expect a property where you can set the line terminator
string (defaults to 'Environment.NewLine').

Yep. That's what I was also looking for but I didn't find it.

Alternatively, there could be an overload for 'ReadLine' which accepts
the separator character. Maybe that's not a good idea because it may
decrease the performance if the separator has to be passed in every
call.
 
It meant more work for me because I had CR *within* a line and I could not
use the streamreader therefore. What would have been your solution? I read
the whole file in a string and split it on my own (because the String.Split
function doesn't take /two/ chars as /one/ separator, and the VB.split
function occupied too much memory).

Interesting situation... Actually, that goes against the docs - since
it only mentions looking for a single lf or a cr/lf. So, maybe it could
be classified as a bug.

I think I agree with Herfried... There should be a property or some way
to override this behavior when it is undesirable. Of course, I've never
had text data that had a single CR embedded in the data that wasn't
meant as a newline :)
 
Back
Top