zero element array creation

  • Thread starter Thread starter Zytan
  • Start date Start date
Time for a revisit on this one Zytan. I had a bit of time this afternoon and
I was getting a tad curious.

This page here http://msdn2.microsoft.com/en-us/library/7ee5a7s1(VS.80).aspx
is the key to understanding why these forms are both valid.

Dim emptyArray() as Byte = New Byte() {}

Dim emptyArray as Byte() = New Byte() {}

It won't tell you in so many words, but the content of that page and some of
it's links show the following.

The page states that to specify that a variable is an array, you follow its
variablename immediately with parentheses (as in the first form).

Other information shows that Byte and Byte() are distinct data types.

Byte is a value type.

Byte() is a reference type derived from System.Array.

If you attempt to code:

Dim emptyArray() as Byte() = New Byte() {}

you get a compiler error that reads:

Array modifiers cannot be specified on both a variable and its type.

More specifically, the compiler error code is BC31087 which has the longer
description of:

Array modifiers are present on both the variable and its type, indicating
an array of arrays.

and the recommended fix is to remove the array modifier from the type
specifier. This is also what the intellisense error correction option
recommend for this condition.

This means that the form recommended by both the documentation and the
compiler is:

Dim emptyArray() as Byte = New Byte() {}

If you code:

Dim emptyArray as Byte() = New Byte() {}

then because the variable name is not followed by () and because Byte() is a
distinct type, it follows that declaration is quite legal.

The documentation further recommends that a zero-length array be declared
as:

Dim emptyArray(-1) As Byte

If you attempt to code:

Dim emptyArray As Byte(-1)

you get compiler error code BC31087: Array bounds cannot appear in type
specifiers.

This reinforces that the recommendations we came across earlier.

As to why we appeared to not know these things off the top off our heads, we
probably did, but because using these 'features' has become second nature,
the whys and wherefores are probably in an archived part of memory and not
readily accessible.
 
This page herehttp://msdn2.microsoft.com/en-us/library/7ee5a7s1(VS.80).aspx
is the key to understanding why these forms are both valid.
[snip]
The page states that to specify that a variable is an array, you follow its
variablename immediately with parentheses (as in the first form).

Yup, I can see that.

Also, from that page, it links to "How to: Create an Array with No
Elements"
http://msdn2.microsoft.com/en-us/library/18e9wyy0(VS.80).aspx
which says "Declare one of the array's dimensions to be -1", as you
mention later in your post.
Other information shows that Byte and Byte() are distinct data types.

Byte is a value type.

Byte() is a reference type derived from System.Array.

Ok, so they really are two different things.
If you attempt to code:

Dim emptyArray() as Byte() = New Byte() {}

you get a compiler error that reads:

Array modifiers cannot be specified on both a variable and its type.

I tried this, too. And that answer indicates that they are one and
the same.
This means that the form recommended by both the documentation and the
compiler is:

Dim emptyArray() as Byte = New Byte() {}
Yup.

If you code:

Dim emptyArray as Byte() = New Byte() {}

then because the variable name is not followed by () and because Byte() is a
distinct type, it follows that declaration is quite legal.

Yes. This still doesn't explain why the two produce the same IL.
But, now, at least, I follow the recommended method, since you are
saying "emptyArray is an arrary of type Byte", which follows: "Dim
emptyArray() as Byte", although they could have borrowed some syntax
ideas from Pascal to make it a little more clear.
If you attempt to code:

Dim emptyArray As Byte(-1)

you get compiler error code BC31087: Array bounds cannot appear in type
specifiers.

This reinforces that the recommendations we came across earlier.

Yup, I tried that, too, and I agree with you. So, now, it is
definitely more 'proper' to use the () on the variable, not the type.
You've changed my mind.
As to why we appeared to not know these things off the top off our heads, we
probably did, but because using these 'features' has become second nature,
the whys and wherefores are probably in an archived part of memory and not
readily accessible.

Totally makes sense. But, I'm sure you can see how, by understanding
(getting an answer to 'why?'), it allows a beginner to get to the
position you are are.

Thanks, Stephany,

Zytan
 
Actually, VB .NET's attempt to ease code porting from VB6 by declaring
Two answer, with the above I don't agree with you.
And before you understand it wrong, I have had much discussion with a very
longtime C# guy on this board. In my opinion VB was using the 1 as starting
indexer in fact the way it should be, the figure for first is 1. Numbers are
represented by a characterset from 0 to 9. However oldies like me are used
by the 0 as first indexer although it is in my eyes in fact completely
foolish. Before somebody start to write why it is a zero. I know that.

I know counting from 1 makes more sense to humans. I've used BASIC
before VB, and I was happy with option base 1. Learning about
pointers and C and Pascal made me realized why 0 is better. But, this
is all *irrelevant* to the issue I was raising:

If I want 10 elements, I want to write 10. VB .NET says I need to
write 9. That's the issue I was bringing up.

In fact, my point was that the issue of 0-based indexing or 1-based
indexing was a very unconvincing argument to start using the maximum
index instead of the array length.
The other point an array is in fact in Net only usefull for fixed arrays.
For the rest the IList and Icollection implementing classes are much better
including the generic ones. The array has everytime to be redimmed.
Therefore I did not see the reason to use of a fixed array with a lenght of
real 0.

In my case, the array is created once. No ReDim. It returns a length
of 0, or a length of x, x = number of solutions.

But, yes, for something that is continually ReDimensions, collections
are better. I know little about them other than that is what they are
designed to improve.

Zytan
 
Zytan said:
This page
herehttp://msdn2.microsoft.com/en-us/library/7ee5a7s1(VS.80).aspx
is the key to understanding why these forms are both valid.
[snip]
The page states that to specify that a variable is an array, you follow
its
variablename immediately with parentheses (as in the first form).

Yup, I can see that.

Also, from that page, it links to "How to: Create an Array with No
Elements"
http://msdn2.microsoft.com/en-us/library/18e9wyy0(VS.80).aspx
which says "Declare one of the array's dimensions to be -1", as you
mention later in your post.
Other information shows that Byte and Byte() are distinct data types.

Byte is a value type.

Byte() is a reference type derived from System.Array.

Ok, so they really are two different things.

'Byte' and 'Byte()' are different, but 'Dim b() As Byte' and 'Dim b As
Byte()' will both declare an array of type "array of 'Byte'" ('Byte()'),
which contains items of type 'Byte'.
 
I think people are misinterpreting my post. Just to be clear again:

I am not stating anything about what option base the language should
be. I just think, as with every other language in existence, when I
want 10 elements, I should type "10" somewhere in the statement
(unless I specify the min AND max bounds, like "0 to 9" or "11 to 20",
which also indicates the number of elements I will get).

This is independent of the Option Base status.
One hears a lot of people ask "Why isn't BASIC/VB more like C/C++?".

This would be a fair question if BASIC was "invented" after C. But it was
the other way around and therfore the more appropriate question should be
"Why isn't C/C++ more like BASIC/VB?".

Right. And there's good answers to that last question. But, until I
knew them, I liked 1-based indexing better. And I still see no reason
why BASIC today can't have Option Base 1.
As for zero or 1 based indexing, we Homo Sapiens do not, intuitively start
count counting from zero. You don't hear someone say "I've made my zeroth
million.". Likewise there was no year zero between BCE and CE which is also
why the year 2000 was in the 20th century and is not in the 21st century.

Of course. The fact that you're explaining this to me indicates you
misunderstood my post. Because I totally understand and agree. I
wouldn't have any problem with VB starting its indexing at 1. Every
BASIC I used did that, and I was happy.
Certainly computers count from zero quite happily because all bits off just
happens to be zero. Once it was determined that indexing from zero was
actually a very good idea the Option Base was introduced to BASIC to allow
it without destroying existing code, but this still predates VB.

Yes, and I never used Option Base, because I liked to count from 1, as
all humans do, until they become programmers and realize what's
happening under the hood. Option Base was a great option, it allowed
people to 'move forward' with 0-based indexing, and as you said,
allowed people to not destroy older code.
Another aspect to consider is that, unlike a lot of other languages, BASIC
is NOT controlled by a commitee. COBOL is controlled by CODASYL, C++ is
controlled by ANSI, etc. Those languages MUST, at least, comply with the
ratified standard, and anything else is a compiler-specific extension. What
VB/VB.NET does (or doesn't do) is purely up to Microsoft.

Yup. Microsoft wrote a lot of different BASICs back in the day. I'm
not sure if people today realize this. They were certainly meant for
people to grasp it quickly, as what BASIC means in both its name and
its meaning. And to do that, you have to count starting at 1.
Certainly, Option Base is no longer supported in VB.NET (2005), but arrays
are still 'sized' by supplying the value for the upper-bound, the same way
it has been since before C was invented.

No. Arrays have always been sized by their length. Always. Every
other language other than VB .NET is consistent with this.

It is convenient to argue that it has "VB arrays have always been made
using the highest index", since it is *technically true*. But, this
coincidental truth is only true as a result of their decision to
support old code. Just because it's true, it doesn't mean it has
weight. It holds no meaning. It just happened to be.

When reading code, people think, "that array is 10 elements in size",
not "the highest index is 10". Many other 0-based languages use this
same concept. They make you write "10" even though you get indexes 0
to 9. Pascal does this, and it is a language based on pseudo-code
which is something meant for everyone to read, regardless of their
language of choice.

We could argue which (length or highest index), if not both, the
original concept was intended to be. But, there is no argument as to
what people believed it meant, or what is intuitively correct, which
is what is important. There is overwhelming evidence that the vast
majority of programmers thought of the number they were typing was
'length', not 'highest index'. All other languages fit that mold.
Even when 0-based languages came about and made length <> highest
index, people still made 0 to 9 index arrays by typing the length
(10), not highest index (9). Only VB has broken the mold by making
you use the highest index. And further evidence is the backlash of
everyone who hates it. And people who don't hate it are wondering why
there's an extra element in all of their arrays. The article I posted
states this would be a known consequence, but it's better than
breaking old code, which MS does care about.

Zytan
 
"Why isn't C/C++ more like BASIC/VB?".
Because BASIC wasn't intended to be used to write operating systems. Why
isn't a Mercedes-Benz more like the Ford Model-T since the Ford came first?


It's an odd perspective that suggests that modern things should be more like
the things cavemen used because they were first. Why aren't MP3s more like
33 1/3 LPs, can't they make them scratch and warp?


Homo sapiens do not calculate divisions accurate to 64- bits nor do run at
65 MPH. These are machines. They aren't supposed to emulate people, we
have people for that.


I hear them say "my bad" so this should be the error message when the CPU
overflows? :-)

Well, Tom, I follow everything you are saying here, but I do believe
that if VB is intended to be BASIC - Beginner's All-purpose Symbolic
Instruction Code, then BASIC really should count from 1.

Because it's FOR humans.

I don't personally care, myself, since I know how to count from 0. My
mind is wrapped around it, as is yours.

Maybe even for the star programmer starting out at 4 years old, its
best to have him understand 0-based indexing right away, but that's
not was BASIC was originally meant for.
The article points out (if I read it correctly) that the arrays are sized
one larger than expected. Expect a bank transfer error any day now and it
will be due to "oh yeah the array was one larger than we thought."

Exactly. The code could be accessing an out of bounds element, but
since you have 1 more than expected, you won't ever get the error
during debugging. Ouch.

And this is because when people type "10" they are thinking "length"
not "highest index". I think VB forcing us to think "highest index"
is absurd. AGAIN, they did it so that (most) older code will not
break. It's a trade off. That's why VB has mistakes like this. It's
ok to admit to them. It wouldn't have happened if VB6 didn't exist.

Saying it was always like this because of a coincidental truth that VB
arrays are always made by typing the 'highest index' doesn't negate
the truth of the matter. (And I'm not even sure if that coincidental
truth is actually true: When using Option Base 0, for old BASIC, did
you really state the array size by its highest index? or was it still
by array length?)

Zytan
 
'Byte' and 'Byte()' are different, but 'Dim b() As Byte' and 'Dim b As
Byte()' will both declare an array of type "array of 'Byte'" ('Byte()'),
which contains items of type 'Byte'.

Yes, I know.

My previous post was pointing out a seemingly contradiction, in that
they are different, yet result in the same IL, so they are the same.

Zytan
 
Zytan,

This was just a silly decission in the time the conversion was made from VB6
to VBNet.

Now all methods are working, those which are starting with a 1 (left) and
those who are starting with a zero .Substring

In my opinion there could have been done then a better job.

Cor
 
Zytan said:
Well, Tom, I follow everything you are saying here, but I do believe
that if VB is intended to be BASIC - Beginner's All-purpose Symbolic
Instruction Code, then BASIC really should count from 1.

Because it's FOR humans.

Discussions of this type tend to avoid defining (in any meaningful way) what
"basic" or "easy" means with each person operating against their own
definition. Keep in mind that BASIC supported the GOTO statement but not in
GOTO <label> (which one might accept) but rather GOTO <line number> and of
course line numbers change as the code is edited. But furthermore BASIC
supported "computed GOTOs". In any case does this make GOTO statements
"easy and basic" or "difficult, error prone and ultimately time consuming?"

What (I suggest) would need to be done is a study on how many people-hours
were wasted overcoming a feature designed to be easy. If the majority of
the "target user" actually makes mistakes using a language feature then it
is (by definition) not very easy.

Personally I'm not affected by most of these design choices and in large
part is the reason I suggest people avoid certain constructs even when a
language permits them. These "features" have been determined (at least by
me) to cause the problems that waste the time, and eventually the money.

Take care,
Tom
 
Zytan said:
I agree and disagree. I disagree that features that allow mistakes
means they are difficult. GOTO is easy. It can be taught in 2
seconds. Most people never screw it up. I never did. I never
changed line numbers except with the automatic renumbering that
hanldes GOTOs, as well. I agree that if something is prone to error,
then it should be rethought. But, removed? I am not sure. To help
the 1% of people who screw up their GOTOs by making it many times
harder by introducing labels perhaps isn't a worthwhile tradeoff.

We're going to have to disagree. I can see no way that "line number 3500 is
the SAVE routine" could be easier than remembering that SAVE is the label
for the save routine and I'm left wondering why more languages don't operate
this way if that is the case.

As far as you not messing it up. I'd be surprised if you didn't type in the
wrong line number once in awhile but are you talking about 15 developers
working on a 500,000 line project or you working alone on a small utility?

If you have anything which backs up the claim that people learned the GOTO
statement in 2 seconds, that only 1% of the developers screwed up and that
labels are "many times harder" I'd love to see it. Are functions and
procedures harder as well? And aren't they specialized labels?
 
We're going to have to disagree. I can see no way that "line number 3500 is
the SAVE routine" could be easier than remembering that SAVE is the label
for the save routine and I'm left wondering why more languages don't operate
this way if that is the case.

I never said "3500" is easier to remember than "SAVE". "SAVE" is
easier to remember. Anybody would agree with that.

Definition, remember? Let me try to explain again:

Explain to someone JUST STARTING to be capable of understading program
flow. You're talking about subroutines. I'm talking 4 year old kids.

10 PRINT "FIRST"
20 PRINT "NEXT"
30 PRINT "THIS HAPPENS LAST"

then, add program flow direction:

10 PRINT "HELLO"
20 GOTO 10

That's easier than labels.

Procedures come into play on a higher level. First, you have to
comprehend the computer does one thing at a time, in order.

The above is what BASIC was for. Really. People actually (or used
to) write 8 line length BASIC programs. It's used in school math
books everywhere.
As far as you not messing it up. I'd be surprised if you didn't type in the
wrong line number once in awhile but are you talking about 15 developers
working on a 500,000 line project or you working alone on a small utility?

Hahahaha, sorry, we are on different wavelengths. I'm talking 100 to
200 line length programs MAX when I was a kid. GOTOs were on the same
screen. Of course MUCH more code could fit onscreen at once, like 5
or 10x more than today's languages using the colon ":" (it still
exists, no one uses it, though), since it helped speed it up giving
that it was interpreted. So, I'm talking elementary, basic BASIC.
Not making OS's. Here's an idea of some code:
http://en.wikipedia.org/wiki/BASIC
If you have anything which backs up the claim that people learned the GOTO
statement in 2 seconds, that only 1% of the developers screwed up and that
labels are "many times harder" I'd love to see it. Are functions and
procedures harder as well? And aren't they specialized labels?

I have no evidence. I just know I learned GOTO in 2 seconds. Label
adds two things to use, rather than one with GOTO. I think it's easy
to see that a 4 year old would understand GOTO 10 much more quickly
than using a label. It's just less stuff to take in all at once. Of
course to anyone older, it's not that big of a deal, but still,
however easy, it's harder. I'm taking relative, you're talking
absolute.

Yes, functions and procs are harder, of course. Were you serious
about that? They are an order of magnitude more difficult! I grasped
them in seconds, having programmed without them for years, and thus
immediately realizing exactly why they were good. But, go look at any
first year programming course, especially in non-comp sci degrees, and
see how quickly people grasp functions. It's WAY harder.

I can only assume you are speaking from the perspective of an
experienced programmer. Yes, I would agree with you that to you and
me and other experienced people it's all elementary.

Zytan
 
Sorry again... this only highlights the need for a definition. I just about
never describe programming constructs as easy or hard in terms of children.
To me it isn't relevant any more than suggesting that this year's Academy
Award for best picture be awarded based upon the movie kids under 5 think is
the best.

Would you agree that it would be easier for a 4-year old if you removed the
need for line numbers? If "easy for anybody" is the goal why not just
process code in the order it encounters it? How about if you type PRNIT
"FIRST" it figures out you meant PRINT, wouldn't that be easier, ergo an
even better language?

I don't happen to think so but the kid in the crib across the street surely
does. :-)
 
Sorry again... this only highlights the need for a definition. I just about
never describe programming constructs as easy or hard in terms of children.
To me it isn't relevant any more than suggesting that this year's Academy
Award for best picture be awarded based upon the movie kids under 5 think is
the best.

Yes, you're right. I guess I assumed we were talking about children
since we were talking about BASIC and GOTO [linenumber].
Would you agree that it would be easier for a 4-year old if you removed the
need for line numbers?

Perhaps. But labels still require two things, where line numbers
don't. The line numbers are labels, yes, but they signify each line.
Labels don't. So, it requires something extra to understand. I could
see something asked such as "what does that (the label) program
statement do?" "nothing, it's just a label that another statement
references".

Also, a linenumberless BASIC needs an IDE, which is more complex than
using the environment for coding and running (like the immediate
window).

The only way to really know is to test it with kids.
If "easy for anybody" is the goal why not just
process code in the order it encounters it?

I think line numbers clarify exactly what each statement is. But, you
may be right.
How about if you type PRNIT
"FIRST" it figures out you meant PRINT, wouldn't that be easier, ergo an
even better language?

That's true. But that may teach it as a language like english where
it's ok as long as you're close. That's not really a good thing for
logic languages. You wouldn't want that in the language of math.

Zytan
 
Back
Top