CeZaR said:
Tomas, can you please explain how the compiler works here.
Sounds interesting...
I'm not Tomas, but I can supply a short foray into the grammar of C++. To
begin, let's look at how the grammar forms a simply integer array. Consider
this simple array declaration at global scope:
int x[100];
Starting with translation-unit, where all C++ programs begin their parse
tree, this is the parse tree for that expression:
translation-unit
|
declaration-seq
|
declaration
|
block-declaration
|
simple-declaration
|
decl-specifier-seq init-declarator-list ;
| |
decl-specifier init-declarator
| |
type-specifier declarator
| |
simple-type-specifier direct-declarator
| |
int direct-declarator [ constant-expression ]
| |
declarator-id 100
|
identifier
|
x
The most interesting grammar productions in this tree start with declarator.
Here are those productions from the C++ standard:
declarator:
direct-declarator
ptr-operator declarator
direct-declarator:
declarator-id
direct-declarator ( parameter-declaration-clause ) cv-qualifier-seqopt
exception-specificationopt
direct-declarator [ constant-expressionopt ]
( declarator )
ptr-operator:
* cv-qualifier-seqopt
&
:
pt nested-name-specifier * cv-qualifier-seqopt
So, the first thing to note is that arrays come from the third production in
the direct-declarator production. Since the brackets always follow, a
direct-declarator, you can see why a simple mistake like the following
results in a gramatical syntax error:
int[100] x;
error C2143: syntax error : missing ';' before '['
error C2059: syntax error : 'constant'
So, now using Standard C++, 8.3.5/6 says "Functions shall not have a return
type of type array or function, although they may have a return type of type
pointer or reference to such things." Let's start with how you create a
reference to an array. Looking at the grammar, the ptr-operators come first,
so you might first think this is the correct syntax:
int x[100];
int &r[100] = x;
The compiler will tell you that you cannot have an array of reference
though. Looking at the parse tree for this, that makes sense (to read a type
from a grammar, you usually start from the bottom right). To fix this, you
can parenthesize the reference:
int x[100];
int (&r)[100] = x;
Now, how do we get from here to a function returning a reference to an
array. Well, again we put the array declaration at the end of the function
declaration because the grammar requires the name to come before the
brackets. Then we use the other production in direct-declarator to get the
argument list to the function, and then we name the function with
declarator-id. Then we put the & for the reference, and then the return
type. So, we might come up with the following:
int &f()[100];
With that, we ran into the same problem as before where we have an array of
int references (clearly not allowed). Given that we also have a function
returning an array. To fix that, we parenthesize again:
int (&f())[100];
And that's it. The design for standard C++ puts array returns at the end of
the function. That's what led the old syntax to do the same thing. Since CLR
arrays have a way to be returned from functions, the restriction from the
part of the standard I quoted above was lifted.
Now, I'll be the first in line to say that array declaration in C++ is
beyond difficult. I have to question other people on the compiler team on
occasion just to figure out what some code is. I'll pose a few challenges
below that demonstrate the difference in either figuring out how to write a
particular piece of code or even reading code. That will distinguish why we
chose the syntax that we did in the new design.
Hopefully, spending enough time understanding what kind of code you can
write with arrays in C++ will convince you how much more elegant the syntax
is that we chose for CLR arrays.
I'm going to cover this and other array topics in a forthcoming blog entry
(yes, I'm actually writing a new blog entry after a nine month hiatus). As a
puzzle in the meantime, here's a challenge for anyone with enough time:
For each of the following, write the syntax for accomplishing this with
standard C++ arrays and for accomplishing it with CLR arrays:
1. An array of function pointers.
2. A function pointer taking an array argument by reference.
3. A function pointer returning a reference to an array.
As a reminder, a function pointer looks like this: void (*pf)();
And for the super challenge, what is this? (Hint: use typedefs)
void (*(& x(void (*(*)[10])(void (*(&)[10])())))[10])(void (*(&)[10])());
How would this last thing be expressed using CLR arrays?
Have fun!