Allocating memory for part of a C struct

  • Thread starter Thread starter Chris Saunders
  • Start date Start date
C

Chris Saunders

Here is the declaration of a struct from WinIoCtl.h:

//
// Structures for FSCTL_TXFS_READ_BACKUP_INFORMATION
//

typedef struct _TXFS_READ_BACKUP_INFORMATION_OUT {
union {

//
// Used to return the required buffer size if return code is
STATUS_BUFFER_OVERFLOW
//

DWORD BufferLength;

//
// On success the data is copied here.
//

BYTE Buffer[1];
} DUMMYUNIONNAME;
} TXFS_READ_BACKUP_INFORMATION_OUT, *PTXFS_READ_BACKUP_INFORMATION_OUT;

Now Buffer can be various lengths and I know of only one way to allocate
memory for it (by allocating memory for the whole struct). I have seen
other similar structs that are not a union and have multiple arrays with a
similar declaration. Is there any way to allocate memory for such arrays?

Regards
Chris Saunders
 
Chris said:
Here is the declaration of a struct from WinIoCtl.h:

//
// Structures for FSCTL_TXFS_READ_BACKUP_INFORMATION
//

typedef struct _TXFS_READ_BACKUP_INFORMATION_OUT {
union {

//
// Used to return the required buffer size if return code is
STATUS_BUFFER_OVERFLOW
//

DWORD BufferLength;

//
// On success the data is copied here.
//

BYTE Buffer[1];
} DUMMYUNIONNAME;
} TXFS_READ_BACKUP_INFORMATION_OUT, *PTXFS_READ_BACKUP_INFORMATION_OUT;

Now Buffer can be various lengths and I know of only one way to allocate
memory for it (by allocating memory for the whole struct). I have seen
other similar structs that are not a union and have multiple arrays with a
similar declaration. Is there any way to allocate memory for such arrays?

Allocating memory is easy - you just use Marshal.AllocHGlobal or
stackalloc (though you have to do the size calculation yourself,
accounting for extra bytes, and mind the layout, particularly the
padding at the end).

The problem is working with such a thing. .NET P/Invoke marshaller
doesn't understand VLA structs, so the best you can get is the same
trick as in C++ - declare a "fixed" single-element array at the end,
and then, after you allocate enough memory for the struct, treat it as
N-element array. C# doesn't do any bound checks for fixed arrays, so
you can do it just fine.
 
Pavel said:
Chris said:
Here is the declaration of a struct from WinIoCtl.h:

//
// Structures for FSCTL_TXFS_READ_BACKUP_INFORMATION
//

typedef struct _TXFS_READ_BACKUP_INFORMATION_OUT {
union {

//
// Used to return the required buffer size if return code is
STATUS_BUFFER_OVERFLOW
//

DWORD BufferLength;

//
// On success the data is copied here.
//

BYTE Buffer[1];
} DUMMYUNIONNAME;
} TXFS_READ_BACKUP_INFORMATION_OUT,
*PTXFS_READ_BACKUP_INFORMATION_OUT;

Now Buffer can be various lengths and I know of only one way to
allocate memory for it (by allocating memory for the whole struct).
I have seen other similar structs that are not a union and have
multiple arrays with a similar declaration. Is there any way to
allocate memory for such arrays?

Allocating memory is easy - you just use Marshal.AllocHGlobal or
stackalloc (though you have to do the size calculation yourself,
accounting for extra bytes, and mind the layout, particularly the
padding at the end).

The problem is working with such a thing. .NET P/Invoke marshaller
doesn't understand VLA structs, so the best you can get is the same
trick as in C++ - declare a "fixed" single-element array at the end,
and then, after you allocate enough memory for the struct, treat it as
N-element array. C# doesn't do any bound checks for fixed arrays, so
you can do it just fine.

Why the C# and p/invoke information in this group, and why use
Marshal.AllocHGlobal?

C++ interop is so much easier.
 
Ben Voigt said:
Why the C# and p/invoke information in this group, and why use
Marshal.AllocHGlobal?

I apologize - the groups come right after another, and the question is
rather atypical for a C/C++ developer (they usually just use malloc/new), so
I got confused there.
C++ interop is so much easier.

In fact, re-reading the original post, I see now that it doesn't even
reference any kind of interop at all - not even C++/CLI. So it was probably
a generic C++ question, which means that it boils down to the usual
non-standard-but-who-cares trick:

TXFS_READ_BACKUP_INFORMATION_OUT* trbio =
(TXFS_READ_BACKUP_INFORMATION_OUT*)malloc(sizeof(TXFS_READ_BACKUP_INFORMATION_OUT)
+ extra_bytes);
 
Pavel said:
I apologize - the groups come right after another, and the question is
rather atypical for a C/C++ developer (they usually just use
malloc/new), so I got confused there.


In fact, re-reading the original post, I see now that it doesn't even
reference any kind of interop at all - not even C++/CLI. So it was

True, but this is the C++/CLI group. So including some information on
interop might be useful (e.g. you can't use a buffer on the managed heap
unless you also pin it).
probably a generic C++ question, which means that it boils down to the
usual
non-standard-but-who-cares trick:

TXFS_READ_BACKUP_INFORMATION_OUT* trbio =

(TXFS_READ_BACKUP_INFORMATION_OUT*)malloc(sizeof(TXFS_READ_BACKUP_INFORMATION_OUT)
+ extra_bytes);

I don't see anything nonstandard about that. It's exactly what I would do,
except that in C++ I would use a cast keyword instead of the C-style cast.
 
Ben Voigt said:
True, but this is the C++/CLI group. So including some information on
interop might be useful (e.g. you can't use a buffer on the managed heap
unless you also pin it).

Well, there's always stackalloc... er, _malloca
I don't see anything nonstandard about that. It's exactly what I would
do, except that in C++ I would use a cast keyword instead of the C-style
cast.

It is non-standard in a sense that the ISO C++ standard prescribes undefined
behavior for this construct. It is precisely why it was treated specially in
C99 in form of open-sized arrays, to guarantee that this idiom was actually
blessed by the Standard.
 
Pavel said:
Well, there's always stackalloc... er, _malloca

That wouldn't put a buffer on the managed heap.
It is non-standard in a sense that the ISO C++ standard prescribes
undefined behavior for this construct. It is precisely why it was
treated specially in C99 in form of open-sized arrays, to guarantee
that this idiom was actually blessed by the Standard.

Well, I'm pretty sure you are allowed to use the memory returned by malloc
as the TXFS_READ_BACKUP_INFORMATION_OUT structure, as long as the size is at
least sizeof TXFS_READ_BACKUP_INFORMATION_OUT. I suppose the issue is that
the compiler is not required to layout the structure with the array at the
end, so you might be walking over a footer or some other field when you
reference using the array name? Explicitly referencing the memory
immediately past the structure should be ok by the standard as long as the
requirement that the extra space be allocated by the caller is stated,
right?
 
That wouldn't put a buffer on the managed heap.

Why would it need to be on the managed heap, if it's just used for
interop?
Well, I'm pretty sure you are allowed to use the memory returned by malloc
as the TXFS_READ_BACKUP_INFORMATION_OUT structure, as long as the size isat
least sizeof TXFS_READ_BACKUP_INFORMATION_OUT.  I suppose the issue is that
the compiler is not required to layout the structure with the array at the
end, so you might be walking over a footer or some other field when you
reference using the array name?  Explicitly referencing the memory
immediately past the structure should be ok by the standard as long as the
requirement that the extra space be allocated by the caller is stated,
right?

Unfortunately, no. The Standard explicitly states that accessing an
array using an out-of-bounds index is U.B., without giving any
exceptions (such as the fact that memory accessed is allocated, etc).
In fact, you don't even have to dereference to get U.B. - it's already
happening at the pointer arithmetic stage. Recall that (p + i), where
p is pointer, and i is index, is only valid pointer arithmetics when
you're within the same object (object here defined as either array or
an individual value that the pointer is pointing to - not the
allocated memory block to which it belongs).
 
Pavel said:
Why would it need to be on the managed heap, if it's just used for
interop?

You presented it as a counter-example to my statement that you can't use a
buffer on the managed heap without pinning. Or that's how I understood it.
Unfortunately, no. The Standard explicitly states that accessing an
array using an out-of-bounds index is U.B., without giving any
exceptions (such as the fact that memory accessed is allocated, etc).
In fact, you don't even have to dereference to get U.B. - it's already
happening at the pointer arithmetic stage. Recall that (p + i), where
p is pointer, and i is index, is only valid pointer arithmetics when
you're within the same object (object here defined as either array or
an individual value that the pointer is pointing to - not the
allocated memory block to which it belongs).

I disagree. You can cast back to the original allocated object type, which
is array-of-char, and then all indices within the originally requested size
are legal and well-defined, including those past the sub-range of memory
which has been used as an object of class type.
 
I disagree.  You can cast back to the original allocated object type, which
is array-of-char, and then all indices within the originally requested size
are legal and well-defined, including those past the sub-range of memory
which has been used as an object of class type.

You can do that, because the Standard explicitly gives you guarantee
that you can treat any pod as a char array via a cast. But we are
dealing with a different case here. Let me spell it out again for
convenience:

struct Foo {
int fl; // fixed-length part
char vl[1]; // variable-length part
};

Foo* foo = (Foo*)malloc(sizeof(Foo) + 9);
foo->vl[0] = 123; // okay
foo->vl[1] = 123; // U.B.
*((char*)foo + offsetof(Foo, vl) + 1) = 123; // okay

You cannot invoke the treat-object-as-char-array clause here, because
that's not what we're trying to do - rather, we're indexing an array
(or, rather, doing pointer arithmetic on a pointer it decays to) which
is not itself "derived" from some object via a cast or union. And,
aside from those cases, the Standard is very explicit - any out-of-
bounds array access is U.B., and whether there is an adjacent memory
block guaranteed to be at the memory location you're trying to
dereference, is absolutely irrelevant. This was discussed at comp.std.c
++ and comp.lang.c++.moderated at length several times, and the
conclusion by the language lawyers was inevitably the same. The
original intent of the language designers is also known and consistent
with that analysis - have a look at this c.l.c++.m post:

http://groups.google.com/group/comp.lang.c++.moderated/msg/14e281c2c3ad97cc
 
Back
Top