Struct inside class

Sneil · May 30, 2006

Example:
namespace _111_
{
public struct S
{
public int i1, i2;
}
public class C
{
public S s;
}
class Program
{
static void Main(string[] args)
{
C myClass = new C();
myClass.s.i1 = 999;
myClass.s.i2 = 888;
//at this point some memory must be assigned for
myClass.s.i1 & myClass.s.i2
//question: where this memory was taken from? From stack?
Or from heap?
}
}
}

I just don't sure....

Barry Kelly · May 30, 2006

Sneil said:
Example:
namespace _111_
{
public struct S
{
public int i1, i2;
}
public class C
{
public S s;
}
class Program
{
static void Main(string[] args)
{
C myClass = new C();

// At this point, memory is allocated for the struct s inside the new
instance of C.

myClass.s.i1 = 999;
myClass.s.i2 = 888;
//at this point some memory must be assigned for
myClass.s.i1 & myClass.s.i2
//question: where this memory was taken from? From stack?
Or from heap?

The memory for the struct s in this example is part of (i.e. fully
contained within) the heap-allocated object myClass.

}
}
}

-- Barry

Sneil · May 30, 2006

Barry said:
The memory for the struct s in this example is part of (i.e. fully
contained within) the heap-allocated object myClass.

OK, I think the same. BUT! In other words myClass.s.i1 is already in
heap, yes? Now - what is boxing? Boxing is creating special packed
version of value-type in heap. In our case i1 ALREADY in heap. So...
object o = myClass.s.i1; //_not_ a boxing here?
???

=?ISO-8859-1?Q?G=F6ran_Andersson?= · May 30, 2006

Sneil said:
OK, I think the same. BUT! In other words myClass.s.i1 is already in
heap, yes? Now - what is boxing? Boxing is creating special packed
version of value-type in heap. In our case i1 ALREADY in heap. So...
object o = myClass.s.i1; //_not_ a boxing here?
???

Yes, it's boxed. You are not storing the myClass.s.il variable in the
object, you are storing a copy of the value of the myClass.s.il variable.

Jon Skeet [C# MVP] · May 31, 2006

Göran Andersson said:
Yes, it's boxed. You are not storing the myClass.s.il variable in the
object, you are storing a copy of the value of the myClass.s.il variable.

No, there's no boxing going on there. Boxing is creating a separate
object on the heap for a value type. That's not happening here.

=?ISO-8859-1?Q?G=F6ran_Andersson?= · May 31, 2006

Jon said:
No, there's no boxing going on there. Boxing is creating a separate
object on the heap for a value type. That's not happening here.

Yes, it is.

myClass.s.i1 is an integer variable. The value copied from that integer
variable can not be stored in the variable o, as it is an object
reference, so a separate object has to be created on the heap where the
value can be stored, and the reference to that object is stored in the
variable o.

Michael D. Ober · May 31, 2006

You're missing the point.

All instance data inside a class is stored on the heap, but within the
memory allocation for that object. The compiler figures computes the
instance data layout, just like it would for the local variables in a
procedure and then references all the instance data relative to the instance
base address. This is very simple to code in machine assembler (MOV AX,
WORD PTR [DX]) for example. Boxing is a runtime feature and is used only
when an object's type is unknown at compile time. When you put a struct
inside a class, the compiler knows the types at runtime, except when you
declare a structure component as a generic "Object".

Here's a simple set of rules for how storage is allocated:

Reference types, including anything derived from "Object": Heap pointer
from either a stack frame, global compilera allocated storage, or another
object on the heap.
All Value Types, including "Struct": Stored on the stack, in global
compiler allocated storage, or relative to the base address of an object on
the heap

Where this gets confusing is when you use a value type as part of the
instance data of a reference type. In this case, the value type is stored
on the heap, but inside the allocated space for the reference type instance.
When you use a reference type as instance data of a reference type, all that
is allocated inside the containing reference type instances is a pointer to
the instance data. This complexity is why C and C++ programs tend to have
memory leaks and why the .NET good garbage collector is required.

Note that I referenced "global compiler allocated storage". This is the
memory allocated by the compiler for globally accessible public variables as
well as static (VB Shared) variables associated with objects. On the x86
platforms, this memory is reference relative to the "DS" register.

In your original example,

namespace _111_
{
public struct S
{
public int i1, i2;
}
public class C
{
public S s;
}
class Program
{
static void Main(string[] args)
{
C myClass = new C();
myClass.s.i1 = 999;
myClass.s.i2 = 888;
//at this point some memory must be assigned for
myClass.s.i1 & myClass.s.i2
//question: where this memory was taken from? From stack?
Or from heap?
}
}
}

The class C is allocated from the heap. Since struct S is a value type, it
it stored entirely in the class. The named components of S are also value
types, so they are stored inside S:

Offset Value Code Reference Comments
0000 i1 base of class C, start of Struct S, int i1 is laid
down first by the compiler
0004 i2 int i2
0008 next object starts here

In code:
// myClass = allocate(C)
MOV EAX, 8 // C is 8 bytes long - GC_ALLOCATE will
return the offset in EAX as well
CALL GC_ALLOCATE // Have the garbage collector allocate 8 bytes of
storage
// The GC will add object
overhead, but these will be negative offsets from the returned address.
MOV [ESP], EAX // The variable myClass is on the stack at
offset 0 relative to the current stack frame BP
MOV EAX, myClass // myClass's address is actually stored on the
stack; this instruction can be optimized out in this case
MOV DWORD PTR [EAX], 999 // myClass.s.i1 = 999
MOV DWORD PTR [EAX]+4, 888 // myClass.s.i2 = 888

First - I don't guarantee the syntax (I haven't written in x86 assembler in
several years), but this is close enough for discussion.
Second - GC_ALLOCATE returns an offset into the heap. This is a "magic"
number that the memory management subsystem handles for you. It is relative
to the value in register DS, which is set at application startup by the
memory manager. As you should be able to see from the assembler code, there
is no "boxing" of the variables. Boxing requires a rather expensive call to
determine an object's actual data type.

All Reference types are allocated in a similar method to the above example.
Value types are allocated by the compiler and don't require the call to
GC_ALLOCATE at run time.

Note the magic occurring in "GC_ALLOCATE" - it allocates memory from the
heap and returns an offset into the heap that is then used by later code for
reference to the memory. The actual object size will be 8 bytes plus
garbage collector management buffer. The GC management buffer will be at
negative offets to the returned address, thus making the rest of the
compiler easier to write. If GC_ALLOCATE can't allocate the requested
memory, it calls GC_COLLECT, which runs compacts accessible heap memory and
resets the allocation pointer for GC_ALLOCATE, which then tries again. If
GC_ALLOCATE still can't allocate memory, it asks the OS to extend the heap.
The OS returns the new heap size to GC_ALLOCATE. If GC_ALLOCATE still can't
allocate the requested memory, it throws an Out of Memory exception.

Mike Ober.

=?ISO-8859-1?Q?G=F6ran_Andersson?= · May 31, 2006

No, you are missing the point.

This is the code in question:

object o = myClass.s.i1;

From the previous discussion, we know that i1 is a member variable of a
strunct in a class, so it's stored on the heap.

The question was if boxing occurs or not, considering that the i1
variable is stored on the heap.

The answer is that boxing does occur, because it's not the i1 variable
that is stored in the object, but the value from the i1 variable. It
doesn't matter where the i1 variable is stored.

Göran Andersson said:
You're missing the point.

All instance data inside a class is stored on the heap, but within the
memory allocation for that object. The compiler figures computes the
instance data layout, just like it would for the local variables in a
procedure and then references all the instance data relative to the instance
base address. This is very simple to code in machine assembler (MOV AX,
WORD PTR [DX]) for example. Boxing is a runtime feature and is used only
when an object's type is unknown at compile time. When you put a struct
inside a class, the compiler knows the types at runtime, except when you
declare a structure component as a generic "Object".

Here's a simple set of rules for how storage is allocated:

Reference types, including anything derived from "Object": Heap pointer
from either a stack frame, global compilera allocated storage, or another
object on the heap.
All Value Types, including "Struct": Stored on the stack, in global
compiler allocated storage, or relative to the base address of an object on
the heap

Where this gets confusing is when you use a value type as part of the
instance data of a reference type. In this case, the value type is stored
on the heap, but inside the allocated space for the reference type instance.
When you use a reference type as instance data of a reference type, all that
is allocated inside the containing reference type instances is a pointer to
the instance data. This complexity is why C and C++ programs tend to have
memory leaks and why the .NET good garbage collector is required.

Note that I referenced "global compiler allocated storage". This is the
memory allocated by the compiler for globally accessible public variables as
well as static (VB Shared) variables associated with objects. On the x86
platforms, this memory is reference relative to the "DS" register.

In your original example,

namespace _111_
{
public struct S
{
public int i1, i2;
}
public class C
{
public S s;
}
class Program
{
static void Main(string[] args)
{
C myClass = new C();
myClass.s.i1 = 999;
myClass.s.i2 = 888;
//at this point some memory must be assigned for
myClass.s.i1 & myClass.s.i2
//question: where this memory was taken from? From stack?
Or from heap?
}
}
}

The class C is allocated from the heap. Since struct S is a value type, it
it stored entirely in the class. The named components of S are also value
types, so they are stored inside S:

Offset Value Code Reference Comments
0000 i1 base of class C, start of Struct S, int i1 is laid
down first by the compiler
0004 i2 int i2
0008 next object starts here

In code:
// myClass = allocate(C)
MOV EAX, 8 // C is 8 bytes long - GC_ALLOCATE will
return the offset in EAX as well
CALL GC_ALLOCATE // Have the garbage collector allocate 8 bytes of
storage
// The GC will add object
overhead, but these will be negative offsets from the returned address.
MOV [ESP], EAX // The variable myClass is on the stack at
offset 0 relative to the current stack frame BP
MOV EAX, myClass // myClass's address is actually stored on the
stack; this instruction can be optimized out in this case
MOV DWORD PTR [EAX], 999 // myClass.s.i1 = 999
MOV DWORD PTR [EAX]+4, 888 // myClass.s.i2 = 888

First - I don't guarantee the syntax (I haven't written in x86 assembler in
several years), but this is close enough for discussion.
Second - GC_ALLOCATE returns an offset into the heap. This is a "magic"
number that the memory management subsystem handles for you. It is relative
to the value in register DS, which is set at application startup by the
memory manager. As you should be able to see from the assembler code, there
is no "boxing" of the variables. Boxing requires a rather expensive call to
determine an object's actual data type.

All Reference types are allocated in a similar method to the above example.
Value types are allocated by the compiler and don't require the call to
GC_ALLOCATE at run time.

Note the magic occurring in "GC_ALLOCATE" - it allocates memory from the
heap and returns an offset into the heap that is then used by later code for
reference to the memory. The actual object size will be 8 bytes plus
garbage collector management buffer. The GC management buffer will be at
negative offets to the returned address, thus making the rest of the
compiler easier to write. If GC_ALLOCATE can't allocate the requested
memory, it calls GC_COLLECT, which runs compacts accessible heap memory and
resets the allocation pointer for GC_ALLOCATE, which then tries again. If
GC_ALLOCATE still can't allocate memory, it asks the OS to extend the heap.
The OS returns the new heap size to GC_ALLOCATE. If GC_ALLOCATE still can't
allocate the requested memory, it throws an Out of Memory exception.

Mike Ober.

Göran Andersson said:

Yes, it is.

myClass.s.i1 is an integer variable. The value copied from that integer
variable can not be stored in the variable o, as it is an object
reference, so a separate object has to be created on the heap where the
value can be stored, and the reference to that object is stored in the
variable o.

Click to expand...

Carl Daniel [VC++ MVP] · May 31, 2006

Göran Andersson said:
No, you are missing the point.

This is the code in question:

object o = myClass.s.i1;

From the previous discussion, we know that i1 is a member variable of
a strunct in a class, so it's stored on the heap.

The question was if boxing occurs or not, considering that the i1
variable is stored on the heap.

The answer is that boxing does occur, because it's not the i1 variable
that is stored in the object, but the value from the i1 variable. It
doesn't matter where the i1 variable is stored.

Absolutely correct. o is a reference to a boxed int that contains the same
value as the variable myClass.s.i1.

The only way to not incur boxing in a case like this is to hold an "interior
pointer" to myClass.s.i1. An ordinary object reference is not an interior
pointer - it always references a complete object.

-cd

Jon Skeet [C# MVP] · May 31, 2006

Göran Andersson said:
Yes, it is.

myClass.s.i1 is an integer variable. The value copied from that integer
variable can not be stored in the variable o, as it is an object
reference

Ah, yes, I missed the storage in the variable o. Yes, this involves
boxing. The struct within a class bit is entirely irrelevant.

Sneil · May 31, 2006

WOW! It's just amazing answer, almost little article.

Great thanks
for it!

It is relative to the value in register DS, which is set at
application startup by the memory manager. As you should be able to
see from the assembler code, there is no "boxing" of the variables.
Boxing requires a rather expensive call to determine an object's
actual data type.

Yes, you absolutely right - up to this point no any boxing. But my
second question was:
......
myClass.s.i1 = 999;
myClass.s.i2 = 888;
object o = myClass.s.i1; //<< is boxing _here_?
Now I am sure - it IS. I see it in Reflector:
......
L_001d: ldc.i4 888
L_0022: stfld int32 _111_.S::i2
L_0027: ldloc.0
L_0028: ldflda _111_.S _111_.C::s
L_002d: ldfld int32 _111_.S::i1L_0037: stloc.1
......
So - in spite of the fact that i1 already in heap it can not be stored
in the variable o, so boxing it is inevitable.

Michael D. Ober · May 31, 2006

In OPs original source, it isn't obvious that OP is referring to a generic
object variable. In this case, boxing must be done. The runtime cannot
operate on a generic object variable without the boxing. I suspect that the
concrete object will still be laid out the same manner by the compiler, but
that the compiler will have to generate addition calls into the memory
manager to box and unbox every reference to the object "o". That's why I
copied OP's original sample.

Mike.

Göran Andersson said:
No, you are missing the point.

This is the code in question:

object o = myClass.s.i1;

From the previous discussion, we know that i1 is a member variable of a
strunct in a class, so it's stored on the heap.

The question was if boxing occurs or not, considering that the i1
variable is stored on the heap.

The answer is that boxing does occur, because it's not the i1 variable
that is stored in the object, but the value from the i1 variable. It
doesn't matter where the i1 variable is stored.

You're missing the point.

All instance data inside a class is stored on the heap, but within the
memory allocation for that object. The compiler figures computes the
instance data layout, just like it would for the local variables in a
procedure and then references all the instance data relative to the instance
base address. This is very simple to code in machine assembler (MOV AX,
WORD PTR [DX]) for example. Boxing is a runtime feature and is used only
when an object's type is unknown at compile time. When you put a struct
inside a class, the compiler knows the types at runtime, except when you
declare a structure component as a generic "Object".

Here's a simple set of rules for how storage is allocated:

Reference types, including anything derived from "Object": Heap pointer
from either a stack frame, global compilera allocated storage, or another
object on the heap.
All Value Types, including "Struct": Stored on the stack, in global
compiler allocated storage, or relative to the base address of an object on
the heap

Where this gets confusing is when you use a value type as part of the
instance data of a reference type. In this case, the value type is stored
on the heap, but inside the allocated space for the reference type instance.
When you use a reference type as instance data of a reference type, all that
is allocated inside the containing reference type instances is a pointer to
the instance data. This complexity is why C and C++ programs tend to have
memory leaks and why the .NET good garbage collector is required.

Note that I referenced "global compiler allocated storage". This is the
memory allocated by the compiler for globally accessible public variables as
well as static (VB Shared) variables associated with objects. On the x86
platforms, this memory is reference relative to the "DS" register.

In your original example,

namespace _111_
{
public struct S
{
public int i1, i2;
}
public class C
{
public S s;
}
class Program
{
static void Main(string[] args)
{
C myClass = new C();
myClass.s.i1 = 999;
myClass.s.i2 = 888;
//at this point some memory must be assigned for
myClass.s.i1 & myClass.s.i2
//question: where this memory was taken from? From stack?
Or from heap?
}
}
}

The class C is allocated from the heap. Since struct S is a value type, it
it stored entirely in the class. The named components of S are also value
types, so they are stored inside S:

Offset Value Code Reference Comments
0000 i1 base of class C, start of Struct S, int i1 is laid
down first by the compiler
0004 i2 int i2
0008 next object starts here

In code:
// myClass = allocate(C)
MOV EAX, 8 // C is 8 bytes long - GC_ALLOCATE will
return the offset in EAX as well
CALL GC_ALLOCATE // Have the garbage collector allocate 8 bytes of
storage
// The GC will add object
overhead, but these will be negative offsets from the returned address.
MOV [ESP], EAX // The variable myClass is on the stack at
offset 0 relative to the current stack frame BP
MOV EAX, myClass // myClass's address is actually stored on the
stack; this instruction can be optimized out in this case
MOV DWORD PTR [EAX], 999 // myClass.s.i1 = 999
MOV DWORD PTR [EAX]+4, 888 // myClass.s.i2 = 888

First - I don't guarantee the syntax (I haven't written in x86 assembler in
several years), but this is close enough for discussion.
Second - GC_ALLOCATE returns an offset into the heap. This is a "magic"
number that the memory management subsystem handles for you. It is relative
to the value in register DS, which is set at application startup by the
memory manager. As you should be able to see from the assembler code, there
is no "boxing" of the variables. Boxing requires a rather expensive call to
determine an object's actual data type.

All Reference types are allocated in a similar method to the above example.
Value types are allocated by the compiler and don't require the call to
GC_ALLOCATE at run time.

Note the magic occurring in "GC_ALLOCATE" - it allocates memory from the
heap and returns an offset into the heap that is then used by later code for
reference to the memory. The actual object size will be 8 bytes plus
garbage collector management buffer. The GC management buffer will be at
negative offets to the returned address, thus making the rest of the
compiler easier to write. If GC_ALLOCATE can't allocate the requested
memory, it calls GC_COLLECT, which runs compacts accessible heap memory and
resets the allocation pointer for GC_ALLOCATE, which then tries again. If
GC_ALLOCATE still can't allocate memory, it asks the OS to extend the heap.
The OS returns the new heap size to GC_ALLOCATE. If GC_ALLOCATE still can't
allocate the requested memory, it throws an Out of Memory exception.

Mike Ober.

Göran Andersson said:

Jon Skeet [C# MVP] wrote:
OK, I think the same. BUT! In other words myClass.s.i1 is already in
heap, yes? Now - what is boxing? Boxing is creating special packed
version of value-type in heap. In our case i1 ALREADY in heap. So...
object o = myClass.s.i1; //_not_ a boxing here?
???
Yes, it's boxed. You are not storing the myClass.s.il variable in the
object, you are storing a copy of the value of the myClass.s.il variable.
No, there's no boxing going on there. Boxing is creating a separate
object on the heap for a value type. That's not happening here.

Yes, it is.

myClass.s.i1 is an integer variable. The value copied from that integer
variable can not be stored in the variable o, as it is an object
reference, so a separate object has to be created on the heap where the
value can be stored, and the reference to that object is stored in the
variable o.

Click to expand...

Click to expand...

Michael D. Ober · Jun 1, 2006

I missed the second question as I couldn't get a clean download until this
morning. Sorry about that. As you have already discovered, boxing requires
additional code and an additional call into the memory manager. There will
be analogous code on the outbound side of the box as well. Boxing not only
takes additional code, but it also takes additional memory since the runtime
must store the metadata for the variable as well.

Mike.

Michael D. Ober · Jun 1, 2006

Actually, the struct inside a class is relavant. The compiler uses this
information to generate the metadata required by the boxing.

Mike.

Göran Andersson said:
Yes, it is.

myClass.s.i1 is an integer variable. The value copied from that integer
variable can not be stored in the variable o, as it is an object
reference

Ah, yes, I missed the storage in the variable o. Yes, this involves
boxing. The struct within a class bit is entirely irrelevant.

=?ISO-8859-1?Q?G=F6ran_Andersson?= · Jun 1, 2006

No, it's not. It's just an integer value that is stored in the boxing
object. Where the value came from originally is totally irrelevant for
how the boxing is done.

If you compare these two statements:

object o = myClass.s.i1;

and

object p = 999;

The values that will be stored in the boxing objects will be identical,
and the boxing will be performed in exactly the same way.

Michael D. Ober · Jun 2, 2006

The relavance comes from the compiler itself having to know which data type
metadata to feed to the boxing routine. In the case of the struct, the
compiler must know the structure's contained datatypes or it can't box. In
the second case, the compiler also determines the datatype to give the
boxing routines. You are correct that it's not relevant at runtime, but it
is relavant at compile time.

Mike Ober.

Jon Skeet [C# MVP] · Jun 3, 2006

Michael D. Ober said:
Actually, the struct inside a class is relavant. The compiler uses this
information to generate the metadata required by the boxing.

The fact that the value originally came from inside a class is
irrelevant. The boxing just creates a boxed System.Int32, regardless of
the origin of the value. The type of the value and the evaluated value
are the only important things.

Jon Skeet [C# MVP] · Jun 3, 2006

Michael D. Ober said:
The relavance comes from the compiler itself having to know which data type
metadata to feed to the boxing routine. In the case of the struct, the
compiler must know the structure's contained datatypes or it can't box. In
the second case, the compiler also determines the datatype to give the
boxing routines. You are correct that it's not relevant at runtime, but it
is relavant at compile time

Yes, it has to know the type - but that's true whatever you're doing.
Boxing a value from inside a struct which is inside a class is exactly
the same as boxing a value of the same type which is evaluated in a
different way.

The compiler is able to traverse the expression to work out the type
required, but that's orthogonal to boxing.

Michael D. Ober · Jun 3, 2006

Agreed.

Mike.

Jon Skeet said:
The fact that the value originally came from inside a class is
irrelevant. The boxing just creates a boxed System.Int32, regardless of
the origin of the value. The type of the value and the evaluated value
are the only important things.

vs2010 bug of partial template specialization?	1	Mar 12, 2017
C# CSharp Reset Struct Class Memory	2	Mar 21, 2007
Extend size of non client area	0	Oct 1, 2016
Excluding / Removing empty elements from XElement	1	Apr 28, 2009
Indexed class / list	45	Dec 15, 2011
Reflection	4	Nov 13, 2003
This is not too clear... thougt i got it but now i lost it againstruct/class	12	Dec 18, 2007
Passing class/struct with methods to C++ DLL	4	Dec 1, 2008

Struct inside class

Sneil

Barry Kelly

Sneil

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Jon Skeet [C# MVP]

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Michael D. Ober

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Carl Daniel [VC++ MVP]

Jon Skeet [C# MVP]

Sneil

Michael D. Ober

Michael D. Ober

Michael D. Ober

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Michael D. Ober

Jon Skeet [C# MVP]

Jon Skeet [C# MVP]

Michael D. Ober

Ask a Question

Similar Threads