Why is the bool type 4 bytes it's a wast of memory..

Tony Johansson · May 28, 2010

Hi!

Is it anyone that might have a good explanation why the designor of .NET
made a bool 4 bytes.
I mean it's just a wast of memory.

//Tony

Peter Duniho · May 28, 2010

Tony said:
Hi!

Is it anyone that might have a good explanation why the designor of .NET
made a bool 4 bytes.
I mean it's just a wast of memory.

For a variety of reasons. I would guess that the most important is that
there's no compelling reason for it _not_ to be 4 bytes, and making it 4
bytes keeps it consistent with other data types.

While 4 bytes obviously takes up more room than 1 byte, the fact is that
large arrays of booleans aren't really all that common, and there's
already a 1 byte type anyway (it's called 'byte'…duh

). On the other
hand, people generally do care about efficient access to booleans, as
well as preserving atomic semantics in multi-threaded scenarios.

On current hardware architectures, it's more efficient to move 32 bits
around than just 8. And because of that, there's the possibility that
writing to a 1-byte boolean would require first reading 4 bytes,
modifying a single byte within those 4 bytes, and then writing the 4
bytes back. Obviously that's harder for .NET to make atomic than just
writing to a single 32-bit word, and of course the efficiency aspect
alone is a decent enough reason.

When you think about it, you might as well ask why a boolean isn't just
1 _bit_ in size. The issues are actually quite similar.

Pete

Patrice · May 28, 2010

Hi,

I mean it's just a wast of memory.

As most design decision, this is a tradeoff that is you usually buy memory
at the price of speed or speed at the price of memory.

Here they likely considered how data are aligned in memory. From
http://en.wikipedia.org/wiki/Data_structure_alignment :

"Data alignment means putting the data at a memory offset equal to some
multiple of the word size, which increases the system's performance due to
the way the CPU handles memory. To align the data, it may be necessary to
insert some meaningless bytes between the end of the last data structure and
the start of the next, which is data structure padding."

If needed you have specialized classes allowing to save some space likely
reducing the speed such as
http://msdn.microsoft.com/en-us/library/system.collections.bitarray.aspx or
http://msdn.microsoft.com/en-us/library/system.collections.specialized.bitvector32_members.aspx.

Arne Vajhøj · Jun 4, 2010

Is it anyone that might have a good explanation why the designor of .NET
made a bool 4 bytes.
I mean it's just a wast of memory.

Somebody at Microsoft made a decision.

One potential explanation could be that BOOL in traditional
Win32 API programming is typedef'ed to int (4 bytes), so making
bool 4 bytes will make it more Win32 compatible.

It uses more space, but huge bool arrays are very rare.

Arne

Arne Vajhøj · Jun 4, 2010

On current hardware architectures, it's more efficient to move 32 bits
around than just 8. And because of that, there's the possibility that
writing to a 1-byte boolean would require first reading 4 bytes,
modifying a single byte within those 4 bytes, and then writing the 4
bytes back. Obviously that's harder for .NET to make atomic than just
writing to a single 32-bit word, and of course the efficiency aspect
alone is a decent enough reason.

But the old architecture x86 and x86-64 is actually killing the
modern architecture IA-64.

When you think about it, you might as well ask why a boolean isn't just
1 _bit_ in size. The issues are actually quite similar.

In Pascal packed array of boolean usually is a single bit per boolean.

Arne

Arne Vajhøj · Jun 4, 2010

As most design decision, this is a tradeoff that is you usually buy memory
at the price of speed or speed at the price of memory.

Here they likely considered how data are aligned in memory. From
http://en.wikipedia.org/wiki/Data_structure_alignment :

"Data alignment means putting the data at a memory offset equal to some
multiple of the word size, which increases the system's performance due to
the way the CPU handles memory. To align the data, it may be necessary to
insert some meaningless bytes between the end of the last data structure and
the start of the next, which is data structure padding."

That is not a sufficient argument:

1 byte incl. padding 4 bytes incl. padding
1 boolean 4 bytes 4 bytes
4 boolean 4 bytes 16 bytes

Arne

Peter Duniho · Jun 4, 2010

Arne said:
But the old architecture x86 and x86-64 is actually killing the
modern architecture IA-64.

Itanium seems to be on its way out, yes.

Word-sized memory access is more critical on Itanium (maybe one reason
people don't use it as much

), but even though no exception occurs on
x86 for unaligned data access, that doesn't mean there's no advantage to
doing so. And if committed to writing more than one byte at once, the
issue of non-atomic byte-sized writes becomes an issue.

For performance reasons, it might well make sense to avoid having to
decide between the in-CPU overhead and the out-of-CPU overhead and just
create a data size large enough that neither matters.

In Pascal packed array of boolean usually is a single bit per boolean.

If you specify "PACKED", yes...the data winds up packed.

And? This is C#, not Pascal. And there's a reason C# doesn't have
bitfields nor packed bit-array data structures. The same reason applies
to not making booleans 1 bit wide as to not making booleans 1 byte wide.

Whatever that reason is.

Pete

Patrice · Jun 4, 2010

Hello,

1 byte incl. padding 4 bytes incl. padding
1 boolean 4 bytes 4 bytes
4 boolean 4 bytes 16 bytes

But then those booleans are not all aligned any more in memory while they
are still aligned with the more expensive layout.

Anyway my main point was rather to tell the OP that he has other options if
he needs to save space (as knowing why it is done this way is unlikely to
change anything) likely at the price of speed (even going down to a boolean
per bit).

Arne Vajhøj · Jun 6, 2010

Itanium seems to be on its way out, yes.

Word-sized memory access is more critical on Itanium (maybe one reason
people don't use it as much ), but even though no exception occurs on
x86 for unaligned data access, that doesn't mean there's no advantage to
doing so. And if committed to writing more than one byte at once, the
issue of non-atomic byte-sized writes becomes an issue.

Neither solution involves any unaligned data access.

There are no reason to expect 4 bytes to be faster than 1 byte.

I tried making some tests on some systems:

x86, x86-64 and Power : approx. same speed
Alpha and IA-64 : 4 byte faster for write operations

Unless one is using .NET on IA-64, then the performance
benefits is not existing.

If you specify "PACKED", yes...the data winds up packed.

And? This is C#, not Pascal. And there's a reason C# doesn't have
bitfields nor packed bit-array data structures. The same reason applies
to not making booleans 1 bit wide as to not making booleans 1 byte wide.

Whatever that reason is.

Not really.

Packing to bits will have a real performance impact.

Arne

Arne Vajhøj · Jun 6, 2010

But then those booleans are not all aligned any more in memory while they
are still aligned with the more expensive layout.

They are natural aligned in both cases.

Anyway my main point was rather to tell the OP that he has other options if
he needs to save space (as knowing why it is done this way is unlikely to
change anything) likely at the price of speed (even going down to a boolean
per bit).

Unless MS was interested in Itanium (which they actually may have been
back in 2001-2002 when stuff like this was decided !), then performance
could not justify the decision.

Arne

Rudy Velthuis · Jun 15, 2010

Sorry, I can't see the orginal message, but (at least on my Windows
32bit) sizeof(bool) is 1, so I wonder what this discussion is about. <g>

--
Rudy Velthuis http://rvelthuis.de

"Most of you are familiar with the virtues of a programmer.
There are three, of course: laziness, impatience, and hubris."
-- Larry Wall

Rudy Velthuis · Jun 15, 2010

Patrice said:
Hello,

But then those booleans are not all aligned any more

Natural alignment means that a type is aligned on a multiple of its own
size (in bytes), so bytes are always naturally aligned, by definition.

Rudy Velthuis · Jun 15, 2010

Arne said:
Somebody at Microsoft made a decision.

Er... I just checked, and

Console.WriteLine(sizeof(bool));

printed 1 for me. On 32 bit Windows. I don't quite understand what the
fuss is all about. <g>

Of course, if the bool is part of an aligned struct, the padding bytes
may make the offset of the next member (say, an Int32)
<offset of boolean> + 4. But a bool itself is only 1 byte in size,
AFAICT. If the next member is a double, the padding can even be 7
bytes, but that does not make the bool 8 bytes in size.

IOW, this probably has a size of 16 bytes:

struct Foo
{
bool b;
double d;
}

Tom Shelton · Jun 15, 2010

Rudy Velthuis wrote on 6/14/2010 :

Er... I just checked, and

Console.WriteLine(sizeof(bool));

printed 1 for me. On 32 bit Windows. I don't quite understand what the
fuss is all about. <g>

Of course, if the bool is part of an aligned struct, the padding bytes
may make the offset of the next member (say, an Int32)
<offset of boolean> + 4. But a bool itself is only 1 byte in size,
AFAICT. If the next member is a double, the padding can even be 7
bytes, but that does not make the bool 8 bytes in size.

IOW, this probably has a size of 16 bytes:

struct Foo
{
bool b;
double d;
}

sizeof represents the .net runtime size. It will return 1.
Marshal.SizeOf (typeof(bool)) will return 4, as the marshaller converts
a bool to 4 bytes when passed to native code...

Arne Vajhøj · Jun 15, 2010

Sorry, I can't see the orginal message, but (at least on my Windows
32bit) sizeof(bool) is 1, so I wonder what this discussion is about.<g>

Good question.

I guess the best answer is: Tony's book !

:-)

Arne

Arne Vajhøj · Jun 15, 2010

Natural alignment means that a type is aligned on a multiple of its own
size (in bytes), so bytes are always naturally aligned, by definition.

We already covered that part a week ago.

Arne

Arne Vajhøj · Jun 15, 2010

Rudy Velthuis wrote on 6/14/2010 :

sizeof represents the .net runtime size. It will return 1.
Marshal.SizeOf (typeof(bool)) will return 4, as the marshaller converts
a bool to 4 bytes when passed to native code...

But that means that BOOL in Win32 C is 4 bytes not that bool
in C# is 4 bytes.

But it may be the background for Tony's book.

Arne

Rudy Velthuis · Jun 15, 2010

Tom said:
sizeof represents the .net runtime size. It will return 1.
Marshal.SizeOf (typeof(bool)) will return 4, as the marshaller
converts a bool to 4 bytes when passed to native code...

Ah, that's a different case. If you push a boolean on the stack, as a
function parameter, it will take up 4 bytes, indeed (in a 32 bit
context).

Rudy Velthuis · Jun 15, 2010

Arne said:
We already covered that part a week ago.

I missed that, sorry.

Arne Vajhøj · Jun 16, 2010

I missed that, sorry.

Your main point was still a very good catch!

Arne

Why is the bool type 4 bytes it's a wast of memory..

Tony Johansson

Peter Duniho

Patrice

Arne Vajhøj

Arne Vajhøj

Arne Vajhøj

Peter Duniho

Patrice

Arne Vajhøj

Arne Vajhøj

Rudy Velthuis

Rudy Velthuis

Rudy Velthuis

Tom Shelton

Arne Vajhøj

Arne Vajhøj

Arne Vajhøj

Rudy Velthuis

Rudy Velthuis

Arne Vajhøj