Why is the bool type 4 bytes it's a wast of memory..

  • Thread starter Thread starter Tony Johansson
  • Start date Start date
T

Tony Johansson

Hi!

Is it anyone that might have a good explanation why the designor of .NET
made a bool 4 bytes.
I mean it's just a wast of memory.

//Tony
 
Tony said:
Hi!

Is it anyone that might have a good explanation why the designor of .NET
made a bool 4 bytes.
I mean it's just a wast of memory.

For a variety of reasons. I would guess that the most important is that
there's no compelling reason for it _not_ to be 4 bytes, and making it 4
bytes keeps it consistent with other data types.

While 4 bytes obviously takes up more room than 1 byte, the fact is that
large arrays of booleans aren't really all that common, and there's
already a 1 byte type anyway (it's called 'byte'…duh :) ). On the other
hand, people generally do care about efficient access to booleans, as
well as preserving atomic semantics in multi-threaded scenarios.

On current hardware architectures, it's more efficient to move 32 bits
around than just 8. And because of that, there's the possibility that
writing to a 1-byte boolean would require first reading 4 bytes,
modifying a single byte within those 4 bytes, and then writing the 4
bytes back. Obviously that's harder for .NET to make atomic than just
writing to a single 32-bit word, and of course the efficiency aspect
alone is a decent enough reason.

When you think about it, you might as well ask why a boolean isn't just
1 _bit_ in size. The issues are actually quite similar.

Pete
 
Hi,
I mean it's just a wast of memory.

As most design decision, this is a tradeoff that is you usually buy memory
at the price of speed or speed at the price of memory.

Here they likely considered how data are aligned in memory. From
http://en.wikipedia.org/wiki/Data_structure_alignment :

"Data alignment means putting the data at a memory offset equal to some
multiple of the word size, which increases the system's performance due to
the way the CPU handles memory. To align the data, it may be necessary to
insert some meaningless bytes between the end of the last data structure and
the start of the next, which is data structure padding."

If needed you have specialized classes allowing to save some space likely
reducing the speed such as
http://msdn.microsoft.com/en-us/library/system.collections.bitarray.aspx or
http://msdn.microsoft.com/en-us/library/system.collections.specialized.bitvector32_members.aspx.
 
Is it anyone that might have a good explanation why the designor of .NET
made a bool 4 bytes.
I mean it's just a wast of memory.

Somebody at Microsoft made a decision.

One potential explanation could be that BOOL in traditional
Win32 API programming is typedef'ed to int (4 bytes), so making
bool 4 bytes will make it more Win32 compatible.

It uses more space, but huge bool arrays are very rare.

Arne
 
On current hardware architectures, it's more efficient to move 32 bits
around than just 8. And because of that, there's the possibility that
writing to a 1-byte boolean would require first reading 4 bytes,
modifying a single byte within those 4 bytes, and then writing the 4
bytes back. Obviously that's harder for .NET to make atomic than just
writing to a single 32-bit word, and of course the efficiency aspect
alone is a decent enough reason.

But the old architecture x86 and x86-64 is actually killing the
modern architecture IA-64.
When you think about it, you might as well ask why a boolean isn't just
1 _bit_ in size. The issues are actually quite similar.

In Pascal packed array of boolean usually is a single bit per boolean.

Arne
 
As most design decision, this is a tradeoff that is you usually buy memory
at the price of speed or speed at the price of memory.

Here they likely considered how data are aligned in memory. From
http://en.wikipedia.org/wiki/Data_structure_alignment :

"Data alignment means putting the data at a memory offset equal to some
multiple of the word size, which increases the system's performance due to
the way the CPU handles memory. To align the data, it may be necessary to
insert some meaningless bytes between the end of the last data structure and
the start of the next, which is data structure padding."

That is not a sufficient argument:

1 byte incl. padding 4 bytes incl. padding
1 boolean 4 bytes 4 bytes
4 boolean 4 bytes 16 bytes

Arne
 
Arne said:
But the old architecture x86 and x86-64 is actually killing the
modern architecture IA-64.

Itanium seems to be on its way out, yes.

Word-sized memory access is more critical on Itanium (maybe one reason
people don't use it as much :) ), but even though no exception occurs on
x86 for unaligned data access, that doesn't mean there's no advantage to
doing so. And if committed to writing more than one byte at once, the
issue of non-atomic byte-sized writes becomes an issue.

For performance reasons, it might well make sense to avoid having to
decide between the in-CPU overhead and the out-of-CPU overhead and just
create a data size large enough that neither matters.
In Pascal packed array of boolean usually is a single bit per boolean.

If you specify "PACKED", yes...the data winds up packed.

And? This is C#, not Pascal. And there's a reason C# doesn't have
bitfields nor packed bit-array data structures. The same reason applies
to not making booleans 1 bit wide as to not making booleans 1 byte wide.

Whatever that reason is. ;)

Pete
 
Hello,
1 byte incl. padding 4 bytes incl. padding
1 boolean 4 bytes 4 bytes
4 boolean 4 bytes 16 bytes

But then those booleans are not all aligned any more in memory while they
are still aligned with the more expensive layout.

Anyway my main point was rather to tell the OP that he has other options if
he needs to save space (as knowing why it is done this way is unlikely to
change anything) likely at the price of speed (even going down to a boolean
per bit).
 
Itanium seems to be on its way out, yes.

Word-sized memory access is more critical on Itanium (maybe one reason
people don't use it as much :) ), but even though no exception occurs on
x86 for unaligned data access, that doesn't mean there's no advantage to
doing so. And if committed to writing more than one byte at once, the
issue of non-atomic byte-sized writes becomes an issue.

Neither solution involves any unaligned data access.

There are no reason to expect 4 bytes to be faster than 1 byte.

I tried making some tests on some systems:

x86, x86-64 and Power : approx. same speed
Alpha and IA-64 : 4 byte faster for write operations

Unless one is using .NET on IA-64, then the performance
benefits is not existing.
If you specify "PACKED", yes...the data winds up packed.

And? This is C#, not Pascal. And there's a reason C# doesn't have
bitfields nor packed bit-array data structures. The same reason applies
to not making booleans 1 bit wide as to not making booleans 1 byte wide.

Whatever that reason is. ;)

Not really.

Packing to bits will have a real performance impact.

Arne
 
But then those booleans are not all aligned any more in memory while they
are still aligned with the more expensive layout.

They are natural aligned in both cases.
Anyway my main point was rather to tell the OP that he has other options if
he needs to save space (as knowing why it is done this way is unlikely to
change anything) likely at the price of speed (even going down to a boolean
per bit).

Unless MS was interested in Itanium (which they actually may have been
back in 2001-2002 when stuff like this was decided !), then performance
could not justify the decision.

Arne
 
Sorry, I can't see the orginal message, but (at least on my Windows
32bit) sizeof(bool) is 1, so I wonder what this discussion is about. <g>

--
Rudy Velthuis http://rvelthuis.de

"Most of you are familiar with the virtues of a programmer.
There are three, of course: laziness, impatience, and hubris."
-- Larry Wall
 
Patrice said:
Hello,


But then those booleans are not all aligned any more

Natural alignment means that a type is aligned on a multiple of its own
size (in bytes), so bytes are always naturally aligned, by definition.
 
Arne said:
Somebody at Microsoft made a decision.

Er... I just checked, and

Console.WriteLine(sizeof(bool));

printed 1 for me. On 32 bit Windows. I don't quite understand what the
fuss is all about. <g>

Of course, if the bool is part of an aligned struct, the padding bytes
may make the offset of the next member (say, an Int32)
<offset of boolean> + 4. But a bool itself is only 1 byte in size,
AFAICT. If the next member is a double, the padding can even be 7
bytes, but that does not make the bool 8 bytes in size.

IOW, this probably has a size of 16 bytes:

struct Foo
{
bool b;
double d;
}
 
Rudy Velthuis wrote on 6/14/2010 :
Er... I just checked, and

Console.WriteLine(sizeof(bool));

printed 1 for me. On 32 bit Windows. I don't quite understand what the
fuss is all about. <g>

Of course, if the bool is part of an aligned struct, the padding bytes
may make the offset of the next member (say, an Int32)
<offset of boolean> + 4. But a bool itself is only 1 byte in size,
AFAICT. If the next member is a double, the padding can even be 7
bytes, but that does not make the bool 8 bytes in size.

IOW, this probably has a size of 16 bytes:

struct Foo
{
bool b;
double d;
}

sizeof represents the .net runtime size. It will return 1.
Marshal.SizeOf (typeof(bool)) will return 4, as the marshaller converts
a bool to 4 bytes when passed to native code...
 
Sorry, I can't see the orginal message, but (at least on my Windows
32bit) sizeof(bool) is 1, so I wonder what this discussion is about.<g>

Good question.

I guess the best answer is: Tony's book !

:-)

Arne
 
Natural alignment means that a type is aligned on a multiple of its own
size (in bytes), so bytes are always naturally aligned, by definition.

We already covered that part a week ago.

Arne
 
Rudy Velthuis wrote on 6/14/2010 :

sizeof represents the .net runtime size. It will return 1.
Marshal.SizeOf (typeof(bool)) will return 4, as the marshaller converts
a bool to 4 bytes when passed to native code...

But that means that BOOL in Win32 C is 4 bytes not that bool
in C# is 4 bytes.

But it may be the background for Tony's book.

Arne
 
Tom said:
sizeof represents the .net runtime size. It will return 1.
Marshal.SizeOf (typeof(bool)) will return 4, as the marshaller
converts a bool to 4 bytes when passed to native code...

Ah, that's a different case. If you push a boolean on the stack, as a
function parameter, it will take up 4 bytes, indeed (in a 32 bit
context).
 
Back
Top