Collections od KEYS only : BOOLEAN VS OBJECT, False Vs Null/Nothing

pamela fluente · Oct 1, 2007

In a previous question I asked if there were collections
similar to SortedList or Dictionary which would hold KEYS only, No
values.

It seems that the framework does not have such "reduced" collections,
but one can use
the standard one, adding as value a Null/Nothing value or a boolean
(of creating a collection
by inheritance).

I am wondering what is most advisable to use as type for the values:
Boolean or Object ?

For instance:
C#
SortedList<MyKey, bool> MyListB = new SortedList<MyKey, bool>();
MyListB.add(new MyKey(), false);
or
SortedList<MyKey, object> MyListO = new SortedList<MyKey, object>();
MyListO.add(new MyKey(), null);

VB
dim MyListB as new SortedList(of MyKey, Boolean)
MyListB.add(New MyKey, false)
or
dim MyListO as new SortedList(of MyKey, Object)
MyListO.add(New MyKey, Nothing)

I am wondering if using a value type (boolean) instead of a reference
type
would be more inefficient ? Your opinion ?

-P

=?ISO-8859-1?Q?G=F6ran_Andersson?= · Oct 1, 2007

pamela said:
In a previous question I asked if there were collections
similar to SortedList or Dictionary which would hold KEYS only, No
values.

It seems that the framework does not have such "reduced" collections,
but one can use
the standard one, adding as value a Null/Nothing value or a boolean
(of creating a collection
by inheritance).

I am wondering what is most advisable to use as type for the values:
Boolean or Object ?

For instance:
C#
SortedList<MyKey, bool> MyListB = new SortedList<MyKey, bool>();
MyListB.add(new MyKey(), false);
or
SortedList<MyKey, object> MyListO = new SortedList<MyKey, object>();
MyListO.add(new MyKey(), null);

VB
dim MyListB as new SortedList(of MyKey, Boolean)
MyListB.add(New MyKey, false)
or
dim MyListO as new SortedList(of MyKey, Object)
MyListO.add(New MyKey, Nothing)

I am wondering if using a value type (boolean) instead of a reference
type
would be more inefficient ? Your opinion ?

-P

It doesn't really matter, as a reference itself is a value type.

The garbage collector, however, scans all references that exist when
it's doing a garbage collection. As the references will all be null, it
won't take long to ignore them, but if you don't use references the
garbage collector doesn't have to be bothered with them at all.

pamela fluente · Oct 1, 2007

It doesn't really matter, as a reference itself is a value type.

The garbage collector, however, scans all references that exist when
it's doing a garbage collection. As the references will all be null, it
won't take long to ignore them, but if you don't use references the
garbage collector doesn't have to be bothered with them at all.

--
Göran Andersson
_____http://www.guffa.com- Nascondi testo tra virgolette -

- Mostra testo tra virgolette -

Ok, so as to the GC, you are saying, if I use boolean I will save it
some little work. Right ?
Actually these list can be pretty long, so it's not a bad idea to cut
references.

But what about memory use ? In any case both collection vould have a
list
of Values. Right ? Now, is it possible that all Nothing would waste
less memory than a tons of FALSE values ?

Or do the occupy the same ? And what about the Add(Key, Value) or
other operations, would it make any difference
to deal with a Value instead of a Reference ?

-P

=?ISO-8859-1?Q?G=F6ran_Andersson?= · Oct 1, 2007

pamela said:
Ok, so as to the GC, you are saying, if I use boolean I will save it
some little work. Right ?
Right.

Actually these list can be pretty long, so it's not a bad idea to cut
references.

But what about memory use ? In any case both collection vould have a
list
of Values. Right ? Now, is it possible that all Nothing would waste
less memory than a tons of FALSE values ?

On a 32 bit system it will be the same. A reference is four bytes, and a
boolean uses one byte but will require four bytes in the KeyValuePair
structure.

Or do the occupy the same ? And what about the Add(Key, Value) or
other operations, would it make any difference
to deal with a Value instead of a Reference ?

No difference at all. A reference is treated just as an integer, until
the point where you actually dereference it to reach the object that
it's pointing to.

pamela fluente · Oct 1, 2007

On a 32 bit system it will be the same. A reference is four bytes, and a
boolean uses one byte but will require four bytes in the KeyValuePair
structure.

No difference at all. A reference is treated just as an integer, until
the point where you actually dereference it to reach the object that
it's pointing to.

Thank you Goran. Now It's much more clear.

I remain with a last doubt. Please, do not laugh at me :-)

)

I had "imagined" a key value pair like something like that:

Key , Reference ---------> Object

Now I was thinking that:

Key, EmptyReference (4 bytes, for the empty reference)
Key, Reference -----> BooleanValue (4bytes + 1 byte )

where the second would involve 1 Byte more (the boolean value).
So it would be convenient to use an empty reference !

Why is this naive representation of mine wrong and instead the 2 above
involve exactly the same memory occupation ?

Is perhaps because in the second case there is no reference to a
boolean and directly

Key, BooleanValue ? (1 byte, for the value)

But if this is so we would have a gain using boolean.

I am confused !! :-)

So what is actually the correct
representation ?

-P

Muj Beg · Oct 1, 2007

By the way, just wanted to mention: If you want a collection of "keys" only,
and not key/value pairs, why not just use the appropiate list class, such
as:

List<string> mykeys.

Thanks,
MB

Jon Skeet [C# MVP] · Oct 1, 2007

Muj Beg said:
By the way, just wanted to mention: If you want a collection of "keys" only,
and not key/value pairs, why not just use the appropiate list class, such
as:

List<string> mykeys.

Because checking whether a list contains an element or not is an O(n)
operation instead of an O(1) operation.

pamela fluente · Oct 2, 2007

Because checking whether a list contains an element or not is an O(n)
operation instead of an O(1) operation.

Yes and also because we are talking here about sorted list and
dictionaries.

=?ISO-8859-1?Q?G=F6ran_Andersson?= · Oct 2, 2007

pamela said:
Thank you Goran. Now It's much more clear.

I remain with a last doubt. Please, do not laugh at me )

I had "imagined" a key value pair like something like that:

Key , Reference ---------> Object

Now I was thinking that:

Key, EmptyReference (4 bytes, for the empty reference)
Key, Reference -----> BooleanValue (4bytes + 1 byte )

where the second would involve 1 Byte more (the boolean value).
So it would be convenient to use an empty reference !

Why is this naive representation of mine wrong and instead the 2 above
involve exactly the same memory occupation ?

Is perhaps because in the second case there is no reference to a
boolean and directly

Key, BooleanValue ? (1 byte, for the value)

But if this is so we would have a gain using boolean.

I am confused !! So what is actually the correct
representation ?

-P

The KeyValuePair structure will contain the value itself, not a
reference to the value. When the value is a reference type, the
reference is stored, but when the value is a value type, the value
itself is stored.

A boolean variable uses one byte, but in a structure the data will be
padded to an even multiple of four bytes, so the boolean still requires
four bytes in the structure.

pamela fluente · Oct 2, 2007

The KeyValuePair structure will contain the value itself, not a
reference to the value. When the value is a reference type, the
reference is stored, but when the value is a value type, the value
itself is stored.

A boolean variable uses one byte, but in a structure the data will be
padded to an even multiple of four bytes, so the boolean still requires
four bytes in the structure.

--
Göran Andersson
_____http://www.guffa.com- Nascondi testo tra virgolette -

- Mostra testo tra virgolette -

Ah, I understand now. Very clear and useful.

Thanks Göran

-P

Muj Beg · Oct 4, 2007

Ah, I see.

Although one could do list.Sort() (once), followed by list.BinarySearch() to
gain seom of the same advantages.

pamela fluente · Oct 4, 2007

Ah, I see.

Although one could do list.Sort() (once), followed by list.BinarySearch() to
gain seom of the same advantages.

Actually, I do not think you can replace, in most cases a sorted list
with a list + Ordering
without a huge performace dropdown. I mean in the cases where a sorted
list is actually needed.

For instance, you cannot be sorting after any add...

-P

Jon Skeet [C# MVP] · Oct 4, 2007

Muj Beg said:
Ah, I see.

Although one could do list.Sort() (once), followed by list.BinarySearch() to
gain seom of the same advantages.

True. That would give O(log(N)) performance instead of O(N) performance
- but with a decently chosen hashcode, a hashtable type of structure
will be much quicker for large sets.

Collections od KEYS only : BOOLEAN VS OBJECT, False Vs Null/Nothing

pamela fluente

=?ISO-8859-1?Q?G=F6ran_Andersson?=

pamela fluente

=?ISO-8859-1?Q?G=F6ran_Andersson?=

pamela fluente

Muj Beg

Jon Skeet [C# MVP]

pamela fluente

=?ISO-8859-1?Q?G=F6ran_Andersson?=

pamela fluente

Muj Beg

pamela fluente

Jon Skeet [C# MVP]