Why doesn't .NET framework has linked list and treemap?

  • Thread starter Thread starter Altramagnus
  • Start date Start date
I always assumed it is because the garbage collector can handle big arrays a
lot better than objects linked through references. You always have the start
of the list in an older generation than the end, which degrades collection
performance, and if the GC decides to compact the heap, many more references
have to be adjusted (assuming a doubly-linked-list).

Niki
 
Why doesn't .NET framework Collections does not have linked list and

See this comment by Julian Bucknall:

http://www.boyet.com/FixedArticles/EZDSL.html

"To be honest, it needs rewriting completely. When I wrote EZDSL, I
used the best knowledge I had at the time, but it's starting to look
long in the tooth.

* The linked lists won't get converted: using an ArrayList or a TList
is a much better and more efficient solution."

Just use an ArrayList in C# - it should give you all you need, really.

Marc
 
Marc Scheuner said:
See this comment by Julian Bucknall:

http://www.boyet.com/FixedArticles/EZDSL.html

"To be honest, it needs rewriting completely. When I wrote EZDSL, I
used the best knowledge I had at the time, but it's starting to look
long in the tooth.

* The linked lists won't get converted: using an ArrayList or a TList
is a much better and more efficient solution."

Just use an ArrayList in C# - it should give you all you need, really.

Well, unless you need a list which is efficient for
insertions/removals/enumeration, but don't need fast random access. The
times when it's worth using a linked list are few, IME, but they do
exist.
 
Jon Skeet said:
Well, unless you need a list which is efficient for
insertions/removals/enumeration, but don't need fast random access. The
times when it's worth using a linked list are few, IME, but they do
exist.

Really? If I remove an element of a list and throw it away, the GC will have
to move (in average) half of the list's elements around when compacting the
heap. So where's the advantage in comparison to the ArrayList?
Next: Insertion; Well, it is pretty fast to insert an element in the middle
of a linked list, but then you have a GC gen 0 object with references to and
from gen 1/2 objects, i.e. the GC will always have to do scan through all
generations. I'd guess the ArrayList beats the linked list here, too.
Last point: enumeration. Let's say I want to call a method on all objects in
an ArrayList; I need to resolve one pointer per object (the one pointing to
the object), and due to the memory layout of the array, the number of cache
misses is quite low. For enumerating a linked list, on the other hand, I
have to resolve two pointers (one to the object, one to the next list item)
and the chances for cache misses is quite high (at least if items are
inserted in the middle of the list - and that was the whole reason to use
the list at all).

Did I miss anything?

Niki
 
Niki Estner said:
Really? If I remove an element of a list and throw it away, the GC will have
to move (in average) half of the list's elements around when compacting the
heap. So where's the advantage in comparison to the ArrayList?

That's assuming the GC isn't smart - it could just put the last node
where the "middle" (removed) node previous was. If you're going to
assume the GC does things in a stupid way, then chances are that most
of the generation will have to be moved on every compaction, which I
suspect isn't the case.
Next: Insertion; Well, it is pretty fast to insert an element in the middle
of a linked list, but then you have a GC gen 0 object with references to and
from gen 1/2 objects, i.e. the GC will always have to do scan through all
generations. I'd guess the ArrayList beats the linked list here, too.

GC will certainly be slower that first time, until the next GC when the
gen0 object will be promoted to gen1. Is that definitely going to be
slower than doing a large copy? For instance, take the code of:

ArrayList list = new ArrayList();
for (int i=0; i < 100000; i++)
{
list.Insert(0, new object());
}

That has to copy the contents of the list every time. With a linked
list, you just need to create a new node and link it to the head. When
the next GC of gen0 occurs, there'll be a lot to be bumped up to gen1,
but that needn't be slower than the bunch of copies involved in the
above. I'd want to see benchmarks to be sure, of course, but I suspect
that using an ArrayList would be slower than using a LinkedList in the
above case. (Obviously just putting the elements efficiently into the
ArrayList to start with would be better, but it's not always as simple
as the above.)


Here's a very quick example program. Obviously the linked list class
doesn't do an awful lot - like allow actual access to the items! - but
I believe it doesn't miss out on anything which is vital to the test.
Note that changing the 0 in the ArrayList.Insert call to list.Count
makes it basically instantaneous, so it's definitely the copying which
takes the time. On my box, the ArrayList takes about 7 seconds, the
LinkedList takes about 0.03 seconds. Note the call to GC.Collect() in
both cases to take into account the problem you mentioned.

using System;
using System.Collections;

class Test
{
static void Main()
{
{
DateTime start = DateTime.Now;
ArrayList list = new ArrayList();
for (int i=0; i < 100000; i++)
{
list.Insert(0, new object());
}
GC.Collect();
DateTime end = DateTime.Now;
Console.WriteLine (end-start);
GC.KeepAlive(list);
}

{
DateTime start = DateTime.Now;
LinkedList list = new LinkedList();
for (int i=0; i < 100000; i++)
{
list.Insert(new object());
}
GC.Collect();
DateTime end = DateTime.Now;
Console.WriteLine (end-start);
GC.KeepAlive(list);
}
}
}

class LinkedList
{
class Node
{
internal object value;
internal Node next;
}

Node head=null;

internal void Insert(object value)
{
Node node = new Node();
node.value=value;
node.next = head;
head = node;
}
}
Last point: enumeration. Let's say I want to call a method on all objects in
an ArrayList; I need to resolve one pointer per object (the one pointing to
the object), and due to the memory layout of the array, the number of cache
misses is quite low. For enumerating a linked list, on the other hand, I
have to resolve two pointers (one to the object, one to the next list item)
and the chances for cache misses is quite high (at least if items are
inserted in the middle of the list - and that was the whole reason to use
the list at all).

I wasn't trying to say that enumeration of a linked list was faster -
it was more that if you want frequent random access you *certainly*
don't want a linked list.
 
Altramagnus,
In addition to the other comments.

System.Collections.Specialized.ListDictionary is a singly linked list.

System.Collections.Specialized.HybridDictionary will start out using
ListDictionary then reverts to using a HashTable.

Hope this helps
Jay
 
Back
Top