Collection sorted on last accessed object

  • Thread starter Thread starter Vani Murarka
  • Start date Start date
V

Vani Murarka

Hi Everyone,

Does .NET offer any collection class which will give me objects last
*accessed* such that I may build a least-recently-used cache that
kills off objects that haven't been used for awhile?

Or is there any other way to implement this kind of a cache /
collection where one can do this kind of cleanup based on
least-recently-used objects?

Java has a collection class called LinkedHashSet which enables one to
do this. Is there something similar in .NET - or some other way of
doing this?

Any pointers will be really appreciated.

Thanks a ton

Regards

Vani
 
Just move it to the front on access and kill those at the back. Any
collection class that allows the move and the kill will do for this, like
ArrayList.
 
at said:
Just move it to the front on access and kill those at the back. Any
collection class that allows the move and the kill will do for this, like
ArrayList.

You may run into performance-problems, ArrayList really doesn't perform
that well on anything but appending.

You could look into using a splay-tree data-structure, it fits your
requirements pretty nice and is really easy to implement.
 
That interests and surprises me, I have not measured the ArrayList's
performance for moving elements around but could you provide some links to
confirm your statement? I do not contradict your statement but I would like
some confirmation.

Then, an ArrayList comes standard meaning less code to write and as long as
its performance is ok why not stick with it? One can always change to
another container as the need arises and have the process itself up and
running first and then see what the performance is.
 
at said:
That interests and surprises me, I have not measured the ArrayList's
performance for moving elements around but could you provide some links to
confirm your statement? I do not contradict your statement but I would like
some confirmation.

Output of attached source (measured execution time since start):

00:00:00 mutation from end...
00:00:00.0100144 mutation from end... done
00:00:00.0100144 mutation from start...
00:00:12.3777984 mutation from start... done

It's not really suprising, since lists implemented as arrays has to copy
the tail of the list when inserting/removing.
Then, an ArrayList comes standard meaning less code to write and as long as
its performance is ok why not stick with it? One can always change to
another container as the need arises and have the process itself up and
running first and then see what the performance is.

Yes, the initial implementation can easily be done using ArrayList, and
if the profiler shows a performance problem there you can
re-implement.... but, I have implemented caching in the past, and
array's really aren't a good datastructure for it.

BTW: How are you going to search the cache? if it gets mederately large
you should probably have a seperate indexing on it, by hashing for example.

If my memory seves me right wrt. LinkedHashSet(JAVA), there is no way to
rearrange the ordering, moving most-used elements to the front, so it
really isn't very good for caching either.

--
Helge Jensen
mailto:[email protected]
sip:[email protected]
-=> Sebastian cover-music: http://ungdomshus.nu <=-

using System;
using System.Collections;

namespace ArrayListPerformance
{
class ArrayListPerfomanceTest
{
public static void Main()
{
int count = 100000;
DateTime start = DateTime.UtcNow;
IList l = new ArrayList();
Console.WriteLine("{0} mutation from end...", DateTime.UtcNow-start);
for (int i = 0; i < count; ++i)
l.Insert(i, i);
Console.WriteLine("{0} mutation from end... done", DateTime.UtcNow-start);
l = new ArrayList();
Console.WriteLine("{0} mutation from start...", DateTime.UtcNow-start);
for (int i = 0; i < count; ++i)
l.Insert(0, i);
Console.WriteLine("{0} mutation from start... done", DateTime.UtcNow-start);
}
}
}
 
I thought ArrayList was backed by a doubly linked list, I guess I was wrong.
If implemented using fixed size arrays you are completely right.

Whatever data structure, as long as it is has the operations that a doubly
linked list has (implemented as such or as some tree flavour) the one most
in front is the most recently accessed one, the next one the one accessed
before that and so on. From the other end it works the same way hence te
requirement for doybly linked list semantics.

I am not considering random access here, just access starting from head and
starting from tail and step from there.

Well, thanks anyway, for pointing out the ArrayList inefficiency.
 
But, I tried the following

I measured

ArrayList al = new ArrayList();
for(int i = 0; i < 100000; i++)
{
al.Add(new TestItem(i));
}

TestItem ti;
int j = 99999;
Console.WriteLine("{0} starting turn around", DateTime.UtcNow);
for(int i = 0; i < 100000; i++)
{
ti = (TestItem)al[j];
al.RemoveAt(j);
al.Insert(0, ti);
j--;
}
Console.WriteLine("{0} turn around finished", DateTime.UtcNow);

and

public class TestItem
{
private int m;

public TestItem(int i)
{
m = i;
}
}

With the following result:

3/26/2005 4:18:20 PM starting turn around
3/26/2005 4:18:59 PM turn around finished

That is about 0.0004 seconds per move, I'd say that is better than fast
enough, at least it sufficiently fast so if moving an element to the front
is all that is needed I would initially just use an ArrayList.

Regards,

At
 
It's backed by an Array :)

Based on experimental evidence, the Array is reallocated when full. I'm
guessing it uses reallocation by multiplying the current size (O(n)
amortized for n inserts).

Didn't you mention a cache? what do you do lookup based on?

Well, it may or may not be a problem... atleast you know why it could be
slow.
 
at said:
That is about 0.0004 seconds per move, I'd say that is better than fast
enough, at least it sufficiently fast so if moving an element to the front
is all that is needed I would initially just use an ArrayList.

Good for you.

The expected expense of a randomly remove/insert would be O((n/2(^2)).

If the cache is smaller than 10k this may be an acceptible delay for
you, especially if the cached calculation is very expensive or lookups
are infrequent.

Of course you can always change to a "better" implementation later.
 
If the important element is expiration of older items, rather than
sorting by access, you can use the Cache object from
System.Web.Caching. It supports timed expirations, both fixed and
sliding, as well as callbacks fired on removed items.

I don't know what the associated overhead is, but I hope that helps.

Good luck.

~ Jeff
 
Hi Everyone,

Thanks for all the responses.

Firstly, the Java class I had intended to mention is the
LinkedHashMap, because I do need to keep key-value pairs and refer to
them for my operations apart from the clean-up task of removing items
which have not been accessed for long.

Regarding using System.Web.Caching -
This is a small server component (say S1) that I am making, which is
to be used by several server business-logic components (A, B, C etc.)
exposed via a web service, which in turn will be called by the main
web application ASPX pages.

A, B, C components will all be calling S1 which remains available
commonly available (S1 will be a Singleton - non-instantiable Class
with all methods as static). The web application is not supposed to be
aware of S1, nor is S1 to be aware of the web pages of the web
application.

In such a scenario, I guess System.Web.Caching cannot be used - right?

What I am doing at present is -
Keeping the information in a private static DataTable of S1. First
column is the primary key column by which I normally have to get
information. Second column is LastAccessed - which is updated with
current server time everytime that item is read, or inserted or
updated.

For the normal operations of S1, the DataTable will be used by its
first column which is being defined as the primary key.

For the clean up task to remove items from the DataTable, I am
thinking of keeping another thread running, which will, time to time,
look for items (filter via dataview) where (current time -
lastaccessed) = greater than a set timeout value and delete those
items.

Is there any better way to do this?

ArrayList will not be appropriate I think because moving items around
might be an overhead and because I need to access items by that key
value.

Regards

Vani
 
Vani said:
A, B, C components will all be calling S1 which remains available
commonly available (S1 will be a Singleton - non-instantiable Class
with all methods as static). The web application is not supposed to be

That's not the singleton pattern from the GOF book.

The singleton from GOF would be:

public class S1 {
public static S1 Intance = new S1();
protected S1() {};
public T f(...);
}

Only one static, the "instance" operation.
For the clean up task to remove items from the DataTable, I am
thinking of keeping another thread running, which will, time to time,
look for items (filter via dataview) where (current time -
lastaccessed) = greater than a set timeout value and delete those
items.

Usually caching means keeping N instances around, not just removing
"too-old" ones.
Is there any better way to do this?

Dunno... why don't you keep cache in memory?
ArrayList will not be appropriate I think because moving items around
might be an overhead and because I need to access items by that key
value.

You would can use two data-structures, an IDictionary for the key-based
lookup and a another where the objects are sorted after their
last-accessed property (this is also what a database would do for
multiple-indexed data).

You could also do something like in the attached file, using a bit more
memory (3 refs pr. node) to get a runtime-efficient impl, but if you can
live with having the cache-data in a databse-like thingy you probably
don't need that.

--
Helge Jensen
mailto:[email protected]
sip:[email protected]
-=> Sebastian cover-music: http://ungdomshus.nu <=-

using System;
using System.Collections;

namespace cache
{
public class LinkedValueDictionary: IDictionary
{
protected IDictionary backend;
protected Value first;
protected Value last;
public class Value
{
public readonly object V;
public readonly object K;
public Value Previous;
public Value Next;
public Value(object k, object v, Value prev, Value next)
{
this.K = k;
this.V = v;
this.Previous = prev;
this.Next = next;
}
}
public LinkedValueDictionary(IDictionary backend) { this.backend = backend; }

#region IDictionary Members
public bool IsReadOnly { get { return backend.IsReadOnly; } }

class DictionaryEnumerator: IDictionaryEnumerator
{
IDictionaryEnumerator it;
public DictionaryEnumerator(IDictionaryEnumerator it) { this.it = it; }
#region IDictionaryEnumerator Members
public object Key { get { return it.Key; } }
public object Value { get { return ((Value)it.Value).V; } }
public DictionaryEntry Entry { get { return new DictionaryEntry(Key, Value); } }
#endregion
#region IEnumerator Members
public void Reset() { it.Reset(); }
public object Current { get { return Entry; } }
public bool MoveNext() { return it.MoveNext(); }
#endregion
}

public IDictionaryEnumerator GetEnumerator() { return new DictionaryEnumerator(backend.GetEnumerator()); }

protected void BringToFront(Value v)
{
v.Previous.Next = v.Next;
v.Previous = null;
v.Next = first;
first = v;
}

public virtual object this[object key]
{
get
{
Value v = (Value)backend[key];
if ( v.Previous != null )
BringToFront(v);
return v.V;
}
set
{
if ( Contains(key) )
Remove(key);
Value v = new Value(key, value, null, first);
if ( first == null )
{
first = v;
last = v;
}
else
first.Previous = v;
first = v;
}
}

public void Remove(object key)
{
Value v = (Value)backend[key];
if ( v.Next == null )
last = v.Previous;
else
v.Next.Previous = v.Previous;
if ( v.Previous == null )
first = v.Next;
else
v.Previous.Next = v.Next;
}

public bool Contains(object key) { return backend.Contains(key); }

public void Clear()
{
backend.Clear();
first = null;
last = null;
}


class ValueCollection: ICollection
{
public readonly LinkedValueDictionary LVD;
public ValueCollection(LinkedValueDictionary lvd) { this.LVD = lvd; }
#region ICollection Members
public bool IsSynchronized { get { return LVD.backend.Values.IsSynchronized; } }
public int Count { get { return LVD.backend.Values.Count; } }
public void CopyTo(Array array, int index)
{
foreach ( Value v in LVD.backend.Values )
array.SetValue(v, ++index);
}

public object SyncRoot { get { return LVD.backend.Values.SyncRoot; } }
#endregion

#region IEnumerable Members
class Enumerator: IEnumerator
{
IEnumerator it;
public Enumerator(IEnumerator it) { this.it = it; }
#region IEnumerator Members
public void Reset() { it.Reset(); }
public object Current { get { return ((Value)it.Current).V; } }
public bool MoveNext() { return it.MoveNext(); }
#endregion
}

public IEnumerator GetEnumerator() { return new Enumerator(LVD.Values.GetEnumerator()); }
#endregion
}
public ICollection Values { get { return new ValueCollection(this); } }

public void Add(object key, object value)
{
if ( Contains(key) )
throw new ArgumentException("Already contains key");
else
this[key] = value;
}
public ICollection Keys { get { return backend.Keys; } }
public bool IsFixedSize { get { return backend.IsFixedSize; } }
#endregion
#region ICollection Members
public bool IsSynchronized { get { return backend.IsSynchronized; } }
public int Count { get { return backend.Count; } }
public void CopyTo(Array array, int index)
{
foreach ( DictionaryEntry e in this )
array.SetValue(e, index++);
}

public object SyncRoot { get { return backend.SyncRoot; } }
#endregion
#region IEnumerable Members
IEnumerator System.Collections.IEnumerable.GetEnumerator() { return new DictionaryEnumerator(backend.GetEnumerator()); }
#endregion

public class LinkedCollection: ICollection
{
public readonly LinkedValueDictionary LVD;
public LinkedCollection(LinkedValueDictionary lvd) { this.LVD = lvd; }
#region ICollection Members
public bool IsSynchronized { get { return LVD.IsSynchronized; } }
public int Count { get { return LVD.Count; } }
public void CopyTo(Array array, int index) { LVD.Values.CopyTo(array, index); }
public object SyncRoot { get { return LVD.SyncRoot; } }
#endregion
#region IEnumerable Members
abstract public class Enumerator: IEnumerator
{
public readonly LinkedValueDictionary LVD;
public IEnumerator it; // Used to guarantee exception on mutation
public Value current;
protected Enumerator(LinkedValueDictionary lvd)
{
this.LVD = lvd;
Reset();
}
#region IEnumerator Members
public void Reset()
{
it.Reset();
current = null;
}

public object Current
{
get
{
if ( current == null )
throw new IndexOutOfRangeException();
else
return current;
}
}

protected abstract void Next();
public bool MoveNext()
{
bool hasnext = it.MoveNext();
if ( hasnext )
Next();
return hasnext;
}
#endregion
}
public class Forward: Enumerator
{
public Forward(LinkedValueDictionary lvd): base(lvd) {}
protected override void Next()
{
if ( current == null )
current = LVD.first;
else
current = current.Next;
}
}
public class Backward: Enumerator
{
public Backward(LinkedValueDictionary lvd): base(lvd) {}
protected override void Next()
{
if ( current == null )
current = LVD.last;
else
current = current.Previous;
}
}

public IEnumerator GetEnumerator() { return GetForwardEnumerator(); }
#endregion
public Forward GetForwardEnumerator() { return new Forward(LVD); }
public Backward GetBackwardEnumerator() { return new Backward(LVD); }
}
LinkedCollection Linked { get { return new LinkedCollection(this); } }
}

public class RecentlyReadOrdered: LinkedValueDictionary
{
public RecentlyReadOrdered(IDictionary backend): base(backend) {}
public override object this[object key]
{
get
{
Value v = (Value)backend[key];
if ( v.Previous != null )
BringToFront(v);
return v.V;
}
set { base[key] = value; }
}
}
}
 
Back
Top