.NET 2.0 performance bug in ArrayList.Sort

  • Thread starter Thread starter Guest
  • Start date Start date
Greg said:
Hey guys I have already submitted a bug report based upon this ..

the exact behavior I submitted is what you are getting at now ...

Array.Sort(array,0,length, null) //fast
Array.Sort(array,0,length,DefaultComparer)//slow
ArrayList.Sort(0,length,null) //should generate same call as first (fast
generates slow call)

Okay, so what would happen now - bug will be checked and you will get
some kind of feedback on if/how/when it will be fixed? If so please
share it, for now I am satisfied with available workarounds.

cheers,

Alex
 
The StringComaparer did have an improvement Frans as I pointed out earlier
(64 seconds to 8 seconds here) .. that is however not the major problem. I
have placed a bug report on this already.
 
Array.Sort(array,0,length, null) //fast
ArrayList.Sort(0,length,null) //should generate same call as first (fast
generates slow call)
These two do not call the same method

The same as:
Array.Sort(string[] items)
Array.Sort(string[] items, 0, items.length-1, Comparer.Default)

Are not the same methods

the compiler sees Array.Sort(string[]) and realises it can instead call
the generic version Array.Sort<string>(...)

The same happens when you use the comparer as null. Because null does
not have a type it can match null to IComparer<string>, thus can then
call Array.Sort<string>(string[], int, int, IComparator<string) instead
of Array.Sort(string, int, int, IComparator)

Try calling:
Array.Sort(array, 0, length, (IComparer)null)
now we're giving null an explicit type so that we for it to use the
non-generic version of Sort.

I've tried calling TrySZSort with a string array and it cannot sort it.
And Array.Sort with a string uses ArrayObjectSorter.

==========================

// check null comparer here
oTime = DateTime.Now;
Array.Sort(Items2, 0, Items.Length, (IComparer)null);
Console.WriteLine("Time to sort with Null Comparer: {0}
msec",
(DateTime.Now.Ticks - oTime.Ticks) /
TimeSpan.TicksPerMillisecond);

Time to sort with Null Comparer: 29859 msec


With both Array.Sort and Array.Sort<T> you really have to becarful which
version is being called, as even if you don't specify a generic, the
compiler can infer it from the parameters.

Thus if you had the generic method:

void Push<T>(IList<T> list, params T items[])
....

IList<int> items = new IList<int>();

Push<int>(items, 1, 2 3);

OR

Push(items, 1, 2 3)

In the second instance the compiler sees that all the parameters match
for the required type <T> thus it infers the call to Push<int>, even if
there is also a non generic method call Push as long as the signature
isn't an exact match.

To demonstrate:
static void Main(string[] args) {
Test<int>(1);
Test(1);
Test((object)1);
Test<string>("abc");
Test("abc");
Test((object)"abc");
Console.WriteLine("Press ENTER to exit");
Console.ReadLine();
}

static void Test<T>(T value) {
Console.WriteLine("Test<T>(T), value = {0}", value);
}

static void Test(object value) {
Console.WriteLine("Test(object), value = {0}", value);
}

static void Test(string value) {
Console.WriteLine("Test(string), value = {0}", value);
}


OUTPUT:
Test<T>(T), value = 1
Test<T>(T), value = 1
Test(object), value = 1
Test<T>(T), value = abc
Test(string), value = abc
Test(object), value = abc
Press ENTER to exit
 
I understand type inference ...

Array<string>.Sort(string[] items) ..

Correct me if I am wrong but the last time I checked, array was not
implemented as a generic type.

Cheers,

Greg

20.6.4: When a generic method is called without specifying type arguments, a
type inference process attempts to infer type arguments for the call. The
presence of type inference allows a more convenient syntax to be used for
calling a generic method, and allows the programmer to avoid specifying
redundant type information. For example, given the method declaration:

class Util
{
static Random rand = new Random();



static public T Choose<T>(T first, T second) {
return (rand.Next(2) == 0)? first: second;
}
}

it is possible to invoke the Choose method without explicitly specifying a
type argument:

int i = Util.Choose(5, 213); // Calls Choose<int>

string s = Util.Choose("foo", "bar"); // Calls Choose<string>





Chris Chilvers said:
I am not sure where you are coming up with this ..
not quite Array.Sort(string[] items) actually calls:
Array<string>.Sort(string[] items)

Referer to 20.6.4 of
http://download.microsoft.com/downl...e-f87a44af3db9/SpecificationVer2.doc#01000021
 
Sorry I said that wrong ... Array.Sort is not implemented as a generic
method in this case.

also the array being passed (from an arraylist) will be object [] not string
[]

Cheers,

Greg
Greg Young said:
I understand type inference ...

Array<string>.Sort(string[] items) ..

Correct me if I am wrong but the last time I checked, array was not
implemented as a generic type.

Cheers,

Greg

20.6.4: When a generic method is called without specifying type arguments,
a type inference process attempts to infer type arguments for the call.
The presence of type inference allows a more convenient syntax to be used
for calling a generic method, and allows the programmer to avoid
specifying redundant type information. For example, given the method
declaration:

class Util
{
static Random rand = new Random();



static public T Choose<T>(T first, T second) {
return (rand.Next(2) == 0)? first: second;
}
}

it is possible to invoke the Choose method without explicitly specifying a
type argument:

int i = Util.Choose(5, 213); // Calls Choose<int>

string s = Util.Choose("foo", "bar"); // Calls Choose<string>





Chris Chilvers said:
I am not sure where you are coming up with this ..

not quite Array.Sort(string[] items) actually calls:
Array<string>.Sort(string[] items)

Referer to 20.6.4 of
http://download.microsoft.com/downl...e-f87a44af3db9/SpecificationVer2.doc#01000021
 
Chris ..

The two have identical signatures ...

ArrayList.Sort in turn calls ...

reflectorred :
public virtual void Sort(int index, int count, IComparer comparer)
{
if ((index < 0) || (count < 0))
{
throw new ArgumentOutOfRangeException((index < 0) ? "index" :
"count", Environment.GetResourceString("ArgumentOutOfRange_NeedNonNegNum"));
}
if ((this._size - index) < count)
{
throw new
ArgumentException(Environment.GetResourceString("Argument_InvalidOffLen"));
}
Array.Sort(this._items, index, count, comparer);
this._version++;
}
equates to ...
Array.Sort(object [], int, int, null)
original call waswhich is
Array.Sort(object [], int, int, null)

in both cases the array is of type object [].

Greg

Chris Chilvers said:
Array.Sort(array,0,length, null) //fast
ArrayList.Sort(0,length,null) //should generate same call as first (fast
generates slow call)
These two do not call the same method

The same as:
Array.Sort(string[] items)
Array.Sort(string[] items, 0, items.length-1, Comparer.Default)

Are not the same methods

the compiler sees Array.Sort(string[]) and realises it can instead call
the generic version Array.Sort<string>(...)

The same happens when you use the comparer as null. Because null does
not have a type it can match null to IComparer<string>, thus can then
call Array.Sort<string>(string[], int, int, IComparator<string) instead
of Array.Sort(string, int, int, IComparator)

Try calling:
Array.Sort(array, 0, length, (IComparer)null)
now we're giving null an explicit type so that we for it to use the
non-generic version of Sort.

I've tried calling TrySZSort with a string array and it cannot sort it.
And Array.Sort with a string uses ArrayObjectSorter.

==========================

// check null comparer here
oTime = DateTime.Now;
Array.Sort(Items2, 0, Items.Length, (IComparer)null);
Console.WriteLine("Time to sort with Null Comparer: {0}
msec",
(DateTime.Now.Ticks - oTime.Ticks) /
TimeSpan.TicksPerMillisecond);

Time to sort with Null Comparer: 29859 msec


With both Array.Sort and Array.Sort<T> you really have to becarful which
version is being called, as even if you don't specify a generic, the
compiler can infer it from the parameters.

Thus if you had the generic method:

void Push<T>(IList<T> list, params T items[])
...

IList<int> items = new IList<int>();

Push<int>(items, 1, 2 3);

OR

Push(items, 1, 2 3)

In the second instance the compiler sees that all the parameters match
for the required type <T> thus it infers the call to Push<int>, even if
there is also a non generic method call Push as long as the signature
isn't an exact match.

To demonstrate:
static void Main(string[] args) {
Test<int>(1);
Test(1);
Test((object)1);
Test<string>("abc");
Test("abc");
Test((object)"abc");
Console.WriteLine("Press ENTER to exit");
Console.ReadLine();
}

static void Test<T>(T value) {
Console.WriteLine("Test<T>(T), value = {0}", value);
}

static void Test(object value) {
Console.WriteLine("Test(object), value = {0}", value);
}

static void Test(string value) {
Console.WriteLine("Test(string), value = {0}", value);
}


OUTPUT:
Test<T>(T), value = 1
Test<T>(T), value = 1
Test(object), value = 1
Test<T>(T), value = abc
Test(string), value = abc
Test(object), value = abc
Press ENTER to exit
 
Greg said:
The StringComaparer did have an improvement Frans as I pointed out
earlier (64 seconds to 8 seconds here) .. that is however not the
major problem. I have placed a bug report on this already.

As both use Array.Sort(), I'd be surprised they will find this a bug.
As I said: you're not comparing 2 equal things.

FB
and >> > see for yourself). So it's not the sort that's flawed, as
it's >> > precisely the same routine that gets executed.no >> so much slower performance of ArrayList.Sort that bothers me,
its >> much slower performance than same code for .NET 1.1,


--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------
 
After reading some suggestions about the possibility of differring call
which I at first shrugged off then accepted I went and started munging
through the IL in question ..


slow ...
IL_0083: ldsfld class [mscorlib]System.Collections.Comparer
[mscorlib]System.Collections.Comparer::Default

IL_0088: call void [mscorlib]System.Array::Sort(class
[mscorlib]System.Array,

int32,

int32,

class [mscorlib]System.Collections.IComparer)

fast

IL_00c9: ldnull

IL_00ca: call void [mscorlib]System.Array::Sort<object>(!!0[],

int32,

int32,

class [mscorlib]System.Collections.Generic.IComparer`1<!!0>)

Changes to USE a generic?!?! given the null? This is very interesting ...

I saw the following ..

if (comparer == null)
{
comparer = Comparer.Default;
}


in reflector and figured that would be picked up later ... but I guess the
compiler realized it...

if we flip a little class such as ..
public class TestComparer : IComparer<object>

{

public int Compare(object x, object y)

{

return ((IComparable) x).CompareTo(y);

}

}

We can then force the first slow array method to use the generic version.
and it speeds up from 60 seconds to 300 ms

Now for the kicker ... we have ArrayList.Sort

The one with a parameter always uses the slow call

IL_0000: ldarg.0

IL_0001: ldc.i4.0

IL_0002: ldarg.0

IL_0003: callvirt instance int32 System.Collections.ArrayList::get_Count()

IL_0008: ldsfld class System.Collections.Comparer
System.Collections.Comparer::Default

IL_000d: callvirt instance void System.Collections.ArrayList::Sort(int32,

int32,

class System.Collections.IComparer)

IL_0012: ret



Parameterless

IL_004b: ldarg.3

IL_004c: call void System.Array::Sort(class System.Array,

int32,

int32,

class System.Collections.IComparer)

is the ArrayLists parameterless sort (which if it passed null would use the
fast version) and would be just as correct as passing Comparer.Default
(since that is the default anyways:))

The arraylist.Sort however is doomed to always be extremely slow as there is
no way for us to force this to happen :(
I am also questioning a bit whether there is another bug somewhere deeper
.... 60 seconds vs 250 ms is not a small problem

So after all of this I would say tht yes we have a bug since passing null in
ArrayList.Sort() instead of Comparer.Default would give us a net gain of
about 19,900% on my machine.

Cheers,

Greg

Chris Chilvers said:
Array.Sort(array,0,length, null) //fast
ArrayList.Sort(0,length,null) //should generate same call as first (fast
generates slow call)
These two do not call the same method

The same as:
Array.Sort(string[] items)
Array.Sort(string[] items, 0, items.length-1, Comparer.Default)

Are not the same methods

the compiler sees Array.Sort(string[]) and realises it can instead call
the generic version Array.Sort<string>(...)

The same happens when you use the comparer as null. Because null does
not have a type it can match null to IComparer<string>, thus can then
call Array.Sort<string>(string[], int, int, IComparator<string) instead
of Array.Sort(string, int, int, IComparator)

Try calling:
Array.Sort(array, 0, length, (IComparer)null)
now we're giving null an explicit type so that we for it to use the
non-generic version of Sort.

I've tried calling TrySZSort with a string array and it cannot sort it.
And Array.Sort with a string uses ArrayObjectSorter.

==========================

// check null comparer here
oTime = DateTime.Now;
Array.Sort(Items2, 0, Items.Length, (IComparer)null);
Console.WriteLine("Time to sort with Null Comparer: {0}
msec",
(DateTime.Now.Ticks - oTime.Ticks) /
TimeSpan.TicksPerMillisecond);

Time to sort with Null Comparer: 29859 msec


With both Array.Sort and Array.Sort<T> you really have to becarful which
version is being called, as even if you don't specify a generic, the
compiler can infer it from the parameters.

Thus if you had the generic method:

void Push<T>(IList<T> list, params T items[])
...

IList<int> items = new IList<int>();

Push<int>(items, 1, 2 3);

OR

Push(items, 1, 2 3)

In the second instance the compiler sees that all the parameters match
for the required type <T> thus it infers the call to Push<int>, even if
there is also a non generic method call Push as long as the signature
isn't an exact match.

To demonstrate:
static void Main(string[] args) {
Test<int>(1);
Test(1);
Test((object)1);
Test<string>("abc");
Test("abc");
Test((object)"abc");
Console.WriteLine("Press ENTER to exit");
Console.ReadLine();
}

static void Test<T>(T value) {
Console.WriteLine("Test<T>(T), value = {0}", value);
}

static void Test(object value) {
Console.WriteLine("Test(object), value = {0}", value);
}

static void Test(string value) {
Console.WriteLine("Test(string), value = {0}", value);
}


OUTPUT:
Test<T>(T), value = 1
Test<T>(T), value = 1
Test(object), value = 1
Test<T>(T), value = abc
Test(string), value = abc
Test(object), value = abc
Press ENTER to exit
 
Back
Top