How does IOrderedEnumerable work under the hood?

  • Thread starter Thread starter jehugaleahsa
  • Start date Start date
J

jehugaleahsa

Hello:

I'm curious about how OrderBy and ThenBy work. I have been letting the
cogs spin in my head and I can't figure this one out. I get the
impression it is probably a no-brainer. Please give me the dirty
details.

Thanks,
Travis
 
Hello:

I'm curious about how OrderBy and ThenBy work. I have been letting the
cogs spin in my head and I can't figure this one out. I get the
impression it is probably a no-brainer. Please give me the dirty
details.

For the _dirty_ details, look at the .NET code. Either with Reflector, or
with the Microsoft source server download.

The short answer is that it does just what you'd expect: it sorts the
input collection. OrderBy() creates a primary sort, while ThenBy() allows
you to add sort criteria.

The actual sort is deferred until you actually enumerate the result. I
haven't checked the source, but I assume that means that the framework is
able to accumulate the sort criteria so that when you ultimately enumerate
a sorted collection that you've created using multiple calls (i.e.
OrderBy() followed by one or more calls to ThenBy()), it actually only has
to do one sort.

Pete
 
For the _dirty_ details, look at the .NET code.  Either with Reflector,or  
with the Microsoft source server download.

The short answer is that it does just what you'd expect: it sorts the  
input collection.  OrderBy() creates a primary sort, while ThenBy() allows  
you to add sort criteria.

The actual sort is deferred until you actually enumerate the result.  I 
haven't checked the source, but I assume that means that the framework is 
able to accumulate the sort criteria so that when you ultimately enumerate  
a sorted collection that you've created using multiple calls (i.e.  
OrderBy() followed by one or more calls to ThenBy()), it actually only has  
to do one sort.

Pete

So under the hood it is accumulating Comparison<T> or something. The
last ordering simply takes the Comparison<T>'s from the other
orderings and combines them. I was wondering if it sorted by grouping
by keys, then sorting each key group, so on and so on. I figured that
would be really inefficient. I figured the IOrderedEnumerable
interface was used somehow under the hood to skip multiple sorts. I'll
see if I can find the code.
 
So under the hood it is accumulating Comparison<T> or something. The
last ordering simply takes the Comparison<T>'s from the other
orderings and combines them. I was wondering if it sorted by grouping
by keys, then sorting each key group, so on and so on. I figured that
would be really inefficient. I figured the IOrderedEnumerable
interface was used somehow under the hood to skip multiple sorts. I'll
see if I can find the code.

I think what you need to "find" is Red Gate's Reflector. The code is
already on your computer. :)

(Like I said, Microsoft's source code download server is another option,
but frankly I have found it to not work as well as it ought to. As near
as I can tell, mainly because Microsoft hasn't kept the source code in
sync with the released versions of .NET, so the IDE source download
feature winds up not doing anything. So, Reflector is still the easiest
way to look at implementation details).

As far as the other comments, since I haven't inspected the implementation
personally, I obviously don't know for sure what's in there. But, it's
clear that the code is written to avoid sorting until absolutely
necessary. So yes, until the enumeration is actually enumerated, all that
calling the ordering methods does is add a new sort criteria (via the
passed in comparison function) to the state of the enumeration.

Once you enumerate, then it has to sort, and it would certainly do that by
calling the accumulated sort comparison functions in the proper order as
necessary. That is, the enumeration must keep the comparison functions in
an ordered list, and for each comparison during the sort, call those
functions in order until finding one that compares as non-equal.

You can confirm that behavior by creating special comparison functions
that can be controlled after instantiation. For example, put them in a
class that can be set to sort ascending or descending, depending on a
property you set. Provide the comparison functions to the various
ordering methods as you're building up the ordered enumeration. Then
switch the order before actually enumerating the enumeration.

I'm sure that you will find that the order you get from the enumeration
matches the switched state of your comparison functions, rather than the
state it was in when you were creating the ordered enumeration.

Note that if you aggregate the sorting operations by enumerating after
each ordering operation, then .NET will necessarily have to do multiple
sorts. But, the sorts are documented as being stable, which means you'll
still get the results you expect, albeit in a less efficient way.

Pete
 
Back
Top