Value types - Beware of Guid in mscorlib.dll

  • Thread starter Thread starter Arthur Mnev
  • Start date Start date
A

Arthur Mnev

It seems to me Microsoft got caught in boxing / unboxing situation; No
other reason comes to mind...

It is a well-known fact that when a value type needs to be passed as
an object it gets boxed, then unboxed once it has been casted on. The
solution? Do not force the framework to convert ValueTypes into object
type and vice versa; back to Guid...

A fiew IL quotes from mscorlib.dll

..method public hidebysig specialname static bool
op_Equality(valuetype System.Guid a, valuetype System.Guid b) cil
managed
{
// Code size 14 (0xe)
.maxstack 8
IL_0000: ldarga.s a
IL_0002: ldarg.1
IL_0003: box System.Guid
IL_0008: call instance bool System.Guid::Equals(object)
IL_000d: ret
} // end of method Guid::op_Equality


..method public hidebysig virtual instance bool Equals(object o) cil
managed
{
// Code size 214 (0xd6)
.maxstack 2
.locals (valuetype System.Guid V_0)
IL_0000: ldarg.1
IL_0001: brfalse.s IL_000b
IL_0003: ldarg.1
IL_0004: isinst System.Guid
IL_0009: brtrue.s IL_000d
IL_000b: ldc.i4.0
IL_000c: ret
IL_000d: ldarg.1
IL_000e: unbox System.Guid
IL_0013: ldobj System.Guid
IL_0018: stloc.0
IL_0019: ldloca.s V_0
IL_001b: ldfld int32 System.Guid::_a
IL_0020: ldarg.0
IL_0021: ldfld int32 System.Guid::_a
IL_0026: beq.s IL_002a
IL_0028: ldc.i4.0
IL_0029: ret

// taken out for readability, repetitive comparison code //
IL_00d2: ldc.i4.0
IL_00d3: ret
IL_00d4: ldc.i4.1
IL_00d5: ret
} // end of method Guid::Equals


For those that are IL illiterate, let me translate this:
"op_Equality" is translated "public static bool operator ==(type
operand)"

so what does the function do? It calls Equal Function as indicated on
IL_0008 line. everything is great except Equal function takes object
as a parameter; therefore our ValueTyped Guid structure is getting
boxed and unboxed (as said on IL_0007 line.

Similarly, if you pay attention to decimal and some other value types
you will find that Equals function defined there is(!) overloaded.
Once sumitted to overloaded function Decimal and other base types call
to function Compare that takes two decimals as parameters (i.e. no
boxing)

Boxing is an expensive operation, data is copied from stack onto heap,
a pointer to heap is then operated on. When Value type is casted on
boxed "object" the data is copied back from heap to local stack. In
case of Guid, if we need to run a comparison we will end up with
unnecessary:

Allocation
16 byte copy to heap
16 byte copy from heap
DeAllocation


The conclusion? This is not life threatening, however, this is
something that people expecting high performance should pay attention
to.
I'm not sure if this is a single "memory stress testing feature" or if
it is repeated throughout the framework. I'll check some other value
types and will post the list if it grows.
 
I read your post and I'm not trying to be sarcastic when I say...What is you
point? I don't mean to trivialize this point or to dispute what you are
saying. I just mean that boxing is an issue that should developers should
be aware of and they should understand the consequences of.
 
There are a few points; first and foremost is that developers should
be aware that Guid class (in specific) and possibly more within the
framework are using, should i say, less then optimal way of performing
a function. Guid comparison implemented by Microsoft runs about 30%
slower then what is possible to achieve just by using proper types.

Guid itself is probably not that big of a deal as it is not the most
commonly used type, i do suspect, however there are more of those type
of mistakes within core libraries; possibly in types that are used
more frequently then Guid.

The second point I was trying to make is that the framework is not
optimized as much as some of us hoped and it always makes sense to pay
attention to what the system is actually doing, as opposed to blindly
re-using code provided (as 95% of the people do). Boxing is certainly
something developers do (hopefully) understand and that is precisely
why i posted it.

Cheers
 
With anything as large and young as .NET one expects to see these kinds of
things. At the end of the day, the issue is probably more of interest in as
much as we can track when they optimise it. I think MS and its
infamous/famous ability to get it mostly right around version 3 is likely
going to prove correct even for .NET.

I very seldom use any code without focusing on its efficiencies, but that is
a luxury. I suspect that the real reason people do use poorly optimised code
is that they are time-stressed or simply don't know better. Either way, it
is important to share this kind of higher-level information to keep the
minds of the community sharp. And, one never knows when this kind of
knowledge might have relevance.

So, in short, thanks for the information.

Frank Buchan
 
Back
Top