What would be a faster alternative to boxing/unboxing?

  • Thread starter Thread starter Mountain Bikn' Guy
  • Start date Start date
M

Mountain Bikn' Guy

I have a situation where an app writes data of various types (primitives and
objects) into a single dimensional array of objects. (This array eventually
becomes a row in a data table, but that's another story.) The data is
written once and then read many times. Each primitive read requires
unboxing. The data reads are critical to overall app performance. In the
hopes of improving performance, we have tried to find a way to avoid the
unboxing and casting. So far, nothing has worked well.

One solution we tried was to create our own simple object to wrap the
primitives. In this object we used a public field that allowed reading of
the primitive data value without a property or unboxing. However, overall
performance went down. We are still investigating, but our current
speculations are that:
1. the overhead of creating the object was significantly greater than the
overhead of boxing and this offset any gains we might have gotten on the
reads (a guess).
2. wrapping each primitive forced us to introduce a new virtual method call
in place of the boxing/casting. Maybe the overhead of this call is what
caused the performance to go down. (Another guess.)

BTW, performance went down 15%.

Does anyone have any ideas on how to implement the following:

A 1-dimension array (with normal indexing) containing heterogenous strongly
typed data that allows reading/writing without casting and without
boxing/unboxing.

All thoughts/suggestions are appreciated.
 
Mountain said:
Does anyone have any ideas on how to implement the following:

A 1-dimension array (with normal indexing) containing heterogenous strongly
typed data that allows reading/writing without casting and without
boxing/unboxing.

All thoughts/suggestions are appreciated.
Have to excuse me as I'm new to C# :)

My understanding is you can't have an array of objects with value types
without boxing. As the instant you do obArray[x] = 32 it boxes it.

Can you use a second array in parallel with your first array ; casting
data if you need to ; this might be quicker. Wasteful of storage :)

Other thing is to hack into the system and write inline code. Don't know
if you can do this in C# even if it is a good idea.

Sorry if these ideas are dumb :)
 
Hi Paul,
Have to excuse me as I'm new to C# :)

My understanding is you can't have an array of objects with value types
without boxing. As the instant you do obArray[x] = 32 it boxes it.
Not quite true.
If you have

struct MyStruct
{
......
}

MyStruct [] arr = new MyStruct[XXX];
arr[0] = new MyStruct() ; //that won't cause any boxing.
MyStruct ms = arr[0]; //won't cause any unboxing because nothing has been
boxed

Anyway, it might not cause boxing, but the value will be copied which will
introduce some overhead compared to the case if we use reference type
instead of value type.

B\rgds
100
 
This suggestion is almost exactly what I referred to in my original posting.
Don't know why, but it was slower than boxing/unboxing. Copying isn't an
issue because I'm dealing with value types (primitives) which will be copied
in any case -- and this is the behavior I desire.

100 said:
Hi Paul,
Have to excuse me as I'm new to C# :)

My understanding is you can't have an array of objects with value types
without boxing. As the instant you do obArray[x] = 32 it boxes it.
Not quite true.
If you have

struct MyStruct
{
.....
}

MyStruct [] arr = new MyStruct[XXX];
arr[0] = new MyStruct() ; //that won't cause any boxing.
MyStruct ms = arr[0]; //won't cause any unboxing because nothing has been
boxed

Anyway, it might not cause boxing, but the value will be copied which will
introduce some overhead compared to the case if we use reference type
instead of value type.

B\rgds
100
 
Hi,
What do you mean by primitives? Do you mean real primitives like (int,
float, char, etc) or you are talking about value types in general (which
could be primitives as well as your own defined structures)?
1. the overhead of creating the object was significantly greater than the
overhead of boxing and this offset any gains we might have gotten on the
reads (a guess).

Wrapping primitives as long as I can see is a process of:

1. Creating an reference type calling its constructor, which I assume accept
the primitives as a parameter. This constructor then saves the primitive in
internal data member of the same type as the primitive (to avoid boxing)
Now let see how many times we copy our primitive: once calling the
constuctor we are making a copy in the stack, twise when we copy it in the
internal variable.

I asked if your promitives are real primitives because if they are they are
not so big. Anyway, even if you use reference types you have to copy the
reference (address - 4 bytes) which is not a big difference. In addition
the JITer will optimize the calls using the CPU registers. If you have
structures, though, it might be too big to be optimized and they will be
indeed copied. You will end up with two copy operation + all work done to
create the wrapping object + calling the constructor. If you have used
boxing you would have - creating object in the heap + one copy (no calling
constructors).

2. Then when you read data again you will have at leas one copy operation of
the primitive. As far as I understand you use virtual method to retrieve the
data so the method is too general to be used for all primitives you may
have. So type conversion(which means boxing + unboxing) is necessary. Again
only one unboxing is better.

It is not surprising (at least to me) that the wrapping classes are slower
than boxing/unboxing

" 2. wrapping each primitive forced us to introduce a new virtual method
call
in place of the boxing/casting. Maybe the overhead of this call is what
caused the performance to go down. (Another guess.)

I don't thing with virtual method you can avoid boxing/casting. IMHO this
will lead to boxing/casting
May be I am wrong. A code snipped will be of help here.
Anyway don't worry about the performance of virtual methods. In the way CLR
works the difference is not as big as in C++ for example. Even non-virtual
methods are called indirectly. Virtual methods need only one step of
indirection more + it doesn't adjust the pointer as C++ does. The diference
is not as big as if you compare to C++.

BTW, performance went down 15%.

Does anyone have any ideas on how to implement the following:

A 1-dimension array (with normal indexing) containing heterogenous strongly
typed data that allows reading/writing without casting and without
boxing/unboxing.

Strongly-typed and heterogenous? IMHO those two guys are mutullay exclusive.

B\rgds
100
 
mountain bikn' guy,

your question:

"Does anyone have any ideas on how to implement the following:

A 1-dimension array (with normal indexing) containing heterogenous strongly
typed data that allows reading/writing without casting and without
boxing/unboxing."

really can't be solved with the .net framework's type safety policies. in
c++, when you knew a particular type was somewhere in memory, you could
just go straight to that chunk of memory and get it. but the framework
cannot allow memory to be accessed like that while guaranteeing type safety.

so the best you cand do is control the way the compiler manipulates your
code 'behind the scenes'. as a general suggestion, try playing with the
tool ildasm.exe, which should be in your framework sdk directory. if you
point ildasm.exe at a managed assembly, it will show you the il code the
compiler generated when building your app. it may sound yucky to look at,
but i can't stress how useful this tool is to me when i'm trying to tweak
performance out of my code. the compiler does so many things behind the
scenes that often this is the only way i can see exactly what's going on.

but back to your question: could you describe more specifically how you
are using this array? are you storing a bunch of objects in the array when
writing, then when reading using the ToString() method? in particular, i'm
interested in what you're doing with the data when/after you read from the
array. this is the place where your code can likely be optimized the most.

get back with me and hopefully i can offer some specific advice to help
your problem.

jeff.

--

This posting is provided "AS IS" with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm

Note: For the benefit of the community-at-large, all responses to this
message are best directed to the newsgroup/thread from which they
originated.
 
Thanks for all your tips and thoughts. Your comments helped me see why some
alternatives we tried were slower than boxing/unboxing.

More responses inline.

Strongly-typed and heterogenous? IMHO those two guys are mutullay exclusive.

Yes, but that doesn't mean some creative thinking can't result in a
satisfactory solution. When I was in college, I created a data structure
that allows (psuedo) random access into dynamically allocated memory without
any external indexes. My goal was to have array style random access into
dynamically allocated (non-contiguous) memory -- two goals that seem
incompatible. The result of my effort was a data structure that
out-performed height-balanced AVL trees. So I'm hoping to have some luck and
come up with something creative that lets me get the benefits of strong
typing in a heterogenous array. ... maybe it will happen, maybe not.
 
Mountain (can I call you Mountain...),

The overhead from boxing should be roughly equivalent to the overhead of
creating the wrapper class, as they're doing nearly exactly the same thing.
I'm not sure how to contrast the overhead of the two options in #2 - my
guess is that virtual dispatch is roughly as expensive as an unbox, but if
you do the virtual every time and the unbox only on value types, it could
explain your results.

You should also probably spend some time using the CLR profiler to examine
your application.

http://www.microsoft.com/downloads/...52-D7F4-4AEB-9B7A-94635BEEBDDA&displaylang=en




--
Eric Gunnerson

Visit the C# product team at http://www.csharp.net
Eric's blog is at http://blogs.gotdotnet.com/ericgu/

This posting is provided "AS IS" with no warranties, and confers no rights.
 
Hi Eric,

I agree that using wrapping classes is probably as slow (even it could be
slower) than boxing/unboxing. And I gave my point in my prevoius post.
However I'm not qiute sure what you are saying here
guess is that virtual dispatch is roughly as expensive as an unbox

In .NET dispatching of virtual-method calls is pretty simple and doesn't
take much work than the calls of non-virtual methods. So if what you are
saying is true it means that every function calls is as expensive as unbox
operation.

B\rgds
100
 
Hi,
Yes, but that doesn't mean some creative thinking can't result in a
satisfactory solution.

Don't get me wrong. I'm not arguing against creativity. I just wanted to say
that heterogenous storage means (as far as I understand that word) storage
that contains different types of data. Strongly-typed would mean (in most of
the casses) storage that contains only one type of data and doesn't accept
any other.

But in my second thought it could be called like that if you have storage
that accept only certain types of data (say integers, floats and my Foo
class) and reject the others.

So if that is what you mean. I see one posible solution which may satisfy
your needs of speed.
For each value type you may have designated array to keep the values. In
this case you won't have boxing/unboxing operations going behind the scene.
Reference types you may hold in one common array (or any other kind of
storage).

Provide as many overloads of the methods for adding and retrieving data as
many different types you have

something like this (suppose we want to accept int, double and strings(this
is my reference type))

class Storage
{
int[] intArr= new int[10];
double[] dblArr = new int[10];
object[] refArr = new object[10];

public int this[int i]
{
get{return intArr;}
set{intArr = value;}
}

public double this[int i]
{
get{return dblArr;}
set{dblArr = value;}
}

public string this[int i]
{
get{return (string)refArr;} //Conversion - it is not as bad as
boxing/unboxing
set{refArr = value;}
}

.......
}
Of course we have to implement logic to expand the storage when the capacity
is exceeded and so on.
We waste too much memory; to be more concrete you'll have N*M unused slots
where N is the number of elemens in the storage and M is the number of the
value types which can be stored. It could be huge amount of unused memory if
you are planing to store a lot of data, but for small quantities it could be
acceptable and we won't have any boxing/unboxing.

B\rgds
100
 
Hi Eric,
You can certainly call me Mountain ;) I have a small mountain bike track
outside my back door (it's more like a BMX track). About the only time I
leave the computer is when I go ride my mountain bike (or when my wife
forces me to go to sleep).

Thanks for your input and the link to the CLR profiler. I think using the
CLR profiler is a good next step.
Regards,
Dave
 
100 said:
Hi Paul,
Have to excuse me as I'm new to C# :)

My understanding is you can't have an array of objects with value types
without boxing. As the instant you do obArray[x] = 32 it boxes it.

Not quite true.
If you have

struct MyStruct
{
.....
}

MyStruct [] arr = new MyStruct[XXX];
arr[0] = new MyStruct() ; //that won't cause any boxing.
MyStruct ms = arr[0]; //won't cause any unboxing because nothing has been
boxed

Anyway, it might not cause boxing, but the value will be copied which will
introduce some overhead compared to the case if we use reference type
instead of value type.

Hm. So structs are this sort of wierd interim type between values and
objects ? I find the whole thing not confusing but maybe inconsistent.

If you can assign a struct to an object reference what happens if you have

sometype somefunc(object[] a)
{
a[1] = new MyStruct()
}

My understanding is that MyStruct[] is allocated off the stack not the
heap. So when the stack frame is removed, what does a[1] point to ?

It doesn't seem very "safe" - unless it does a struct copy (as in
struct1=struct2) and allocates that on the heap ?

Sorry if these are dumb questions :) I only started C# a couple of days ago.
 
100 said:
class Storage
{
int[] intArr= new int[10];
double[] dblArr = new int[10];
object[] refArr = new object[10];

public int this[int i]
{
get{return intArr;}
set{intArr = value;}
}

public double this[int i]


I thought that C# couldn't do polymorphism on return types only ? Am I
wrong ?
 
Hi Paul,

No, the question is not dumb at all.
Hm. So structs are this sort of wierd interim type between values and
objects ? I find the whole thing not confusing but maybe inconsistent.

No, the struct are value types. If you look at the docs value types are:
primitives (int32, int64, char, etc), enums and structures. So take a look
at MSDN. Int32 for example is decalred as a *struct*. They are called
primitives because IL has special instructions to work with them directly.
That is why you won't see any overloads of the arithmetic operators defined
for them. Decimal, though, is not a primitive so it has overloaded the
arithmetic operators and this is the reason why the performance is worse
when you work with them compared with double for example.
If you can assign a struct to an object reference what happens if you have

sometype somefunc(object[] a)
{
a[1] = new MyStruct()
}

My understanding is that MyStruct[] is allocated off the stack not the
heap. So when the stack frame is removed, what does a[1] point to ?

It doesn't seem very "safe" - unless it does a struct copy (as in
struct1=struct2) and allocates that on the heap ?

Sorry if these are dumb questions :) I only started C# a couple of days ago.

Value types may be in two states: boxed and unboxed, how you know. When they
are in boxed state they are always in the managed heap. When they are in
unboxed state they might be in the heap or in the stack. In c# we can have
unboxed value type in the heap only when they are part of the reference
object (which is allocated always in the heap). I said in C# because for
CLR's point of view the operation of unboxing doesn't do the copy of the
value type from the heap to the stack. You might have boxed value type in
the heap and then doing *unbox* it gets a pointer to the data part of the
object (pointer to the unboxed object) and pushed it in the evaluation
stack. It doesn't copy the data from the heap. However, because in most of
the cases the opertation of unboxing is followed by operatation of copying
the data C# designers desided not to provide unboxing in the heap. In C#
unboxing is always followed by copy. You can do it in VC++ or ILAsm, though.

So arrays are reference types. As reference types they reside in the heap.
But if they are of type *value type* the items are in unboxed state in the
heap as if they were data members of a class.
So, if you have

int[] intArr = new int[10];
intArr[0] = 10;
the value type 10 won't be boxed in the array it will be copied in the heap
in unboxed state.

What happens here
sometype somefunc(object[] a)
{
a[1] = new MyStruct()
}

new MyStruct will create indeed the strucutre in the stack. But then the
assignment will copy this value form the stack to the heap. The original one
which is in the stack will go away when the stack frame gets removed but its
verbatim copy in the heap will stay as part of the array.

HTH
B\rgds
100
 
Hi Paul,
You are hundred percent right. That was my fault. I wrote this minutes
before I went home yesterday and its my fault that I posted this without any
revision.

Thanks for bringing this out. We can't use indexer for that. What we can do
is to use separate operations SetData and GetData with as many overloads as
different types we want to accept. GetData cannot return value because ot
will introduce the same problem that we have already with the indexer. So we
can have prototype like

void GetData(int index, out <type> value)
{
}

this should fix the problem.

B\rgds
100

Paul Robson said:
100 said:
class Storage
{
int[] intArr= new int[10];
double[] dblArr = new int[10];
object[] refArr = new object[10];

public int this[int i]
{
get{return intArr;}
set{intArr = value;}
}

public double this[int i]


I thought that C# couldn't do polymorphism on return types only ? Am I
wrong ?
 
100 said:
What happens here
sometype somefunc(object[] a)
{
a[1] = new MyStruct()
}


new MyStruct will create indeed the strucutre in the stack. But then the
assignment will copy this value form the stack to the heap. The original one
which is in the stack will go away when the stack frame gets removed but its
verbatim copy in the heap will stay as part of the array.

HTH
B\rgds
100

Thanks ! It's much clearer now :)
 
The real problem with this idea is - how we can possibly know which type is
the object at given index in order to know what overload to use to retrieve
the data. We can provide method that returns the type of an object by index
value. But then we need to check this type and call the appropriate method
which will introduce more overhead when reading data. It may turns that
boxing/unboxing is the best solution and the problem with the prformance is
not there.

That's why I think we can't come up with good idea unless Mountain gives us
an example how he is planning to use this storage. He may not need to have
the objects ordered and then this solution will work and we won't have the
problem of wasting the memory.

B\rgds
100

100 said:
Hi Paul,
You are hundred percent right. That was my fault. I wrote this minutes
before I went home yesterday and its my fault that I posted this without any
revision.

Thanks for bringing this out. We can't use indexer for that. What we can do
is to use separate operations SetData and GetData with as many overloads as
different types we want to accept. GetData cannot return value because ot
will introduce the same problem that we have already with the indexer. So we
can have prototype like

void GetData(int index, out <type> value)
{
}

this should fix the problem.

B\rgds
100

Paul Robson said:
100 said:
class Storage
{
int[] intArr= new int[10];
double[] dblArr = new int[10];
object[] refArr = new object[10];

public int this[int i]
{
get{return intArr;}
set{intArr = value;}
}

public double this[int i]


I thought that C# couldn't do polymorphism on return types only ? Am I
wrong ?

 
Well, it is just a guess, but here's my reasoning.

The virtual dispatch by itself isn't a big deal - it's only an extra few
instructions. But because it's virtual (and there could be derived classes
loaded later), the JIT doesn't inline virtual calls, so you do pay a
non-trivial penalty there.

Unbox, on the other hand, is a runtime type check (fairly cheap), and then a
copy of memory from the reference location to the stack location. That's
pretty cheap, and my guess is that it's cheaper than the lost inline

--
Eric Gunnerson

Visit the C# product team at http://www.csharp.net
Eric's blog is at http://blogs.gotdotnet.com/ericgu/

This posting is provided "AS IS" with no warranties, and confers no rights.
 
Hi Eric,
Well, it is just a guess, but here's my reasoning.

The virtual dispatch by itself isn't a big deal - it's only an extra few
instructions. But because it's virtual (and there could be derived classes
loaded later), the JIT doesn't inline virtual calls, so you do pay a
non-trivial penalty there.

Unbox, on the other hand, is a runtime type check (fairly cheap), and then a
copy of memory from the reference location to the stack location. That's
pretty cheap, and my guess is that it's cheaper than the lost inline

Pretty wild guess I think ;)

1. There is no guarantee that a non-virtual method will be inlined.

2. Unboxing is a cheep operation. The copy operation which c# generates
always after unboxing could be expensive depending on the size of the value
type data. Calling virtual function is like indirect call with two
indirections and if the method doesn't have any parameters (which have to be
copied in the stack) - like simple GetMethod - I don't think it would be
more expensive then unboxing (event if it does have paramters if they are
primitives or reference types the fist two will be transfered via CPU
registeres, the return value as well could be passed back thru
register(s) ).

Of course this guess is as wild as yours ;)


B\rgds
100
 
Here's the answer to your question (I hope):

The ordering is determined by computational dependencies. Just before
processing starts, an ordering is determined dynamically. Then during
processing, items are written to the array in that order. There is actually
an external index that is used to find items in the array.

Does that answer your question?

I do currently use 'out' parameters in the overloaded methods that retreive
data from the array, as you figured out.

I've written code for other approaches too. I have lots of alternative
implementations that I have tested. As a side note -- which isn't really
relevant to the main topic -- when I need to have overloaded return types, I
use this approach:

<type> GetData(int index, <type> value)
{
return (type)valueList[index];//cast to <type>
}

What this does is use a parameter to chose an overload, and each overload
has a different return type. It is very similar to the 'out' approach except
the data is returned in a more familiar way.

I hope mentioning this doesn't divert this thread from the main topic. My
concern is still more with a design that might solve the problem more
creatively. I have an interesting approach that I tested a couple days ago.
I'll post some example code this weekend.

Thanks for your interest in the topic!
Mountain

100 said:
The real problem with this idea is - how we can possibly know which type is
the object at given index in order to know what overload to use to retrieve
the data. We can provide method that returns the type of an object by index
value. But then we need to check this type and call the appropriate method
which will introduce more overhead when reading data. It may turns that
boxing/unboxing is the best solution and the problem with the prformance is
not there.

That's why I think we can't come up with good idea unless Mountain gives us
an example how he is planning to use this storage. He may not need to have
the objects ordered and then this solution will work and we won't have the
problem of wasting the memory.

B\rgds
100

100 said:
Hi Paul,
You are hundred percent right. That was my fault. I wrote this minutes
before I went home yesterday and its my fault that I posted this without any
revision.

Thanks for bringing this out. We can't use indexer for that. What we can do
is to use separate operations SetData and GetData with as many overloads as
different types we want to accept. GetData cannot return value because ot
will introduce the same problem that we have already with the indexer.
So
we
can have prototype like

void GetData(int index, out <type> value)
{
}

this should fix the problem.

B\rgds
100

Paul Robson said:
100 wrote:

class Storage
{
int[] intArr= new int[10];
double[] dblArr = new int[10];
object[] refArr = new object[10];

public int this[int i]
{
get{return intArr;}
set{intArr = value;}
}

public double this[int i]

I thought that C# couldn't do polymorphism on return types only ? Am I
wrong ?


 
Back
Top