Performance in array copy in C#?Thanks

  • Thread starter Thread starter linuxfedora
  • Start date Start date
L

linuxfedora

What is the difference in the performance for the following 2 codes?

Code1:

char[] data = new char[100];
...
class1.SendArray(data);

public void SendArray(char[] data)
{
SocketHandle.Send(data);
}

Code 2:

char[] data = new char[100];
...
class1.SendArray(ref data);

public void SendArray(ref char[] data)
{
SocketHandle.Send(data);
}

Is that Code 2 will run faster since it is using the reference, so
need to copy the data? Thanks.
If so, what is the drawback when using Code 2?Thanks
 
(e-mail address removed) pisze:
What is the difference in the performance for the following 2 codes?

Code1:

char[] data = new char[100];
..
class1.SendArray(data);

public void SendArray(char[] data)
{
SocketHandle.Send(data);
}

Code 2:

char[] data = new char[100];
..
class1.SendArray(ref data);

public void SendArray(ref char[] data)
{
SocketHandle.Send(data);
}

Is that Code 2 will run faster since it is using the reference, so
need to copy the data? Thanks.
No.
There is one difference in functionality:
If inn SendArray you do:
data = null;
then in code 2 the "data" variable from outside of SendArray will equal
null. In code 1 the outside variable (data, too) will NOT change.
 
Hi!

Well, I'm not an expert but I guess since your Code2 will be using reference
it will be faster because
it's always faster to pass a memory address to a function than copying the
entire array.

Igor
 
Igor Diniz pisze:
Hi!

Well, I'm not an expert but I guess since your Code2 will be using reference
it will be faster because
it's always faster to pass a memory address to a function than copying the
entire array.
Yes you are no expert.
Array is a reference itself. There is no reason to pass a reference to a
reference.
 
What is the difference in the performance for the following 2 codes?

The difference in performance is so small that you won't even be able to
measure it.
Code1:

char[] data = new char[100];
..
class1.SendArray(data);

public void SendArray(char[] data)
{
SocketHandle.Send(data);
}

Code 2:

char[] data = new char[100];
..
class1.SendArray(ref data);

public void SendArray(ref char[] data)
{
SocketHandle.Send(data);
}

Is that Code 2 will run faster since it is using the reference, so
need to copy the data? Thanks.

Both methods are using _a_ reference, and neither is copying the data.
You pass a reference to both methods, but to the second method you are
passing a reference _by_ reference.

You have to distinguish between _a_ reference and sending a variable
_by_ reference. Although they sound similar, they are completely
different. The ref keyword is used to specify that you send a variable
by reference, but the variable doesn't have to be a reference type. You
can send an int variable by reference, that only means that you can
change the int variable from inside the method.
If so, what is the drawback when using Code 2?Thanks

Unless you have to replace the array inside the method, there is no
reason to send it by reference.

An example of where an array is sent by reference, is the Array.Resize
method. As arrays can't be resized in .NET, the method creates a new
array, copies the data to the new array, and replaces the reference in
the array variable with a reference to the new array.
 
An example of where an array is sent by reference, is the Array.Resize
method. As arrays can't be resized in .NET, the method creates a new
array, copies the data to the new array, and replaces the reference in
the array variable with a reference to the new array.
By the way. That is imperfectino of the framework, and it would be
possible.
 
Doker said:
By the way. That is imperfectino of the framework, and it would be
possible.

It would only be possible to do efficiently if the memory after the
array is unused. Otherwise the array would have to be moved, or whatever
was residing in the memory after the array would have to be moved.
That would mean that resizing an array could sometimes be done
efficiently and sometimes not. In the cases when the memory would have
to be rearranged, all threads in the application would have to be
freezed and all references in the application would have to be scanned,
which could be a considerable performance hit.
 
It would only be possible to do efficiently if the memory after the
array is unused. Otherwise the array would have to be moved, or whatever
was residing in the memory after the array would have to be moved. That
would mean that resizing an array could sometimes be done efficiently
and sometimes not. In the cases when the memory would have to be
rearranged, all threads in the application would have to be freezed and
all references in the application would have to be scanned, which could
be a considerable performance hit.
Yes, locking on whole heap would be necessary. Still i would appreciate
this.
But wait. Isn't it true that that way or another heap must be locked to
prevent from allocating the same part of mem twice in two different
threads? I think it is. So, method of Array class called TryStretch
would be good, possible, and appreciated.
 
Doker said:
Yes, locking on whole heap would be necessary. Still i would appreciate
this.

Göran isn't talking about locking the heap. He's point out that the
_entire application_ would have to be paused, so that the array could be
moved.

Memory allocation does already involve coordination between multiple
threads accessing the heap. Memory allocations are thread-safe. But
that's not what Göran is pointing out.

His point, and it's correct, is that if you move the array, you need to
go through the application's entire data and update any references to
that array so that they point to the new location. Before the array is
moved, all of the threads need to be stopped so that they don't execute
code that would use an out-of-date version of the array (or worse,
unallocated memory).

The .NET memory management does, I believe, include functionality to
defragment the heap. So it does already have this ability to stop
everything and move the objects around. But it's not something you'd
want to do under normal circumstances. IMHO, resizing an array isn't
sufficient cause to justify incurring the penalty of performing that
sort of operation.

Now, all of this is a basic consequence of the way the array data
structure works. Because the instance is itself the location where the
data is stored, you can't resize the array without these hassles.

Other collection classes don't have this limitation. If you want a
resizable collection, there are those that exist, like the generic
List<> class. Even there, when you resize the storage for the
collection, you run the risk of having to perform a copy on the entire
data within the collection, resulting in a performance hit.

But this performance hit is relatively small, involving just the
overhead of copying the data, rather than pausing all of the threads of
the entire application and searching all the data structures (or doing
whatever it is the CLR does when it has to defragment the heap...I don't
actually know the specifics).

Furthermore, this can be done in a well-defined manner, rather than
requiring the entire application to be paused while it happens. The
class isn't thread-safe, so if you have multiple threads using the class
you'd have to explicitly synchronize those threads while any
modification was being made, but that's a much smaller, more
well-defined mechanism than rearranging the heap itself.

Hope that helps.

Pete
 
Doker said:
Yes, locking on whole heap would be necessary. Still i would appreciate
this.
But wait. Isn't it true that that way or another heap must be locked to
prevent from allocating the same part of mem twice in two different
threads? I think it is.

Yes, of course. But the code for allocating memory is very simple, so
it's not a problem that one thread at a time can allocate memory. The
code is basically something like:

IntPtr MemAlloc(int size) {
IntPtr ptr;
lock (_sync) {
if (_heapTop - _heapPointer < size) GC.GimmeMoreHeap(size);
ptr = _heapPointer;
_heapPointer += size;
}
return ptr;
}

There is of course a bit more code to handle the large objects heap and
such, but that's basically it. The allocation part is really simple.
Check that there is room on the heap, and increase a pointer.
So, method of Array class called TryStretch
would be good, possible, and appreciated.

As the garbage collector compacts the heap putting the objects back to
back, I think that a TryStretch method would fail almost every time.
 
Back
Top