Optimization needed... Is Assembly code available?

  • Thread starter Thread starter ThunderMusic
  • Start date Start date
T

ThunderMusic

Hi,
I need to optimize a close loop. Usually I would use assembly, but I don't
event know if it's available to VB.NET. I'll explain the case and maybe some
of you will be able to advise me.

I have a byte() (byte array) containing about 100,000 elements (a bit less,
but always is around 80,000-95,000). I need to take the data from it as
int16 and compare the int16 to a reference value

for now, I'm using a MemoryStream created from the byte(), a BinaryReader
that reads from the MemoryStream (ReadInt16) and then compare the int16 to a
reference value and return false if the int16 is greater than the reference
value. if all the values are lower than the reference value, then the
function returns true. If it returns after the first loop or so, it's ok,
but if it runs through all the elements it's way too long. In assembly it
would take way less time than that, because I could move the reference value
in one of the CPU's register and leave it there as long as my loop runs and
iterate directly in the byte() to get 2 bytes at a time.

In VB.NET it's a function that stands in about 10 lines of code (maybe even
lower), but it takes up to 50% of the CPU (when looked in the task manager)
which is way too much because it runs very often. When it does not run (many
other things are running in the app), the app takes about 1%-2% of the CPU,
when this close loop starts being called, the app get's to 30%-50% of the
CPU... it's a bit excessive due to the fact that the function is called
about 8-10 times a second.

thanks for your help

ThunderMusic
 
FYI, there's a group dedicated to performance issues;
microsoft.public.dotnet.framework.performance

I need to optimize a close loop. Usually I would use assembly, but I don't
event know if it's available to VB.NET.

It isn't, but you can certainly write assembly code, compile to a DLL
and call that from VB.NET if you want.

You could also try using pointers in C# (unsafe code) or C++ to walk
the array.

In any case I wouldn't worry too much about the CPU utilization, more
about the time spent in the loop. Use a profiler!


Mattias
 
Mattias Sjögren said:
FYI, there's a group dedicated to performance issues;
microsoft.public.dotnet.framework.performance

thanks a lot, I didn't know because it's the first time I have a real
performance issue, so I'll use a profiler (as advised) and try posting
there. Thanks a lot.

ThunderMusic
 
ThunderMusic said:
I need to optimize a close loop. Usually I would use assembly, but I
don't event know if it's available to VB.NET. I'll explain the case
and maybe some of you will be able to advise me.

To answer your original question, it is not normally possible to write
inline assembly in a .NET language, because .NET assemblies are intended to
be portable between platforms. However, there are techniques by which you
can include inline CLR bytecode, which is handy when either you need to
optimize a hotspot or you need access to a CLR feature that your language
does not give access to. There is a tool available here, written by Mike
Stall, that enables this:

http://blogs.msdn.com/jmstall/archive/2005/02/21/377806.aspx
 
ThunderMusic said:
I need to optimize a close loop. Usually I would use assembly, but I don't
event know if it's available to VB.NET. I'll explain the case and maybe some
of you will be able to advise me.

I have a byte() (byte array) containing about 100,000 elements (a bit less,
but always is around 80,000-95,000). I need to take the data from it as
int16 and compare the int16 to a reference value

for now, I'm using a MemoryStream created from the byte(), a BinaryReader
that reads from the MemoryStream (ReadInt16) and then compare the int16 to a
reference value and return false if the int16 is greater than the reference
value. if all the values are lower than the reference value, then the
function returns true. If it returns after the first loop or so, it's ok,
but if it runs through all the elements it's way too long. In assembly it
would take way less time than that, because I could move the reference value
in one of the CPU's register and leave it there as long as my loop runs and
iterate directly in the byte() to get 2 bytes at a time.

I think that using a MemoryStream and a BinaryReader is adding a lot of
overhead. Try the following code (C#, but I'm sure you can convert it
to VB.NET without too much hassle):

using System;

class Test
{
const int Iterations = 1000;
const int Size = 100000;

static void Main()
{
byte[] array = new byte[Size];

Random rng = new Random();

rng.NextBytes(array);

ushort threshold = (ushort)rng.Next(ushort.MaxValue);

DateTime start = DateTime.Now;
int counter=0;
for (int i=0; i < Iterations; i++)
{
for (int j=0; j < array.Length; j += 2)
{
int value = array[j] << 8 + array[j+1];
if (value > threshold)
{
counter++;
}
}
}
DateTime end = DateTime.Now;

Console.WriteLine ("Counter: {0}", counter);
Console.WriteLine ("Time taken: {0}", end-start);
}
}

On my computer, that takes about 6 seconds for *10,000* iterations (of
all 50,000 shorts). That means that if you call it 10 times a second,
it would still only take about 6ms of each second.

The important line is:

int value = (array[j] << 8) + array[j+1];

I don't know what the shift operators are in VB.NET, but that's the way
I'd go. (You may need to reverse which byte is shifted, of course -
that depends on the endianness of your data.)
 
Hello,

Have you tried using Buffer.BlockCopy?

I would create an Int16 (Short) array that was long enough to hold all the 2-byte pairs in the byte array and use Buffer.BlockCopy
to just copy those bytes into the Int16 array directly. Then loop through the Int16 array instead of using the
MemoryStream/BinaryReader combination. I would think that would be faster.

Is that something that would work?

Kelly
 
Kelly Ethridge said:
Have you tried using Buffer.BlockCopy?

I would create an Int16 (Short) array that was long enough to hold
all the 2-byte pairs in the byte array and use Buffer.BlockCopy to
just copy those bytes into the Int16 array directly. Then loop
through the Int16 array instead of using the
MemoryStream/BinaryReader combination. I would think that would be
faster.

Is that something that would work?

I've just tried that, and it's about twice as fast as my "manual
shifting" solution - at least if the ushort array can be reused
(repopulated, but not recreated). Oddly enough, I'd looked at Buffer
but not worked out the best way of using it.
 
Ya, there's always a catch.

Jon Skeet said:
I've just tried that, and it's about twice as fast as my "manual
shifting" solution - at least if the ushort array can be reused
(repopulated, but not recreated). Oddly enough, I'd looked at Buffer
but not worked out the best way of using it.
 
use pointers !

byte[] array;
byte *pbyte = &array[0];
short* pshort = (short*)pbyte;

--
Regards,
Lloyd Dupont

NovaMind development team
NovaMind Software
Mind Mapping Software
<www.nova-mind.com>
 
Thanks a lot, I'll try both your solutions and give you feedback on how
well/fast it worked.

Thanks

ThunderMusic

Kelly Ethridge said:
Ya, there's always a catch.
 
ok, I tried the Buffer.BlockCopy solution, and it did work ultra fine... in
fact, when the method is run, the cpu does not even go up a bit, not even by
1%, it stays constant... my whole app is taking between 0% and 2% of CPU,
and before, with the MemoryStream and BinaryReader, it was taking up to 50%
of the CPU. Now it's totally seemless when the loop runs.

Thanks a lot

ThunderMusic


ThunderMusic said:
Thanks a lot, I'll try both your solutions and give you feedback on how
well/fast it worked.

Thanks

ThunderMusic
 
Back
Top