Very bad managed C++ performance

  • Thread starter Thread starter Omid Hodjati
  • Start date Start date
O

Omid Hodjati

Hi All,
I implemented an encryption algorithm using C#, native C++
and managed C++. Then I measured the CPU time used for
executing this algorithm in all implementation. The
managed version of C++ was the worst ! It is 20 times
slower that the native version and 6 times slower that the
C# versio !!
Because i wanted to use automatic memory management
offered by .Net framework, i used the managaed classes
available. e.g. i have used "Array" class instead of
simple "unsigned char []" constuct. It seems that this
replacement is the main factor in decreasing the execution
performance. (I just don't talk about inability of managed
extentions to support jagged arrys for now...;-).
- Do i have to pay such a big penalty for managed
execution ?
- Does anyone have a better solution for
replacing "unsigned char []" ?(except __gc[] which
ultimatly yiels Array too)

Thanks in advance
 
Omid said:
The managed version of C++ was the worst ! It is 20 times
slower that the native version and 6 times slower that the
C# versio !!

Hi Omid,
Based on these results, the difference between using Managed Extensions for
C++ and C# is suspicious. The performance of the two should be very similar.
- Do i have to pay such a big penalty for managed
execution ?

Of course, the first question whenever talking about performance -- have you
profiled the different applications? It is possible that the arrays may be a
contributing factor to the slow down, but it may not be the only root cause.
Certainly, profiling the C# version and the C++ (managed) version should
illustrate why there is such a big difference.
- Does anyone have a better solution for
replacing "unsigned char []" ?(except __gc[] which
ultimatly yiels Array too)

Without knowing exactly what you are using these arrays for, it's not easy
to suggest an alternative. I do have a few tips for you when using C++. If
verifiability is not an issue, you can use interior pointers to iterate over
the array. This saves you from having bounds checking every time the array
is accessed. If you translate the C# code directly into C++, it will not use
interior pointers and will incur bounds checking.

I hope that gives you some ideas. Cheerio!
 
Thank you taking attention Mr. Bray,
The following is one the simples functions in the program.
This function xors an array of bytes. It is so simple but
the managed version is run 20 times slower than unmanaged
version. As you mentioned, range cheking in the array may
be the main problem. I have used __gc[] too. IlDasm shows
that both are interpreted the same way and then the result
is the same.
I have profiled my application execution (using Rational
Quantify). The tool shows that Array class is not a big
factor in perfromance.

-What could be wrong with the solution i have provided in
managed version.

-How can a managed pointer help me to bypass range
checking? (There are restrictions on using managed
pointers that is confusing somehow).

- I appreciate a revised source code for this sample
function.

Thanks in advanced

Omid Hodjati.


////*******C++ unmanaged version******************///
void rvtEncrypt::XOR(unsigned char* ary1,unsigned char*
ary2, int len, unsigned char* res)
{
for(int j=0;j<len;++j)
res[j]=ary1[j]^ary2[j];
}



////*********C ++ managed Version ***************///
void ManagedEncryption::XOR(Array *ary1, Array* ary2, int
len, Array * res )
{
for(int i=0;i<len;++i)
{
Buffer::SetByte(res, i, Buffer::GetByte(ary1, i) ^
Buffer::GetByte(ary2,i));
}
}

//************C# version **********************////
private void XOR (byte[] Ary1, byte[] Ary2, int Len, byte
[] Res)
{
for(int i = 0; i < Len; ++i)
Res = (byte)(Ary1 ^ Ary2);
}
 
Hello,

What stops you from writing exactly the same code as C# version?

For example MC++:

void XOR(Byte Ary1[], Byte Ary2[], int Len, Byte Res[])
{
for(int i = 0; i < Len; ++i)
Res = (Ary1 ^ Ary2);
}

I get same results for C#/MC++/VS2003.

In your original code you at the very least add the overhead of calling
SetByte/GetByte/GetByte.
 
Thank you leon paying attention,

My biggest problem with C++ is Jagged arrays. We can not
create Jagged arrays in Managed C++. In the other parts of
the code i need to create jagged arrays (e.g. an array of
keys). But Managed C++ stops me creating such structures
in Managed heap. I can use "Byte [][] keys" in C# but
there is currently no equivalent in managed C++.
Multidimentional arrays can not fix the problem. Because i
want to pass "one row" of the array to a function (such as
XOR) but it is imposible to pass one row of the array ,i
think.

-I appreciate receiving a better solution

Thanks in advanced
-----Original Message-----
Hello,

What stops you from writing exactly the same code as C# version?

For example MC++:

void XOR(Byte Ary1[], Byte Ary2[], int Len, Byte Res[])
{
for(int i = 0; i < Len; ++i)
Res = (Ary1 ^ Ary2);
}

I get same results for C#/MC++/VS2003.

In your original code you at the very least add the overhead of calling
SetByte/GetByte/GetByte.


Omid Hodjati said:
Thank you taking attention Mr. Bray,
The following is one the simples functions in the program.
This function xors an array of bytes. It is so simple but
the managed version is run 20 times slower than unmanaged
version. As you mentioned, range cheking in the array may
be the main problem. I have used __gc[] too. IlDasm shows
that both are interpreted the same way and then the result
is the same.
I have profiled my application execution (using Rational
Quantify). The tool shows that Array class is not a big
factor in perfromance.

-What could be wrong with the solution i have provided in
managed version.

-How can a managed pointer help me to bypass range
checking? (There are restrictions on using managed
pointers that is confusing somehow).

- I appreciate a revised source code for this sample
function.

Thanks in advanced

Omid Hodjati.


////*******C++ unmanaged version******************///
void rvtEncrypt::XOR(unsigned char* ary1,unsigned char*
ary2, int len, unsigned char* res)
{
for(int j=0;j<len;++j)
res[j]=ary1[j]^ary2[j];
}



////*********C ++ managed Version ***************///
void ManagedEncryption::XOR(Array *ary1, Array* ary2, int
len, Array * res )
{
for(int i=0;i<len;++i)
{
Buffer::SetByte(res, i, Buffer::GetByte(ary1, i) ^
Buffer::GetByte(ary2,i));
}
}

//************C# version **********************////
private void XOR (byte[] Ary1, byte[] Ary2, int Len, byte
[] Res)
{
for(int i = 0; i < Len; ++i)
Res = (byte)(Ary1 ^ Ary2);
}



.
 
Thank you dear Brandon,

You solution was realy a cure!! That was simple and
straight forward. Profiling shows that your scenario is
execuuted much faster. But...

1-The usign this scenario in my whole program requires
some more structure. I have an array of keys in the
program. I want to pass the keys one by one to some
fucntions (such as XOR that you saw befor). I create the
key array using "Byte [][] keys" in C#. I can create them
the same way in unmanaged code too. But i don't know the
equivalent structure in managed world. i can not create
Jagged arrays in managed C++. Do you know any replacement
structure that do not leads to unefficient Array structure
(something replacing "Array* __gc[]" i mean)?

2-Another question dear Brandon. can you help me know what
happens when i compile my unmanaged c++ code usign /clr
option. When i compile my unmanaged c++ project using /clr
the performace of the algorithm degrades 20%. I have no
managed code in this project. Then i expect not to have
extra JITing or verification penalties. I have excluded
startup time from my measurements too. Then i do not
expect such overheads). I do not alocate memory too. The i
do not expect GC overhead too..... it is realy confusing
to me .

3-I Appreciate your attention and help in advance.

-----Original Message-----
Omid said:
Thank you taking attention Mr. Bray,

Hi Omid, you are welcome! :-)
I have profiled my application execution (using Rational
Quantify). The tool shows that Array class is not a big
factor in perfromance.

The array class probably won't show up in a profiler, but the function
containing the loop you wrote might. Does that show up?
-What could be wrong with the solution i have provided in
managed version.

Well, as Leon pointed out, using the strongly typed arrays in Managed C++
helps tremendously. The equivalent of "int[] x" in C# is "int x __gc[]" in
C++. System::Array is the base class of all arrays. By using System::Array
as the type, you're paying an overhead for dynamic type checking everytime
you use it.
-How can a managed pointer help me to bypass range
checking? (There are restrictions on using managed
pointers that is confusing somehow).

Here's an example that uses interior pointers to avoid the bounds checking.
I'd only use this if profiling determines this is a hot spot. By avoiding
bounds checking, you run the danger of passing the wrong length into the
function (or more problematic, passing arrays of different length into the
function).

void rvtEncrypt::XOR(unsigned char ary1 __gc[],
unsigned char ary2 __gc[],
int len,
unsigned char res __gc[])
{
unsigned char __gc* pary1 = &ary1[0];
unsigned char __gc* pary2 = &ary2[0];
unsigned char __gc* pres = &res[0];

for (int j=0; j<len; ++j)
{
*res = (*ary1) ^ (*ary2);
pary1++;
pary2++;
pres++;
}
}
- I appreciate a revised source code for this sample
function.

Because arrays contain their length, passing the length into a function is
actually not necessary. For example, "ary1->Length" returns the length of
ary1.

Hope that helps. Cheerio!

--
Brandon Bray Visual C++ Compiler
This posting is provided AS IS with no warranties, and confers no rights.


.
 
So your main question is about jagged arrays and we can disregard
your previous code snippet, which was showing 1d array.
Multidimentional arrays can not fix the problem. Because i
want to pass "one row" of the array to a function (such as
XOR) but it is imposible to pass one row of the array ,i
think.

Here is one way:

void XOR(Byte Ary1[,], Byte Ary2[,], int row, int Len, Byte Res[])
{
for(int i = 0; i < Len; ++i)
Res = (Ary1[row,i] ^ Ary2[row,i]);
}

This runs at least 2x-2.5x times slower than 1d array. But many times
better than Buffer::SetByte/Buffer::GetByte^Buffer::GetByte.

This runs as fast (or faster) than original C# version with 1d array:

void XOR(Byte Ary1[,], Byte Ary2[,], int row, int Len, Byte Res[])
{
unsigned char __pin* pinAry1 = &Ary1[row,0];
unsigned char __pin* pinAry2 = &Ary2[row,0];
for(int i = 0; i < Len; ++i)
Res = (pinAry1 ^ pinAry2);
}

I'm sure there are other ideas depending on your design.

Hope that helps


in message
Thank you leon paying attention,

My biggest problem with C++ is Jagged arrays. We can not
create Jagged arrays in Managed C++. In the other parts of
the code i need to create jagged arrays (e.g. an array of
keys). But Managed C++ stops me creating such structures
in Managed heap. I can use "Byte [][] keys" in C# but
there is currently no equivalent in managed C++.
Multidimentional arrays can not fix the problem. Because i
want to pass "one row" of the array to a function (such as
XOR) but it is imposible to pass one row of the array ,i
think.

-I appreciate receiving a better solution

Thanks in advanced
-----Original Message-----
Hello,

What stops you from writing exactly the same code as C# version?

For example MC++:

void XOR(Byte Ary1[], Byte Ary2[], int Len, Byte Res[])
{
for(int i = 0; i < Len; ++i)
Res = (Ary1 ^ Ary2);
}

I get same results for C#/MC++/VS2003.

In your original code you at the very least add the overhead of calling
SetByte/GetByte/GetByte.


Omid Hodjati said:
Thank you taking attention Mr. Bray,
The following is one the simples functions in the program.
This function xors an array of bytes. It is so simple but
the managed version is run 20 times slower than unmanaged
version. As you mentioned, range cheking in the array may
be the main problem. I have used __gc[] too. IlDasm shows
that both are interpreted the same way and then the result
is the same.
I have profiled my application execution (using Rational
Quantify). The tool shows that Array class is not a big
factor in perfromance.

-What could be wrong with the solution i have provided in
managed version.

-How can a managed pointer help me to bypass range
checking? (There are restrictions on using managed
pointers that is confusing somehow).

- I appreciate a revised source code for this sample
function.

Thanks in advanced

Omid Hodjati.


////*******C++ unmanaged version******************///
void rvtEncrypt::XOR(unsigned char* ary1,unsigned char*
ary2, int len, unsigned char* res)
{
for(int j=0;j<len;++j)
res[j]=ary1[j]^ary2[j];
}



////*********C ++ managed Version ***************///
void ManagedEncryption::XOR(Array *ary1, Array* ary2, int
len, Array * res )
{
for(int i=0;i<len;++i)
{
Buffer::SetByte(res, i, Buffer::GetByte(ary1, i) ^
Buffer::GetByte(ary2,i));
}
}

//************C# version **********************////
private void XOR (byte[] Ary1, byte[] Ary2, int Len, byte
[] Res)
{
for(int i = 0; i < Len; ++i)
Res = (byte)(Ary1 ^ Ary2);
}



.
 
Back
Top