Converting const char * -> System::String^ without gcnew.

  • Thread starter Thread starter DaTurk
  • Start date Start date
D

DaTurk

Hi,

I have several interfaces in CLI that I access via c#. My problem is,
is that down in the unmanaged c++, which the CLI lies on top of, I
have a lot of c_str() happening. But all of my methods in CLI return
System::String^. I originally just gcnew'd System::String^ passing in
the c_str(). But I can't really have as many gcnew's as I'm using for
overhead and for fear of leaks.

So my question is this, how can I get the char* coming from c_str()
call to return as a System::String^ without actually calling gcnew?
The reason this is an issue is that these libraries are pushed quite
hard, maybe having string coming through in excess of 100 time a
second and will most likely never be turned off. Any ideas?

THanks in advance.
 
DaTurk said:
I originally just gcnew'd System::String^ passing in
the c_str(). But I can't really have as many gcnew's as I'm using for
overhead and for fear of leaks.

There is no leak, System::String^ is garbage collected. Even if you
wanted to, you couldn't control the memory deallocation of managed
strings. It's always managed by the .NET framework.

The overhead is given, you can't avoid that. char* is an unmanaged 8-bit
per character string, while System::String is a managed 16-bit per
character string. At the minimum, the string's value must be copied
between those two classes, you can't avoid that. Even when you assign
one std::string to another, a byte-for-byte copy has to be made. Even
std::string calls malloc and free internally, which are typically slower
than gcnew. When you have enough memory available, gcnew can be as fast
as incrementing a pointer.
So my question is this, how can I get the char* coming from c_str()
call to return as a System::String^ without actually calling gcnew?

You can not.
The reason this is an issue is that these libraries are pushed quite
hard

If the native->unmanaged transition is proven to cause a major
performance problem, you have to write your code 100% managed, or 100%
native. In fact, std::string itself is not nearly as fast as working
with char* directly (see strncpy, itoa, snprintf, etc.).

Tom
 
So my question is this, how can I get the char* coming from c_str()
You can not.

So what if you can't make a String without gcnew, you can do even better.

Require your caller to pass you either a preallocated System::Char[]
(cli::array<wchar_t>^) or a StringBuilder. Then you needn't reallocate
memory on each call.

But memory allocation in .NET is optimized and should be a very cheap
operation, unless you are allocating large objects in a loop (using
String::Concat iteratively to slowly build a large string is very bad, use a
StringBuilder instead for that sort of thing and preserve a sufficient
Capacity).
If the native->unmanaged transition is proven to cause a major performance
problem, you have to write your code 100% managed, or 100% native. In
fact, std::string itself is not nearly as fast as working with char*
directly (see strncpy, itoa, snprintf, etc.).

But of course you may work directly with a managed array and interior
pointers with no extra overhead, and only a little extra if you need to
pin_ptr it to pass to a native function.
 
Ben said:
Require your caller to pass you either a preallocated System::Char[]
(cli::array<wchar_t>^) or a StringBuilder.

That sounds reasonable. It still requires a copy operation, which could
be slower than what you save by eliminating gcnew from the loop.
But of course you may work directly with a managed array and interior
pointers with no extra overhead

I like this idea. So you recommend returning an IntPtr to the char*:

// C++/CLI library:
struct Unmanaged
{
Unmanaged() : some_string("test") { }
std::string some_string;
};

public ref class Lib
{
public:
Lib() : unmanaged(new Unmanaged()) { }
~Lib() { this->!Lib(); }
!Lib() { delete unmanaged; }
IntPtr GetString() { return &unmanaged->some_string[0]; }
private:
Unmanaged* unmanaged;
};

// C# application:
unsafe void ProcessString()
{
using(Lib lib = new Lib())
{
IntPtr ip = lib.GetString();
byte* c = (byte*)ip.ToPointer();
// access Unmanaged::some_string's characters directly from C#
}
}

This can only be used with the unsafe keyword and the /unsafe compiler
switch.

Tom
 
Tamas Demjen said:
Ben said:
Require your caller to pass you either a preallocated System::Char[]
(cli::array<wchar_t>^) or a StringBuilder.

That sounds reasonable. It still requires a copy operation, which could be
slower than what you save by eliminating gcnew from the loop.

How? gcnew will require the same copy operation, as well as creating an
additional garbage collected object. But the C++ code can most likely work
with unicode directly, and avoid the copy operation.
I like this idea. So you recommend returning an IntPtr to the char*:

No, I recommended totally avoiding any allocation inside the function, and
having the caller provide an existing buffer, so that one buffer allocation
can serve multiple calls into the C++ code.

Let the caller pass in a (C#) byte[] if working with ASCII data, or a char[]
if working with Unicode. Getting a System::String of the data eventually
involves a new instance, because String objects are immutable -- each
distinct content requires a distinct instance. Of course, you can also try
to share string instances across multiple calls that return the same content
(this also helps future comparison).
 
Ben Voigt wrote:
Require your caller to pass you either a preallocated System::Char[]
(cli::array<wchar_t>^) or a StringBuilder.
That sounds reasonable. It still requires a copy operation, which could be
slower than what you save by eliminating gcnew from the loop.

How? gcnew will require the same copy operation, as well as creating an
additional garbage collected object. But the C++ code can most likely work
with unicode directly, and avoid the copy operation.


I like this idea. So you recommend returning an IntPtr to the char*:

No, I recommended totally avoiding any allocation inside the function, and
having the caller provide an existing buffer, so that one buffer allocation
can serve multiple calls into the C++ code.

Let the caller pass in a (C#) byte[] if working with ASCII data, or a char[]
if working with Unicode. Getting a System::String of the data eventually
involves a new instance, because String objects are immutable -- each
distinct content requires a distinct instance. Of course, you can also try
to share string instances across multiple calls that return the same content
(this also helps future comparison).

What about MArshaling? What if you Marshal the c_str to a
std::string. Then can't you just do something of the nature String^
test = std::string test?
 
DaTurk said:
Ben Voigt wrote:
Require your caller to pass you either a preallocated System::Char[]
(cli::array<wchar_t>^) or a StringBuilder.
That sounds reasonable. It still requires a copy operation, which could
be
slower than what you save by eliminating gcnew from the loop.

How? gcnew will require the same copy operation, as well as creating an
additional garbage collected object. But the C++ code can most likely
work
with unicode directly, and avoid the copy operation.


But of course you may work directly with a managed array and interior
pointers with no extra overhead
I like this idea. So you recommend returning an IntPtr to the char*:

No, I recommended totally avoiding any allocation inside the function,
and
having the caller provide an existing buffer, so that one buffer
allocation
can serve multiple calls into the C++ code.

Let the caller pass in a (C#) byte[] if working with ASCII data, or a
char[]
if working with Unicode. Getting a System::String of the data eventually
involves a new instance, because String objects are immutable -- each
distinct content requires a distinct instance. Of course, you can also
try
to share string instances across multiple calls that return the same
content
(this also helps future comparison).

What about MArshaling? What if you Marshal the c_str to a
std::string. Then can't you just do something of the nature String^
test = std::string test?

One, a std::string is not compatible in any way with a System::String^, the
conversion is first to char*. Secondly, Marshaling is an expensive
operation that requires multiple copies, and would only be appropriate for
interprocess communication. With C++/CLI, the managed and unmanaged code
share the same process, the same memory space, even the same thread, so
marshalling is definitely not needed.
 
Back
Top