CLI string question

  • Thread starter Thread starter news.chi.sbcglobal.net
  • Start date Start date
N

news.chi.sbcglobal.net

I have a question about string literals with C++/CLI.

Is there a difference between the following two lines?

String ^s1 = gcnew String("Hello");
String ^s2 = gcnew String(L"Hello");

I have been playing around with the compiler options for "Character Set" and
"Common Language Runtime Support" and those two lines compile fine. I am
thinking that when compiling with /clr, "Hello" is a string of Unicode
characters and the L prefix is no longer needed. Am I wrong about that?
 
news.chi.sbcglobal.net said:
I have a question about string literals with C++/CLI.

Is there a difference between the following two lines?

String ^s1 = gcnew String("Hello");
String ^s2 = gcnew String(L"Hello");

I have been playing around with the compiler options for "Character Set"
and "Common Language Runtime Support" and those two lines compile fine. I
am thinking that when compiling with /clr, "Hello" is a string of Unicode
characters and the L prefix is no longer needed. Am I wrong about that?

At the fist view, one might think that the first line creates a String using
the constructor String(const char*) and the second another one using
String(const wchar_t*). In fact, both constructors exist and they have been
used in MC++. I have written a simple test porogram to see what happens and
here is the output:
IL_0004: ldstr "Hello"
IL_0009: stloc.1
IL_000a: ldstr "Hello"
IL_000f: stloc.0

As you can see, the compiler optimizes the constuctor call away. This can be
helpful because now string interning can ensure that there is only one
managed string literal.

I would not use either of both alternatives. Have a look at Stan Lippman's
blog to find out that in C++/CLI, there is a trivial conversion from a
string literal to a String^. Look at this code:

void f( String^ );
void f( const char* );

void bar( R^ r )
{
f( "abc" );
}

In this code, f(String^) will be called. "abc" is of type char[4]. In
C++/CLI, the conversion from char[4] to String^ is defined to be better than
the conversion from char[4] to const char*.

Marcus Heege
 
Hi, Marcus.

From what you said and from what I read on Stan Lippman's blog, L"Hello"
(unicode because of the prefix) and "Hello" (ansi) are both going to have
System::String as their underlying types anyway. I guess I don't have to
worry about it.

Woo Hoo!




Marcus Heege said:
news.chi.sbcglobal.net said:
I have a question about string literals with C++/CLI.

Is there a difference between the following two lines?

String ^s1 = gcnew String("Hello");
String ^s2 = gcnew String(L"Hello");

I have been playing around with the compiler options for "Character Set"
and "Common Language Runtime Support" and those two lines compile fine. I
am thinking that when compiling with /clr, "Hello" is a string of Unicode
characters and the L prefix is no longer needed. Am I wrong about that?

At the fist view, one might think that the first line creates a String
using the constructor String(const char*) and the second another one using
String(const wchar_t*). In fact, both constructors exist and they have
been used in MC++. I have written a simple test porogram to see what
happens and here is the output:
IL_0004: ldstr "Hello"
IL_0009: stloc.1
IL_000a: ldstr "Hello"
IL_000f: stloc.0

As you can see, the compiler optimizes the constuctor call away. This can
be helpful because now string interning can ensure that there is only one
managed string literal.

I would not use either of both alternatives. Have a look at Stan Lippman's
blog to find out that in C++/CLI, there is a trivial conversion from a
string literal to a String^. Look at this code:

void f( String^ );
void f( const char* );

void bar( R^ r )
{
f( "abc" );
}

In this code, f(String^) will be called. "abc" is of type char[4]. In
C++/CLI, the conversion from char[4] to String^ is defined to be better
than the conversion from char[4] to const char*.

Marcus Heege
 
Jeff said:
Hi, Marcus.

From what you said and from what I read on Stan Lippman's blog, L"Hello"
(unicode because of the prefix) and "Hello" (ansi) are both going to have
System::String as their underlying types anyway.

Precisely spoken, string literals are still native types. Compile and run
the code below.

// stringTypes.cpp
// compile with "cl /EHSc stringTypes.cpp"

#include <typeinfo.h>

#include <iostream>
using namespace std;

int main()
{
cout << typeid("123").name() << endl;
cout << typeid(L"123").name() << endl;
}
////////////////////////////////////

If you run the app, you will get the output

char const [4]
wchar_t const [4]

But there is a trivial conversion to String^, which allows you to use it
like a string literal in most cases.
I guess I don't have to worry about it.

I am aware of only two scenarios when you have to worry about it:

1) If a f(char[4]) and f(String^) exists, f("123") will call f(char[4])
2) If you throw "123", it will not be caught as System::String^

Both scenarios are extremely seldom. Therefore, your conclusion "I guess I
don't have to worry about it." is alright with me.
 
Yes, but you are using standard C++, not C++/CLI. My question is about how
those string literals map to managed types. So if you are not compiling with
/clr then the example does not apply because it isn't .NET.

From what I read last night, I believe that C++/CLI prefers to map those
string literals to System::String^ instead of const char* or const wchar_t*.
So I think what I said was correct.
 
Jeff Suddeth said:
Yes, but you are using standard C++, not C++/CLI. My question is about how
those string literals map to managed types. So if you are not compiling
with /clr then the example does not apply because it isn't .NET.

From what I read last night, I believe that C++/CLI prefers to map those
string literals to System::String^ instead of const char* or const
wchar_t*. So I think what I said was correct.

Apart from one exception which is ignorable in practice, you are right. The
code below shows the exception. In the betas there has been one other
exception, but it seems they have changed that.

After all: Consider a string literal to be either a managed string literal
or a native string literal, depending on the context where it is used.

// stringTypes.cpp
// compile with "cl /clr stringTypes.cpp"

#include <typeinfo.h>

using namespace System;

int main()
{
Console::WriteLine(gcnew String(typeid("123").name()));
Console::WriteLine(gcnew String(typeid(L"123").name()));

try
{
throw "123";
}
catch (String^ str)
{
Console::WriteLine("This text will not be written");
}
catch (System::Runtime::InteropServices::SEHException^ ex)
{
Console::WriteLine("This output will prove that a string literal is not
caught by String^");
Console::WriteLine("StringLiterals are thrown as native exceptions.");
}
}
 
Back
Top