Special character

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Hi

Can anyone tell me how can I convert a StringBuilder variable containing some special characters such as CHAR(27) (which is returned from an API DLL) into a string variable (which is in UniCode) correctly for further processing

Regards
RR
 
RR said:
Can anyone tell me how can I convert a StringBuilder variable
containing some special characters such as CHAR(27) (which is
returned from an API DLL) into a string variable (which is in
UniCode) correctly for further processing.

Just using ToString() will do it. Note that a StringBuilder only
contains Unicode as well - if you've got your data into there properly
to start with, it should all be fine.
 
Hi,

IMO, it is best to use a String, not StringBuilder, when you are doing this
sort of parsing. The advantages of StringBuilder over String (the immutable
nature of strings) are lost as soon as you use ToString, then start
manipulating data at the character level.

StringBuilder is best applied when you want to either append or delete
substrings, but to do no other manipulation (and, certainly NOT if you want
to search for a substring).

Dick

--
Richard Grier (Microsoft Visual Basic MVP)

See www.hardandsoftware.net for contact information.

Author of Visual Basic Programmer's Guide to Serial Communications, 3rd
Edition ISBN 1-890422-27-4 (391 pages) published February 2002.
 
Dick Grier said:
IMO, it is best to use a String, not StringBuilder, when you are doing this
sort of parsing. The advantages of StringBuilder over String (the immutable
nature of strings) are lost as soon as you use ToString, then start
manipulating data at the character level.

StringBuilder is best applied when you want to either append or delete
substrings, but to do no other manipulation (and, certainly NOT if you want
to search for a substring).

I don't see that there's anything wrong with doing other manipulation
with a StringBuilder. For example:

o Setting individual characters
o Replacing all occurrences of a specified character or string with
another

I'd agree that searching for substrings isn't really appropriate there,
but there's no problem with the above.
 
I use StringBuilder to get the "manipulated" value for that variable from the DLL function. I know that using ToString() will convert the value to string in UniCode, but the problem is that the value returns to the StringBuilder variable from the DLL function seems to be in ASCII codes so that it cannot be read into the string variable properly. So will there be any solution

R

----- Jon Skeet [C# MVP] wrote: ----

RR said:
Can anyone tell me how can I convert a StringBuilder variabl
containing some special characters such as CHAR(27) (which i
returned from an API DLL) into a string variable (which is i
UniCode) correctly for further processing

Just using ToString() will do it. Note that a StringBuilder only
contains Unicode as well - if you've got your data into there properly
to start with, it should all be fine
 
Hi,

Parsing is the issue, not replacement or deletion. As soon as you use the
ToString method... You have created a string. And to parse you need a
string. Usually, though not always, a parsing operation requires multiple
strings/substrings. All value of the StringBuilder comes from it NOT being
a string, and as soon as you have to convert it to one to use it, that value
has expired.

I use StringBuilder object when I am appending new data (for display for
example). However, if I have to fiddle with the data, perhaps to filter out
special characters, or equivalent, then, as the Soprano's might say,
"Forgedaboudit."

Dick
--
Richard Grier (Microsoft Visual Basic MVP)

See www.hardandsoftware.net for contact information.

Author of Visual Basic Programmer's Guide to Serial Communications, 3rd
Edition ISBN 1-890422-27-4 (391 pages) published February 2002.
 
Dick Grier said:
Parsing is the issue, not replacement or deletion.

Certainly if you *do* do any parsing, StringBuilder is inappropriate. I
see nothing in the original post which talks about parsing though.
As soon as you use the
ToString method... You have created a string. And to parse you need a
string. Usually, though not always, a parsing operation requires multiple
strings/substrings. All value of the StringBuilder comes from it NOT being
a string, and as soon as you have to convert it to one to use it, that value
has expired.

Indeed, although I'm not sure how that's relevant to the original
problem.
I use StringBuilder object when I am appending new data (for display for
example). However, if I have to fiddle with the data, perhaps to filter out
special characters, or equivalent, then, as the Soprano's might say,
"Forgedaboudit."

Filtering out certain characters is fine within a StringBuilder -
that's what it's Replace methods are for. Why would you want to convert
to a string before doing the replacements? (This is particularly
relevant when you need to do multiple replacements - it's more
efficient to do them "inline" in a StringBuilder than to build a new
string for each replacement.)
 
RR said:
I use StringBuilder to get the "manipulated" value for that variable
from the DLL function. I know that using ToString() will convert the
value to string in UniCode

No, it won't. It will just copy the *existing* unicode data from the
StringBuilder into a new string.
but the problem is that the value returns
to the StringBuilder variable from the DLL function seems to be in
ASCII codes so that it cannot be read into the string variable
properly. So will there be any solution?

The StringBuilder will contain Unicode characters, period. That's what
it does.

Now, if your DLL is incorrectly adding characters to the StringBuilder,
that's a different matter - and you should look at your interop call,
not manipulating the contents of the StringBuilder afterwards, if
that's the problem.
 
The DLL function is as follows

ic_execute(input_command as LPSTR, input_command_length as LPLONG, text_buffer as LPSTR, text_buffer_len as LPLONG, text_len as LPLONG, return_code as LPLONG, return_code_2 as LPLONG, cond_code as LPLONG

It seems that the returned value (i.e. text_buffer in the above function) for the StringBuilder is not in unicode format (I'm not sure if it is in UTF-8 format) as part of the string cannot be stored properly (e.g. special character). Is there any way to convert it correctly into a string variable (which is in Unicode format)

----- Jon Skeet [C# MVP] wrote: ----

RR said:
I use StringBuilder to get the "manipulated" value for that variabl
from the DLL function. I know that using ToString() will convert th
value to string in UniCod

No, it won't. It will just copy the *existing* unicode data from the
StringBuilder into a new string
but the problem is that the value return
to the StringBuilder variable from the DLL function seems to be i
ASCII codes so that it cannot be read into the string variabl
properly. So will there be any solution

The StringBuilder will contain Unicode characters, period. That's what
it does

Now, if your DLL is incorrectly adding characters to the StringBuilder,
that's a different matter - and you should look at your interop call,
not manipulating the contents of the StringBuilder afterwards, if
that's the problem
 
Hi Jon,
Can anyone tell me how can I convert a StringBuilder variable containing
some special characters such as CHAR(27) (which is returned from an API DLL)
into a string variable (which is in UniCode) correctly for further
processing
<<

The search for an EOF character is a standard step in parsing. This often
is done in communications streams -- the further processing is the
remainder.

I see nothing in the question that IS NOT parsing. Of course, I might be on
the wrong track.

Dick

--
Richard Grier (Microsoft Visual Basic MVP)

See www.hardandsoftware.net for contact information.

Author of Visual Basic Programmer's Guide to Serial Communications, 3rd
Edition ISBN 1-890422-27-4 (391 pages) published February 2002.
 
RR said:
The DLL function is as follows:

ic_execute(input_command as LPSTR, input_command_length as LPLONG,
text_buffer as LPSTR, text_buffer_len as LPLONG, text_len as LPLONG,
return_code as LPLONG, return_code_2 as LPLONG, cond_code as LPLONG)

It seems that the returned value (i.e. text_buffer in the above
function) for the StringBuilder is not in unicode format (I'm not
sure if it is in UTF-8 format) as part of the string cannot be stored
properly (e.g. special character). Is there any way to convert it
correctly into a string variable (which is in Unicode format)?

I'm afraid I don't know much about interop myself - but I suggest you
ask in an interop newsgroup, as they're likely to come across this
problem all the time.
 
Dick Grier said:
Can anyone tell me how can I convert a StringBuilder variable containing
some special characters such as CHAR(27) (which is returned from an API DLL)
into a string variable (which is in UniCode) correctly for further
processing
<<

The search for an EOF character is a standard step in parsing. This often
is done in communications streams -- the further processing is the
remainder.

I see nothing in the question that IS NOT parsing. Of course, I might be on
the wrong track.

Converting some characters to others is very easy with StringBuilder,
and doesn't require converting to a string first.

However, I don't believe this question is really about that parsing at
all - it's an interop problem of getting the right data into the
StringBuilder in the first place.
 
Hi,
Converting some characters to others is very easy with StringBuilder,
and doesn't require converting to a string first.
<<

True.

The question asked about searching for a ESC character. This character
often is used as a delimiter in a packetized data stream. The most common
use of ESC characters (that I have encountered) is the escape sequences uses
by various terminal emulations. These must be extracted by very complex
state machines -- these employ pattern matching, but also much more.
Character replacement will not be an issue, but discovery of substrings
will be (an action will be taken on a subsequent substring, until the next
ESC sequence is encountered, which then alters the current state).
However, I don't believe this question is really about that parsing at
all - it's an interop problem of getting the right data into the
StringBuilder in the first place.
<<

Perhaps parsing isn't involved. However, that was my first reaction.
Interpreting these questions often involves its own form of parsing, with
less than complete information.

Dick

--
Richard Grier (Microsoft Visual Basic MVP)

See www.hardandsoftware.net for contact information.

Author of Visual Basic Programmer's Guide to Serial Communications, 3rd
Edition ISBN 1-890422-27-4 (391 pages) published February 2002.
 
Dick Grier said:
Converting some characters to others is very easy with StringBuilder,
and doesn't require converting to a string first.
<<

True.

The question asked about searching for a ESC character. This character
often is used as a delimiter in a packetized data stream. The most common
use of ESC characters (that I have encountered) is the escape sequences uses
by various terminal emulations. These must be extracted by very complex
state machines -- these employ pattern matching, but also much more.
Character replacement will not be an issue, but discovery of substrings
will be (an action will be taken on a subsequent substring, until the next
ESC sequence is encountered, which then alters the current state).

Sure - and at *that* stage, convert it to a string. I was really just
taking issue with your original statement that StringBuilders were
*only* useful for insertions and deletions.
However, I don't believe this question is really about that parsing at
all - it's an interop problem of getting the right data into the
StringBuilder in the first place.
<<

Perhaps parsing isn't involved. However, that was my first reaction.
Interpreting these questions often involves its own form of parsing, with
less than complete information.

True.
 
Back
Top