To All: String reverse problem

  • Thread starter Thread starter compulim
  • Start date Start date
C

compulim

Hi Tomer,

If there's a non-recursive alternative, you should never use recursive
function. (easier to maintain, performance reason...)

The following implementation provide non-recursive approach with no hacking
(or "special case handling"), lesser function calls and zero IF statement.

public static string StrRev( string s ) {
char[] charArray;

charArray = s.ToCharArray();
Array.Reverse( charArray );

return ( new string( charArray ) );
}


Compulim @ kasuei


Hi,

I've check on the web for a string reverse function and this one on many
places:

public static string StrRev(string s)
{
if (s.Length == 1)
return s;
else
return StrRev( s.Substring(1) ) + s.Substring(0,1);
}
Now, this doesn't work when you put an empty string and produce a
StackOverflowException, so we add a condition at the begining of the
function:
public static string StrRev(string s)
{
if (s.Length == 0)
return s;
if (s.Length == 1)
return s;
else
return StrRev( s.Substring(1) ) + s.Substring(0,1);
}
Hope this helps, Tomer.
 
Tomer said:
I've check on the web for a string reverse function and this one on many places:

public static string StrRev(string s)
{

if (s.Length == 1)

return s;

else

return StrRev( s.Substring(1) ) + s.Substring(0,1);

}

Now, this doesn't work when you put an empty string and produce a
StackOverflowException, so we add a condition at the begining of the
function:

public static string StrRev(string s)

{

if (s.Length == 0)

return s;

if (s.Length == 1)

return s;

else

return StrRev( s.Substring(1) ) + s.Substring(0,1);

}

Alternatively, you could do something which *won't* throw a
StackOverflowException (or use horrendous amounts of memory for no good
reason):

public static string ReverseString (string s)
{
if (s==null || s.Length < 2)
{
return s;
}

StringBuilder builder = new StringBuilder(s.Length);
for (int i=s.Length-1; i >= 0; i--)
{
builder.Append(s);
}
return builder.ToString();
}

This is slightly more expensive in time but less expensive in memory
than creating a new char array and then a new string from that. Either
way, it's *much* better than the recursive version.
 
There are 10 million ways to reverse a string and all of this has nothing to
do with the CF specifically but what the heck, I'll chip in...

The checks at the beginning of the method are not "special cases", just
verifying the preconditions of the method... While we are talking about
them, both examples given would fail if they were passed a null string...

Faster than both of the methods above is to create StringBuilder and append
to it the characters of the input string (walking backwards)... If
performance was not an issue you could use
Microsoft.VisualBasic.Strings.StrReverse

Cheers
Daniel
 
Hi Tomer,

If there's a non-recursive alternative, you should never use recursive
function. (easier to maintain, performance reason...)

The following implementation provide non-recursive approach with no hacking
(or "special case handling"), lesser function calls and zero IF statement.

public static string StrRev( string s ) {
char[] charArray;

charArray = s.ToCharArray();
Array.Reverse( charArray );

return ( new string( charArray ) );
}

I don't see why special-casing empty or single character strings is a
bad idea. (The above still fails with null. If you want it to throw an
exception, ArgumentNullException would be a better one than the
NullReferenceException currently thrown.)

While it's also better with memory than the recursive version, it still
creates a "wasted" character array.
 
The original one also fails with null value. But for performance reason, I
agree adding "null" and "length < 2" condition.

But more method invocation will leads to poor performance. The temporary
charArray in my approach won't take up much memory and able to GC once it
left the method.

What do you think?


Compulim @ kasuei

Jon Skeet said:
Hi Tomer,

If there's a non-recursive alternative, you should never use recursive
function. (easier to maintain, performance reason...)

The following implementation provide non-recursive approach with no
hacking
(or "special case handling"), lesser function calls and zero IF
statement.

public static string StrRev( string s ) {
char[] charArray;

charArray = s.ToCharArray();
Array.Reverse( charArray );

return ( new string( charArray ) );
}

I don't see why special-casing empty or single character strings is a
bad idea. (The above still fails with null. If you want it to throw an
exception, ArgumentNullException would be a better one than the
NullReferenceException currently thrown.)

While it's also better with memory than the recursive version, it still
creates a "wasted" character array.
 
Hi,

I've check on the web for a string reverse function and this one on many places:

public static string StrRev(string s)
{

if (s.Length == 1)

return s;

else

return StrRev( s.Substring(1) ) + s.Substring(0,1);

}

Now, this doesn't work when you put an empty string and produce a StackOverflowException, so we add a condition at the begining of the function:

public static string StrRev(string s)

{

if (s.Length == 0)

return s;

if (s.Length == 1)

return s;

else

return StrRev( s.Substring(1) ) + s.Substring(0,1);

}

Hope this helps, Tomer.
 
The original one also fails with null value. But for performance reason, I
agree adding "null" and "length < 2" condition.

But more method invocation will leads to poor performance. The temporary
charArray in my approach won't take up much memory and able to GC once it
left the method.

What do you think?

Obviously the recursion is a bad idea, but the version I posted is,
IMO, better than your alternative. The temporary char array will take
up the same amount of space as the original string, which *might* be
very large. Although it will be *eligible* for garbage collection as
soon as the method returns, that doesn't mean it's "free" - you don't
want to stress the GC if you don't have to.

The speed will of my solution partly depends on whether calls to
StringBuilder.Append(char) is inlined or not. One alternative is to
prepopulate the StringBuilder with the requisite number of chars, and
then use the indexer to set the characters - I would imagine the
indexer is more likely to be inlined.
 
Jon Skeet said:
Obviously the recursion is a bad idea, but the version I posted is,
IMO, better than your alternative. The temporary char array will take
up the same amount of space as the original string, which *might* be
very large. Although it will be *eligible* for garbage collection as
soon as the method returns, that doesn't mean it's "free" - you don't
want to stress the GC if you don't have to.

The speed will of my solution partly depends on whether calls to
StringBuilder.Append(char) is inlined or not. One alternative is to
prepopulate the StringBuilder with the requisite number of chars, and
then use the indexer to set the characters - I would imagine the
indexer is more likely to be inlined.

I decided to run a little benchmark on the desktop framework, and
calling Append repeatedly is a bit faster than using the indexer.

Using Array.Reverse is significantly faster than either of them, which
surprised me.

I'd certainly want to test on the Compact Framework if I thought this
was critical though - as the difference between the two is largely
going to be about garbage collection times and inlining, the difference
in framework could make a significant difference in timing.
 
Jon Skeet said:
I'd certainly want to test on the Compact Framework if I thought this
was critical though - as the difference between the two is largely
going to be about garbage collection times and inlining, the difference
in framework could make a significant difference in timing.

Just for kicks, I tried it on the Compact Framework, and it turns out
that the situations are reversed:

Desktop:
Array.Reverse: 4.5 seconds
StringBuilder.Append: 13.3 seconds
Indexer in StringBuilder: 16.6 seconds

Compact Framework: (far fewer iterations, of course)
Array.Reverse: 114 seconds
StringBuilder.Append: 25 seconds
Indexer in StringBuilder: 20 seconds


From being about 3 times as fast, Array.Reverse becomes about 5 or 6
times as slow as the other methods!

Just goes to show what a difference it makes being on a different
framework...
 
What about using the Microsoft.VisualBasic.Strings.StrReverse? How fast is
that?

And what is the indexer method you use? I only saw the append method...

Tomer.
 
Tomer said:
What about using the Microsoft.VisualBasic.Strings.StrReverse? How fast is
that?

On the Compact Framework, it takes a bit longer than using an indexer
and a bit shorter than using Append.

On the desktop, it takes a bit longer than the Append and a bit shorter
than using an indexer.
And what is the indexer method you use? I only saw the append method...

Here it is:

static string ReverseIndex (string x)
{
if (x==null || x.Length < 2)
{
return x;
}
StringBuilder builder = new StringBuilder(x.Length);
builder.Append(' ', x.Length);
for (int i=0; i < x.Length; i++)
{
builder[x.Length-1-i]=x;
}
return builder.ToString();
}


Now, I've just had a bit of fun with some unsafe code, namely:

static string ReverseUnsafe (string x)
{
if (x==null || x.Length < 2)
{
return x;
}
string y = x.Substring(0, x.Length);
unsafe
{
fixed (char *start = y)
{
char *p = start;
for (int i=x.Length-1; i >= 0; i--)
{
*p++ = x;
}
}
}
return y;
}

According to the following blog entry, this is dangerous:
http://weblogs.asp.net/brada/archive/2004/06/03/148144.aspx

However, due to using x.Substring (0, x.Length) we know that the string
already contains the same characters as it did before - so unless I'm
mistaken (having looked at the Rotor source) it *should* be okay.
That's assuming that Substring doesn't return "this" of course (like
Clone() does).

On the desktop it's just over twice as fast as the previous champion
(Array.Reverse). On the Compact Framework it fails to compile with the
interesting error message of:
"Missing compiler required member
'System.Runtime.CompilerServices.RuntimeHelpers.OffsetToStringData'"
 
Holy cow! Array.Reverse sucks!

The reality is that if you want fast, you're best route is probably to pin
the string, create a byte array and use pointers to reverse it. I bet
that's faster (and I would have guessed that would be how Array.Reverse
worked - evidently not).

-Chris
 
Chris Tacke said:
Holy cow! Array.Reverse sucks!

Well, either that or ToCharArray, or creating a new string from a char
array...
The reality is that if you want fast, you're best route is probably
to pin the string, create a byte array and use pointers to reverse
it. I bet that's faster (and I would have guessed that would be how
Array.Reverse worked - evidently not).

The difficult bit is getting a pointer to the start of the string - it
works fine in the desktop framework, but not in the compact framework
:(
 
I just did some more benchmarking. I tested 2 methods, then used your results to normalize.

Uisng Array.Reverse took ~100ms to reverse a 8148-character string.
Using pointers took ~8ms

so normalizing that to your data:
Array.Reverse: 114 seconds
StringBuilder.Append: 25 seconds
Indexer in StringBuilder: 20 seconds
pointer: 9 seconds


public unsafe static string StrRevPtr(string s)
{
int len = s.Length;

char[] srcchars = s.ToCharArray();
char[] destchars = new char[len];

fixed(char *src = srcchars, dest = destchars)
{
for(int i = 0 ; i < len ; i++)
{
dest[len-i-1] = src;
}
}

return new string(destchars);
}
 
Ya, I also think Array.Reverse you some kind of low-level functions to do
the inversion. But the result turns out it's not.

Surprisingly CF Array.Reverse is much slower than desktop. But isn't method
invocation costs much CPU power?

Very nice discussion and learnt a lot. Thanks everyone.

Chris Tacke said:
Holy cow! Array.Reverse sucks!

The reality is that if you want fast, you're best route is probably to pin
the string, create a byte array and use pointers to reverse it. I bet
that's faster (and I would have guessed that would be how Array.Reverse
worked - evidently not).

-Chris
 
And using 1/2 the memory as before, I get another 10% speed gain:

public unsafe static string StrRevPtr(string s)
{
int len = s.Length;
char[] srcchars = s.ToCharArray();
char b;
fixed(char *src = srcchars)
{
for(int i = 0 ; i < len / 2 ; i++)
{
b = src;
src = src[len-i-1];
src[len-i-1] = b;
}
}
return new string(srcchars);
}

-Chris
 
Chris Tacke said:
I just did some more benchmarking. I tested 2 methods, then used your results to normalize.

Uisng Array.Reverse took ~100ms to reverse a 8148-character string.
Using pointers took ~8ms

so normalizing that to your data:
Array.Reverse: 114 seconds
StringBuilder.Append: 25 seconds
Indexer in StringBuilder: 20 seconds
pointer: 9 seconds


public unsafe static string StrRevPtr(string s)
{
int len = s.Length;

char[] srcchars = s.ToCharArray();
char[] destchars = new char[len];

fixed(char *src = srcchars, dest = destchars)
{
for(int i = 0 ; i < len ; i++)
{
dest[len-i-1] = src;
}
}

return new string(destchars);
}


Nicely done. Now if only we could get the char* of an actual string, we
could do it without creating the extra char array...
 
Yep, that would save both time and resources again.

-Chris


Jon Skeet said:
Chris Tacke said:
I just did some more benchmarking. I tested 2 methods, then used your results to normalize.

Uisng Array.Reverse took ~100ms to reverse a 8148-character string.
Using pointers took ~8ms

so normalizing that to your data:
Array.Reverse: 114 seconds
StringBuilder.Append: 25 seconds
Indexer in StringBuilder: 20 seconds
pointer: 9 seconds


public unsafe static string StrRevPtr(string s)
{
int len = s.Length;

char[] srcchars = s.ToCharArray();
char[] destchars = new char[len];

fixed(char *src = srcchars, dest = destchars)
{
for(int i = 0 ; i < len ; i++)
{
dest[len-i-1] = src;
}
}

return new string(destchars);
}


Nicely done. Now if only we could get the char* of an actual string, we
could do it without creating the extra char array...
 
Nicely done. Now if only we could get the char* of an actual string, we
could do it without creating the extra char array...

This could be done using a GCHandle to get the pointer to the string, then
using unsafe code to manipulate the data. The only thing to take care of is
to locate the character data within the string which is not particularly
safe since there is no OffsetToStringData in NETCF. This worked for me but
the size of the string header data is probably vairable so may not be too
hot:-

public unsafe static string StrRevPtr(string s)

{

int len = s.Length;

//skip past initial string data

int offset = 10;

GCHandle hString = GCHandle.Alloc(s, GCHandleType.Pinned);


char b;

char* src = (char*)hString.AddrOfPinnedObject().ToPointer();

for(int i = 0 ; i < len / 2 ; i++)

{

b = src[offset+i];

src[offset+i] = src[offset+len-i-1];

src[offset+len-i-1] = b;

}

hString.Free();

return s;

}

The benefits of doing it this way would probably only become apparent with
larger strings, since we are working directly on the characters within the
string and not creating a separate array.

Peter


--
Peter Foot
Windows Embedded MVP
www.inthehand.com | www.opennetcf.org

Do have an opinion on the effectiveness of Microsoft Windows Mobile and
Embedded newsgroups? Let us know!
https://www.windowsembeddedeval.com/community/newsgroups
 
Peter Foot said:
This could be done using a GCHandle to get the pointer to the string, then
using unsafe code to manipulate the data. The only thing to take care of is
to locate the character data within the string which is not particularly
safe since there is no OffsetToStringData in NETCF. This worked for me but
the size of the string header data is probably vairable so may not be too
hot:-

The benefits of doing it this way would probably only become apparent with
larger strings, since we are working directly on the characters within the
string and not creating a separate array.

I think it's a very bad idea to reverse the *actual* string that was
passed in. Creating a clone of the string (eg using Substring) and
reversing that is much cleaner - it maintains the immutable nature of
strings, effectively. The difficult bit (which the code you've given
would solve) is only creating *one* copy of the original data.

It would be nice to be able to create a string with the requisite
amount of space, a given set of flags at the top of the length
indicator, and then copy the data in manually, but failing that, taking
a copy of the original data and then reversing it isn't too bad.
 
Back
Top