String class design anomaly

Edward Diener · Jan 30, 2004

While most of the design of the string class is not too bad, the one member
function which really sticks out as being incorrectly designed is the
Replace member function. After an Insert member function which quite
reasonably inserts another string starting at a specific location and a
Remove function which removes a number of characters from the string
starting at a specific location, I would have assumed, without having looked
first, that a Replace function would replace a number of characters of a
string starting at a specific location with another string. Instead what we
get for the Replace function has nothing to do with that, but is a function
whose name would have best been ReplaceAllInstances, and which replaces all
instances within the string of either a character with another character or
a substring with another substring.

While a ReplaceAllInstances is certainly a useful and worthy function (
along with a RemoveAllInstances which is not provided in the string class ),
and while a normal Replace, as I have described it, can be done with a
Remove followed by an Insert, I think that the designer(s) of the string
class really fell asleep by not providing a Replace function which acts as I
have described and which would be the orthogonal equivalent of the design of
the Remove and Insert functions. As it is, the string Replace function is
misnamed and has nothing to do with the meaning of a normal string Replace.
Was everyone at MS asleep the day this anomaly was created, or was the
designer of this member function just too proud to create a Replace function
which would have the functionality which one expects in a well-designed
string class ? ( rhetorical question )

This anomaly just goes to show that even at a company with the reputation of
technical excellence as MS, people sometimes simply don't know how to design
classes and components correctly.

Scott M. · Jan 30, 2004

Does the fact that strings are immutable have anything to do with this?

Edward Diener · Jan 30, 2004

Scott said:
Does the fact that strings are immutable have anything to do with
this?

No, it is irrelevant since all the string class functionality involving
changes returns another string with the change made. A more orthogonal
Replace could just make the changes internally generating new strings and
return the changed string after doing Remove followed by an Insert.

The Replace member function is just a bad design for its name, but fine for
a ReplaceAllInstances member function. Of course along with
ReplaceAllInstances a RemoveAllInstances would have been logical and
orthogonal also.

All this shows is that MS's designers can be as weak, or weaker, as

Christoph Wienands · Jan 30, 2004

Hi Edward,

The Replace member function is just a bad design for its name, but fine for
a ReplaceAllInstances member function. Of course along with
ReplaceAllInstances a RemoveAllInstances would have been logical and
orthogonal also.

All this shows is that MS's designers can be as weak, or weaker, as
programming designers anywhere else in the world <g> .

The Replace method works fine, which means your only problem is the name. I
n my opinion the name is just fine because it does what the name says,
replaces strings matching the first parameter with the second parameter.

We can even turn it around: If the method did what you wanted it to do, then
we'd have to call it ReplaceFirstInstance, heh.

Granted, there could have been an overloaded method that has an additional
Count parameter, or a start- and an end position, but honestly, isn't this
picking raisins?

No offence ;-)

Christoph

Edward Diener · Jan 30, 2004

Christoph said:
Hi Edward,

The Replace method works fine, which means your only problem is the
name.

I never said it didn't work fine.

I n my opinion the name is just fine because it does what the
name says, replaces strings matching the first parameter with the
second parameter.

That's a fair way to look at it. However both the Remove and Insert methods
involve positions in the string and number of characters where a single
removal and insertion would be done. I would expect the Replace method to
also involve itself in a position in the string and a number of characters
where a replacement would be done. In the string implementations in which I
have previously used, replacement always follows this general paradigm. Why
not argue that Remove should have been a function which removes all
occurrences of a specified substring or character from a string ? Because
that is not the expectation of a Remove function. Similarly for Replace.

We can even turn it around: If the method did what you wanted it to
do, then we'd have to call it ReplaceFirstInstance, heh.

See my remarks on Remove and Insert.

Granted, there could have been an overloaded method that has an
additional Count parameter, or a start- and an end position, but
honestly, isn't this picking raisins?

No. My point was orthogonality. One has the traditional operations for
adding, removing, and changing an object or part of an object. In this case
it is a string and the names are Insert, Remove, and Replace respectively. I
would expect that they all dealt with the same general paradigm of how this
works for a string. Instead Insert and Remove does, while Replace means
something totally different. I have no argument with the useful
functionality of the Replace method as it is, only with the name which I
feel should have been something else, and the lack of a Replace which is
orthogonal with the Insert and Remove. In my programming I have found it
much more common to have a string Replace which replaces a segment of the
string at a particular offset and with a particular length with another
string, than to have a function which replaces all occurrences of a
character with another character or a substring with another substring.
Notice I said "common". Having the functionality of the current Replace is
definitely very useful, but it definitely should have been called something
different and the normal Replace logically implemented as I suggested.

No offence ;-)

Certainly no offense taken.

Eddie

Jon Skeet [C# MVP] · Jan 31, 2004

Edward Diener said:
While most of the design of the string class is not too bad, the one member
function which really sticks out as being incorrectly designed is the
Replace member function. After an Insert member function which quite
reasonably inserts another string starting at a specific location and a
Remove function which removes a number of characters from the string
starting at a specific location, I would have assumed, without having looked
first, that a Replace function would replace a number of characters of a
string starting at a specific location with another string.

Frankly, if you (or anyone else) can't be bothered to even look at the
first line of the summary documentation for a pretty commonly used
method, you deserve all you get. I don't see what's wrong with calling
it Replace (just as it is in Java, for the single char version).

Instead what we
get for the Replace function has nothing to do with that, but is a function
whose name would have best been ReplaceAllInstances, and which replaces all
instances within the string of either a character with another character or
a substring with another substring.

I'd personally call the kind of replace that you're after
"ReplaceSection" or something similar.

While a ReplaceAllInstances is certainly a useful and worthy function (
along with a RemoveAllInstances which is not provided in the string class ),
and while a normal Replace, as I have described it, can be done with a
Remove followed by an Insert, I think that the designer(s) of the string
class really fell asleep by not providing a Replace function which acts as I
have described and which would be the orthogonal equivalent of the design of
the Remove and Insert functions. As it is, the string Replace function is
misnamed and has nothing to do with the meaning of a normal string Replace.

Normal by whose definition? It's normal by my understanding of it, and
while I've seen lots of people referring to it in other posts, I
haven't seen anyone else being confused by it.

This anomaly just goes to show that even at a company with the reputation of
technical excellence as MS, people sometimes simply don't know how to design
classes and components correctly.

Or maybe it just shows a difference in personal opinion?

Jon Skeet [C# MVP] · Jan 31, 2004

Edward Diener said:
That's a fair way to look at it. However both the Remove and Insert methods
involve positions in the string and number of characters where a single
removal and insertion would be done. I would expect the Replace method to
also involve itself in a position in the string and a number of characters
where a replacement would be done. In the string implementations in which I
have previously used, replacement always follows this general paradigm. Why
not argue that Remove should have been a function which removes all
occurrences of a specified substring or character from a string ? Because
that is not the expectation of a Remove function. Similarly for Replace.

It's not *your* expectation of a Replace function - it's certainly
mine, however.

Note that the actual functionality you're after isn't even *in* the
string class - presumably because it's not a commonly-needed operation,
whereas "normal" (to my mind) replacement of one substring with another
*is* a common operation.

In my programming I have found it
much more common to have a string Replace which replaces a segment of the
string at a particular offset and with a particular length with another
string, than to have a function which replaces all occurrences of a
character with another character or a substring with another substring.

We clearly have different experiences then. You may find it interesting
to note that String.Replace is often mentioned on newsgroups as being
the answer to someone's question (or at least used as part of the
answer), but I can't off-hand remember anyone asking for the kind of
replacement operation you're after. If this is not just my memory
playing tricks, it would suggest that my experience has more in common
with other developers than yours.

Edward Diener · Jan 31, 2004

Jon said:
It's not *your* expectation of a Replace function - it's certainly
mine, however.

Then why not have the Remove function that removes each occurrence of a
character or substring within the string ? Or is this not *your* expectation
?

I have clearly pointed out the difference between the Remove and Insert
functions operating on a single portion of the string as defined by a
starting position and a length, while the Replace function is not related to
that idea at all. It is just as "common" to Replace an area of a string
denoted by a starting position and a length with another string as it is to
Remove or Insert an area of a string denoted by a starting position and a
length with another string.

Note that the actual functionality you're after isn't even *in* the
string class - presumably because it's not a commonly-needed
operation, whereas "normal" (to my mind) replacement of one substring
with another *is* a common operation.

This is an absurd argument. I point out that the Replace function is
misnamed, ie. its the wrong one, and then to prove *your* point you tell me
that the Replace function which I deem normal is not even in the string
class so therefore it must not be a commonly-needed operation. Duh ! Clearly
my whole point is that the string class is missing a commonly needed
operation and has been designed poorly in the sense that the commonly needed
Replace is missing and an alternative, which would better have been called
under a different name, is there instead.

We clearly have different experiences then. You may find it
interesting to note that String.Replace is often mentioned on
newsgroups as being the answer to someone's question (or at least
used as part of the answer), but I can't off-hand remember anyone
asking for the kind of replacement operation you're after. If this is
not just my memory playing tricks, it would suggest that my
experience has more in common with other developers than yours.

Functionality is no doubt in the eye of the beholder ? I will largely agree
with you on that. However what I was pointing out, quite clearly, is the
lack of orthogonality between the Remove and Insert functions on one hand,
and the Replace functions on the other hand. As far as your idea of a
Replace function is concerned, I too find that useful, but a better name
would have been ReplaceAll or ReplaceAllInstances. Needless to say a
DeleteAll, or a DeleteAllInstances, was not provided in the string class.
Does that prove to you that its functionality is not "common" ?

Daniel O'Connell [C# MVP] · Jan 31, 2004

Edward Diener said:
Then why not have the Remove function that removes each occurrence of a
character or substring within the string ? Or is this not *your* expectation
?

Hehe, interestingly, you'd do this with the current Replace implementation,
instead of Remove. You need to remember that Replace is simply a easy way to
perform removes & inserts, period. There is *NO* reason that Replace should
work under the same set of rules as Remove and Insert, it isn't an
additional function so much as a utility function for the lazy. For
realistic use of the class, Remove and Insert must exist, there is no
requirement for Replace, they are different things.

I have clearly pointed out the difference between the Remove and Insert
functions operating on a single portion of the string as defined by a
starting position and a length, while the Replace function is not related to
that idea at all. It is just as "common" to Replace an area of a string
denoted by a starting position and a length with another string as it is to
Remove or Insert an area of a string denoted by a starting position and a
length with another string.

This is an absurd argument. I point out that the Replace function is
misnamed, ie. its the wrong one, and then to prove *your* point you tell me
that the Replace function which I deem normal is not even in the string
class so therefore it must not be a commonly-needed operation. Duh ! Clearly
my whole point is that the string class is missing a commonly needed
operation and has been designed poorly in the sense that the commonly needed
Replace is missing and an alternative, which would better have been called
under a different name, is there instead.

The point he was trying to make is that it is apparently a "commonly-needed
operation" that no one ever uses, I know I've only used it like twice(how
often do you really work on strings like that? What are you working on that
requires it so drastically?).
When it comes down to it, the word "Replace" is going to mean "replace every
instance of 'x'" to *far* more people than "replace the text at this place"
is going to. Replace when it comes to text means "find string A, delete it,
and put string B in its place", just look at any word processor, text
editor, or anything else that uses the word Replace and you'll notice that.

Functionality is no doubt in the eye of the beholder ? I will largely agree
with you on that. However what I was pointing out, quite clearly, is the
lack of orthogonality between the Remove and Insert functions on one hand,
and the Replace functions on the other hand. As far as your idea of a
Replace function is concerned, I too find that useful, but a better name
would have been ReplaceAll or ReplaceAllInstances. Needless to say a
DeleteAll, or a DeleteAllInstances, was not provided in the string class.
Does that prove to you that its functionality is not "common" ?

Again, DeleteAll would be string.Replace("stringA","");, useless to add as
the functionality is there, it just isn't named that way. At that, its
better not to name things ReplaceAll or anything of that matter, its ugly
and not particularly clean. If you must add it an overload would be a better
idea(taking an index and a count like most other string methods). There is
alot to be said about clean naming, and Replace & ReplaceAll would be a
considerable violation of that.

Jon Skeet [C# MVP] · Jan 31, 2004

Edward Diener said:
Then why not have the Remove function that removes each occurrence of a
character or substring within the string ? Or is this not *your* expectation
?

No, it's not my expectation. Either of the methods could work either
way, to be honest - so the first thing I do is read the docs. That
seems to be the sensible thing to do, in my view.

I have clearly pointed out the difference between the Remove and Insert
functions operating on a single portion of the string as defined by a
starting position and a length, while the Replace function is not related to
that idea at all. It is just as "common" to Replace an area of a string
denoted by a starting position and a length with another string as it is to
Remove or Insert an area of a string denoted by a starting position and a
length with another string.

Not in my experience it's not. I can't remember the last time I wanted
to do that, to be honest.

This is an absurd argument. I point out that the Replace function is
misnamed, ie. its the wrong one, and then to prove *your* point you tell me
that the Replace function which I deem normal is not even in the string
class so therefore it must not be a commonly-needed operation. Duh ! Clearly
my whole point is that the string class is missing a commonly needed
operation and has been designed poorly in the sense that the commonly needed
Replace is missing and an alternative, which would better have been called
under a different name, is there instead.

You hadn't actually put that point, as far as I'd seen - you'd only
talked about the current Replace being misnamed.

Functionality is no doubt in the eye of the beholder ? I will largely agree
with you on that. However what I was pointing out, quite clearly, is the
lack of orthogonality between the Remove and Insert functions on one hand,
and the Replace functions on the other hand.

But I don't see why all methods in the class *should* have exactly the
same convention, when it's more convenient (in my view) to keep the
short names as they are.

As far as your idea of a
Replace function is concerned, I too find that useful, but a better name
would have been ReplaceAll or ReplaceAllInstances. Needless to say a
DeleteAll, or a DeleteAllInstances, was not provided in the string class.
Does that prove to you that its functionality is not "common" ?

No, because it's easy to do with Replace:

string x = y.Replace("x", "");

removes all 'x's from the string. Indeed, you'll find this is often
given as a solution when people *do* want to remove all instances of a
particular string.

Chris Capel · Feb 3, 2004

Then why not have the Remove function that removes each occurrence of a

character or substring within the string ? Or is this not *your* expectation
?

Mainly because myString.Replace(stringToReplace, "") offers exactly the same
functionality.

Chris Capel · Feb 3, 2004

Oops. I wouldn't have said that if I had finished the other two posts. But
since it had already been said a couple posts higher without being commented
on I went ahead and commented on it.

Eric Newton · Feb 11, 2004

um anybody hear heard of the StringBuilder class?

StringBuilder does what you want, implementing an internal char array to get
around the string immutability, which is IMO, a near-perfect scenario

Only "near perfect" in that theres a few missing operations in
StringBuilder, and its sealed :-(

But I guess that doesnt preclude me from just building a new StringBuilder
wrapper... oh the options, the options!

String class design anomaly

Edward Diener

Scott M.

Edward Diener

Christoph Wienands

Edward Diener

Jon Skeet [C# MVP]

Jon Skeet [C# MVP]

Edward Diener

Daniel O'Connell [C# MVP]

Jon Skeet [C# MVP]

Chris Capel

Chris Capel

Eric Newton