Thinking out loud about pass by value semantics

  • Thread starter Thread starter Alvin Bruney
  • Start date Start date
A

Alvin Bruney

The more knowledgable I get about this .net world, the more questions I
have.

..NET uses pass by reference for all objects....uhhh I mean pass by value.
(Couldn't resist this jab)

Consider a string object whose value is passed as a parameter to a function.
The reference is pointing back to memory allocated on the heap where the
contents of the string lives. Inside the function, I adjust the value, say
string += "value", which means on the surface that the string object in
memory gets changed. That's the 10,000 foot view. It's not correct, but
widely repeated. Strings are immutable, so what exactly is going on
underneath the hood with this scenario?

Is it right that the run-time must now allocate new memory and copy the old
contents into this new location in order to honor the immutability of
strings?

If that's true, then the run-time must know explicitly that the memory
location behaves differently for a string so it must mark that block of
memory as special. Other objects would not behave this way hence the need
for special treatment.

I thought memory wasn't treated that way. Isn't it true for other languages
and environments that memory is just a block which gets no special
treatment? Why has .NET chosen to go with this meandering? What are the
benefits of this special treatment because it looks to me, from where I sit
that the overhead associated with this approach isn't worth the price of
admission.

I've been wrong before so don't humor me. Give it to me straight, I'm old
enough to take it.
 
The more knowledgable I get about this .net world, the more questions
I have.

.NET uses pass by reference for all objects....uhhh I mean pass by
value. (Couldn't resist this jab)

Consider a string object whose value is passed as a parameter to a
function. The reference is pointing back to memory allocated on the
heap where the contents of the string lives. Inside the function, I
adjust the value, say string += "value", which means on the surface
that the string object in memory gets changed. That's the 10,000 foot
view. It's not correct, but widely repeated.

Is that really widely repeated?

Strings are immutable,
so what exactly is going on underneath the hood with this scenario?

Think about it this way...

public void foo()
{
string s = "Hello world";
bar(s);
}

public void bar(string t)
{
t += " and hello again";
}


You've got one string object and two references to that object. 's' is
a reference to the string object, and 't' is a separate reference to
that object. Now, when we execute the "+=", .NET allocates a new string
object and makes 't' a reference to the new object. But 's' still
points to the old object.
Is it right that the run-time must now allocate new memory and copy the old
contents into this new location in order to honor the immutability of
strings?
Right.

If that's true, then the run-time must know explicitly that the memory
location behaves differently for a string so it must mark that block
of memory as special. Other objects would not behave this way hence
the need for special treatment.

No, actually all classes work this way. Remember that

t += " and hello again" ;

is equivalent to...

t = t + " and hello again";

All you've done is change what object the reference is pointing to, you
haven't done anything to the original object at all.

IMO, what makes this a little confusing and misleading is the fact that
you don't have to explicitly create a new string, the framework does it
for you behind the scenes, so sometimes you forget you're dealing with a
new object. For example, if we had written

public void bar(string t)
{

t = new string( ( t + " and hello again").ToCharArray() );
}

That's basically the same thing, but now you can explicitly see that
we've created a new string in the bar function and assigned 't' to be a
reference to it. Does that make things clearer?

I thought memory wasn't treated that way. Isn't it true for other
languages and environments that memory is just a block which gets no
special treatment? Why has .NET chosen to go with this meandering?
What are the benefits of this special treatment because it looks to
me, from where I sit that the overhead associated with this approach
isn't worth the price of admission.

No meandering. Any class that implements operator+ will work exactly
this way. In fact, a good way to understand all this is to create
a simple class of your own that implements operator+, and then see how
the "+=" operator affects it.
 
When you write

string += value

you are really writing

string = string + value;

That is implemented by creating a new string that contains the values from
"string" and "value", and then assigning it back to the string reference.

Strings are immutable by not being able to be modified, so there's no way to
modify an existing string.

Hope this helps.



--
Eric Gunnerson

Visit the C# product team at http://www.csharp.net
Eric's blog is at http://blogs.gotdotnet.com/ericgu/

This posting is provided "AS IS" with no warranties, and confers no rights.
 
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

..NET uses pass-by-value for all objects? No, not really. Your scenario
won't work if it were the case. That is, if you do it like this:

~ public static void Main(string[] args)
~ {
~ string a1 = "blah";
~ blahblah(a1);
~ Console.WriteLine("a1 is now = " + a1);
~ }

~ public static void blahblah(string s) {
~ s += "blah";
~ }

a1 after blahblah() will still be just "blah", exactly the way it would
happen in Java, because everything in Java is passed by value.

On the other hand, if you replace blahblah(string s) with blahblah(ref
string s) and blahblah(a1) with blahblah(ref a1), then it will work,
i.e.: a1 will be "blahblah" after the method call.

But then again it doesn't have anything to do with string being
immutable, because string is indeed still immutable. Like what the other
poster mentioned, s += "blah" creates a new string object ("blahblah")
and assigns it to s, such that s no longer points to the *previous*
instance ("blah"), which remains immutable, forever and ever.

Alvin Bruney wrote:
<snipped>

- --
Ray Hsieh (Djajadinata)
ray underscore usenet at yahoo dot com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQE/m7OFwEwccQ4rWPgRArAKAJwMbEZPNHv6VyuE4AqkT0/wKphY/gCfT4zs
qDStW/jRPC5mrvunJ5dY1G4=
=rEq4
-----END PGP SIGNATURE-----
 
I got most of that and it is inline with my beliefs but clarify this
public void bar(string t)
{
t += " and hello again";
}

nothing is actually allocated right? So there isn't any reference set aside
so you can't actually say that
You've got one string object and two references to that object.
When the call is made it now becomes true.

Now here is where I am a little fuzzy.
When the call is actually made

is this a copy of the reference pointing to original string object like
normal pass by value semantics or the runtime has already allocated a new
string object and is passing a copy of this new reference to the function
because it knows that it is a string object and must be handled differently.
Put another way, what is bar(s) passing in?
A reference to the existing object or it has already allocated and copied a
new reference in anticipation of s += " and hello again"
 
I got most of that and it is inline with my beliefs but clarify this


nothing is actually allocated right? So there isn't any reference set aside
so you can't actually say that
When the call is made it now becomes true.

At the line that says 'bar(s)', .NET creates a second reference to the
original object. The 'bar(string t)' function receives this second
reference, which is still pointing to the original string. So at the
point in time *before* the line that says 't += ', we have one string
object, and two references pointing to that object.
Now here is where I am a little fuzzy.
When the call is actually made

is this a copy of the reference pointing to original string object like
normal pass by value semantics or the runtime has already allocated a new
string object and is passing a copy of this new reference to the function
because it knows that it is a string object and must be handled differently.

The first. It's a reference pointing to the original string object.
Put another way, what is bar(s) passing in?
A reference to the existing object or it has already allocated and copied a
new reference in anticipation of s += " and hello again"

Step by step through the second function...

public void bar(string t)
{
// at this point, t is another reference to our original string
// object

t = t + " and hello again";
// .NET allocates a new string object and makes t a reference to
// that object

(Actually, conceptually I suppose .NET creates a new string object
equal to "and hello again", then calls string.operator+() to add 't'
and our this string together. string.operator+() returns a new
string, and t is made a reference to this second new string object).


strings aren't immutable because of some kind of special handling from
the framework, they're immutable for the simple reason that the string
class doesn't provide any public functions that would change the
contents of the string. You could easily create your own immutable
class that would act the exact same way.

class MyClass
{
string _s;

public MyClass(string s)
{
_s = s;
}

public static MyClass operator+(MyClass lhs, string i)
{
return new MyClass(lhs._s + i);
}

public override string ToString()
{
return _s;
}




public static void Main()
{
MyClass s = new MyClass("Hello world");
bar(s);
Console.WriteLine(s);
}

public static void bar(MyClass t)
{
t += " and hello again";
Console.WriteLine(t);
}

}
 
*whistle* Skeeeeeeeeeeeeeeetttttttttttt
we got one here....
.NET uses pass-by-value for all objects? No, not really.
--


-----------
Got TidBits?
Get it here: www.networkip.net/tidbits
Ray Hsieh (Djajadinata) said:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

.NET uses pass-by-value for all objects? No, not really. Your scenario
won't work if it were the case. That is, if you do it like this:

~ public static void Main(string[] args)
~ {
~ string a1 = "blah";
~ blahblah(a1);
~ Console.WriteLine("a1 is now = " + a1);
~ }

~ public static void blahblah(string s) {
~ s += "blah";
~ }

a1 after blahblah() will still be just "blah", exactly the way it would
happen in Java, because everything in Java is passed by value.

On the other hand, if you replace blahblah(string s) with blahblah(ref
string s) and blahblah(a1) with blahblah(ref a1), then it will work,
i.e.: a1 will be "blahblah" after the method call.

But then again it doesn't have anything to do with string being
immutable, because string is indeed still immutable. Like what the other
poster mentioned, s += "blah" creates a new string object ("blahblah")
and assigns it to s, such that s no longer points to the *previous*
instance ("blah"), which remains immutable, forever and ever.

Alvin Bruney wrote:
<snipped>

- --
Ray Hsieh (Djajadinata)
ray underscore usenet at yahoo dot com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQE/m7OFwEwccQ4rWPgRArAKAJwMbEZPNHv6VyuE4AqkT0/wKphY/gCfT4zs
qDStW/jRPC5mrvunJ5dY1G4=
=rEq4
-----END PGP SIGNATURE-----
 
Alvin Bruney said:
*whistle* Skeeeeeeeeeeeeeeetttttttttttt
we got one here....

Nope - he's right. It doesn't use pass-by-value for all objects,
because objects themselves are never passed, only references are. I
don't think Ray was actually suggesting that objects are passed by
reference. If he'd done that, I'd have jumped in the normal kind of way
:)
 
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Yeap I wasn't :) I was pointing out to Alvin that if he passes a
reference to string by value to a method, he can modify the string
itself, but he can't modify where that reference points to (which is
what happens when he does s += "something" -- it'll compile ok but it
won't change it outside the method). To modify the reference itself,
he'll need to pass the reference by reference.

Alvin, if you've done C before, it's a bit like passing a pointer to a
pointer when you need to change *where the latter points to* instead of
*the value at the location to which the latter currently points*.

Jon Skeet [C# MVP] wrote:

|
| Nope - he's right. It doesn't use pass-by-value for all objects,
| because objects themselves are never passed, only references are. I
| don't think Ray was actually suggesting that objects are passed by
| reference.

- --
Ray Hsieh (Djajadinata)
ray underscore usenet at yahoo dot com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQE/nUzMwEwccQ4rWPgRApooAJoCVGvay3eBBX7HRenP2GkmSEERvACeNWJN
5dNBNtkV+Y7mDGhl0jcRn/U=
=divQ
-----END PGP SIGNATURE-----
 
Back
Top