Structs vs. Classes

Jon Skeet [C# MVP] · May 6, 2007

Peter Duniho said:
Possibly. And yet a) most C# programmers, at least for the near future,
will in fact be C++ programmers migrating

I think I'd need to see some numbers before I believe that. I reckon VB
and Java programmers form a large proportion too - possibly larger than
C++ programmers.

and b) even if we reach a point
where most C# programmers are indeed ones who started out with C# and thus
have been indoctrinated in the languages subtle quirks, that does not in
my mind change the fact that having a "new" operator do two very different
things depending on the type of object on which it's operating is just not
a great idea.

Well, it does the same thing in some ways, and different things in
other ways. It does the "same thing" in the same way that both value
types and reference types are passed by value by default - if you
understand that the difference between reference types and value types,
it makes sense, although if you don't, it doesn't.

Note that, at least in my view, there *isn't* consistency of how you
create new instances of a type. When you use the "new" operator with a
struct, you aren't really creating a new instance of that type. The new
instance of a struct is created when the containing object is created
(whether that's on the stack, or in a class, or another struct). The
"new" operator doesn't really create the instance that you're using.

I think it depends on exactly what one means by "instance", to be
honest. I nearly used the word "value" but thought that wouldn't be
clear. It's quite possible that we don't have adequate terminology at
the moment.

(Just to be clear, I'm sure we agree on what actually happens, just not
on what we call it.)

So I'm not really on board with respect to your thought that over time,
we'll be left with "the benefit of consistency of how you create new
instances of a type". I don't see that consistency today, and so don't
see that we'll be left with that in the future.

I suspect we'll have to agree to disagree on this, but I'm fine with
that.

It's a classic example of inappropriate operator overloading, IMHO. It
falls in with other C#/.NET "features", such as the use of the word
"event" to describe something very different from a "WaitableEvent", as
well as the use of the method name "Invoke()" to describe one thing in the
Form class, and something somewhat different on a delegate.

Yes - I've just been writing about delegates and trying to write
coherently about Control.Invoke vs Delegate.Invoke. It's insanely
difficult to make it readable.

It's certainly not a big deal...any person spending any significant amount
of time with C# can easily get accustomed to these issues. But they do
detract from the general theme of C# and .NET of trying to avoid apparent
ambiguity.

Whereas I quite like not having to have two different types of syntax
for calling constructors.

Out of interest, what would you propose for calling a value type
constructor? (Please not the C++ syntax!)

Peter Duniho · May 6, 2007

[...] that does not in
my mind change the fact that having a "new" operator do two very
different things depending on the type of object on which it's
operating is just not a great idea.

Click to expand...

Well, it does the same thing in some ways, and different things in
other ways. It does the "same thing" in the same way that both value
types and reference types are passed by value by default

Yes, that's sort of true. And it does the "same thing" in the same way
that "Control.Invoke" and "Delegate.Invoke" do the "same thing". My point
is that in the case of the "new" operator and the "Invoke" method, the
"same thing" is different enough that using the same syntax obscures some
fundamental differences in what's actually happening.

(For the record, the parameter passing aspect doesn't bother me at all,
because I think one thing that C# does make clear is the difference
between representation of value types and reference types).

- if you
understand that the difference between reference types and value types,
it makes sense, although if you don't, it doesn't.

Well, I *think* I understand the difference, and yet to me it doesn't make
sense to have to call "new" on a value type.

[...] The
"new" operator doesn't really create the instance that you're using.

Click to expand...

I think it depends on exactly what one means by "instance", to be
honest. I nearly used the word "value" but thought that wouldn't be
clear. It's quite possible that we don't have adequate terminology at
the moment.

Maybe. To me, "instance" means the actual storage required by the
variable. There's a physical, real-world connection between an "instance"
and the place where that "instance" is stored. But with value types in
C#, by the time I get around to writing "ValueType variable = new
ValueType()", the storage has been allocated and all that "new" is doing
for me is initializing that storage.

I suspect we'll have to agree to disagree on this, but I'm fine with
that.

Me too. After all, it doesn't change how we use the language. I just
feel it's more confusion to new C# programmers than it needs to be. (And
personally, being a new C# programmer I feel that I have a better handler
on what's confusing and what's not than some of you more experienced C#
folks

).

[...]
Whereas I quite like not having to have two different types of syntax
for calling constructors.

Out of interest, what would you propose for calling a value type
constructor? (Please not the C++ syntax!)

Off the top of my head, I don't know. There's a reason I'm not a language
designer.

All I know is that this code:

ValueType valuevar = new ValueType();
RefType refvar = new RefType();

has two lines that look like they do the same thing, and yet they don't do
the same thing at all.

If there were some way to get a reference to a value type that took
advantage of the "new" syntax, maybe it wouldn't bug me so much.
Something related to boxing, for example. But as far as I know, there
isn't. Feel free to correct me if I'm wrong about that.

Pete

Jon Skeet [C# MVP] · May 6, 2007

Peter Duniho said:
Yes, that's sort of true. And it does the "same thing" in the same way
that "Control.Invoke" and "Delegate.Invoke" do the "same thing". My point
is that in the case of the "new" operator and the "Invoke" method, the
"same thing" is different enough that using the same syntax obscures some
fundamental differences in what's actually happening.

I'd say they're more similar than that - but it's the kind of thing
which wouldn't do much good to debate.

(For the record, the parameter passing aspect doesn't bother me at all,
because I think one thing that C# does make clear is the difference
between representation of value types and reference types).

Shame it's something that gets a lot of people confused

Well, I *think* I understand the difference, and yet to me it doesn't make
sense to have to call "new" on a value type.

For me it creates a new "value" - but there we go.

[...] The
"new" operator doesn't really create the instance that you're using.

Click to expand...

I think it depends on exactly what one means by "instance", to be
honest. I nearly used the word "value" but thought that wouldn't be
clear. It's quite possible that we don't have adequate terminology at
the moment.

Click to expand...

Maybe. To me, "instance" means the actual storage required by the
variable. There's a physical, real-world connection between an "instance"
and the place where that "instance" is stored. But with value types in
C#, by the time I get around to writing "ValueType variable = new
ValueType()", the storage has been allocated and all that "new" is doing
for me is initializing that storage.

You see, to me, an instance is a collection of variable values. The
storage doesn't particularly bother me. Therefore even if I've already
got a value, with my way of looking at things I'm creating a new
instance which is replacing the old instance - whereas of course with
your way of of looking at it I quite see that you're just replacing the
data within the existing instance.

Me too. After all, it doesn't change how we use the language. I just
feel it's more confusion to new C# programmers than it needs to be. (And
personally, being a new C# programmer I feel that I have a better handler
on what's confusing and what's not than some of you more experienced C#
folks ).

That would certainly be true if I hadn't helped rather a lot of other
newbie C# people over time (not trying to be immodest, just honest) -
and I think you have *much* more experience of other languages than
most of the C# neophytes I've seen, to be honest. I expect to be able
to explain things to you (on the rare occasions where it's necessary,
frankly) in terms which are defined elsewhere, point you at those
definitions and let you get on with it - rather than doing a lot of
hand-holding, comparing with real-world examples etc.

This is in no way to diminish the importance of the things you find
irritating/tricky/confusing/whatever.

Off the top of my head, I don't know. There's a reason I'm not a language
designer. All I know is that this code:

ValueType valuevar = new ValueType();
RefType refvar = new RefType();

has two lines that look like they do the same thing, and yet they don't do
the same thing at all.

Wise words from Obi wan: "Luke, you're going to find that many of the
truths we cling to depend greatly on our own point of view."

They're certainly different in terms of whether they need to allocate
storage or not. If you see the major point as being initializing an
instance, and use my view of an instance, it's a different story

(I'm not going to claim the differences go away, just that I'd disagree
with the claim that "they don't do the same thing at all".)

If there were some way to get a reference to a value type that took
advantage of the "new" syntax, maybe it wouldn't bug me so much.
Something related to boxing, for example. But as far as I know, there
isn't. Feel free to correct me if I'm wrong about that.

Well, this would do it:
object o = new ValueType();

but it's really not what you want, most of the time

Peter Duniho · May 7, 2007

[...]

If there were some way to get a reference to a value type that took
advantage of the "new" syntax, maybe it wouldn't bug me so much.
Something related to boxing, for example. But as far as I know, there
isn't. Feel free to correct me if I'm wrong about that.

Click to expand...

Well, this would do it:
object o = new ValueType();

but it's really not what you want, most of the time

Yeah, no...I wasn't really looking to box the value type per se. Just
something *like* that.

Though, I suppose in the context of C# that's going to accomplish almost
the same thing.

Anyway, thanks for the comments...I do see where you're coming from, and
mostly I'm of the mind that since the language is the way it is, how I
wish it to be isn't nearly as relevant as how it is today. Even if I'm
not entirely satisfied, I'm glad that the language does at least make
sense to someone, as that shows that the design isn't arbitrarily wrong or
anything like that.

And that's a good thing.

Pete

Jon Harrop · May 7, 2007

Diego said:
So Classes are faster than Structures..

In my FFT, using structs for complex numbers instead of classes is 3x
faster.

Bruce Wood · May 7, 2007

In my FFT, using structs for complex numbers instead of classes is 3x
faster.

Yes, of course... complex numbers are natural value types.
Particularly if you do a lot of calculations that involve intermediate
results, structs will be much quicker because they don't have to be
allocated on the heap, and they don't have to be GC'd.

On the other hand, if you make something large like a customer record
a struct, then it will be copied on every assignment and every time
it's passed to a method (unless you use "ref"), which eats up
substantial amounts of time and memory. Not to mention the horrible
mess that results as you try to fashion your code to _avoid_ having
the thing be copied all over the place.

The moral: use the appropriate tool for the job. Classes are a lousy
representation for complex numbers. Structs are a lousy representation
for customers, stock items, etc.

By the way, Mr. Harrop, I take it this means that you've managed to
sort out the complex number thing in C#. Glad to hear it. :-)

Christof Nordiek · May 7, 2007

Dom said:
Well, fine. But I have to ask, what is the proper use of a Struct?

structs is for simple, small, fixedsize types with value semantic, like the
predefined number types or things like DateTime etc.
Also string has value-semantic, but since it can have sice up to 4GiB it is
implemented as class.

Would I ever use it instead of a Class?

Very probably, never!

Christof

Jon Harrop · May 7, 2007

Bruce said:
Yes, of course... complex numbers are natural value types.
Particularly if you do a lot of calculations that involve intermediate
results, structs will be much quicker because they don't have to be
allocated on the heap, and they don't have to be GC'd.

Yes and no. I do not yet understand why but switching between struct and
class for my low-dimensional vectors (2D and 3D) shows that classes are
slightly faster in that case, yet I would have expected their use to be
very similar to that of a complex number.

The moral: use the appropriate tool for the job. Classes are a lousy
representation for complex numbers. Structs are a lousy representation
for customers, stock items, etc.

Absolutely. Judging by the semantics, I'd always opt for a class by default
and only try a struct if it could work and might be faster, i.e. avoid
premature optimization.

By the way, Mr. Harrop, I take it this means that you've managed to
sort out the complex number thing in C#. Glad to hear it.

I had actually already written the whole thing when I posted that. I just
had the impression of reinventing the wheel by writing my own Complex
class...

Bruce Wood · May 8, 2007

Yes and no. I do not yet understand why but switching between struct and
class for my low-dimensional vectors (2D and 3D) shows that classes are
slightly faster in that case, yet I would have expected their use to be
very similar to that of a complex number.

Absolutely. Judging by the semantics, I'd always opt for a class by default
and only try a struct if it could work and might be faster, i.e. avoid
premature optimization.

Although I might be out of line given the domain you're working in, I
worry when people talk of struct versus class in terms of
"optimization". True, MS itself chose to go that route with Point and
Rectangle, but the most important thing for me is the change in
semantics between the two. The best example I have is that of a fellow
who posted here some time ago. He was trying to make a vector-based
graphics sketching program, and was frustrated that he couldn't seem
to get it to work with the already-available System.Drawing.Point
type.

The problem, of course, is that System.Drawing.Point is a struct.
This, in effect, makes it a commodity item, like int or double. You
make points, perform calculations that yield new points, and toss them
away when you're done with them. Points, ints, and doubles have very
low status within your program: you barely think about them, just make
new ones, copy them, transform them, and toss them. In particular (and
this was the fellow's problem) if you have two Point variables
containing (5, 5) and (5, 5), then there is no way to distinguish
them. That is to say, if the user drags a point from (4, 1) to (5, 5),
over top of another point already at (5, 5), then there's no way to
tell the two apart: value types have no identifying features apart
from their value.

What he wanted was a class. If you wrap System.Drawing.Point in a
class (or make your own class with x, y fields) then suddenly it
_does_ make sense to talk about "this point (5, 5)" versus "that point
(5, 5)", because a class has not only its state (aka value), but also
an address on the heap where it lives, so two points at (5, 5) _are_
distinguishable because they are stored in separate class instances.
Making these Point classes mutable causes no confusion, because that's
what we expect form reference semantics: if an object changes, all
references to that object see the change. It's all quite natural.

Changing from struct to class, or vice versa, dramatically changes how
the thing acts, and how you use it (or not). Of course, sometimes you
live with poor semantics because the difference in efficiency is
dramatic, but for the most type I decide between class and struct
based on how I want the thing to act.

Jon Harrop · May 8, 2007

Bruce said:
Although I might be out of line given the domain you're working in, I
worry when people talk of struct versus class in terms of
"optimization". True, MS itself chose to go that route with Point and
Rectangle, but the most important thing for me is the change in
semantics between the two.

Alas, I don't think there is any difference in semantics between the two for
me as I'm writing in F# (well, apart from that FFT ;-).

As a functional language, F# passes everything by reference so the only
effect of using a struct is to unbox the data structure which might be
faster. Uniformity certainly makes life easier...

Structs vs. Classes

Jon Skeet [C# MVP]

Peter Duniho

Jon Skeet [C# MVP]

Peter Duniho

Jon Harrop

Bruce Wood

Christof Nordiek

Jon Harrop

Bruce Wood

Jon Harrop