Serialization / ISerializable question for a Guru or two!

  • Thread starter Thread starter Robert Hooker
  • Start date Start date
R

Robert Hooker

Given a class:

[Serializable]
public class MyClass : ISerializable
{
//
//Lots of fields and members
//

public void GetObjectData(SerializationInfo info, StreamingContext
context)
{
MemberInfo[] arMI =
FormatterServices.GetSerializableMembers(this.GetType(),context);
foreach ( MemberInfo mi in arMI )
{
info.AddValue( mi.Name,((FieldInfo)mi).GetValue(obj) );
}
}
}


I can serialize a collection of 10000 "MyClass" no problems. Takes about 2
seconds or so with a BinaryFormatter.

If I remove the "ISerializable" inheritence and serialize the collection, it
takes about half the time - around 1 second.

My Question:

I ask because, most of the time, I want the ISerializable implementation
(for change-tolerance during deserialization) but for some other 'non
persistence' operations I want a fast serialization (for deep copies and for
"undo redo" snapshots of object graphs)

Any help is greatly appreciated!
Rob.
 
Hello Rob,

Thanks for your post. I reviewed your description carefully, however, I am
not quite sure what the exact question you are asking. Still, I'd like to
share the following informtion with you:

Generally speaking, we use SerializableAttribute to indicate that a class
can be serialized. If we want to control the serialization process of a
class, we can implement the ISerializable interface. Please refer to the
following MSDN documentations:

SerializableAttribute Class
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/
frlrfsystemserializableattributeclasstopic.asp

ISerializable Interface
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/
frlrfsystemruntimeserializationiserializableclasstopic.asp

Does this answer your question?

Regards,

HuangTM
Microsoft Online Partner Support
MCSE/MCSD

Get Secure! -- www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 
Thanks for your reply - but no, it doesn't answer my question:

Restated:

I have a class which implements ISerializable (see the first post).
-Serialization of 10000 instances takes 4 seconds (for example), using
GetObjectData etc etc

If I *remove* ISerializable
-Serialization of 10000 instances takes 2 seconds (for example), using
..NETs built in Serialization

I want to know if I can leave ISerializable on my class and at runtime:
- sometimes have serialization use ISerialzable (GetObjectData etc), and
- other times ignore ISerializable and use .NET's built in serialization


Rob.
 
Rob,
Could you just delegate this to the RemotingServices object? Something like:

public void GetObjectData(SerializationInfo info, StreamingContext context)
{
if(criteriaToCustomSerialize)
{
MemberInfo[] arMI = FormatterServices.GetSerializableMembers(this.GetType(),context);
foreach ( MemberInfo mi in arMI )
{
info.AddValue( mi.Name,((FieldInfo)mi).GetValue(obj) );
}
}
else
{
RemotingServices.GetObjectData(this, info, context);
}
}

It took me a few minutes to find that object. I knew it had to be there but I had to you the object browser in the IDE to find it instead of the help documentation. I think that the call to RemotingServices happens when there is no ISerializable interface implemented by an object so this just allows you to delegate if you want the framework to handle it. Hope this helps. BTW another reason why the custom serialization you are using might show slower performance is the use of foreach. It would be faster if you used a for(int i=0; i < arMI.Lengh; i++) instead. Please correct me if I am wrong there but I always try to use that type of iteration with large numbers of objects.

John Sheppard
Missouri Botanical Garden


----- Robert Hooker wrote: -----

Thanks for your reply - but no, it doesn't answer my question:

Restated:

I have a class which implements ISerializable (see the first post).
-Serialization of 10000 instances takes 4 seconds (for example), using
GetObjectData etc etc

If I *remove* ISerializable
-Serialization of 10000 instances takes 2 seconds (for example), using
..NETs built in Serialization

I want to know if I can leave ISerializable on my class and at runtime:
- sometimes have serialization use ISerialzable (GetObjectData etc), and
- other times ignore ISerializable and use .NET's built in serialization


Rob.
 
Hi Rob,

I appologize for the misunderstanding.

If we have implemented the ISerializable interface in a class, .NET will
use ISerializable and call GetObjectData for serialization and there is no
way to bypass it.

In addition, please kindly note that RemotingServices is used for
publishing remoted objects and proxies instead of serialization.

Please feel free to let me know if you have any problems or concerns.

Regards,

HuangTM
Microsoft Online Partner Support
MCSE/MCSD

Get Secure! -- www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 
Hello Rob,

How are things going? I would appreciate it if you could post here to let
me know the status of the issue. If you have any questions or concerns,
please don't hesitate to let me know. I look forward to hearing from you,
and I am happy to be of assistance.

Regards,

HuangTM
Microsoft Online Partner Support
MCSE/MCSD

Get Secure! -- www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 
Tian Min Huang said:
Hello Rob,

How are things going? I would appreciate it if you could post here to let
me know the status of the issue. If you have any questions or concerns,
please don't hesitate to let me know. I look forward to hearing from you,
and I am happy to be of assistance.

Regards,

HuangTM
Microsoft Online Partner Support
MCSE/MCSD

Get Secure! -- www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.

Hi,

I still haven't gotten a good answer as what what is going on under the hood
of Serialization. (I supose I shouldn't need to know).

Can you pass this question/observation onto someone in the 'framework' team?
Maybe they can shed some light on whats going on.

**********
Having read a series of excellent articles by J Richter for MSDN mag - where
he explains what 'default' serialization is doing under the hood, I wrote an
ISerializable implementation which does the same - and it works produces the
same size/speed of serialized streams as 'default' serialization, but only
for 'simple' object graphs. I'd like to know the 'why' of this so that I can
design my graph in a better way.

In the 3 tests below my simple object graph was serialized twice. The first
time is with 'default' serialization only, and the second time I recompile
with ISerializable of the nodes and root.

Here's my graph:

*** Test1
Root object (contains an arraylist)
Node1
Node2
Node3

Both methods result in an identical file - 90 bytes

*** Test2
Root object
Node1 (contains a reference to "root")
Node2 (contains a refernece to Node1)
Node3 (contains a refernece to Node2)

Both methods result in an identical file - 204bytes

*** Test3 - where I got the surprise
Root object
Node1
Node2 (**creates** and holds a reference to Node2a)
Node2a
Node3

Built in serialization gives a file of 215bytes, while my previously
indentical ISerializable method goes to 480bytes!!
In both cases, the graph is valid when I deserialize again - but clearly the
ISerializable version of the file contains data it doesn't need too.


The "serialize out" code I am using inside my ISerializable.GetObjectData
implementation is this:

MemberInfo[] arMI =
FormatterServices.GetSerializableMembers(obj.GetType(),context);
for (int i=0;i<arMI.Length;i++)
{
FieldInfo fi = (FieldInfo)arMI;
info.AddValue( fi.Name, fi.GetValue(obj) );
}

Can anyone shed any light on this explosion in filesize?
Rob.
 
Hello Rob,

Thanks for your response. I am performing research on this issue and will
update you with my information.

Have a nice day!

Regards,

HuangTM
Microsoft Online Partner Support
MCSE/MCSD

Get Secure! -- www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 
Hello Rob,

I reviewed your description carefully, and think more information is needed
before moving forward:

In Test3, do you have any nested reference between Node2 and Node2a?

I suggest that you can use XML Serialization to serialize the nodes to an
XML file so that we are able to check what was going on.

I am standing by for your response.

Regards,

HuangTM
Microsoft Online Partner Support
MCSE/MCSD

Get Secure! -- www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 
Hello Rob,

Thanks for your project and output files. I checked it on my side and now
share the following information with you:

As you know, when using custom serialization, we need to provide the name
to associate with the value. Based on my research, it's that name which
causes the size difference. I suggest that you can open both output files
in VS .NET IDE, and you will see an extra of 100 times of the following
line in the output of custom serialization:

SerializationTest.ChildNodeval1parentsubNode

Does this answer your question?

Regards,

HuangTM
Microsoft Online Partner Support
MCSE/MCSD

Get Secure! -- www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 
Hi,
Based on my research, it's that name which causes the size difference.
I think you need to look at your research again! :-)

For the binaryformatter:
- I get *identical* filesizes in my serialized output from .NETs builtin
serializaer and Custom serializaton
- so long as the ChildNode doesn't create a new object inside itself

- Opening up the result Binary files in notepad, you can *clearly* see the
"names" of the fields for **BOTH** builtin serialization, custom
serialization

- When I make ChildNode create a new object inside itself, the filesizes
become different by a factor of 2. **BOTH** still have the field names, but
the Custom serialized file now appears to have a different structure that in
did previously - and is much larger.
Does this answer your question?
So, no it doesn't. The names you see (and you show below) appear in BOTH the
built in and Custom Serialization files.
In fact, this article: http://msdn.microsoft.com/msdnmag/issues/02/04/net/
by J Richter also states that "builtin" serialization uses name\value pairs
as well.


So - why is builtin serialization sometimes the same as custom serialization
in its output, and sometimes not?

Rob.


Using .NET's built-in serialization
 
Hi Rob,
in its output, and sometimes not?
AFAIK, the builtin serialization will optimize for better performance under
certain circumstances. While for custom serialization, it has to strictly
follow the ISerializable.GetObjectData() you provided to repeat an extra
100 times of "SerializationTest.ChildNodeval1parentsubNode" as in this
case.

Regards,

HuangTM
Microsoft Online Partner Support
MCSE/MCSD

Get Secure! -- www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 
Back
Top