Bloated Files...

  • Thread starter Thread starter Michael Jones
  • Start date Start date
M

Michael Jones

Hello,



We're moving a project from MFC to .net. And we've been able to make some
great improvements on the way - especially on the GUI side (big hurray for
the guys who made the propertyGrid).



But when it comes to Serialization - well I don't know.



Using BinaryFormatter the file is 436 times larger then the file created
with MFC. And just a hint for the problem in this: the file with 2.2 million
objects was 7MB and now it is 3GB. (oh - not to mention that the load time
was 30seconds and now well over 15minutes).



Now I've read a lot about all the different options and have tested quite of
it. I love the way all the formatting stuff fits together e.g. in remoting
and I can see the merits.



But, just in the off chance that I am overlooking something - is there a
better Serialization scheme hidden somewhere in .net or has somebody else
sat down an implemented a "MFC" similar way?





Kind Regards,

Michael
 
Understand that because MS offers a hammer (.NET), not everything is a nail.
There are many cases where other technologies are better suited.

I would have to see what you are doing with the BinaryFormatter to better
understand the issue, as the size seems rather large.

NOTE: Data that is string oriented will be larger than it probably needs to
be, as it will be set up as Unicode characters. If you can use ASCII
characters, it would be better to convert strings into byte arrays before
fomatting, as you will save 1 byte per character. In your case, this is still
1.5GB, which is still huge.

--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

***************************
Think Outside the Box!
***************************
 
Michael Jones said:
Using BinaryFormatter the file is 436 times larger then the file created
with MFC. And just a hint for the problem in this: the file with 2.2 million
objects was 7MB and now it is 3GB. (oh - not to mention that the load time
was 30seconds and now well over 15minutes).

2.2 million objects in 7MB sounds *very* small - that's only just over
3 bytes per object. Are you sure those numbers are accurate? Or are you
really storing virtually nothing for each object?
 
Hello,



well, I guess I was a bit imprecise there - 2.2 Million objects in memory
when loaded - the (our) serialization implementation already removes a lot
of overhead as well as aggregated or implicit objects. We also don't store
the unused but not null objects in e.g. arrays but have to reconstruct these
empty objects for the display processes. (Even MFC serialization needed a
bit of help to get there). Also I forgot that the smart pointers that I use
get counted as an extra object, yet don't get stored at all.



I've started implementing a thin serialization and can now see light at the
end of the tunnel as the file now has shrunk concededly.



Yet, while doing that I am also studying the .net produced files as I don't
really like reinventing the wheel and found something strange; most of the
time .net (VS2005 Beta II, .net 2.0) suppresses the object type information
on reoccurring types, yet at some stage it seems that it forgets this
intention and starts storing them every time a object gets stored. This
seems to be the reason for this explosion of file size. has anybody observed
something like this?



Kind Regards

Michael
 
Back
Top