Hi Graham,
I read a lot about serialising things to XML. Can someone give me some
examples of why I would want to do this? I seem to be missing the point.
Serialization is a very helpful mechanism.
Traditionally, we create applications that either collect, analyze, or
transfer information. For some data collection systems, relational
databases are a terrific answer. The fields in each transaction are the
same and the existence of a value doesn't often drive the need for a
completely different data structure. Data can and should be collected in
rows and columns and stored in a relational database.
Problem is that databases are a bit too good. They are adaptable and
versatile, and we've been taught to use them for every data persistence
opportunity. "If all you have is a hammer, everything looks like a nail".
Some kinds and uses of data persistence are not well suited for pure
relational databases. What kinds are there? Look for the stuff that
"smells funny." Large data blocks stored as a blob in a database (not
including image files). Situations where a single key may refer to any of a
dozen tables (as a primary key) but never more than one. Situations where a
database table is created to have a set of fields of different types and an
indicator to tell the programmer which field contains the value, as a way of
allowing data of different types to be stored. Situations where database
records are created with an "OptionalField1", "OptionalField2" and
"OptionalField3" in their definition, to allow the user to define the
meanings of the fields (but you'd better not exceed three, or else!).
All these situations are ones that I see daily. They are all limitations of
the relational paradigm (and, in fact, the purely heirarchical database does
no better). The interesting thing is that if your data is Object Oriented,
the OO side of your app has little or no difficulty dealing with things like
this. The problem is not representing the data. The problem is persisting
it.
Applications that have to cope with data that "doesn't follow the rules" are
common. These applications become complex, especially in the data
persistence layer. Engineers have termed this to be "Object Relational
Impedence". In other words, you have to write a LOT of code to get over the
fact that your objects cannot be easily converted to a relational database
format.
Serialization can be used as a way of persisting the data without storing it
to a relational database. You get none of the transactional features of a
relational database. However, in some cases, you don't need it. In those
cases, you can make your code considerably simpler by using serialization to
persist the data than you would be able to if you were writing to a
relational database.
In other cases, you can combine the two, where you store some data in
relational tables, but then you leave in an XML column where you serialize
the entire object or some fragment of it, where the fragment is the portion
of your data that doesn't "follow the rules". In SQL Server 2000, you had
to create a 'text' field to hold this data, but in the newest SQL Server
(Yukon), you can define the field to be of type 'XML' and not only store XML
data in it, but query the data in those fields using XPath queries, and
enforce constraints on the data.
In other situations, like those involved in transmitting information between
systems (B-to-B or systems integration), a database is more of a problem
than a solution. In situations like this, you need to be able to take a
"snapshot" of a transaction in time, completely wrap it up (potentially sign
it), and send just one transaction, with all associated data, from one place
to another. In this case, an object can be composed in memory, serialized
to a single string, and transmitted as a self-contained business transaction
through a reliable messaging mechanism. This is a fundamental notion in
Service Oriented Architecture, or SOA.
The ability to serialize and deserialize is something that has to be
hand-coded when it cannot be provided by the tools. I've seen countless
situations where buckets of code can be replaced with three calls to the
framework for serialization and deserialization. I cannot say enough how
important this is for reducing the complexity of the code and therefore,
reducing the cost to create, test, and maintain.
Hope this helps,
--- Nick
--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik
Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--