XML document question

  • Thread starter Thread starter Lloyd Dupont
  • Start date Start date
L

Lloyd Dupont

I have a class which looks like that:

class Stuff
{
Dictionary<string, object> properties;
List<Stuff> children;
}

I'm writing a home made XML 'serialization'.
I quote that, because it's not serialization, I'm writing the XML writer
myself (without System.Reflection).


I'm using a shema like that:
<stuff count=2 key1="value1" key2="value2" key3="value3">
<stuf>
.........
</stuff>
<stuf>
.........
</stuff>
</stuff>

where count = children.Count
and key1,2,3 are the keys in the properties dictionary and value1,2,3 their
values.

properties is to be filled by the user with whatever (s)he wants.


Now my problem is:
==============

As the user could put whatever (s)he wants in the properties dictionary
(s)he could as well create a key named 'count', therefore the 'count'
atribute will be present twice in the tag.

It's not a problem for my reader (which expect a first count followed by any
kind of attribute) but it might be for other people wanting to consume my
XML documents.

What do you think?

it's actually much easier this way...
(particularly because this sample output is far from the whole story)
 
Lloyd Dupont wrote:

As the user could put whatever (s)he wants in the properties dictionary
(s)he could as well create a key named 'count', therefore the 'count'
atribute will be present twice in the tag.

It's not a problem for my reader (which expect a first count followed by any
kind of attribute) but it might be for other people wanting to consume my
XML documents.

Any XML parser/XML API can find the count of stuff elements by
parsing/counting the child stuff elements so I am not sure it is
necessary to have an attribute giving that count value.
As for trying to have two attributes of the name 'count', that is not
allowed in XML and breaks well-formedness rules. Also attributes of XML
elements are not ordered.

I am also not sure how with Dictionary<string, object> any object you
can have in there fits in an XML attribute value.
 
You may run into an issue with deserializing your distionary since the values
are of the type "Object"
Writing them to an xml stream won't be a problem since you could just call
ToString on all of the items, but reading them might be because you don't
know what type they are.

Could you take an approach like this?

<stuff>
<property name="propx" value="valx" type="System.String"/>
<property name="propy" value="valy" type="System.Int32"/>
<property name="propz" value="valz" type="System.DateTime"/>
....
</stuff>

You can get the types when serializing by wither using object.GetType() and
then deserialize to the correct type. Also, when you think about it, there
really isn't a point is hacing the count attribute there.
When you deserialize this document, you can build your dictionary as the
appropriate nosed are read without having to worry about the document's state
as a whole. Also, the document's structure can now be defined discretely
through DTD or XmlSxhema, which means you can create your xmlReader with an
XmlSchema instance so that your input is validated while it is being read
into your application.
 
1. the count attribute has a performance 'raison d'etre'
there are few instance where it could be be big (over some thousands), if I
don't specify I am going to run into many resize.

This is really annoying as I'm not using a List which double its internal
capacity every time but (purposefully) an home made collection which grows
its internal capaciy linearly, hence lots of resize/copy will happen.

2.
I thought of a format like
<tag>
<properties>
......
</properties>
<content>
......
</content>
</tag>

but it's going to completely blow up the size of my XML file (lots of tag
here!), which annoys me...
whereas my format is 'compact'.

What do you think of this concern?
 
Back
Top