XML Deserialization Attribute vs. Element

  • Thread starter Thread starter Todd
  • Start date Start date
T

Todd

I'm developing an application that accepts XML input from a diverse set of
clients. Some clients specify values as an attribute, and some specify values
as a sub-element.

The XmlSerializer doesn't seem to handle both cases (attribute vs. element)
by default. Is there a way to get the Deserializer to recognizer EITHER
attribute or element for the same class?

If I add the [XmlAttribute] attribute to the member sex (below), then the
attributed deserialization works, but the sub-element version then fails. It
seems there's no way to get the Deserializer to handle both cases?

According to the standard XML spec, specifying a value as either an element
or an attributes should be identical.



[Serializable]
public class person
{
public string sex;
public string firstname;
public string lastname;
}


public void Test()
{
string textWithElement =
@"<person>
<sex>female</sex>
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>";

string textWithAttribute =
@"<person sex=""female"">
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>";


TestDeserialization(textWithElement);
TestDeserialization(textWithAttribute);

// Output
// person, sex=female, firstname=Anna, lastname=Smith
// person, sex=, firstname=Anna, lastname=Smith

}

public void TestDeserialization(string text)
{
person p;
XmlSerializer mySerializer = new XmlSerializer(typeof(person));
MemoryStream mem = new MemoryStream(UTF8Encoding.UTF8.GetBytes(text));
object o = mySerializer.Deserialize(mem);
p = (person)o;

System.Diagnostics.Debug.WriteLine(string.Format("person, sex={0},
firstname={1}, lastname={2}", p.sex, p.firstname, p.lastname));
}
 
Todd said:
I'm developing an application that accepts XML input from a diverse set of
clients. Some clients specify values as an attribute, and some specify values
as a sub-element.

The XmlSerializer doesn't seem to handle both cases (attribute vs. element)
by default. Is there a way to get the Deserializer to recognizer EITHER
attribute or element for the same class?
No, there isn't. In general, the XmlSerializer works best when there's one
single format that's tightly coupled to the class. This is most compatible
with the attributed model the serializer uses. If you need to deserialize
from heterogeneous sources, you can either write the deserialization logic
yourself (with simple classes like XPathNavigator), or you can use multiple
distinct serialization classes decoupled from the actual data object, or you
can use overrides, or you can modify your class (which is easiest, but tends
to get ugly).

Here's an example of modifying the class:

[Serializable]
public class Person {
[XmlAttribute(AttributeName = "sex")]
public string SexAttribute;

[XmlElement(ElementName = "sex")]
public string SexElement;

public string Sex {
get { return SexElement ?? SexAttribute; }
}

public string FirstName;
public string LastName;
}

This also illustrates another problem: if you want to serialize this class,
you have to decide whether the "sex" field will be serialized as an element
or as an attribute, so it makes sense that these end up as separate properties.

Here's how we can do the same thing with overrides:

XmlAttributeOverrides overrides = new XmlAttributeOverrides();
XmlAttributes attributes = new XmlAttributes();
attributes.XmlAttribute = new XmlAttributeAttribute();
overrides.Add(typeof(person), "sex", attributes); // Serialize "sex" as an
attribute instead of an element
XmlSerializer mySerializer = new XmlSerializer(typeof(person), overrides);

Obviously, this solution doesn't work unless you know what format your
client will use in advance. The same objection applies to any solution that
uses XmlSerializer for deserialization, however.
According to the standard XML spec, specifying a value as either an element
or an attributes should be identical.
This is not true, or rather, you're misinterpreting it. An attribute is
*not* equivalent to a child element. They are equivalent insofar as they
allow for the same semantics, but they are not treated as the same thing.
You'll find that almost everything based on XML makes a clear distinction
between them. Unfortunately, this does lead to endless "attribute or
element" discussions that are often pointless.

It's by far better to specify which one you're going to use and stick to it
then to allow clients to pass whatever they want.
 
Thank you, Jeroen for your detailed answer! Your solution to have a separate
attribute and element will work well for my application, and I'll likely go
with that approach (I already tested a sample class with success!).
Fortunately, I don't have to serialize anything, just accept their input and
deserialize.

I only wish I could go back to "the clients" and tell them to use just one
approach. But they are already "set in stone", so my app needs to be diverse
enough to handle both approaches.

Sincerely yours,
-Todd
 
Back
Top