XML Merge

Tom · Jan 5, 2009

Geeze,

You think this would be easy, but it's killing me. I have 2 XElement
objects. Both consist of the same schema, and all I want to do is merge the
2 w/o having any duplicates. Is there an easy way to do this? I've tried
Linq, and the old XML API, but have been unsuccessful with both. If
somebody has an example in Linq, that would be awesome.

Thanks.

Martin Honnen · Jan 5, 2009

Tom said:
You think this would be easy, but it's killing me. I have 2 XElement
objects. Both consist of the same schema, and all I want to do is merge
the 2 w/o having any duplicates. Is there an easy way to do this? I've
tried Linq, and the old XML API, but have been unsuccessful with both.
If somebody has an example in Linq, that would be awesome.

Well how exactly do you want to identity duplicates? Based on certain
element or attribute value like an id element or attribute? Does the
schema define an id attribute or a key elements?

Martin Honnen · Jan 5, 2009

Tom said:
You think this would be easy, but it's killing me. I have 2 XElement
objects. Both consist of the same schema, and all I want to do is merge
the 2 w/o having any duplicates. Is there an easy way to do this? I've
tried Linq, and the old XML API, but have been unsuccessful with both.
If somebody has an example in Linq, that would be awesome.

LINQ to XML provides a method XNode.DeepEquals
http://msdn.microsoft.com/en-us/library/system.xml.linq.xnode.deepequals.aspx
that might help you to compare nodes. But of course it all depends on
how exactly you want to identify duplicates.

Tom · Jan 5, 2009

Martin Honnen said:
Well how exactly do you want to identity duplicates? Based on certain
element or attribute value like an id element or attribute? Does the
schema define an id attribute or a key elements?

Well,

Here's how I build an XML file ...

public void RecursivelyBuildDescendentTree()
{
XElement OriginalPerson =
new XElement ("Person",
new XElement ("Name", "Bob"));

XElement Child1 =
new XElement ("Person",
new XElement ("Name", "John"));
XElement Child2 =
new XElement ("Person",
new XElement ("Name", "Jane"));

XElement Girlfriend1 =
new XElement ("Person",
new XElement ("Name", "Sally"));
XElement Girlfriend2 =
new XElement ("Person",
new XElement ("Name", "Kelly"));
}

public void test()
{
XElement root = new XElement("Root");
XElement rootPerson = RecursivelyBuildDescendentTree();
}

I'm not building a family tree, but it's the same idea. The tree starts out
with a person, and each person can have 0 or more children, and 0 or more
girlfriends. I would basically want to merge 2 files like this based on the
person's name. I have no xsd file, the schema is built right into the code
(as shown above). I know I should be verifying it, but I'm not right now.
I'm just looking to get this working first.

Right now I'm working on adding some type of merge method that will take 2
XElement structures, and manually go through 1 and determine if a person is
not in the other. If not, then it will add the person and all of that
person's children and girlfriends.

Seems like there is a better way though. I also feel like this is going to
be really slow when I'm finished.

Martin Honnen · Jan 5, 2009

Tom said:
Here's how I build an XML file ...

public void RecursivelyBuildDescendentTree()
{
XElement OriginalPerson =
new XElement ("Person",
new XElement ("Name", "Bob"));

XElement Child1 =
new XElement ("Person",
new XElement ("Name", "John"));
XElement Child2 =
new XElement ("Person",
new XElement ("Name", "Jane"));

XElement Girlfriend1 =
new XElement ("Person",
new XElement ("Name", "Sally"));
XElement Girlfriend2 =
new XElement ("Person",
new XElement ("Name", "Kelly"));
}

So that function simply creates several XElements with name "Person",
each having a child XElement with name "Name". But those Person
XElements are not in any way related as you don't put one as a child of
the other. And the function does not even return anything so once it has
been called all those Person XElements are waiting to be garbage collected.

public void test()
{
XElement root = new XElement("Root");
XElement rootPerson = RecursivelyBuildDescendentTree();

That would not even compile, given your above
RecursivelyBuildDescendentTree which has a void return type.

Tom · Jan 5, 2009

Martin Honnen said:
So that function simply creates several XElements with name "Person", each
having a child XElement with name "Name". But those Person XElements are
not in any way related as you don't put one as a child of the other. And
the function does not even return anything so once it has been called all
those Person XElements are waiting to be garbage collected.

That would not even compile, given your above
RecursivelyBuildDescendentTree which has a void return type.

Oops. Sorry Martin. That's obviously a bad example as you pointed out.
This machine isn't my dev machine, so I can't just copy and paste code.

Anyway, imagine there's a RecursivelyBuildDescendentTree function that just
builds a tree. This tree has elements and each element can have a child.

So with that said, that's what I'm dealing with. I was looking for a way to
merge XML files that would look something like this ...

xml file 1:
<root>
<person>
<personName>bill</personName>
<girlfried>sally</girlfriend>
<person> (this would be the child. sorry, bad example earlier)
<personName>tom</personName>
<girlfriend>jill</girlfriend>
</person>
</person>
</root>

xml file 2:

<root>
<person>
<personName>bill2</personName>
<girlfried>sally2</girlfriend>
<person>
<personName>tom2</personName>
<girlfriend>jill2</girlfriend>
<person>
<personName>tom3</personName>
<girlfriend>jill3</girlfriend>
</person>
</person>
<person>
<personName>bill</personName>
<girlfried>sally</girlfriend>
<person> (this would be a duplicate i don't want.)
<personName>tom</personName>
<girlfriend>jill</girlfriend>
</person>
</person>
</person>
</root>

Martin Honnen · Jan 6, 2009

Tom said:
So with that said, that's what I'm dealing with. I was looking for a
way to merge XML files that would look something like this ...

xml file 1:
<root>
<person>
<personName>bill</personName>
<girlfried>sally</girlfriend>
<person> (this would be the child. sorry, bad example earlier)
<personName>tom</personName>
<girlfriend>jill</girlfriend>
</person>
</person>
</root>

xml file 2:

<root>
<person>
<personName>bill2</personName>
<girlfried>sally2</girlfriend>
<person>
<personName>tom2</personName>
<girlfriend>jill2</girlfriend>
<person>
<personName>tom3</personName>
<girlfriend>jill3</girlfriend>
</person>
</person>
<person>
<personName>bill</personName>
<girlfried>sally</girlfriend>
<person> (this would be a duplicate i don't want.)
<personName>tom</personName>
<girlfriend>jill</girlfriend>
</person>
</person>
</person>
</root>

How should the merged XML document look?