Merge/Synchronize XML Files

  • Thread starter Thread starter Meelis Lilbok
  • Start date Start date
M

Meelis Lilbok

Hi is for synchronizing two xml files any fast solution?

Lets say i have 2 xml files 1.xml and 2.xml

1.xml contianes

<test>
<t id="1">Hello</t>
<t id="2">World</t>
<t id="3">Good bye!</td>
</test>

2.xml containes
<test>
<t id="1">Hello</t>
<t id="2">World</t>
</test>

After synchronizing 2.xml must look likt this
<test>
<t id="1">Hello</t>
<t id="2">World</t>
<t id="3">Good bye!</td>
</test>

At the moment i use
For Each
Next
and this is too slow, if file containes about 1000 <t> nodes

Regards;
Meelis
 
Why don't you try writing an XSLT transformation that combines all the
nodes in <test> and then eliminates the duplicates? I think you could either
combine the two xml documents prior to the transform, or perhaps by
importing them from within the XSLT.

You can then do an xmlDoc.DocumentElement.Transform...

Rick
 
Meelis Lilbok said:
Hi is for synchronizing two xml files any fast solution?

Lets say i have 2 xml files 1.xml and 2.xml

1.xml contianes

<test>
<t id="1">Hello</t>
<t id="2">World</t>
<t id="3">Good bye!</td>
</test>

2.xml containes
<test>
<t id="1">Hello</t>
<t id="2">World</t>
</test>

After synchronizing 2.xml must look likt this
<test>
<t id="1">Hello</t>
<t id="2">World</t>
<t id="3">Good bye!</td>
</test>

At the moment i use
For Each
Next
and this is too slow, if file containes about 1000 <t> nodes

Regards;
Meelis

I suspect that a loop using .nextNode on either the input, target or both
per iteration will suit your needs Although XSL may still outperform a
script based language doing this even 1000 nodes shouldn't take an excessive
amount of time.

Your example doesn't show why you simply don't replace 2.xml with 1.xml.
More detail show the wider set of cases are needed to arrive at an
apporpriate solution.

If id="1" were missing from 1.XML should it be deleted from 2.xml?
If id="2" in 2.xml contained the word 'kosmos' should it contain 'world'
after the merge because it was replaced by id="2" from 1.xml?
In the real world is t a complex element if so do you intend to merge the
contents of ts of the same id from each xml file or simply replace the t in
2.xml with the one in 1.xml?
 
If id="1" were missing from 1.XML should it be deleted from 2.xml?
If id="2" in 2.xml contained the word 'kosmos' should it contain 'world'
after the merge because it was replaced by id="2" from 1.xml?
In the real world is t a complex element if so do you intend to merge the
contents of ts of the same id from each xml file or simply replace the t
in
2.xml with the one in 1.xml?

Yes i cant simly replace beacuse in one file node with id="2" can have
"Hello"
in second file id="2" may have "Hallo"

I try to explain little bit more :=

file 1.xml is a "template" file, containing strings/texts in native
language(Estonian)
With my application users can translate strings to they own language.
When now user launches translator application

1) template is loaded from server
2) application checks if template file contains new id's(nodes) and adds
those nodes to user file.


[template.xml]
<test>
<t id="1">Tere</t>
<t id="2">Maailm</t>
<t id="3">Head aega!</t>
</test>

[user.xml]
<test>
<t id="1">Hallo</t>
<t id="2">World</t>
</test>


After synchronizing user.xml must look like this
<test>
<t id="1">Hello</t>
<t id="2">World</t>
<t id="3">Head aega!</t>
</test>




Meelis
 
Meelis Lilbok said:
If id="1" were missing from 1.XML should it be deleted from 2.xml?
If id="2" in 2.xml contained the word 'kosmos' should it contain 'world'
after the merge because it was replaced by id="2" from 1.xml?
In the real world is t a complex element if so do you intend to merge the
contents of ts of the same id from each xml file or simply replace the t
in
2.xml with the one in 1.xml?

Yes i cant simly replace beacuse in one file node with id="2" can have
"Hello"
in second file id="2" may have "Hallo"

I try to explain little bit more :=

file 1.xml is a "template" file, containing strings/texts in native
language(Estonian)
With my application users can translate strings to they own language.
When now user launches translator application

1) template is loaded from server
2) application checks if template file contains new id's(nodes) and adds
those nodes to user file.


[template.xml]
<test>
<t id="1">Tere</t>
<t id="2">Maailm</t>
<t id="3">Head aega!</t>
</test>

[user.xml]
<test>
<t id="1">Hallo</t>
<t id="2">World</t>
</test>


After synchronizing user.xml must look like this
<test>
<t id="1">Hello</t>
<t id="2">World</t>
<t id="3">Head aega!</t>
</test>

So if I've understood it correctly all you really need is add new nodes that
have appeared at the end of 1.xml to the end of 2.xml? Sounds a little
simplistic so I probably haven't understood your requirement but if it is
then:-


Option Explicit

Dim xml1 : Set xml1 = LoadDOM("g:\temp\xml1.xml")
Dim xml2 : Set xml2 = LoadDOM("g:\temp\xml2.xml")
Dim oNode

Set oNode = xml2.documentElement.lastChild
Set oNode = xml1.selectSingleNode("//t[@id='" & oNode.getAttribute("id") &
"']")

For Each oNode in oNode.selectNodes("following-sibling::t")
xml2.documentElement.appendChild(oNode.cloneNode(true))
Next

xml2.save "g:\temp\xml2.xml"

Function LoadDOM(sFile)

Set LoadDOM = CreateObject("MSXML2.DOMDocument.3.0")
LoadDOM.async = False
LoadDOM.setProperty "SelectionLanguage", "XPath"
LoadDOM.load sFile

End Function


This for eaches only the new nodes the end of xml1 which are not already in
xml2.
 
Meelis said:
Yes i cant simly replace beacuse in one file node with id="2" can have
"Hello"
in second file id="2" may have "Hallo"

I try to explain little bit more :=

[template.xml]
<test>
<t id="1">Tere</t>
<t id="2">Maailm</t>
<t id="3">Head aega!</t>
</test>

[user.xml]
<test>
<t id="1">Hallo</t>
<t id="2">World</t>
</test>


After synchronizing user.xml must look like this
<test>
<t id="1">Hello</t>

I assume you meant to type Hallo...
<t id="2">World</t>
<t id="3">Head aega!</t>
</test>

If you get rid of all the XML noise, you will be left with name-value pairs
(see DictionaryEntry in the help).

"1" "Hallo"
"2" "World"


If you then put the template DictionaryEntry items into a Hashtable (q.v.)
followed by the values extracted from the user.xml file *but taking note of
this from the Hashtable.Add method help*:

"The Item property can also be used to add new elements by setting the value
of a key that does not exist in the Hashtable. For example:
myCollection["myNonexistentKey"] = myValue. However, if the specified key
already exists in the Hashtable, setting the Item property overwrites the
old value. In contrast, the Add method does not modify existing elements."

then you will have a hashtable containing the merged data.


You can then take the data from the hashtable, add back in all the XML cr^W
stuff as you Append it to a StringBuilder, then write back to disk. The
whole operation should take about as long as it takes to double-click a
mouse button.

' not checked, but this is how you'd re-build the XML
dim sb as new StringBuilder("<test>" & vbLF)
for each thing as DictionaryEntry in yourHashtable
sb.Append(string.format(" <t id=""{0}"">{1}</t>" & vbLf,
thing.key.tostring, thing.value.tostring))
next
sb.Append("</test>")
' now write the file

Any use?

Andrew
 
Back
Top