Shredding XML & replacing sections..

  • Thread starter Thread starter Greg Linwood
  • Start date Start date
G

Greg Linwood

I'm wondering what is the best way of replacing a set of nodes with an
XmlDocument with another set of nodes? I'm limited to VS2003 (VB) for this
unfortunately..

I essentially need to replace a specific range of nodes with another range,
eg change this document:
<doc>
<node1>val1</node1>
<node2>val2</node2>
<node3>val3</node3>
<node4>val4</node4>
<node5>val5</node5>
<node6>val6</node6>
</doc>

to this:
<doc>
<node1>val1</node1>
<node2>val2</node2>
<node3>val30</node3>
<node3b>val3b</node3b>
<node4>val40</node4>
<node5>val5</node5>
<node6>val6</node6>
</doc>

I have the new range of XmlNodes (3, 3b & 4) in another XmlDocument but !'m
a little unsure about how to efficiently replace the current node range (3,
4) with the new range.

Any help greatly appreciated!

Regards,
Greg Linwood
SQL Server MVP
http://blogs.sqlserver.org.au/blogs/greg_linwood
Benchmark your query performance
http://www.SQLBenchmarkPro.com
 
I'm wondering what is the best way of replacing a set of nodes with an
XmlDocument with another set of nodes? I'm limited to VS2003 (VB) for
this unfortunately..

I believe you can use XPATH to select a set of nodes... from there you
maybe able to iterate over the collection and replace the values with
what you need.
 
Hey Greg,

In that case what is the rule ?

You could do something like:


Dim doc = <doc>
<node1>val1</node1>
<node2>val2</node2>
<node3>val3</node3>
<node4>val4</node4>
<node5>val5</node5>
<node6>val6</node6>
</doc>


Dim el = doc.<node3>.First
el.Value &= 0
el.AddAfterSelf(<node3b>val3b</node3b>)
el = doc.<node4>.First
el.Value &= 0
 
Hi Bill

Sorry for the slow response, but we can't rely on the element values / inner
text having unique values (they often don't), so
el.AddAfterSelf(<node3b>val3b</node3b>) wouldn't work as there could be a
few elements with the value val3b unfortunately.

Any other possibilities?

Regards,
Greg Linwood
SQL Server MVP
http://blogs.sqlserver.org.au/blogs/greg_linwood
Benchmark your query performance
http://www.SQLBenchmarkPro.com



Hey Greg,

In that case what is the rule ?

You could do something like:


Dim doc = <doc>
<node1>val1</node1>
<node2>val2</node2>
<node3>val3</node3>
<node4>val4</node4>
<node5>val5</node5>
<node6>val6</node6>
</doc>


Dim el = doc.<node3>.First
el.Value &= 0
el.AddAfterSelf(<node3b>val3b</node3b>)
el = doc.<node4>.First
el.Value &= 0
 
Hi Greg,

Not sure I get what you mean. The el.AddAfterSelf(<node3b>val3b</node3b>)
inserts a node, <node3b>val3b</node3b>, after el which is the <node3>
element, so it's not reliant on values, rather it is reliant on a node being
found.

To help more I'd really need to know what the logical rules are for the
problem. The code I showed is just one way to create the output given the
input.
 
Ah sorry Bill - I had created a dummy summary structure which wasn't quite
correct. Sorry for wasting your time there, but this is the actual
structure:

<doc>
<row>
<null/>
<string>2008.01.22 11:00</string>
<string>2008.01.22 12:00</string>
<string>2008.01.22 13:00</string>
<string>2008.01.22 14:00</string>
</row>
<row>
<string>perfmoncountername</string>
<number>1.000000</number>
<number>1.000000</number>
<number>0.112000</number>
<number>0</number>
</row>
</doc>

This XML feeds perfmon data to a flash based graphing control we use on our
perf monitoring tools & unfortunately can't be changed..

We construct the document using an aspx (which receives http post from
service clients) and every time we receive a post, we want to inject new
date / time ranges into both sections of the document - the first row
contains date/time graph categories & each following row contains series
information (per counter). So, with the above example, if we received a post
at 15:00, we'd need to add another <string> element to the first row, and a
correlating <number> element at the end of the second row.

We also need to be able to inject elements within existing ranges as there
are scenarios where data gets re-posted. Using the above example again, we
might receive a post at 12:30 & we'd have to inject a <string> element
between 12:00 & 13:00 as well as ordinally between the 2nd & 3rd <number>
elements in the second row.

Regards,
Greg Linwood
SQL Server MVP
http://blogs.sqlserver.org.au/blogs/greg_linwood
Benchmark your query performance
http://www.SQLBenchmarkPro.com
 
Hi Greg,

I don't like that xml structure one bit. You should never rely on order in
xml, rather add attributes to specify the row and column indexes, e.g:

<row idx=1>
<string col="0">
....

Then you'd have something safe to work with as you could explicitly search
for attributes.

Given you're current format, I'd have to make a heap of assumptions as to
what will be in the XML at any given time, and I'm not really comfortable
with that. But if you want to sign off on those assumptions and do the
appropriate tests to ensure they are true, then okay ;)

So seems the first thing you need to do is calculate the column, given a
time, and then use that column for later inserts in subsequent rows. I was
going to use a LINQ query for that, but it's pretty hard to do. For example,
I'm assuming there might be no columns other than the header, so there may
be no string elements in the first row. So a dreadful iteration is probably
simplest there:

Dim datetoInsert = #1/22/2008 10:00:00 AM#
Dim el = doc.<row>.First.<null>.First

For Each item In doc.<row>.First.<string>
'TODO: ensure we are only dealing with dates in here. If not, use
IsDate orTryParse etc
If CDate(item.Value) > datetoInsert Then Exit For
el = item
Next

Dim col = el.NodesBeforeSelf.Count

If col > 1 Then
doc.<row>(1).<number>(col - 1).AddAfterSelf(<number>23456</number>)
End If

In the above, I've assumed the second row. You may want to locate them by
doing a query. again the structure of your document is not great for that,
at least not efficiently. doc.<row>.<string> would need to be filtered to
only when <string> is the first element, so you'd probably have to query on
doc.<row>. The following is really ugly, and can blow up still if no match
is found:

Dim row = (From r In doc.<row> _
Let item = r.Elements()(0) _
Where item.Name.LocalName = "string" _
AndAlso item.Value = "perfmoncountername" _
Select r).First

If you removes the First, it won't blow up, but you'll then need to ensure
there are items in the returned rows and select the first row

Hopefully this should get you started :)
 
Back
Top