Whats an efficient way of insertring elements into an XML document

  • Thread starter Thread starter Nick Z.
  • Start date Start date
N

Nick Z.

Lets say you have the following xml document

<?xml version="1.0" encoding="utf-16"?>
<log>
<event>Some event #1</event>
</log>

How would I add another <event> element to this document?

Asume that the file size is already 200kb or more.
Using XmlDocument class is not very efficient as far as I see.
Using the XmlTextWriter you can only append to the end of the file,
hence breaking the xml structure.

How would I do this?

Thanks,
Nick Z.
 
Despite not being "highly efficient", the XmlDocument, along with XPath to
find the proper node for insert, is still the best method. There may be some
rare instances where using a Reader and parsing lines is faster, but I cannot
envision an algorythm, except perhaps a known hash of a line, or sticking to
binary (which is far more complex) will beat the XmlDocument.

If you can prebuild the XML snippets (nodes) for the insert, you could
manufacture an XSLT on the fly, but I do not think that would give you more
perf than the XmlDocument.

--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

***************************
Think Outside the Box!
***************************
 
But if the file is several megabytes, doesn't that mean that calling
XmlDocument.Load() will load all that into memory each time you open
the document, and then parse all of the data and build an in memory
representation of the whole file. Thats seems incredibly inefficient in
performance critical situation.
 
Nick Z. said:
But if the file is several megabytes, doesn't that mean that calling
XmlDocument.Load() will load all that into memory each time you open
the document, and then parse all of the data and build an in memory
representation of the whole file. Thats seems incredibly inefficient in
performance critical situation.

How often are you expecting to need to insert elements into the
document? Have you considered using a format other than XML? XML really
isn't terribly friendly when it comes to appending.

Alternatively, consider having a file which contains *multiple* XML
documents, and a class which can build them up into a single one. You
could then combine the "fragments" every so often to restore it to a
proper XML file.
 
I think XML is not the best option in order to write your document.
Maybe writing only XML elements in a XML-like document could be a good idea,
id est:

You can write manually <Event /> elements to the end of your XML element
file. So, your file should be like:

<Event>Some event #1</Event>
<Event>Some event #2</Event>
<Event>Some event #3</Event>

Thus, you can write hi-speed log entries into your file and you can recreate
your XML structure on memory with something as simple as:

XmlDocument doc = new XmlDocument();
doc.LoadXml("<Log>" + fileContent + "</Log>");

saludos,



ernesto
 
I wanted to keep the document in well-formatted xml, but I guess it's
not that important. I'll have to write <event> elements directly.

Thanks,
Nick Z.
 
Something I did long before .Net was write a log file in pseudo HTML. That
way, the log file could be viewed in a browser. The other advantage is that
it would be relatively(?) easy to parse. My file looked something like:

<table width=100%>
<tr>
<td>some data</td>
<td>more data</td></tr>
<tr>...

Although i _never_ wrote the "</table>" tag, or even <body></body> tags, the
file still displayed in ie5 (I think) without any problems.

Just another option,
Scott
 
When writing stuff like log files in xml i leave off the document level tags.
so what you have is a long list of <event>Some event #1</event> tags and
nothing else. this allows you to append at the end of the file very
effeciently without havign to use any xml objects at all...just a FileStream
object.

Then when you want to consume the log, just read the file into a string,
slap an opening document level tag at the begining, and a closing one at the
end. then do what you will with it.
 
john said:
When writing stuff like log files in xml i leave off the document level tags.
so what you have is a long list of <event>Some event #1</event> tags and
nothing else. this allows you to append at the end of the file very
effeciently without havign to use any xml objects at all...just a FileStream
object.

Thats almost how I implemented it. The only difference being that I used
an XmlTextWriter instanciated with a StreamWriter, the combo seems to
work well.

I'm just wondering, whats the best way to keep track of the file size...
I can think of several ways, but neither sounds terribly efficient. You
could 'trim' the log file every time you are about to add an event. But
that requires opening it locating the oldest <event> (the first in the
file in my case), deleting it and closing the file. How would I
accomplish this efficiently.
 
Nick Z. said:
Thats almost how I implemented it. The only difference being that I used
an XmlTextWriter instanciated with a StreamWriter, the combo seems to
work well.

I'm just wondering, whats the best way to keep track of the file size...
I can think of several ways, but neither sounds terribly efficient. You
could 'trim' the log file every time you are about to add an event. But
that requires opening it locating the oldest <event> (the first in the
file in my case), deleting it and closing the file. How would I
accomplish this efficiently.

Well, just how often are you going to need to do this? If you only need
to trim once an hour (or even once every few minutes) I'd suggest
loading the whole thing into an XML document and then rewriting just
the nodes you still want.
 
I just thought of something.

I think I'm going to use two files, say 'OldLog' and 'CurrentLog'.
First have only 'CurrentLog' then once it fills up rename it to 'OldLog'
and create a new, empty 'CurrentLog'. Once that fills up, delete the old
'OldLog' and rename the 'CurrentLog' to 'OldLog', then create a new
'CurrentLog'... etc.

This seems like an easier approach than trimming every set period of
time/posts.

Thank you,
Nick Z.
 
Nick Z. said:
I just thought of something.

I think I'm going to use two files, say 'OldLog' and 'CurrentLog'.
First have only 'CurrentLog' then once it fills up rename it to 'OldLog'
and create a new, empty 'CurrentLog'. Once that fills up, delete the old
'OldLog' and rename the 'CurrentLog' to 'OldLog', then create a new
'CurrentLog'... etc.

This seems like an easier approach than trimming every set period of
time/posts.

Yup, that certainly sounds like it would work.
 
Back
Top