InnerText in an XML file

  • Thread starter Thread starter tshad
  • Start date Start date
T

tshad

I have an XML file that I am reading in VS 2005. The code is something
like:

XmlDocument itemDoc = new XmlDocument();
itemDoc.Load(@"c:\TestDocs\1004win.xml");

// iterate through top-level elements
foreach (XmlNode itemNode in itemDoc.DocumentElement.ChildNodes)
{

if (itemNode.Attributes.Count > 0)
{
spaces += " ";
foreach (XmlAttribute xmlAttribute in itemNode.Attributes)
{
sw.WriteLine("\n{0}Attribute Name: {1} Value: {2}\n", spaces,
xmlAttribute.Name, xmlAttribute.Value);
}
spaces.Remove(0, 4);
}
if (itemNode.ChildNodes.Count == 0)
sw.WriteLine("(No additional Information)\n");
else
{
if (itemNode.ChildNodes.Count > 0)
{
spaces += " ";
foreach (XmlNode childNode in itemNode.ChildNodes)
{
sw.WriteLine("{0}Child Node: {1} Value: {2} InnerText: {3}\n",
spaces,
childNode.Name,
childNode.Value,
childNode.InnerText);
}
spaces.Remove(0, 4);
}
}
}

My XML document looks like:

<?xml version="1.0" encoding="utf-8"?>
<REPORT VERSION="1.10" FILENUM="" DESCRIPTION="Form Utility XML: 3/18/2008
12:27:13 PM" MAJORFORM="100">
<ORDER></ORDER>
<TRACKING></TRACKING>
<FORMS>
<FORM NUM="1" FORMCODE="1004" SECCODE="1" >
<FIELDS>
<OTHERFILENUMBER>692</OTHERFILENUMBER>
<FNMA_FILENUMBER>693</FNMA_FILENUMBER>
<SUBPROPADDRESS>3</SUBPROPADDRESS>
</FIELDS>
<FORMPHOTOS></FORMPHOTOS>
<ATTACHMENTS></ATTACHMENTS>
</FORM>
</FORMS>
</REPORT>

I am trying to get the innerText of each of the fields below <FIELDS> so I
use childNode.InnerText to get it.

But I also found that if there are child nodes under your nodes the
innerText will include that as well.

In my example, the innerText for <Form> is 6926933 - which is the innerText
of <OTHERFILENUMBER>, <FNMA_FILENUMBER> and <SUBPROPADDRESS>???

Why is that and how can I get just the innerText of a particular element?

I would have assumed that the innerText of Form would have been "".

Thanks,

Tom
 
tshad said:
I have an XML file that I am reading in VS 2005.
I am trying to get the innerText of each of the fields below <FIELDS> so I
use childNode.InnerText to get it.

But I also found that if there are child nodes under your nodes the
innerText will include that as well.

In my example, the innerText for <Form> is 6926933 - which is the innerText
of <OTHERFILENUMBER>, <FNMA_FILENUMBER> and <SUBPROPADDRESS>???

Why is that and how can I get just the innerText of a particular element?

I would have assumed that the innerText of Form would have been "".

Try with .FirstChild.Value instead og .InnerText !

Arne
 
Hi Tom,

XmlNode.InnerText is doing exactly what it is supposed to do, it "Gets or
sets the concatenated values of the node and all its child nodes. "

As all XmlElement nodes with text values have a XmlText node underneath
containing this text, Arne's solution of using .FirstChild.Value to obtain
this value should work. Just remember to ignore XmlText nodes elsewhere.

BTW, don't use \n for line breaks unless you have a good reason for doing
so. On MS Windows a the newline characters are \r\n, and you are not always
guaranteed that \n will be respected even though most programs do. Even
better is using Environment.NewLine wherever you want to break a line as this
will compile to system specific \r\n or \n depending on wether you are
compiling for a Unix platform or not.
 
Arne Vajhøj said:
Try with .FirstChild.Value instead og .InnerText !

I assume that you are saying that the innerText is actually a child of the
node?

I changed the code to show:

foreach (XmlNode childNode in itemNode.ChildNodes)
{
sw.WriteLine("{0}Child Node: {1} Value: {2} InnerText: {3}\n",
spaces,
childNode.Name,
childNode.FirstChild.Value,
childNode.InnerText);

// Does this Child Node have any Child Nodes


This gives me an error saying that:

Object reference not set to an instance of an object.

But as soon as I changed it to check if there were any child nodes, then it
worked fine.

if (childNode.ChildNodes.Count > 0)
sw.WriteLine("{0}Child Node: {1} Value: {2} InnerText: {3}\n",
spaces,
childNode.Name,
childNode.FirstChild.Value,
childNode.InnerText);
else
sw.WriteLine("{0}Child Node: {1} Value: {2} InnerText: {3}\n",
spaces,
childNode.Name,
"",
childNode.InnerText);

Also, I found that I needed to access the attributes of a parent 2 levels
away. Is that possible?

For example, using the same xml page:

<?xml version="1.0" encoding="utf-8"?>
<REPORT VERSION="1.10" FILENUM="" DESCRIPTION="Form Utility XML: 3/18/2008
12:27:13 PM" MAJORFORM="100">
<ORDER></ORDER>
<TRACKING></TRACKING>
<FORMS>
<FORM NUM="1" FORMCODE="1004" SECCODE="1" >
<FIELDS>
<OTHERFILENUMBER>692</OTHERFILENUMBER>
<FNMA_FILENUMBER>693</FNMA_FILENUMBER>
<SUBPROPADDRESS>3</SUBPROPADDRESS>
</FIELDS>
<FORMPHOTOS></FORMPHOTOS>
<ATTACHMENTS></ATTACHMENTS>
</FORM>
</FORMS>
</REPORT>

If my childnode is now at "OTHERFILENUMBER", childnode.Parent = FIELDS. but
I need to get to "FORM" as that has all the attributes I need.

Is there a way to directly access that node from the childNode?

Thanks,

Tom
 
tshad said:
I assume that you are saying that the innerText is actually a child of the
node?

No I am saying that .InnerText get all text from the element and
the sublements while .FirstChild.Value will just get you the text
within this element.
This gives me an error saying that:

Object reference not set to an instance of an object.

But as soon as I changed it to check if there were any child nodes, then it
worked fine.

It is not surprising that .FirstChild returns null if there are
no child nodes.
If my childnode is now at "OTHERFILENUMBER", childnode.Parent = FIELDS. but
I need to get to "FORM" as that has all the attributes I need.

Is there a way to directly access that node from the childNode?

You can use .Parent multiple times !

Arne
 
Arne Vajhøj said:
No I am saying that .InnerText get all text from the element and
the sublements while .FirstChild.Value will just get you the text
within this element.


It is not surprising that .FirstChild returns null if there are
no child nodes.


You can use .Parent multiple times !

So I can do childNode.Parent.Parent.Parent.Name or
childNode.Parent.Parent.Parent.FirstChild.Value.

That would solve my problem.

Thanks,

Tom
 
tshad said:
So I can do childNode.Parent.Parent.Parent.Name or
childNode.Parent.Parent.Parent.FirstChild.Value.

Yes.

(I am just thinking - is it .Parent or .ParentNode - never
mind - the friendly compiler will tell you)

Arne
 
Back
Top