experts: structure of dataset xml

  • Thread starter Thread starter suzy
  • Start date Start date
S

suzy

hello,

i have a problem, i wonder if anyone can help...

i am retrieving records from an access table using a dataset and want the
data to be returned in xml, so i am using the dataset.GetXml() method.

lets say, for example, that the xml is in the following format:

<Users>
<User>
<UserId>1</UserId>
<Username>robert</Username>
<Password>friend</Password>
</User>
<User>
<UserId>2</UserId>
<Username>natasha</Username>
<Password>sunny</Password>
</User>
</Users>

with this xml, i can easily read values from the xml by selecting the node i
want.

however, my problem occurs if i do an inner join in my user table (lets say
from the users table, to the address table). the xml then gets returned
something like this:

<Users>
<User>
<User.UserId>1</User.Id>
<Username>robert</Username>
<Password>friend</Password>
<AddressId>11</AddressId>
<Address.UserId>1</Address.UserId>
<AddressLine1>11 main road</AddressLine1>
<AddressLine2>london</AddressLine2>
</User>
<User>
<User.UserId>2</User.UserId>
<Username>natasha</Username>
<Password>sunny</Password>
<AddressId>12</AddressId>
<Address.UserId>2</Address.UserId>
<AddressLine1>12 main road</AddressLine1>
<AddressLine2>london</AddressLine2>
</User>
</Users>


as you can see, when there are fields with the same name across the tables
in the inner join, the node gets prefixed with the table name. this causes
a problem when it comes to reading values out of the xml because:

sometimes i might be doing an inner join, sometimes i might not be doing an
inner join.
the node names change depending on whether i join or not, so how does my
code that reads the values from the xml know what to look for?

i dont want to write separate code for each join scenario... is there a way
around this? maybe there is a way to force the .getxml method to ALWAYS
prefix fields with the table name?

many thanks.
 
Hi,

I guess you are using a 'select *' I would actually caution against this and
rather explicidly list the columns to select that way you would not be
getting the redundant UserID from the address and the naming problem would
go away. However if you require it for some reason the following should
help.

The solution here would be to alias the column names in the query. For
example:

select U.UserId as UserID,
U.Username as UserName,
U.Password as Password,
A.AddressID as AddressID,
A.UserID as AddressUserID,
...
from Users U
inner join Addresses A on U.UserID = A.UserID ...

This way when ADO.NET process the resultset there are no columns that have
conflicting names.

Hope this helps

Chris Taylor
 
Hi Chris,

Thanks for your help. You are right, I am using a "SELECT *".

Also, I thought about the alias way, but unfortunately, I don't think it's
an option for me, because I am trying to write a generic data access tool,
which means I wouldn't know which fields I was querying beforehand. :-(

What would be ideal is if the fields were returned as attributes rather than
nodes. Is there such a thing in .net?

Any other ideas?
 
Hi,

Sorry the last post was not of much help then. However to write the data out
as attributes is quite doable, by setting the ColumnMapping property of each
column to MappingType.Attribute.

foreach ( DataTable dt in ds.Tables )
{
foreach ( DataColumn col in dt.Columns )
{
col.ColumnMapping = MappingType.Attribute;
}
}

Now you can use the DataSet WriteXml method to serialize the DataSet to a
string or a stream etc.

Hope this helps

Chris Taylor
 
Chris, thanks a lot for that - I never knew you could do that :)

Unfortunately, the attributes are still prefixed with the table name (silly
me, I should have realised it would still do this). eg:

<Users>

<User User.UserId="1" Username="robert" Password="friend" AddressId="12"
Address.UserId="1"/>

</Users>

All I need to do now is figure out how to separate the inner join tables
into sub-nodes of the main node... eg:

<Users>

<User UserId="1" Username="robert" Password="friend">

<Address AddressId="12" UserId="1"/>

</User>

</Users>

I have no idea how do this. :( Any ideas?

I may have to resort to storing a DB schema as a xml file, which my data
access tool reads before each query so it knows which fields exist.
 
Hi,

I am not sure of this would fit for your situation. Instead of using a join,
you can read the Users and Addresses table into the DataSet, create a
DataRelation between the two DataTables on the UserID columns. Then you can
set the DataRelation's Nested property to true. The resulting XML will be
the User Records with the related Address info nested in each User Element.
This should also solve your problem with the naming of the columns.

Hope this helps

Chris Taylor
 
Good idea, Chris.

I haven't tried it yet, but I can foresee a problem that it would return ALL
records from User and Address tables which could be quite a big overhead in
some cases.

I think you are right though - it would solve the naming of columns issue
though.
 
Hi,

You could reduce the overhead by selecting only the relevant records from
each table. For example if you only want the users with addresses then the
selection of the users can include a EXISTS check.

select * from users as u
where exists ( select 1 from addresses as a where a.userid = u.userid )

Hope this helps

Chris Taylor
 
Chris,

I guess you are right, but the query code would get more and more
complicated the more "inner joins" I do. For a generic data access tool,
this could become quite a nightmare!

Currently, I am trying to retrieve the database scheme. I have managed to
get all the information I need (tables and fields). Unfortunately, the
fields are being returned in alphabetical order in the columns collection.
I'd rather it was returned in the same order as they appear in the table.
It's not a big deal (ie: it wouldn't affect the functionality of my tool),
but it would be nice to have them in a nice order so the xml is easier to
read.

:)
 
Hi Suzy,

How are you retrieving the schema information? From the system tables,
schema views or GetOleDbSchemaTable?

Regards

Chris Taylor
 
Hi Chris,

Yes, I'm using:
dt = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[]{null,
null, null, "TABLE"});
to get a list of tables.

And then I loop through these tables and use:
dtFields = conn.GetOleDbSchemaTable (OleDbSchemaGuid.Columns, new object[]
{null, null, dt.Rows["TABLE_NAME"], null});
to get a list of columns in each table.
 
You can create a DataView on the DataTable that is sorted by
'ORDINAL_POSITION', that will give you the results in column sequence.

DataView dv = new DataView(dtFields);
dv.Sort = "ORDINAL_POSITION";

// Now dv has columns in database sequence

Hope this helps

Chris Taylor

suzy said:
Hi Chris,

Yes, I'm using:
dt = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[]{null,
null, null, "TABLE"});
to get a list of tables.

And then I loop through these tables and use:
dtFields = conn.GetOleDbSchemaTable (OleDbSchemaGuid.Columns, new object[]
{null, null, dt.Rows["TABLE_NAME"], null});
to get a list of columns in each table.





Chris Taylor said:
Hi Suzy,

How are you retrieving the schema information? From the system tables,
schema views or GetOleDbSchemaTable?

Regards

Chris Taylor
managed
to then
the using create XML
will table
name which
my not,
 
Great! Thanks for that. I'll give that a try. Hopefully that will be the
last of my problems! :)

Chris Taylor said:
You can create a DataView on the DataTable that is sorted by
'ORDINAL_POSITION', that will give you the results in column sequence.

DataView dv = new DataView(dtFields);
dv.Sort = "ORDINAL_POSITION";

// Now dv has columns in database sequence

Hope this helps

Chris Taylor

suzy said:
Hi Chris,

Yes, I'm using:
dt = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[]{null,
null, null, "TABLE"});
to get a list of tables.

And then I loop through these tables and use:
dtFields = conn.GetOleDbSchemaTable (OleDbSchemaGuid.Columns, new object[]
{null, null, dt.Rows["TABLE_NAME"], null});
to get a list of columns in each table.





Chris Taylor said:
Hi Suzy,

How are you retrieving the schema information? From the system tables,
schema views or GetOleDbSchemaTable?

Regards

Chris Taylor


Chris,

I guess you are right, but the query code would get more and more
complicated the more "inner joins" I do. For a generic data access tool,
this could become quite a nightmare!

Currently, I am trying to retrieve the database scheme. I have
managed
to
get all the information I need (tables and fields). Unfortunately, the
fields are being returned in alphabetical order in the columns collection.
I'd rather it was returned in the same order as they appear in the table.
It's not a big deal (ie: it wouldn't affect the functionality of my tool),
but it would be nice to have them in a nice order so the xml is
easier
to
read.

:)

Hi,

You could reduce the overhead by selecting only the relevant records
from
each table. For example if you only want the users with addresses then
the
selection of the users can include a EXISTS check.

select * from users as u
where exists ( select 1 from addresses as a where a.userid = u.userid )

Hope this helps

Chris Taylor
Good idea, Chris.

I haven't tried it yet, but I can foresee a problem that it would
return
ALL
records from User and Address tables which could be quite a big
overhead
in
some cases.

I think you are right though - it would solve the naming of columns
issue
though.



Hi,

I am not sure of this would fit for your situation. Instead of using
a
join,
you can read the Users and Addresses table into the DataSet, create
a
DataRelation between the two DataTables on the UserID columns. Then
you
can
set the DataRelation's Nested property to true. The resulting XML
will
be
the User Records with the related Address info nested in each User
Element.
This should also solve your problem with the naming of the columns.

Hope this helps

Chris Taylor

Chris, thanks a lot for that - I never knew you could do
that
:) "SELECT
*". in
the
 
Back
Top