Performance better with DataSet or DataReader or ...???

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Hi,
I have a business-entity-classes which are constructed of few primitive
types and one DataSet type. For Example:

Public class MyClass{
private int id;
private string name;
private DataSet childrenCollection;

…

In my data access layer I want to create an instance based on this class and
fill it with data from the DB (MSDE). I have two options in mind and I want
to know which would be the best in term of performance. In both options I’m
using a single stored procedure with two select statements – one for the
single row (primitive types) and one for the children collection. For example:

SELECT ID, Name from T_Parents WHERE ID= id;
SELECT ID, Name from T_Children WHERE ParentID = id;

The options are:

1. Fill all the data into a DataSet. As there are two select statements two
tables will be created in the DataSet I would be able to copy the values from
the first table to my class primitive and the other table to my class
DataSet.

2. Use a DataReader and, by looping over the second ResultsSet, build a
DataTable row after row. So instead of using the DataSet.Fill method I’ll be
“manually†creating the rows and inserting the values based on the results
coming from the DataReader.

I don’t think it matters, but I’m using the data access application bloc v2.

If there is another (better) option I haven’t thought about I’ll be happy to
know. Otherwise I would like to know which one should perform better.

Thank you,
Gwenda
 
Gwenda:
You haven't really been specific enough for anyone to provide a specific
answer. In this case, knowing more about your schema could easily impact
the answer. Knowing more about your retrieval code would also be
beneficial.

Is the DataSet you've named childrenCollection the same as the DataSet that
you describe populating in the paragraphs after the SQL Statements? If so,
since you're already storing the data about the parent in the other member
variables, would it not make more sense to store just the DataTable
containing the children?

Are id and name the only columns you're retrieving?

Assuming that "T_" means "table" (correct me if I'm wrong), do T_Parents and
T_Children store the same data just partitioned for creating a relationship?
If not, does T_Children have data that could/should be normalized in order
to make the data the same?

The reason I ask these questions is because there may be a way to retrieve a
single DataTable and DataView and do away with the overhead associated with
the DataSet. If you still need to create both tables though, the
performance difference between DataAdapter.Fill and iterating through a
DataReader to build a DataTable will be negligable and you'll also have more
code to maintain. Keep in mind that if your data structure ever changes,
you'll need to update the code for building your table in addition to any
other required changes...

I'm sorry I couldn't be more help at this time but if you answer the above
questions and include your schema (at least as much as relates to this
question), I should be able to provide more assistance.
 
Gwenda,

When I understand your question well, than is for me the answer simple,

When I can use the dataadapter to build a dataset, I do it.

I hope this helps you to find your decissions.

Cor
 
Hi Dave,
Sorry if I took a few issued for granted. I'll try now to make my questions
clearer:

1. My actual schema is much more complex than the one I’ve used just for
example purposes. But the main issue here is that a lot of my business
entities (and my DB schema as well) are constructed from primitive types, my
own types (classes) and DataSets. I use the DatSets whenever an entity
contains a collection within it. As some of the collections are quite big and
I want to keep track of what was change in them by the user.

2. I hope you understood from my answer in 1 that I do store in a DataSet
only the children.

3. “T_†does means “table†but the data in the parent table and the
children table is completely different. Following my example, if T_Parents
contain ID and Name (again, in my actual scenario it contains much more) so
T_children will contain something like: ID, Favorite_Color, Average_Speed and
Parent_ID which is a foreign key to T_Parents.ID with many to one relation.

I hope this is helpful and I'll be happy to answer more questions.
Thank you,
Gwenda
 
Hi Gwenda,

First of all, I would like to confirm my understanding of your issue. From
your description, I understand that you need to know the best way to fill
the childrenCollection DataSet. If there is any misunderstanding, please
feel free to let me know.

In my opinion, I will try to fill the two tables to a single DataSet and
create a DataRelation between two tables. We can use GetChildRows to get
the children table rows according to the parent ID and put them to the
objects. This will help to lower the workload of database server, since a
DataReader will require an open connection to the database. If many clients
will connection to the server simultaneously, this would be a good
practice. It does all the object filling at client side.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."
 
Sorry for not getting this back yesterday. It sounds like you may still
have an opportunity to rework your schema a bit but without the schema I
can't really give you any guidance on that (and depending on your progress
on the project it might not be an option).

I'd like to extend on what Kevin has said below. If you populate a DataSet
with the two result sets returned from your stored procedure you will
definately reduce the load on your DB Server (but, since you're using MSDE,
I doubt it would make much difference overall). Also consider this though:
It doesn't sound like you really need to manage the relationship with the
DataSet because (from my understanding of you're previous posts), you're
going to populate the private fields of your class with the parent's
information, then reference the children through the dataset.

If this is the case, If you only have the one "Collection" you can just
replace the DataSet field with a DataTable field and call the copy() method
of the DataTable in your DataSet to copy the DataTable into your class. You
could then dispose of your data set and get rid of the overhead it causes.
Furthermore, you stated that you were using the DataSet to "track what was
changed in them by the user," The individual data tables are responsible
for managing the changes, not the DataSet.

If you do, indeed have multiple "Collections" you should probably follow
Kevin's advice and use the DataSet with DataRelations but rather than
populate member fields in your class, just create properties that expose
only what needs to be exposed. For example:

public class MyClass
{
private DataSet peopleInfo;

public int ID
{
get { return (int)(peopleInfo.Tables["Parent"].Rows[0]["ID"]); }
}

public string Name
{
get { /* code to get the name here */ }
set { /* code to set the name here */ }
}

public DataTable Children
{
get { return peopleInfo.Tables["Children"]; }
}
}

I hope this helps
 
Hi Kevin and Dave,
First, let me thank you for your answers which were very helpful. I
especially liked the idea of exposing parts of the DS through my properties.
Second, let me clear one issue just to make sure we understand each other. I
mentioned in my previous message that my objects are filled in the data layer
and then being sent to the client (through a business layer and a service
layer). So in any case I wasn’t planning to leave an open connection to the
DB from the client. The connection is opened; the object is filled; and the
connection is closed. Since dataReaders usually perform better than DS, I
thought it would be better to use it – for example:

public MyClass GetMyClass(int myClassID)
{
const string procName = "P_Get_MyClass_By_ID";
MyClass myClass = null;
SqlParameter[] sqlParameters = new SqlParameter[]{new
SqlParameter("@myClass_ID", myClassID)};
using (SqlDataReader myClassReader =
SqlHelper.ExecuteReader(connectionString,CommandType.StoredProcedure,
procName, sqlParameters))
{
while (myClassReader.Read())
{

if(myClass == null)
{

myClass = new MyClass (myClassReader.GetInt32(0),
myClassReader.GetString(1), new DataSet("contactsList"));
}
}
myClass.childrenCollection.Tables.Add("Children");

myClass.childrenCollection.Tables["Children"].Columns.Add("Favorite_color",
System.Type.GetType("System.String"));

myClass.childrenCollection.Tables["Children"].Columns.Add("Speed"),
System.Type.GetType("System.String"));
if(myClass.NextResult())
{
while (myClassReader.Read())
{

myClass.childrenCollection.Tables["Children"].Rows.Add( new
object[]{myClassReader.GetString(0), clientReader.GetString(1)});
}
}
}
}

Following your answers, I understand this code is harder to maintain but I
want to be sure that performance-wise its better too – taking into account
that the connection is not kept open through the whole client session.

Thank you both again,
Gwenda



Dave Fancher said:
Sorry for not getting this back yesterday. It sounds like you may still
have an opportunity to rework your schema a bit but without the schema I
can't really give you any guidance on that (and depending on your progress
on the project it might not be an option).

I'd like to extend on what Kevin has said below. If you populate a DataSet
with the two result sets returned from your stored procedure you will
definately reduce the load on your DB Server (but, since you're using MSDE,
I doubt it would make much difference overall). Also consider this though:
It doesn't sound like you really need to manage the relationship with the
DataSet because (from my understanding of you're previous posts), you're
going to populate the private fields of your class with the parent's
information, then reference the children through the dataset.

If this is the case, If you only have the one "Collection" you can just
replace the DataSet field with a DataTable field and call the copy() method
of the DataTable in your DataSet to copy the DataTable into your class. You
could then dispose of your data set and get rid of the overhead it causes.
Furthermore, you stated that you were using the DataSet to "track what was
changed in them by the user," The individual data tables are responsible
for managing the changes, not the DataSet.

If you do, indeed have multiple "Collections" you should probably follow
Kevin's advice and use the DataSet with DataRelations but rather than
populate member fields in your class, just create properties that expose
only what needs to be exposed. For example:

public class MyClass
{
private DataSet peopleInfo;

public int ID
{
get { return (int)(peopleInfo.Tables["Parent"].Rows[0]["ID"]); }
}

public string Name
{
get { /* code to get the name here */ }
set { /* code to set the name here */ }
}

public DataTable Children
{
get { return peopleInfo.Tables["Children"]; }
}
}

I hope this helps

Kevin Yu said:
Hi Gwenda,

First of all, I would like to confirm my understanding of your issue. From
your description, I understand that you need to know the best way to fill
the childrenCollection DataSet. If there is any misunderstanding, please
feel free to let me know.

In my opinion, I will try to fill the two tables to a single DataSet and
create a DataRelation between two tables. We can use GetChildRows to get
the children table rows according to the parent ID and put them to the
objects. This will help to lower the workload of database server, since a
DataReader will require an open connection to the database. If many
clients
will connection to the server simultaneously, this would be a good
practice. It does all the object filling at client side.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."
 
One last thing as a follow up. The performance of a data reader is better
if you're going to access the contents directly but in this case, it sounds
like there would be no performance gain by using one method or the other
since you were going to just going to do manually what the DataAdapter does
automatically through the Fill() method.

The Fill() method would probably be a better choice here since you're
essentially going to have to maintain a copy of the database structure in
your application. If you were to use a DataReader to populate a DataTable
and if you were to ever change the structure of the database (highly
probable) then you'll have to not only update the code, you'll also have to
recompile and redistribute the application.

One may argue that you'd have to do that anyway since if the data structure
changed then the application probably would too so you'd have to
recompile/redistribute anyway but I say that it not only depends on the
change. Additionally, adding the complexity of maintaining a copy of the
data structure simply introduces another place for your code to break. I'm
a strong believer in keeping code at its simplest form.

Good luck!

gwenda said:
Hi Kevin and Dave,
First, let me thank you for your answers which were very helpful. I
especially liked the idea of exposing parts of the DS through my
properties.
Second, let me clear one issue just to make sure we understand each other.
I
mentioned in my previous message that my objects are filled in the data
layer
and then being sent to the client (through a business layer and a service
layer). So in any case I wasn't planning to leave an open connection to
the
DB from the client. The connection is opened; the object is filled; and
the
connection is closed. Since dataReaders usually perform better than DS, I
thought it would be better to use it - for example:

public MyClass GetMyClass(int myClassID)
{
const string procName = "P_Get_MyClass_By_ID";
MyClass myClass = null;
SqlParameter[] sqlParameters = new SqlParameter[]{new
SqlParameter("@myClass_ID", myClassID)};
using (SqlDataReader myClassReader =
SqlHelper.ExecuteReader(connectionString,CommandType.StoredProcedure,
procName, sqlParameters))
{
while (myClassReader.Read())
{

if(myClass == null)
{

myClass = new MyClass (myClassReader.GetInt32(0),
myClassReader.GetString(1), new DataSet("contactsList"));
}
}
myClass.childrenCollection.Tables.Add("Children");

myClass.childrenCollection.Tables["Children"].Columns.Add("Favorite_color",
System.Type.GetType("System.String"));

myClass.childrenCollection.Tables["Children"].Columns.Add("Speed"),
System.Type.GetType("System.String"));
if(myClass.NextResult())
{
while
(myClassReader.Read())
{

myClass.childrenCollection.Tables["Children"].Rows.Add( new
object[]{myClassReader.GetString(0), clientReader.GetString(1)});
}
}
}
}

Following your answers, I understand this code is harder to maintain but I
want to be sure that performance-wise its better too - taking into account
that the connection is not kept open through the whole client session.

Thank you both again,
Gwenda



Dave Fancher said:
Sorry for not getting this back yesterday. It sounds like you may still
have an opportunity to rework your schema a bit but without the schema I
can't really give you any guidance on that (and depending on your
progress
on the project it might not be an option).

I'd like to extend on what Kevin has said below. If you populate a
DataSet
with the two result sets returned from your stored procedure you will
definately reduce the load on your DB Server (but, since you're using
MSDE,
I doubt it would make much difference overall). Also consider this
though:
It doesn't sound like you really need to manage the relationship with the
DataSet because (from my understanding of you're previous posts), you're
going to populate the private fields of your class with the parent's
information, then reference the children through the dataset.

If this is the case, If you only have the one "Collection" you can just
replace the DataSet field with a DataTable field and call the copy()
method
of the DataTable in your DataSet to copy the DataTable into your class.
You
could then dispose of your data set and get rid of the overhead it
causes.
Furthermore, you stated that you were using the DataSet to "track what
was
changed in them by the user," The individual data tables are responsible
for managing the changes, not the DataSet.

If you do, indeed have multiple "Collections" you should probably follow
Kevin's advice and use the DataSet with DataRelations but rather than
populate member fields in your class, just create properties that expose
only what needs to be exposed. For example:

public class MyClass
{
private DataSet peopleInfo;

public int ID
{
get { return (int)(peopleInfo.Tables["Parent"].Rows[0]["ID"]); }
}

public string Name
{
get { /* code to get the name here */ }
set { /* code to set the name here */ }
}

public DataTable Children
{
get { return peopleInfo.Tables["Children"]; }
}
}

I hope this helps

Kevin Yu said:
Hi Gwenda,

First of all, I would like to confirm my understanding of your issue.
From
your description, I understand that you need to know the best way to
fill
the childrenCollection DataSet. If there is any misunderstanding,
please
feel free to let me know.

In my opinion, I will try to fill the two tables to a single DataSet
and
create a DataRelation between two tables. We can use GetChildRows to
get
the children table rows according to the parent ID and put them to the
objects. This will help to lower the workload of database server, since
a
DataReader will require an open connection to the database. If many
clients
will connection to the server simultaneously, this would be a good
practice. It does all the object filling at client side.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."
 
Hi Gwenda,

Actually, the DataAdapter internally uses DataReader to read data and fill
the DataSet. However, if we fill a DataSet and copy data to your object,
that will be slower than your code to put records directly to the DataSet
in myClass. So I think your code is fine. Good luck!

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."
 
Back
Top