Datasets vs. OOP

  • Thread starter Thread starter cody
  • Start date Start date
Alfredo said:
N-tier programming is about physical tiers. You still have a logical
client and a logical server.

There is no n-tier 'law', so saying it is about physical tiers is IMHO your
interpretation, not 'the' interpretation, because there is none.

FB
 
Alfredo said:
The resultsets returned from an SQL database, to manage them and to
submit the changes to the DBMS. The same as I said but with other
words.

I know they can be used with non SQL based data, but it is clear that
it is the main intention.

However a resultset has no relation with a physical set of entities.

SELECT * from A, B has no relation with A nor B, it forms a new entity, build
from A and B. So it's IMHO not used to map an SQL database, but simply there
to work with resultsets on one side (read) and to update rows in tables on
the other side (write).

I comment on your comment because I read the other day an article which
claimed there is no 'order' in a datatable because it has to mimic an Sql
Table and tables don't have an 'order'.
I said that they can be as well optimized as joins, not that SQL
Server optimize them as well as joins ;-)

heh :) ok point taken. However, you can formulate subqueries which are hard
to optimize (especially when they're deep) due to the runtime-statistics
limitation.
Query 2 could be transformed in Query 1 by the optimizer.

... in theory yes, however in practise it will take too much time in a lot
of cases. Which leaves out this optimization I think. But we're drifting off
topic ;)

FB
 
There are a lot of toerh misaaplications of OOP that are far more complex
than just this issue.

But this is an important issue IMO. For instance O/R mappers are tools
intended to apply the Network approach misusing the DBMSs as mere
storage mechanisms.
I'm not sure I understand what you mean here, but if your point is that
tightly coupling Data with the objects they represent is a problem, then yes
I agree.


No, I mean that performance is independent of the logical model
(normalized tables), and only dependent of the physical model (DBMS
internal structures). And with the current DBMSs the logical model is
tightly coupled to the physical model, what is a violation of the
principle of data independence.

That's why sometimes we need to corrupt (denormalize) the logical
designs in order to achieve better performance.

Regards
Alfredo
 
William said:
They can be cheap if done correctly, they can also be 'expensive' although
that's still a relative term. The point is that the implementation of a
model doesn't mean the model is weak. For years while i was too small to
read and before I was born, there was a fair amount of literature and
stereotypes that the relational model was mud just b/c the vendor
implementation didn't do X well for instance. Since I was sucking my thumb
at the time instead of using Oracle 1.0.1, I have to just look to what was
written and many themes and misconceptions seemed to be fairly commonplace.
But this didn't have anything to do with the model's failure and that was
ostensibly the analogy I was trying to make

Agreed. The relational model is 'weak' when you look at it from an OO POV:
you can't model inheritance for example without check constraints or even
values in tables. I wrote something about this a year ago or so, and realized
later that I was wrong: the relational model isn't weak because of that, it's
just 'different'.

FB
 
Hi Frans,
I know, but when you subclass the DataTable class and add a property, it
will not be serialized into the data.

That makes your message clear for me.

Thanks, I will maybe have someday a look for it.

Cor
 
There is no n-tier 'law', so saying it is about physical tiers is IMHO your
interpretation, not 'the' interpretation, because there is none.

You could say the same about almost everything in the IT world.

But n-tier systems have two logical tiers: Client and server.

Regards
Alfredo
 
However a resultset has no relation with a physical set of entities.

Neither a SQL resutlset. Select * from A should not have relationship
with a physical set of entities.
SELECT * from A, B has no relation with A nor B, it forms a new entity, build
from A and B.

Select * from A, B is a virtual table, instead of a base table. But
this is a very common issue.
I comment on your comment because I read the other day an article which
claimed there is no 'order' in a datatable because it has to mimic an Sql
Table and tables don't have an 'order'.

There is order in a datatable because datatables are not perfect
"mappings" of SQL tables.
heh :) ok point taken. However, you can formulate subqueries which are hard
to optimize (especially when they're deep) due to the runtime-statistics
limitation.

Of course, but theoretically there are many possible ways to organize
the runtime-statistics. Each vendor has its own.
... in theory yes, however in practise it will take too much time in a lot
of cases. Which leaves out this optimization I think. But we're drifting off
topic ;)

It depends on how good the optimizer is. IMO the technology is still
in its infancy.


Regards
Alfredo
 
Agreed. The relational model is 'weak' when you look at it from an OO POV:

Striking statement!

The Relational Model is the direct application of thousands of years
of accumulated knowledge in math and logic. It is the direct
application of Predicate Logic and Set Theory. Disciplines developed
by people like Aristotle, Boole, Cantor, Frege, Godel, Rusell, etc.
The Relational Model is a very solid mathematical framework.

Compare this with OO which has as many different interpretations as
practicioners.
you can't model inheritance for example without check constraints or even
values in tables.

I am afraid that what you mean is not inheritance at all.
I wrote something about this a year ago or so, and realized
later that I was wrong: the relational model isn't weak because of that, it's
just 'different'.

Indeed. The Relational Model is a data management approach. OO is a
set of coding guidelines. It is like to mix apples with oranges or
bacon with velocity as we say here :-)


Regards
Alfredo
 
Are we trolling Alfredo ;=) I have never read anything anywhere that a
reputable source would claim that the concept of "n-tier" had much if
anything to do with physical tiers. Most folks would consider the term
"n-tier" in the context of separating the business layer from the user and
data layers. Since those might be physically implemented all on the same
machine, "n-tier" is clearly a term to describe a logical implementation.
But a client-server relationship is "n-tier" also, distinctly not only a
logical implementation, but most often a physical implementation also. I'll
leave it to others more experienced than I to expound.
 
Let's back way up here b/c this discussion is turning into something totally
different than the original question. Pretend there is magically no such
thing as RDBMS system...they just poof disappeared. This wouldn't affect
dataset one bit. You might counter that Rowstate would be unnecessary, but
that isn't really the case, who's to know what someone might need state
information for.

As such, DataSets have NOTHING to do with Relational theory. They may
appear to model it, they may appear to copy it's functionality, they may
even implement many of the rules, but they aren't necessarily RDBMS
structures or anything of the sort. Not using a DataRelation and having
Redundant is perfectly legal usage of DataSets. It may or may not be
recommended depending on what you want to do with the data, but that has
nothing to do with the dataset, it has to do with the end goal.

I could create a simple 2d object (person with 10 properties) that I can
model perfectly iwth a DataRow. All a Dataset is is a collection of
datatable which is a collection of datarows. I could do this because I don't
want to use new collection classes since these ones exist. I'm 'reusing'
existing objects to achieve my end. The fact that I may shove this data
into a database is ancilliary becuase I also may not.

To this end, I could easily employ datasets and say to hell with Relational
theory, Network model etc and still have a valid design. In addition, If I
only persistted say two of those fields and decided that I did want to send
them to the db, and it did match the structure of the back end, OR/M is not
only a natural fit, it's a very good one.

To that end I fail to see the point here?
 
Alfredo said:
You could say the same about almost everything in the IT world.

But n-tier systems have two logical tiers: Client and server.

but the BL is a client of the DAL and a server to the GUI client tier :),
that's why I was saying, the differences in tiers are semantical, not
necessarily client/server

FB
 
Alfredo said:
But this is an important issue IMO. For instance O/R mappers are tools
intended to apply the Network approach misusing the DBMSs as mere
storage mechanisms.

I don't think every O/R mapper does that, only the ones following
Fowler/Evans' Domain model. (note: I don't follow that approach)

FB
 
Alfredo said:
Striking statement!

The Relational Model is the direct application of thousands of years
of accumulated knowledge in math and logic. It is the direct
application of Predicate Logic and Set Theory. Disciplines developed
by people like Aristotle, Boole, Cantor, Frege, Godel, Rusell, etc.
The Relational Model is a very solid mathematical framework.

I think that is a bit too much. Codd did most of the work, and before that,
there was mathematical theory, but not applyable to databases directly.
Agreed, most theoretical basis is founded way before Codd walked the earth,
but I find it a bit too much to give ancient mathematicans the credit for the
relational model ;)
I am afraid that what you mean is not inheritance at all.

( <- == is-a )

Person <- Employee <- Manager <- CEO
Person <- Employee <- Clerk
Person <- Employee <- Accountant

Now, if you set this up in NIAM or ORM (Halpin) you will use
supertypes/subtypes and which allow you to define relations directly to
'Manager' for example.

When you want to project this onto a relational model, you can't define is-a
relationships or supertype/subtypes. You can 'try', but it is very hard.
Nijssen/Halpin mention 2 ways:
1) flatten the supertype/subtype hierarchy and store them all in 1 table,
with all attributes of all types in the hierarchy in that table, and
attributes which are defined with a lower type than the root supertype are
nullable. Furthermore, add a column which identifies the type of the entity
in a particular row.
2) define 1 table per subtype and add an FK constraint from the PK of the
subtype table to the PK of its supertype table (not THE supertype table!)

1) requires values in its type column to interpret which type the row really
has. This is thus not modelable in the relational model: without data in that
column, there is no hierarchy, and the hierarchy is in the data, not in the
model. In other words you can't model inheritance this way, even though it IS
inheritance.
2) is close, however it doesn't symbolizes a hierarchy per se, because the FK
constraint just illustrates 'relationship between attributes' not an is-a
relationship. Retrieving a complete 'type' requires reads from 1 (person) or
more tables (rest), which is not modelable: I can read a row from 'Manager'
without the necessity to read its related row in Employee: i.e. the rows are
related but not seen as a unit, while we do see them as a unit in our
abstract supertype/subtype hierarchy.


FB
 
Are we trolling Alfredo ;=)
No.

I have never read anything anywhere that a
reputable source would claim that the concept of "n-tier" had much if
anything to do with physical tiers.

I don't know any reputable source who writes about n-tiers. It is all
very weak.
Most folks would consider the term
"n-tier" in the context of separating the business layer from the user and
data layers.

A nonsense. It is impossible to separate the business layer from the
database layer because they are the same. What is possible is to
separate the logical database layer from the storage mechanism and
DBMSs are intended to do that.

An Application Server is nothing but a DBMS that uses another DBMS
behind the scenes as the storage mechanism. Then we have again a
client and a server. The backend DBMS used for storage is an
implementation detail. It is "encapsulated" and out of the logical
level.
Since those might be physically implemented all on the same
machine, "n-tier" is clearly a term to describe a logical implementation.

No, you may have many physical tiers in the same machine.
But a client-server relationship is "n-tier" also

Of course, a client-server system may have many physical tiers. If
they have more than 2 they are called "n-tier" systems.


Regards
Alfredo
 
Let's back way up here b/c this discussion is turning into something totally
different than the original question.

Something very frequent in the newsgroups.
As such, DataSets have NOTHING to do with Relational theory.

This is unfortunately true.
They may
appear to model it, they may appear to copy it's functionality, they may
even implement many of the rules, but they aren't necessarily RDBMS
structures or anything of the sort.

Nobody said the contrary. It is clear that datasets are intended to
map SQL databases, but you can do all you want with them.
To this end, I could easily employ datasets and say to hell with Relational
theory, Network model etc and still have a valid design.

Of course, but to use a network design in the 21 century does not make
a lot of sense.


Regards
Alfredo
 
but the BL is a client of the DAL and a server to the GUI client tier :),

But the DAL does not exist for the GUI. It is hidden or encapsulated.
That's why I said that there are only two visible logical layers.

In the case of a distributed DBMS is the same. A properly designed
distributed database should appear to the clients as a single
database. Client-Server again.


Regards
Alfredo
 
Alfredo said:
But the DAL does not exist for the GUI. It is hidden or encapsulated.
That's why I said that there are only two visible logical layers.

You can have multiple tiers at the same tier level, hidden by a facade,
however the facade calls into different tiers.
In the case of a distributed DBMS is the same. A properly designed
distributed database should appear to the clients as a single
database. Client-Server again.

however these 'clients' are servers for other clients. So strictly thinking
in 'client' and 'server' is not that great. Better is: 'consumer-role' and
'producer-role' in a given 'situation'.

FB
 
One of my clients (who used to be a CFO at a big software house) often
states that "software is an opinion". Being conceptual, the entire "layer"
idea must be -- by definition -- a logical implementation.

I do agree with the the thought that "you may have many physical tiers in
the same machine."
 
Back
Top