LINQ question

  • Thread starter Thread starter news.microsoft.com
  • Start date Start date
N

news.microsoft.com

I have a desktop aplication which use an embeded database engine (embeded in
the application that is) to store its data.
And I used LINQ 2 SQL as an ORMapper to synchronize the database with my
object representation in memory.

What I want to do is load most object at the start, do in memory work, and
occasionaly load additional big data or save the changes.

I have a problem with the DataContext.GetTable<T>() mehod. It seems
everytime I do a foreach on its retur it will query the database, is that
true?

So in my particular case I'm better off working with an List<T> which I will
initialize from the GetTable<T> result?

Just in case I wrote such a list (Code attached) and I wonder if I did it
the proper way, if anyone can take a look?



Also I have a table like that
Data
--------
ID int
summary nvarchar(256)
content image

which I loaded in
[Table("Data")]
class Data
{
[Column(...)]
public int ID {get; set; }
[Column(...)]
public sring Sumary { get; set; }
[Column(...)]
public byte[] Content { get; set; }
}

Now Content is potentially big and I want to use lazy loading, i.e. I DON'T
want to load the 'byte[] Content' with the rest of the data (I guess it's
easy, just remove the ColumnAttribute), but I DO want to save the Content,
if it's not null (i.e. if it has been explicitely loaded/set/modified) when
SubmitChanges() is called on the Context.
Any tip on how to do that?

Moreover, if I only change the content (which will have no ColumnAttribute)
how do I mark my entity as dirty in the context? (i.e. I would need an
'UpdateOnSubmit()' method in the table, but it doesn't exists..)
 
Before getting into this, I think a lot of people have a misunderstanding of
LINQ and its purpose. It is not a tool for data access, per se. Sure, LINQ
to SQL can query a database and pull your data in a lazy manner, but the
real strength is in iterating sets of objects that adhere to IEnumerable. In
some instances, you can use LINQ as an effective OR/M product. There are
other cases where it can bite you. Examine the limitations before making
decisions and do not be afraid to treat reporting as a separate function
from full CRUD.

Also, I have a limited view of the problem domain, so you may find some of
the things I throw out here will not work for you. Take what you can and see
where the ideas carry you.

Having said that, the rest is inline.

news.microsoft.com said:
I have a desktop aplication which use an embeded database engine (embeded
in
the application that is) to store its data.
And I used LINQ 2 SQL as an ORMapper to synchronize the database with my
object representation in memory.

This may or may not bite you.
What I want to do is load most object at the start, do in memory work, and
occasionaly load additional big data or save the changes.

You can do this, but consider only loading the "big data" when someone is
actively changing it. It takes another database hit, but it may be more
efficient in the long run.
I have a problem with the DataContext.GetTable<T>() mehod. It seems
everytime I do a foreach on its retur it will query the database, is that
true?

I am not completely sure. LINQ is designed to wait until you access data to
actually query the database. But I have not read, or tested, whether a
commonly used set of data (cache, for example) gets requeried each time you
iterate.

As it stands, however, you are better to cache the objects you want. You can
still use LINQ to run through objects after they are objects, and there
would be no requery.

If you are worried about requery every time you iterate, consider either a)
not using LINQ as your data access or b) using LINQ, but then storing the
objects off elsewhere. In either case, you can iterate the objects later, as
the objects will use IEnumerable.
So in my particular case I'm better off working with an List<T> which I
will
initialize from the GetTable<T> result?

This is one way to accomplish what I was talking about, but you can still
lazy load using the for each and wait to grab the objects. It really just
depends on your needs.
Just in case I wrote such a list (Code attached) and I wonder if I did it
the proper way, if anyone can take a look?

With a quick glance, everything appears fine. I am sure someone else will
offer a deeper analysis.
Also I have a table like that
Data
--------
ID int
summary nvarchar(256)
content image

which I loaded in
[Table("Data")]
class Data
{
[Column(...)]
public int ID {get; set; }
[Column(...)]
public sring Sumary { get; set; }
[Column(...)]
public byte[] Content { get; set; }
}

Now Content is potentially big and I want to use lazy loading, i.e. I
DON'T
want to load the 'byte[] Content' with the rest of the data (I guess it's
easy, just remove the ColumnAttribute), but I DO want to save the Content,
if it's not null (i.e. if it has been explicitely loaded/set/modified)
when
SubmitChanges() is called on the Context.
Any tip on how to do that?

So, you don't want to grab the data, but you do want to save it off if it is
not null (ie, a user added it?). Sounds to me like you need two sets of
objects. The first is the metadata objects, which contain the fields you
want people to see, sans the really big byte[] data. The second is the
object you want to save when a user changes the byte[] content.

if this sounds different, it is really nothing different from what you do
when you use a grid in conjunction with a details view when there is a field
you cannot display easily (or efficiently). There is nothing saying you
cannot switch the user to a "page" that shows all of the data, including a
field where they can upload binary data.
Moreover, if I only change the content (which will have no
ColumnAttribute)
how do I mark my entity as dirty in the context? (i.e. I would need an
'UpdateOnSubmit()' method in the table, but it doesn't exists..)

Once again, I would consider NOT using the same object to view and save in
your case, based on what I have read. There is no rule that states you must
have the same representation in view (when you do not want to see the binary
data) and CRUD mode (actually the CUD part of CRUD, as Read is handled
effectively by the initially grabbing of objects).

--
Gregory A. Beamer
MVP, MCP: +I, SE, SD, DBA

Subscribe to my blog
http://gregorybeamer.spaces.live.com/lists/feed.rss

*************************************************
| Think outside the box!
|
*************************************************
 
Inline ...
What I want to do is load most object at the start, do in memory work, and
occasionaly load additional big data or save the changes.

You may want to balance between loading on demand what you really need and
keepind frequently mostly read only data in cache. My personal preference is
to load just what is needed and add caching if needed rather than loading
too much to start with (laso be aware of concurrency issues loading too much
data could introduce)...
I have a problem with the DataContext.GetTable<T>() mehod. It seems
everytime I do a foreach on its retur it will query the database, is that
true?

AFAIK yes, actually I never use directly those methods. I use LINQ queries
(that ultimately will call those methods). They are queries i.e. just a
representation of the final request. When enumerating or using the ToList
method the query is sent to the dB and the list is produced in memory. The
list is then reusable withotu hitting the db anymore.
So in my particular case I'm better off working with an List<T> which I
will
initialize from the GetTable<T> result?

IMO yes if you don't want to always reload.
Now Content is potentially big and I want to use lazy loading, i.e. I
DON'T
want to load the 'byte[] Content' with the rest of the data (I guess it's
easy, just remove the ColumnAttribute), but I DO want to save the Content,
if it's not null (i.e. if it has been explicitely loaded/set/modified)
when
SubmitChanges() is called on the Context.
Any tip on how to do that?

AFAIK you can set attributes on the column in the designer to lazty load
your data if needed and if you change the data, the change is trakced and
the update should happen. Would try first before asking question if it
doesn"t work as expected...
Moreover, if I only change the content (which will have no
ColumnAttribute)
how do I mark my entity as dirty in the context? (i.e. I would need an
'UpdateOnSubmit()' method in the table, but it doesn't exists..)

AFAIK changed are track on a paoprty per peroperty basis ie. if a proprty is
updated Linqwill create the statement that updates just changed columns...

Also my personal preference is to add a timestamp column instead of letting
linq to check all columns for concurrency...
 
BTW, Hey, no worries, I know what I'm doing!
I do need it all!
This is not a server database, this is an embeeded database, just for this
person, and a summary on all data is presented in a treeview on the side of
the view at all time, hence I do need it at all time!
 
What I want to do is load most object at the start, do in memory work,
You may want to balance between loading on demand what you really need and
keepind frequently mostly read only data in cache. My personal preference
is to load just what is needed and add caching if needed rather than
loading too much to start with (laso be aware of concurrency issues
loading too much data could introduce)...
Hey, no worries, I know what I'm doing!
I do need it all!
This is not a server database, this is an embeeded database, just for this
person, and a summary on all data is presented in a treeview on the side of
the view at all time, hence I do need it at all time!
 
Hi,

Thanks for your answer. It kind of confirm many of my earlier thoughts.

Although I'm still investigate how to handle the byte[] array

Thinking about what you said, I could have 2 data context, sharing the same
connection (hence transactions! which I can explicitely create before
submitting changes).
In one I have a summary of all my document, in the other I have "heavy line"
(with the byte[]), seems possible, mmhh... I have to think about it more,
but thanks for the tip.
 
Back
Top