how can i access 30.000 records

  • Thread starter Thread starter DEVELOPER
  • Start date Start date
Loading 30k records, no matter what the engine, is going to suck Querying
the data is easiest and most performant with SQL CE. You could get similar
or even better perf using indexed binary files, but that a whole lot of
work, so unless you expect big payoffs, or it is super time critical (in
which case I'd question the decision to use managed code), you've got only
one option here.

-Chris
 
Couldn't agree with you more.
Chris Tacke said:
Loading 30k records, no matter what the engine, is going to suck Querying
the data is easiest and most performant with SQL CE. You could get similar
or even better perf using indexed binary files, but that a whole lot of
work, so unless you expect big payoffs, or it is super time critical (in
which case I'd question the decision to use managed code), you've got only
one option here.

-Chris
 
WR [Sun, 7 Sep 2003 18:54:26 -0400]:
IMHO, 30,000 records on a PDA won't ever be that quick or easy to see.

You don't get around much. 30k is only where things start to even be
measurable, if you know the right (good) software. If you only know the
wrong (bad) software, I can see where you might say that.

PXA255/400, Insert: 16,000 items per second (insert rate: 960,000 items/minute
Read: 333,000 items per second (read rate: 20+ million items/min
PXA250/400, Insert: 7,000 items per second (insert rate: 425,000 items/minute)
Read: 200,000 items per second (read rate: 12+ million items/min
SA1110/206, Insert: 3,900 items per second (insert rate: 234,000 items/minute)
Read: 185,000 items per second (read rate: 11+ million items/min

Notes
------------------------------------------------------------------------------

The index data structure remains balanced during inserts and deletes
(66% node utilization typical)
Deletes physically remove the key and data items
Deleted space is immediately available for reuse
Variable-length keys, variable-length data records
Unlimited sort orders (any and all types, compound, anything) per file
Optimize can produce node utilization > 95% (optimize 100,000 recs in 5 secs)


Full story:

http://40th.com/gt40/bench_gt40_arm.html
 
Your promoting a non CF compatible database (that doesn't support SQL
statements so far as I could tell) and comparing it to SQL Server CE? I'm
pretty impressed by SQL Server CE to date. That team has done a wonderful
job and I'm still amazed it's a freebie.

I think a reasonable question regarding retrieving 30K worth of records is,
has the relevant functional requirement been correctly mapped to a handheld
device design constraint. I'm sure cases do exist where one needs 30K of
records at once, and this may be one of them, but there may also be a better
way to approach the problem.

WR [Sun, 7 Sep 2003 18:54:26 -0400]:
IMHO, 30,000 records on a PDA won't ever be that quick or easy to see.

You don't get around much. 30k is only where things start to even be
measurable, if you know the right (good) software. If you only know the
wrong (bad) software, I can see where you might say that.

PXA255/400, Insert: 16,000 items per second (insert rate: 960,000 items/minute
Read: 333,000 items per second (read rate: 20+ million items/min
PXA250/400, Insert: 7,000 items per second (insert rate: 425,000 items/minute)
Read: 200,000 items per second (read rate: 12+ million items/min
SA1110/206, Insert: 3,900 items per second (insert rate: 234,000 items/minute)
Read: 185,000 items per second (read rate: 11+ million items/min
Notes
-------------------------------------------------------------------------
-----

The index data structure remains balanced during inserts and deletes
(66% node utilization typical)
Deletes physically remove the key and data items
Deleted space is immediately available for reuse
Variable-length keys, variable-length data records
Unlimited sort orders (any and all types, compound, anything) per file
Optimize can produce node utilization > 95% (optimize 100,000 recs in 5 secs)


Full story:

http://40th.com/gt40/bench_gt40_arm.html
 
Navigating through 30K records on a QVGA screen is close to my idea of hell.
Think hundreds, not thousands. So I concur with Justin -- go back and look
at your solution design and see if you can trim those records down.


Justin Weinberg said:
Your promoting a non CF compatible database (that doesn't support SQL
statements so far as I could tell) and comparing it to SQL Server CE? I'm
pretty impressed by SQL Server CE to date. That team has done a wonderful
job and I'm still amazed it's a freebie.

I think a reasonable question regarding retrieving 30K worth of records is,
has the relevant functional requirement been correctly mapped to a handheld
device design constraint. I'm sure cases do exist where one needs 30K of
records at once, and this may be one of them, but there may also be a better
way to approach the problem.

WR [Sun, 7 Sep 2003 18:54:26 -0400]:
IMHO, 30,000 records on a PDA won't ever be that quick or easy to see.

You don't get around much. 30k is only where things start to even be
measurable, if you know the right (good) software. If you only know the
wrong (bad) software, I can see where you might say that.

PXA255/400, Insert: 16,000 items per second (insert rate: 960,000 items/minute
Read: 333,000 items per second (read rate: 20+ million items/min
PXA250/400, Insert: 7,000 items per second (insert rate: 425,000 items/minute)
Read: 200,000 items per second (read rate: 12+ million items/min
SA1110/206, Insert: 3,900 items per second (insert rate: 234,000 items/minute)
Read: 185,000 items per second (read rate: 11+ million items/min
-------------------------------------------------------------------------
-----

The index data structure remains balanced during inserts and deletes
(66% node utilization typical)
Deletes physically remove the key and data items
Deleted space is immediately available for reuse
Variable-length keys, variable-length data records
Unlimited sort orders (any and all types, compound, anything) per file
Optimize can produce node utilization > 95% (optimize 100,000 recs in
5
 
Again, your proposal is a non-CF solution, so your comment is irrelevant
here. The point is that for a CF, Windows CE solution, he currently only
has one choice for reasonable performance on this size data set.

There are several non-CF databases that perform substantially better than
SQL CE. I've used a few of them, one of which is *way* faster than 16,000
inserts per second, but it's an academic argument - there's no CF interface
for it, so the original poster couldn't use it for his solution.

--
Chris Tacke, eMVP
Advisory Board Member
www.OpenNETCF.org
---
Windows CE Product Manager
Applied Data Systems
www.applieddata.net


WR [Sun, 7 Sep 2003 18:54:26 -0400]:
IMHO, 30,000 records on a PDA won't ever be that quick or easy to see.

You don't get around much. 30k is only where things start to even be
measurable, if you know the right (good) software. If you only know the
wrong (bad) software, I can see where you might say that.

PXA255/400, Insert: 16,000 items per second (insert rate: 960,000 items/minute
Read: 333,000 items per second (read rate: 20+ million items/min
PXA250/400, Insert: 7,000 items per second (insert rate: 425,000 items/minute)
Read: 200,000 items per second (read rate: 12+ million items/min
SA1110/206, Insert: 3,900 items per second (insert rate: 234,000 items/minute)
Read: 185,000 items per second (read rate: 11+ million items/min
Notes
-------------------------------------------------------------------------
-----

The index data structure remains balanced during inserts and deletes
(66% node utilization typical)
Deletes physically remove the key and data items
Deleted space is immediately available for reuse
Variable-length keys, variable-length data records
Unlimited sort orders (any and all types, compound, anything) per file
Optimize can produce node utilization > 95% (optimize 100,000 recs in 5 secs)


Full story:

http://40th.com/gt40/bench_gt40_arm.html
 
There is a trick you can use...

Obtain a total count of all the records on your device (30,000). Place a
seperate scrollbar on your form (not attached to the data grid) and set the
max value of that scrollbar to the high limit you just retrived. Next, if
your display window can show say... 20 records at a time - then load 200 -
300 records into an array. Use your scrollbar to navigate the record array
and if the user ever goes over the cached amount - load the next array. For
extra speedy response times... have a previous, current and next array of
records.

Essentially - this is a programatic method for paging a recordset... which
is exactly what the sql query window in enterprise manager uses.

I have used this method on the handheld navigating student enrollment
records (one had 70,000 entries) and the response time was quite good.

Rick Winscot
rickly@zyche dot com
 
I am curious, have you tried this without the caching?

We do a similar thing except without the cache, speed is quite fast. I
figure the cache doesn't really achieve anything but another overhead.
Access to a resultset (in our case via ADOCE or OLEDB) is all done in RAM
and retrieval of records is generally very quick (for both direct table and
query access). I am not sure about an SDF file on a compact flash card in
direct table mode, maybe this is where you see improvements using cache. Not
sure, but I would be keen to hear comparisons.

Regards,

David.
 
David,

You are right. On the desktop, the cache is unnecessary. However, the
handheld is a different story. I initially tried this process without
caching blocks of records - and the speed was bearable. Caching improved
handheld performance - but there was a balance point in caching/non-caching.

Application response times improved dramatically when pulling smaller chunks
of data from a properly indexed table (primary key was numeric) but started
to slow down again when the ratio of db gets increased. Considering the
overhead per-query, this wasn't to surprising. Access from a flash card was
even more pronounced - as the access time to the flash card adds significant
overhead.

Rick Winscot
rickly@zyche dot com
 
Rick,



No, the point I was making is that caching on the handheld (in the scrolling
example) would probably have no beneficial outcome. You will likely see
improvement on the desktop but not on a handheld device. On the desktop, you
have the benefit of retrieving multiple rows at once through OLEDB (unlike
on the device which retrieves them one at a time) but you must also factor
in issues of network latency and even simple bottlenecks like accessing a
data page from a harddisk. Low level hardware caching on the hard disk and
general caching in the OS would counteract most situations. Those are the
real delays which your described method of caching will definitely improve.



I figured that since those delays are not applicable on a handheld device
(other than like I mentioned on an SDF on a CF card or RDA) because the SDF
file is typically stored entirely in RAM. The only overhead I foresee is
access to the actual data in the sdf pages and the fact it has to come
through the engine and oledb abstraction layer to the application.



It is interesting to hear you did see significant benifit though. If I get a
chance I may try caching similar to what you have and determine if it
results in any real benefit.



Hypothetically, I get the feeling you'll see a quite significant delay if
you suddenly drag the scroll bar past the 200-300 record limit and it has to
pull in the whole 300 records (worst case) instead of just the displayed 20
like our method, then again I suppose that is your balance point and it
depends soley on how your users are viewing data. We never show the user a
list beyond say 200-500 records, and we often provide search functionality
which either highlights or filters appropriate records to justify showing so
many records in the first place. Browsing to find anything over 100 records
is ridiculous and for practicality, categorisation as a start would be a
must. Whenever we need to store something like 30 000 records, we tend to
use our own custom data structures like hashtables and binary files, being
able to manipulate the data at this level is a pain but achieves the
absolute best (and usually acceptable) performance.



On another note, have you noticed a dramatic drop in performance as a record
size reaches a certain threshold say 25 + average fields with a couple of
large nvarchar's? I imagine we have crossed a page limit in sqlserverce,
however the performance is not consistent across platforms and hardware
devices (particularly MIPS -> ARM). We have noticed certain odd behaviour
(inconsistent performance) and often come to the conclusion that we are just
pushing the engine too much (possibly related to the amount of RAM
available). I only wish we had some better tools to diagnose these types of
issues. Soon I am going to try out Sybase Anywhere 9.0 as they seem to have
improved their OLEDB interface a little, and see if there much difference
between the two engines.



Interesting to hear any comments and I will surely let you know if I find
any results to the contrary in regards to the caching.


Thanks for the info.

Regards,



David.
 
Back
Top