SQL db or Flat File storage option

  • Thread starter Thread starter James
  • Start date Start date
J

James

Let's say I have a stand alone app that was conceived/created decades
ago and has been updated through the years. It's a basic CRUD
application with the data stored in a positional flat file. After
data is entered, calculations are performed and are displayed to the
user. If all is well, the user submits the data for batch processing
and awaits results. During this batch processing, the data takes many
forms before again being sent away for even more processing. I'll
call this the legacy app.

Over the last few years, this legacy app has been accompanied by a web
based product providing most of the same features. Naturally, the
developers fired up their SQL Server and built an ASP based web site
with about 60 data collection forms. After data entry, the data in
SQL is transformed into the same positional flat file as in the legacy
app. Then processing is the same. The site's performance has
deteriorated under heavy load in recent years.

It's been suggested, as a way to improve performance, that we
eliminate the SQL storage and perform the CRUD operations directly on
the positional flat file by way of CGI. I guess the idea is to cut
out the middle man. My suspicion is that this was suggested on a whim
and someone ran with it.

This sounds like a large amount of file I/O and I would prefer to use
ado.net and SQL server but am meeting resistance. I don't mean to
start a flat file –vs- db flame war and if I'm in the wrong group
please let me know.

My question is this…
With about 15k users/day and about 400 pieces of data each, would
anyone really use a flat file as their data storage option, why or why
not?

-Cliff
 
James said:
Let's say I have a stand alone app that was conceived/created decades
ago and has been updated through the years. It's a basic CRUD
application with the data stored in a positional flat file. After
data is entered, calculations are performed and are displayed to the
user. If all is well, the user submits the data for batch processing
and awaits results. During this batch processing, the data takes many
forms before again being sent away for even more processing. I'll
call this the legacy app. Yuck

Over the last few years, this legacy app has been accompanied by a web
based product providing most of the same features. Naturally, the
developers fired up their SQL Server and built an ASP based web site
with about 60 data collection forms. After data entry, the data in
SQL is transformed into the same positional flat file as in the legacy
app. Then processing is the same. The site's performance has
deteriorated under heavy load in recent years.
Ok, but why is that? If you have properly constructed tables and targeted
indexes, join correctly etc you can blast select queries on 20 million
records in no time flat (of course the server has a bearing on this but my
point is that SQL DB's are VERY VERY Fast if used correctly). You can then
use Bulk Load for instance or similar techniques at prescribed periods so
you aren' doign twice the IO all the time.

The data has to write to sql and then write to the flat file, if you do this
each time, sure it's going to be slower than just having one IO Write. But
that's not SQL's fault and cutting SQL out of the equation may fix this one
issue, but open up a bunch of new ones.
It's been suggested, as a way to improve performance, that we
eliminate the SQL storage and perform the CRUD operations directly on
the positional flat file by way of CGI. I guess the idea is to cut
out the middle man. My suspicion is that this was suggested on a whim
and someone ran with it.
So how woudl you do a Point in Time restore with your flat file if the drive
controller went bad on the hard drive? How would you implement a standby
server if you needed it? What if you had 5 times your current load...how
would you upscale it with a flat file?
This sounds like a large amount of file I/O and I would prefer to use
ado.net and SQL server but am meeting resistance. I don't mean to
start a flat file -vs- db flame war and if I'm in the wrong group
please let me know.
IMHO, those people should be unemployed and/or working in another industry.
They haven't done their homework from the sounds of things and taking a
shortcut got them into this situation, so sure, let's make another mistake
to fix it.

With ADO.NET you can use both disconnected and connected objects giving you
a great deal of power and flexibilty. Plus you get all of the other
benefits associated with ADO.NET, all of the premade objects , plus all the
future enhancements. Don't figure Stored Procedures will run real well on
your flat file. Dishing out permissions will be really fun etc.

I'll agree that you can use a Flat file as a Datasource and still get most
of the benefits of ADO.NET, but SQL Server/Oracle etc were ADVANCES over
text files, they were put in place to solve many of these problems, and they
do a great job of it.
My question is this.
With about 15k users/day and about 400 pieces of data each, would
anyone really use a flat file as their data storage option, why or why
not?

How many users did the app have back when it worked only with the flat file?
I"ve stated why I wouldn't use it above. Log shipping, backup and restore
functionality, security administration and just about everything else I can
think of. What if some hacker got your flat file? Boy that would be hard
to read wouldn't it. You're drive controller will probably go bad at some
point in the server's life, that won't be pretty. Oh well, you can always
go back to the full file and copy it over, the data you needed since the
last backup isn't all that important anyway, right?

If this is the whole story then they really need to rethink this...it's
almost hard to think someone is seriously suggesting this.



--

W.G. Ryan, eMVP

http://forums.devbuzz.com/
http://www.knowdotnet.com/williamryan.html
http://www.msmvps.com/WilliamRyan/
 
Thanks for the insite William. It is comical but someone was
seriously suggesting this. With your comments, plus (like you said)
everything else I could think of, I beleive we're convinced that
ado.net is not just cool and hip but can actually do something.
 
Hi James:
Sounds like you may be outnumbered in a flat file versus database discussion. Yes - you are in the wrong group!
To name just a few benefits of SQL that flat files won't give you: indexes, compiled stored procedures with optimized query plans, and data integrity.
Here is a pointer you can use to further your case: http://www.microsoft.com/sql/evaluation/casestudies/scalability.asp#throughput

Based on your description the precise interaction between the new web application and the legacy application is unclear. A couple questions that may help you decide how much SQL can help/hinder the scenario:

What is the bottleneck of the system? Is it disk I/O, CPU, memory, network, other? What application takes up most of your system resources?

Can the legacy application be replaced or removed from the response of the web server? How much of the data-processing can be ported from the legacy applicaton to SQL? If you're just using SQL as an intermediate holder for data and letting the legacy app do all of the work then you may not be gaining much. If the legacy application *must* do all of the processing of flat files - can it be done offline or on another [set of] server?


This posting is provided "AS IS" with no warranties, and confers no rights.
 
I'll agree that you can use a Flat file as a Datasource and still get most
of the benefits of ADO.NET, but SQL Server/Oracle etc were ADVANCES over
text files, they were put in place to solve many of these problems, and they
do a great job of it.

While I agree with your overall theme, SqlServer and Oracle solve a specific
problem - the relational problem. I have several customers that have ported
to Oracle/SqlServer on their own, saw performance drop through the bottom,
and I am there now un-porting and rewriting.

Oracle and SqlServer do a great job of managing relational objects and
multiuser data access (e.g. reads and writes), but both are abysmal failures
are replacing flat file non-relational data stores - mostly because one
experiences as 10x performance hit in I/O. Ultimately, one has to migrate
away from the relational store or implement the processing closer to the
store (e.g. stored procedures)

regards
roy fine
 
Back
Top