File system or Database

  • Thread starter Thread starter Jon
  • Start date Start date
J

Jon

Hello all,

Just wanted to get some other options to what the best approach is for
an application we're looking to develop.

A 3rd party application will be dumping some files (PDF's) onto a
share drive, that we then need to take and display to an end user via
the internet.

There will be approximately 20 documents dumped per customer per week,
and there's about 100 customers, so 2000 documents per week, 8000 per
month.

There are three options that we're looking at:

1) Read these files direct from the file system and display to the
relevant user;
2) Have an application insert these documents as BLOBS into a database
and read from the database;
3) Read the meta data about the file (name, date created etc) into the
database, but not the file, so effectivly just insert a pointer to the
file location.

Out of the above, what are peoples thoughts?

Any help would be much appreciated,

Jon
 
3) Read the meta data about the file (name, date created etc) into the
database, but not the file, so effectivly just insert a pointer to the
file location.

This gives you the best of both worlds. You can search/list/sort etc the
files but the actually viewing of it is done via the filesystem and not the
database.
 
You're talking about using a database, but what for? You have not mentioned
storing any information in the database that is not included in the files
themselves already, so what would the extra database component be used for?
Unless you are going to store some data about the files that is not
contained in the file system entries for the files, there is no reason to do
anything but store the files in the file system and serve them to the
client.

--
HTH,

Kevin Spencer
Microsoft MVP

DSI PrintManager, Miradyne Component Libraries:
http://www.miradyne.net
 
I'd tend to lean toward storing the files in SQL Server unless performance
testing with realistic data demonstrated that it would not work out well for
this app.
If you store files in the file system then you usually must write a whole
lot of mundane file management code, such as dealing with filename
conflicts, cleaning up old files, protecting them from unauthorized access,
logging events relating to the files, etc. If you use SQL Server, it
essentially does all that for you so you end up writing less code and
instead leverage the strengths of existing technologies..
Here's more info:
http://SteveOrr.net/articles/EasyUploads.aspx
 
From developer's perspective storing such data as BLOBs is a lot
cleaner solution, mainly because of referential integrity is difficult
to enforce (and maintain) between header records in database and
actual data on file system (i.e. possiblity of orphan header records
without files or orphan files without header records).

However DBAs do not like BLOBs. They perceive BLOBs as performance
hog, riskier (reportedly,,, one may loose access to all files if one
byte goes corrupt) and trickier to deal in replication/log-shipping/
recovery scenarios (more complex transaction logging).
 
From developer's perspective storing such data as BLOBs is a lot
cleaner solution, mainly because of referential integrity is difficult
to enforce (and maintain) between header records in database and
actual data on file system (i.e. possiblity of orphan header records
without files or orphan files without header records).

However DBAs do not like BLOBs. They perceive BLOBs as performance
hog, riskier (reportedly,,, one may loose access to all files if one
byte goes corrupt) and trickier to deal in replication/log-shipping/
recovery scenarios (more complex transaction logging).
 
From developers perspective storing such data as BLOBs is a lot
cleaner solution, mainly because of referential integrity is difficult
to enforce (and maintain) between header records in database and
actual data on file system (i.e. possiblity of orphan header records
without files or orphan files without header records).

However DBAs do not like BLOBs. They perceive BLOBs as performance
hog, riskier (one may loose access to all files if one critical byte
goes corrupt) and trickier to deal in replication/log-shipping/
recovery scenarios (more complex transaction logging).
 
In developers' perspective storing such data as BLOBs is a lot cleaner
solution, mainly because of referential integrity is difficult to
enforce (and maintain) between header records in database and actual
data on file system (i.e. possiblity of orphan header records without
files or orphan files without header records).

However DBAs do not like BLOBs. They perceive BLOBs as performance
hog, riskier (reportedly,,, one may loose access to all files if one
byte goes corrupt) and trickier to deal in replication/log-shipping/
recovery scenarios (more complex transaction logging).
 
A few snooty DBA's don't deter me much. There is no perfect solution.
Plenty of things can go wrong in a file system too.
Just because BLOBS can cause performance issues in some situations doesn't
mean they always do. It has much to do with the size of the blobs and how
frequently they are accessed & modified. I do recommend doing some
performance especially testing if performance is a critical issue.
 
I read ( never actually tested) that string with MS SQL 2000 there is no
performance issues with having BLOBs in SQL.
If your BLOB is less than 2K then MS SQL keeps it in separate space from DB
and have only reference to it in a field.
If less then BLOB is part of the record as any varchar for example would be.

thus pretty much implementing same thing as you would with custom solution
and having BLOBs as separate files.
The benefits are obvious though. Uniform backup for example


George.
 
Back
Top