Article Storage: Files vs. Database

  • Thread starter Thread starter Jonathan Wood
  • Start date Start date
J

Jonathan Wood

I'd like to build a Website that contains many articles. Two basic
approaches are to either store the articles in aspx files, possibly indexed
by the database, or to store the article text in the database.

Some advantages of storing them in files are simplicity, and efficiency.

Some advantages of storing them in the database are ease of some operations,
and the option of using SQL Server 2005 text index to implement search.

Can anyone else offer some considerations for choosing between these two
approaches?

Thanks.

Jonathan
 
You said "Some advantages of storing them in files are simplicity, and
efficiency"
I think you are mistaken here.

Simplicity: I do not see any difference between pulling article conent from
database and from file. Actually it would be easier to use DB since besides
content you might have bunch of additional properties assigned to the
article. Like "Topic", Thumbnail image, Header. When you need to show list
of "Todays" article what are you going to do if you keep them as files?

Efficiency: It's much more efficient to pull data from DB than from the file
system. Simply because DB designed for that. It offers indexes and such.
FileSystem does not have it. It always scans the folder in order to find the
file.




George.
 
We use databases almost exclusively for this kind of "stuff". Its easier to
search, index, etc. and the content can be populated into a "templatized"
article page. It can also be cached via Partial page caching.
Peter
 
George,
Simplicity: I do not see any difference between pulling article conent
from database and from file. Actually it would be easier to use DB since
besides content you might have bunch of additional properties assigned to
the article. Like "Topic", Thumbnail image, Header. When you need to show
list of "Todays" article what are you going to do if you keep them as
files?

Efficiency: It's much more efficient to pull data from DB than from the
file system. Simply because DB designed for that. It offers indexes and
such. FileSystem does not have it. It always scans the folder in order to
find the file.

Well, I'm looking for input. But I personally think a link to an existing
file is simpler than loading data from a database. And I hear all the time
how loading a straight file is more efficient than one that is loaded from
the database.

Jonathan
 
Yeah, these are definitely some of the advantages. Have you made use of SQL
Server 2005's full-text indexing yet? With file-based articles, implementing
search is a pain.

Also, would love to see some samples of the sites you are referring to if
any of them are public.

Jonathan
 
That was my point. Efficiency and ease to work with of static file is only
an illusion.
You will end up with nightmare if you need to change the logo or something
like that in 1000s of static pages.


George.
 
Implementing search functionality is pain.You might do it yourself or get
some third party solution like dtSearch or open source (just google "serarch
engine open source")

But from my experience the SQL server's full-text do not do a good job.

George.
 
I'm not sure what you meant by "that was my point" as you seemed to be
making a different point.

As far as changing the appearance of the pages, stylesheets and master pages
should prevent the need to mess with the pages once created.

I'm not saying I would not choose to use databases. But I'm afraid I don't
see simplicity and efficiency as the reasons to do so.

Thanks.

Jonathan
 
Can you elaborate on this? The full-text indexing was created for exactly
this purpose. Why does it not do a good job, or is a pain to use?

Thanks.

Jonathan
 
Yes, I agree that a database will be needed for one purpose or another.

But then how good are my options for implementing search functionality?

Thanks.

Jonathan
 
In reality it never works well.
I have a web site http://www.mspiercing.com if you look at it you going to
see search box there.

First problem I had is misspells. People on internet do not know how to
spell things. And SQL's full text does not work with misspells.

Another problem was plural vs singular form.

Another problem was weights. If person is looking for "belly ring" I want
the search to find and rank items with word "belly" first simply cause
"ring" exists in pretty much any item.


George.
 
You said "I hear all the time how loading a straight file is more efficient"
And my point was that this is an illusion. That is why many people still
believe it. And you hear it all the time.
File System is designed as a sequential list. If you need to find file a.txt
in it there is no other way but to scan the whole list.

On contrary DB designed to instantaneously point to correct record if you
have an index.

Also MS was planning for long time ago move file system to MS SQL engine.
Not sure if it's still in works or not.

George.
 
Of those you listed, plural vs singular would be my biggest worry.

Yeah, I understand people don't know how to spell. But if their search
doesn't turn up anything, then I'm not going to feel too guilty when they
misspelled something. Although I understand it's a bit different if your
site is selling stuff, in which case, the site loses when they don't find
what they're looking for.

Thanks.
 
George,
You said "I hear all the time how loading a straight file is more
efficient"
And my point was that this is an illusion. That is why many people still
believe it. And you hear it all the time.
File System is designed as a sequential list. If you need to find file
a.txt in it there is no other way but to scan the whole list.

I understand that. My understanding was that transferring the content was
faster from a file that transferring it over a database connection, and not
so much look-up time. My understanding was also that this has been well
testing and documented. But I will certainly admit I haven't ran any tests
myself.
Also MS was planning for long time ago move file system to MS SQL engine.
Not sure if it's still in works or not.

Not sure how great that sounds. Some sort of index to allow a binary lookup
makes sense though.
 
Yes, that's exactly the type of thing I'm considering (I'm already using SQL
Server 2005). But I have seen some complaints about this approach. For
example, loading articles from a database does not appear to be as efficient
as simply displaying files. And I think it was this thread where someone
complained about SQL Server's full indexing's inability to match similar or
plural forms of a word.

Thanks for the link.
 
Back
Top