"Google Like" with weighted searches project

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I'm currently working on a ASP.Net / C# / SQL 2000 project that involves the
entering of keywords, that a web user enters, and then searching MSWord
documents for those words. This information will then be used to perform
weighted searches on the keywords and text of multiple MSWord documents. How
might this best be accomplished? Should I perform Full Text Searches on the
Word files or store the data in a database (by coping and pasting the
document into a Web app page)? If I store it in a database, how would I
store more than 255 characters and then be able to do searches on specific
words? Thanks in advance for your reply!
 
why cant you use index server

--
Regards

John Timney
ASP.NET MVP
Microsoft Regional Director
 
Do you mean Full Text indexing and Full text searches? Do you think this is
the best way to go? Thanks for your reply!
--
DQdontquit


John Timney ( MVP ) said:
why cant you use index server

--
Regards

John Timney
ASP.NET MVP
Microsoft Regional Director
 
Take a look at it and see what it can give you. Its gonna save you a lot of
work over erinventing the wheel, and its programmable from asp.net

--
Regards

John Timney
ASP.NET MVP
Microsoft Regional Director

DQ dont quit said:
Do you mean Full Text indexing and Full text searches? Do you think this
is
the best way to go? Thanks for your reply!
 
Thanks John, I'll investigate Full Text indexing and searches and let
everyone know how it works out. I'm not sure what you meant by "it's
programmable from asp.net. Could you be more specific? Thanks.
 
The search results (catalogue queries) can be intercepted and searches
triggered so you can manipulate whats presented or searched for. Worth a
look at given it comes for free on windows servers (if they are your servers
that is).

--
Regards

John Timney
ASP.NET MVP
Microsoft Regional Director
 
DQ said:
I'm currently working on a ASP.Net / C# / SQL 2000 project that involves the
entering of keywords, that a web user enters, and then searching MSWord
documents for those words.

Good project.

This information will then be used to perform
weighted searches on the keywords and text of multiple MSWord documents. How
might this best be accomplished? Should I perform Full Text Searches on the
Word files or store the data in a database (by coping and pasting the
document into a Web app page)?

What would be cool is to copy each word, word by word, into a database,
assigning it an integer as to it's place in the document and some other
information that would be useful in searching ( page number, paragraph
number ) and so on.

So, you would create a Text-Search based object model that would be
based on the atomic unit in the database, which consists of all the
words (not many words bigger than 255 that I know of!) and a clustered
index of the place in the document where the word is.

Then you would make a table of noise words, a table of similar words (
singular mapped to plural).

Ok, so then, you would do a fast search by finding all the words, then
grabbing the place, and offering the list to the user. Then, you would
make an interface to move the user to the place inside the document.


If I store it in a database, how would I
 
Full-Text Searches are awesome. I created a ASP.NET page that would stream
the MSWORD document (upload) into a MSSQL database table. After installing
the Full-Text component on the Windows Sever, and using the following stored
procedures:
1. sp_fulltext_database
2. sp_fulltext_catalog
3. sp_fulltext_table
4. sp_fulltext_column
5. sp_fulltext_table
6. sp_fulltext_catalog
to set up the catalog on the MSSQL Server (I could not use the Enterprise
Manager "front-end" to because my PC is remote), I was able to perform
searches on the binary data (word doc was stored as binary).
However, I have no clue as to how to allow the user to open the word
document so that they could modify the text and save it again to the
database. John, could you point me in the right direction? Also how do I
"rate" you? Thanks again for your reply!!!
 
You've done most of the work already by working out how to instert the
object into the DB. You need to work out how to get binary objects out
again, and they then need to save them locally to work on them, they then
have to be uploaded again and have them reinserted. Its the only way I'm
afraid as they cant work on them on the server so they have to extract them,
do their work and then reinsert them.

Good example here:
http://www.developer.com/net/asp/article.php/3097661

Regards

John Timney
Microsoft MVP
 
John Timney ( MVP ) said:
You've done most of the work already by working out how to instert the
object into the DB. You need to work out how to get binary objects out
again, and they then need to save them locally to work on them, they then
have to be uploaded again and have them reinserted. Its the only way I'm
afraid as they cant work on them on the server so they have to extract them,
do their work and then reinsert them.

Good example here:
http://www.developer.com/net/asp/article.php/3097661

Regards

John Timney
Microsoft MVP
 
Hi,

DQdontquit: Have you finished your project? I'm doing exactly the same
thing. "google like" search using MS full text search.
Everything it's ok, but now i want to show user some text from that
document, so he can have and idea of what kind of document it refers too.
For it, i have an image field in SQL server, i cant retrieve that particular
row using FTS but cant decode that data into string so i can show and
highlight search words.

Also, i don't know if it can be done (decode text) directly using T-SQL
(using IFilter) or i have to do manually .NET side.
I can't find any helpfull info and i cant believe it, because dirty job is
done by Ifilter and excelent MS FTS index service.
 
Back
Top