open source .NET search engine?

  • Thread starter Thread starter susiedba
  • Start date Start date
S

susiedba

does anyone know of a framework; or tools; or something-- that
describes an open source VB.net search engine / spider?

anyone want to trade notes?

I want to build something a lot more focused than google; for example;
I want to spider Home Depot websites and sell it to Lowes.

Does anyone want to help?

-Susie
 
Susie,

As for your subject: open source .NET search engine try
http://www.dotlucene.net/

As for spidering, there are many website copiers out there, try HTTrack
http://www.httrack.com/.

I am not condoning unethical copyright infringement. However, I'm so sure
that you will be unable to sell a competitor's website to a large company
like Lowes that I give this information. Besides, I'm sure they have many
talented people in their IT department that could get them this info in no
time at all.

As far as building something more focused than google--good luck. Google,
just like Microsoft, has top computer scientist working on some crazy stuff
i.e. natural language processing, query analysis, best bets, controlled
vocabularies and a little artificial intelligence. Most of which involves a
fair amount of some pretty advanced mathematics. I'm not trying to
discourage you to not build something better; there are lots of brilliant
people in this world. Just use your brilliance for something good!

As far as making money, why not use the Amazon E-commerce Web Service. It
gives you access to all of Amazon's products, images, reviews, pricing, and
a remote shopping cart system. You can just mark up the prices a bit to make
money for not doing much of anything besides coding a website.
http://aws.amazon.com
http://www.google.com/search?hl=en&q=amazon+web+service

Chris
 
does anyone know of a framework; or tools; or something-- that
describes an open source VB.net search engine / spider?

anyone want to trade notes?

I want to build something a lot more focused than google; for example;
I want to spider Home Depot websites and sell it to Lowes.


Lucene.NET or Microsoft Index Server or SQL Server Full Text Search Engine.
 
thanks guys; does anyone else have any ideas??

Susie,

As for your subject: open source .NET search engine try
http://www.dotlucene.net/

As for spidering, there are many website copiers out there, try HTTrack
http://www.httrack.com/.

I am not condoning unethical copyright infringement. However, I'm so sure
that you will be unable to sell a competitor's website to a large company
like Lowes that I give this information. Besides, I'm sure they have many
talented people in their IT department that could get them this info in no
time at all.

As far as building something more focused than google--good luck. Google,
just like Microsoft, has top computer scientist working on some crazy stuff
i.e. natural language processing, query analysis, best bets, controlled
vocabularies and a little artificial intelligence. Most of which involves a
fair amount of some pretty advanced mathematics. I'm not trying to
discourage you to not build something better; there are lots of brilliant
people in this world. Just use your brilliance for something good!

As far as making money, why not use the Amazon E-commerce Web Service. It
gives you access to all of Amazon's products, images, reviews, pricing, and
a remote shopping cart system. You can just mark up the prices a bit to make
money for not doing much of anything besides coding a website.
http://aws.amazon.com
http://www.google.com/search?hl=en&q=amazon+web+service

Chris
 
I really do think that there is room for a new service.
I just am going to have some sort of scope to my project-- instead of
pulling a napoleon- like google does-- and try to enter EVERY MARKET at
the same time.

moving from search engines to spreadsheets; IM; Email; Usenet; Books;
eCommerce-- I just dont think that google is 'big enough' to be
successful in any of these new markets.

which means that there is room for innovation.

I've had many customers ask me to pull XYZ off of site ABC

I personally think that a simple search engine should consist of a
couple of Olap Servers and a couple of relational boxes.. and a couple
of crawlers... not too complex at all.

-Susie
 
and it goes without saying that I think that Microsoft is completely
and utterly incompetent.

they're pulling a napoleon also.. they need to sell their Xbox and MSN
division and fold it back into their core competencies.

supposedly they have 5,000 developers and testers working on vista AND
office 2007.

what the hell are the other 60,000 employees doing?
 
supposedly they have 5,000 developers and testers working on vista AND
office 2007.

what the hell are the other 60,000 employees doing?

Microsoft does have more than 2 products. .NET, VS.NET, SQL Server :-)
 
I personally think that a simple search engine should consist of a
couple of Olap Servers and a couple of relational boxes.. and a couple
of crawlers... not too complex at all.

Good luck if you don't think it's complex - there's a reason why Google
hires a lot of PhDs!
 
not really; .NET VS.NET and SQL Server sure dont take more people than
Office and Windows.

what about revenue.

80% of their revenue comes from Office and Windows???
 
the reason that they hire a lot of PhD is because they reinvent the
wheel.

I dont see a need to write your own Operating System, Web Browser;
Database Engine.
I dont see a need to write everything in crazy-ass-complex AJAX.

I just think that they're too trendy and not functional enough.

I've also had a half dozen clients ask me if I can crawl website X and
give them data XYZ.

And I dont think that there is an enterprise level search engine.. I
mean-- store it in a database and allow simple queries against a
database.

I just think that if i had a good datamart and a couple of olap servers
I could run circles around google.
 
I mean seriously

if you can use Olap to look for similiar words; etc
I just dont think that it would be that complex.

and you people that sit around and think that google is worth more than
IBM?

LAUGHABLE.

I just think that it's a shame.. I would love to help you susie--
because I think that there IS a better solution.

Instead of using 100 different languges from Perl to PHP to BigTable to
GoogleOS- I just call hogwash on their ass.

maybe we should build a project on sourceforge.. does anyone have an
idea?

literally-- database driven search engine; most everything lives in
mySql with a couple of SQL Server boxes at the top of the equation.
 
Back
Top