Comments & Opinions


Home Page | Comments | Articles | Faq | Documents | Search | Archive | Tales from the Machine Room | Contribute | Login/Register

The quest for the lost page

Updated!

To search or not to search? This is the problem. No, ain't a problem at all. You search, that's it.

No, I'm not talking about women (sic), I'm talking about the search funciton of my site, that is disabled since the move to the new CMS. Is since the beginning of the week that I'm looking for a way to bring that back to life.

The previous search functions used the Ht://Dig engine, that is a good product. Or better, it was. The project seems dead, the last version (a beta) is from 2004 and hasn't released any more versions since. Moreover, that last version (and any other older ones) does not compile with the latest versions of Gcc, so you have to struggle to get it to work.

Sure, a pre-compiled one is distributed with Slackware, but... what I'd like to do is to get this thing a little more integrated with my system, been able to put a search box wherever I want and get the result directly into my code, without having to call an external - static - web page as a wrapper. That's why I'm trying to recompile the thing.

But after a full day spent struggling with this thing (who said that C++ is self-documenting? Liar!) I'm still here with a bunch of files that refuses to make an executable.

Long ago, while I worked for $wesavetheworlddotcom, I've used a nice thing called Asp Seek, but even this seems to be dead in 2003 and, as usual, it doesn't compile.

So I began to think to various alternatives:

  1. Get another engine.

    Yeah, right. Easy to say. But find one that ain't written in PiHateyouverymuchP, doesn't use Java/Python/Ruby/whattheheck and doesn't require jumping through hoops to integrate with my system that is not so easy.

  2. Just use bloody Google

    That is the easy way out. But... I don't like the fact that they put Ads on the result page. Hey, I get it: that's how they makes money, or you pay them. But they makes money and I don't... What did you say? Adding ads to the site? Dude, don't mention it... And there is the other problem: big G works fine if your site is on the internet, but it doesn't if you want to use it on an intranet.

  3. Do-it-yourself

    That should be the perfect solution. However, I don't have the time or the will to do it. So I spent the whole day yesterday trying to find a search engine "base" already made to integrate. And I couldn't find anything.

Bah, maybe I'm going to remove the search funciton altogether.

And the search functionality is back on line!

Yep, were you scared? Well, thanks to everyone that spamm.. hemmm... that sent me lots of pointers about search engine projects that weren't dead, in fact I had already began to look at a few and in the end I've picked swish-e as new "search engine".

I've update the source code in SVN with the new 'search' function, now I've to update the documentations and then rebuild the package.

Thanks

Davide Bianchi
22/07/2009 18:25

Comments are added when and more important if I have the time to review them and after removing Spam, Crap, Phishing and the like. So don't hold your breath. And if your comment doesn't appear, is probably becuase it wasn't worth it.

19 messages this document does not accept new posts
Anonymous cowardSearch engines By Anonymous coward - posted 22/07/2009 10:50
Non so se possono esserti utili, ma ho trovato i seguenti link:

mnoGoSearch web search engine software
http://www.mnogosearch.org/

Xapian
http://xapian.org/

Scusami, invece, se ti ho fatto solo perdere tempo.

--
Anonymous coward


maxxfiMeglio niente che Java? By maxxfi - posted 22/07/2009 10:59

Giusto per capire come la pensi, meglio niente search engine che Java/Python/Ruby/whattheheck ?
(Java sta anche a me abbastanza sui cabasisi, per cui ti capisco ;\) )

--
maxxfi


Davide Bianchi@ maxxfi By Davide Bianchi - posted 22/07/2009 18:13

> Giusto per capire come la pensi, meglio niente search engine che Java/Python/Ruby/whattheheck ?

No, meglio niente che "grande rottura di marroni, perdita di tempo, giorni e giorni di lavoro indefesso per una funzionalita' che poi fondamentalmente non e' che ci muoio se non ce l'ho".

Come gia' detto alla nausea: io il sito lo mantengo nel mio tempo libero. Invece di stare qui' a debuggare codice C++ scritto 5 anni fa' potrei andarmene in giro in moto.

--
Davide Bianchi


Sottilette@ Davide Bianchi By Sottilette - posted 23/07/2009 13:08



>Come gia' detto alla nausea: io il sito lo mantengo nel mio tempo libero. Invece >di stare qui' a debuggare codice C++ scritto 5 anni fa' potrei andarmene in giro >in moto.

turbo quoto!
ho una voglia di rimettere in strada il mio Cbr.... e google fa un lavoro davvero decente; il sito Ibm.com ha inktomi come search engine, io riesco a trovare prima ciò che cerco con bigG, magari sono solo più abituato ad utilizzare questo secondo.


--
Davide Bianchi

--
bye bye my Honda CBR bike.... looking for some Bayern Horses?


Anonymous coward@ Davide Bianchi By Anonymous coward - posted 23/07/2009 17:08

> Come gia' detto alla nausea: io il sito lo mantengo nel mio tempo libero.
> Invece di stare qui' a debuggare codice C++ scritto 5 anni fa' potrei andarmene
> in giro in moto.

Solo perchè è estate, ed è pieno di olandesine in giro per i parchi. Non appena torna la stagione delle piogge (tra due settimane) il sito avrà il suo motore di ricerca.

--
G

--
Anonymous coward


Anonymous cowardRicerca... By Anonymous coward - posted 22/07/2009 11:05

Ho letto nella pagina di presentazione del CMS FdT che per il CMS FdT Next Generation non escludi l'uso di Java.
Perché non cominciare subito con Lucene come motore di ricerca?

Nella nostra intranet il suo "sporco lavoro" lo fa degnamente.

--
Anonymous coward


Davide Bianchi@ Anonymous coward By Davide Bianchi - posted 22/07/2009 18:14

> ...non escludi l'uso di Java.
> Perché non cominciare subito con Lucene come motore di ricerca?

"non escludo" != "sicuramente"

--
Davide Bianchi


Anonymous cowardperl search engines By Anonymous coward - posted 22/07/2009 11:09

forse che hai gia' visto sta paginata di roba in perl?
http://www.thefreecountry.com/perlscripts/searchengines.shtml

--
Anonymous coward


Anonymous cowardCLucene? By Anonymous coward - posted 22/07/2009 11:09

http://sourceforge.net/projects/clucene/

--
Anonymous coward


Anonymous cowardsearch engine By Anonymous coward - posted 22/07/2009 11:20

I don't know how is hard, but the fact that you have all the articles in the database should make easy to have a dedicated search engine which looks in the right tables...

--
Anonymous coward


Davide Bianchi@ Anonymous coward By Davide Bianchi - posted 22/07/2009 18:17

> I don't know how is hard, but the fact that you have all the articles in the database should make easy to have a dedicated search engine which looks in the right tables...

Yes, in fact I was thinking the same. BUT if you want full-text-search you need to wave bye-bye to InnoDB tables and referential integrity. Or you need to do a simple "partial" match that can take a lot of time.

Moreover, this way you don't have a clue about which document "matches" more of what you're looking. Unless you do a parsing of the doc and get a list of the words and then count the individual matches with each word and...

It looks easy... on the surface... is only when you start thinking at all the nitty-gritty bits that is not so easy.

--
Davide Bianchi


Anonymous cowardAi provato... By Anonymous coward - posted 22/07/2009 11:41

http://swish-e.org/

--
Anonymous coward


Luigidai un occhio a questo .... By Luigi - posted 22/07/2009 11:56

Se hai tempo. voglia e pazienza
http://sourceforge.net/projects/opensearchserve/

Ciao, luigi

--
Luigi


Anonymous cowardQuesto? By Anonymous coward - posted 22/07/2009 12:18

swish-e, non so se ti va bene ma è un inizio

--
Anonymous coward


Oriano...ma no, dai....... By Oriano - posted 22/07/2009 13:39

....che era cosi' comoda.
Se volevo rileggermi qualche storia, la trovavo in un attimo.
Dai concentrati....sei concentrato? Piu' concentrato....

--
Oriano


Davide Bianchi@ Oriano By Davide Bianchi - posted 22/07/2009 18:18

> ....che era cosi' comoda.

E perche' pensi che sia qui' a spaccarmi lo cranio senno'?

--
Davide Bianchi


Anonymous cowardIndicizzatori e motori di ricerca By Anonymous coward - posted 22/07/2009 15:06

Ho dato un occhiata ai progetti ht://Dig e Asp seek che avevi usato in precedenza e mi sembra che Swish-e http://swish-e.org/ o The Lemur Toolkit http://www.lemurproject.org/ sono entrambi scitti in C / C++ e la possibilta di usarli tramite script CGI
Io non li ho mai usati ma da quello che ho letto della loro documentazione sembrebbero essere adatti a sostituire degnamente Ht://Dig e salvare la funzione ricerca.

--
Anonymous coward


Anonymous cowardsearch engine By Anonymous coward - posted 22/07/2009 16:21

premesso che non gli ho ancora dato un'occhiata, ci sarebbe questo:
http://mnogosearch.org/
a vederlo operare sul loro stesso sito sembra efficiente.
Ciao
G.

--
Anonymous coward


dpantaleoViene da chiedersi... By dpantaleo - posted 24/07/2009 18:51

chi è che usa quella pagina...
i veri appassionati di D ormai conoscono i titoli a memoria :D

--
dpantaleo
"Nemo reverte ab nos..."


19 messages this document does not accept new posts

Previous Next


This site is made by me with blood, sweat and gunpowder, if you want to republish or redistribute any part of it, please drop me (or the author of the article if is not me) a mail.


This site was composed with VIM, now is composed with VIM and the (in)famous CMS FdT.

This site isn't optimized for vision with any specific browser, nor it requires special fonts or resolution.
You're free to see it as you wish.

Web Interoperability Pleadge Support This Project
Powered By Gojira