Several people have asked the same question, so I'm going to include a
HOWTO doc in the next release explaining some common usage. The best
thing is to review the bin/odeum_mgr for code that shows a common
indexing/searching set of functions. The odeum_mgr file is pretty
small, and hopefully you can understand it.
In your case I'd say that you have a few options depending on how you
store/use your data:
1. Write a stand-alone process that periodically goes through the
database, compares modified times of articles, and updates the odeum
index on disk. It should also periodically cull (remove) articles which
don't exist anymore.
2. Add to your "store/edit/delete article" procedure a small side
operation that also indexes all of the text in the article with
ruby/odeum after it is put in the database. This makes the search
3. A kind of hybrid where you have a thread in your program that
listends to a queue. When an article is stored/updated/deleted you put
a little message on the queue saying "article 333444 deleted". The
thread then just reads stuff off the queue and does the index updating
based on what it's told.
The advantage of #1 is that it's easier to control when the index is
being updated and you don't need to worry as much about read/write
locking since Odeum will do it for you (in theory). This also has a
nice separation since your search feature only opens the database in
read mode, and the indexer is the only thing opening it write mode.
The disadvantage #1 is that your articles aren't immediately available
Option #2 fixes the immediacy problem, but you'll get into some delays
if you have more than one attempt to update the index at the same time.
Odeum does a good job of read/write locking the index using OS level
thread locking (if it's supported), so your biggest risk is your program
crashing in the middle of the index update (which hoses the index
usually). Basically, #2 will drive you insane trying to manage the
writers fighting over the index.
Option #3 is kind of in-between the other two: it's a little easier to
implement and control than #2, not quite as easy as #1, but has nearly
immediate results of #2.
Now, searching is easy. Once you have the articles indexed and put into
the odeum storage, you simply need to open it and do a search. The
results are a series of Document objects with URI's for names, meta-data
attached, and words you can use to summarize the document. Pretty much
everything you need to find the article in the database and show the
user a summary. In the search results, just show the summary from
odeum, and wait to show them the full article until they click on the
link. That cuts down on database traffic since all of the relevant
words are stored right in the odeum index.
Feel free to contact me offline if you want more advice. I'm going to
be making some changes to QDBM Odeum for Mikio, and also including some
more features into Ruby/Odeum in the next release.
On Sat, 2005-04-23 at 06:54 +0900, Oliver Cromm wrote:
* Zed A. Shaw wrote:
> Hello Everyone,
> Just another announcement for the Ruby/Odeum project:
As a long-time user of QDBM with Ruby, I'm very happy about that.
My problem: in the project that I would want to use Odeum for, a News
Archive, the texts are in a database, not in files. Would it be easy to
use the library with that? I don't get a clear idea where to start. The
only way I could think of is to make my own server application that taps
into the database and serves the texts, but I would be happy if there is
a solution with less overhead.