Hi everybody,
I've just released the newest dev release of Lafcadio, 0.7.0, and the bugfix release 0.6.1 for the stable branch.
== What's Lafcadio? ==
An object-relational mapping library for use with MySQL. It supports a lot of advanced features, including in-Ruby field value checking, extensive aid in mapping to legacy databases, an advanced query engine that allows you to form queries in Ruby that can be run either against the live database, or an in-memory mock store for testing purposes.
Lafcadio is more than a year old and is currently in use on production websites, most notably http://rhizome.org/, an online community that has a 6-year-old legacy database and gets more than 3 million hits a month.
== What's new in 0.7.0? ==
Excessively Clever Query Caching goes like this: Everytime you run a select against the DB, Lafcadio caches the results in memory. Then, if you later run a second select that is a subset of the first, Lafcadio detects it, figures out what it's a subset of, filters out the results in memory, and returns you the results. This all happens transparently.
What does this mean? It means a significantly faster app, because if you run these three queries:
select * from users where lname = 'Smith'
select * from users where lname = 'Smith' and fname like '%john%'
select * from users where lname = 'Smith' and email like '%hotmail%'
Lafcadio will only ask MySQL for the results for the first select statement, and do the rest for you without using the DB connection.
Francis Hwang
http://fhwang.net/
Hey Francis,
I've just released the newest dev release of
Lafcadio, 0.7.0, and the
bugfix release 0.6.1 for the stable branch.
Link?
http://rubyforge.org/projects/lafcadio
== What's Lafcadio? ==
<snip>
== What's new in 0.7.0? ==
Excessively Clever Query Caching goes like this:
<snip>
Awesome ! This is of great interest to me.
Have you though about parallel query dispatch over
horizontally partitioned data? I have done something
like this for MS SQL 2000. Interested?
Francis Hwang
http://fhwang.net/
-- shanko
···
--- Francis Hwang <sera@fhwang.net> wrote:
__________________________________
Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today!
Quite. But seeing as I'm pretty unschooled in DB theory in general, I've never heard of "parallel query dispatch". Care to explain, or offer a link?
Also, if you're interested in seeing this feature ported over to a DB you use (such as MS SQL 2000) I'm open to extending Lafcadio to work with any other DB as long as I've got people actively testing them on other DBs. (I always use MySQL, hence Lafcadio's MySQL focus up 'til now.)
Francis Hwang
···
On Jan 20, 2005, at 10:24 AM, Shashank Date wrote:
== What's new in 0.7.0? ==
Excessively Clever Query Caching goes like this:
<snip>
Awesome ! This is of great interest to me.
Have you though about parallel query dispatch over
horizontally partitioned data? I have done something
like this for MS SQL 2000. Interested?
Quite. But seeing as I'm pretty unschooled in DB
theory in general, I've never heard of "parallel
query dispatch".
Well, of course ! My bad: I am using our internal
terminology while talking to outside world 
The correct term is "Federated Databases". And even
that term is context dependant. Google it in the
context of SQL Server 2K and you will get what I mean.
Care to explain, or offer a link?
http://www.sql-server-performance.com/federated_databases.asp
Also, if you're interested in seeing this feature
ported over to a DB
you use (such as MS SQL 2000) I'm open to extending
Lafcadio to work
with any other DB as long as I've got people
actively testing them on other DBs.
I can surely help testing. Especially if involves
running test cases in the background. I won't be able
to devote too much time on the foreground though.
(I always use MySQL, hence Lafcadio's
MySQL focus up 'til
now.)
No problem. Let me know how I can get started.
Francis Hwang
http://fhwang.net/
-- shanko
···
__________________________________
Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today!
How Using the Windows 2000 Encrypted File System to Secure SQL Server Databases and Backups Affects SQL Server's Performance – SQL Server Performance
Intriguing stuff. Once you've set this up in MS SQL 2k, what requirements are there for a client to manage them? I mean, besides what the database takes care of for you automatically.
And by the way, if you're working with federated databases, how big are these tables you're dealing with? I'm just wondering how much bigger the tables at my work can get before I need to look into something like this.
Also, if you're interested in seeing this feature
ported over to a DB
you use (such as MS SQL 2000) I'm open to extending
Lafcadio to work
with any other DB as long as I've got people
actively testing them on other DBs.
I can surely help testing. Especially if involves
running test cases in the background. I won't be able
to devote too much time on the foreground though.
Well, I'll put "port to MS SQL" on my to-do list and let you know when a new beta release has MS SQL support ... then I just need a steady supply of specific bug reports to chase down, after that.
Francis Hwang
···
On Jan 20, 2005, at 11:00 AM, Shashank Date wrote:
Hi Francis,
Intriguing stuff. Once you've set this up in MS SQL
2k, what requirements are there for a client to
manage them? I mean, besides what the database
takes care of for you automatically.
Umm .... mantaining the indexes comes to mind. I don't
know the details since we never actually used it as it
comes out of the box. We found out that the queries
were not being executed in parallel. Hence we wrote
our own version (in Ruby of course) and called it
"parallel query dispatcher" 
And by the way, if you're working with federated
databases, how big are
these tables you're dealing with? I'm just wondering
how much bigger
the tables at my work can get before I need to look
into something like
this.
It is not only the size that matters (in this case
;-)) but the nature of the application. Our data is
being collected at various data centers and then
coalesced at the central server. So it comes naturally
partitioned. Further our queries are rarely (almost
never) across the partitions. This is a very important
aspect which lends itself to federation.
Add to that the fact that our combined database is
about 100GB and tables are typically over 5 Million
rows. So when we did not have the budget to scale up
we decided to scale out and were reasonably
successful. We were in production for almost a year on
four 3-server clusters throwing hundreds of queries
every day. We did dynamic load balancing and were
working on query caching (like the one you have
provided in Lafcadio) when the project got the
attention of higher-ups and a more generous budget to
scale up ... which almost always is a better
alternative.
Well, I'll put "port to MS SQL" on my to-do list and
let you know when
a new beta release has MS SQL support ... then I
just need a steady supply of specific bug reports
to chase down, after that.
Great ! Let me know ...
Francis Hwang
http://fhwang.net/
-- shanko
···
__________________________________
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250
So are you saying that the data came naturally partitioned, and you left it partitioned, and then used Ruby to analyze queries and dispatch them to the right database transparently? I suppose I could use a concrete example to help me grok this.
Francis Hwang
···
On Jan 22, 2005, at 5:11 PM, Shashank Date wrote:
Hi Francis,
Intriguing stuff. Once you've set this up in MS SQL
2k, what requirements are there for a client to
manage them? I mean, besides what the database
takes care of for you automatically.
Umm .... mantaining the indexes comes to mind. I don't
know the details since we never actually used it as it
comes out of the box. We found out that the queries
were not being executed in parallel. Hence we wrote
our own version (in Ruby of course) and called it
"parallel query dispatcher" 