The "perfect" ORM?

Jim Weirich wrote:

>
> I want a really good ORM that is highly non-intrusive
> (e.g., I don't have to inherit and I don't have to
> clutter my classes and objects with metadata).
>
> Someone told me that this was much the way Hibernate
> works in Java.

Hahahaha ... oh, that's a good one :slight_smile:

Actually, hibernate is very flexible, but you do end up specifying a lot of
metadata either through annotations, comments (as another poster
demonstrated), or via XML files. Much more intrusive than, say,
ActiveRecord.

Very discouraging, Jim. :slight_smile: Read my post from five mins ago and
see if it seems on-target...

Thanks,
Hal

Kev Jackson wrote:

>
I've spent a fair bit of time with Hibernate and I can safely say that
it is not the "ruby way" (even from the little experience I have with ruby)

Thanks for this lengthy and informative post.

There's no question that a "clone" of Hibernate is wrong for Ruby.
But my understanding is that it at least keeps its hands off the
user classes (to some extent).

Now you have a simple JavaBean style of class, the only thing that
Hibernate imposes here is that it seems to be easier to use id values
that are auto-generated (long maps to number(19) in Oracle etc). You
can use Strings (auto generated hex values) or assign the id/primary key
yourself. Best practice in the Hibernate community is to let the
database auto-generate where possible and to always use surrogate keys.

Logan Capaldo and I talked about this. I'm still uncomfortable with it,

but will probably come around. I'd still like to code it the "other"
way
first -- manually assigned primary keys. The process of doing that will
clarify it in my mind, and I will be able to see the pros/cons better.

So yes in the raw Java code for the model, Hibernate does not
interfere. However at this point, you still need to map the JavaBean to
the database table. This is done with a (verbose) xml mapping file. As
these are such a pain to write, most people use XDoclet to generate the
mapping automatically. For XDoclet to do this, you have to sprinkle
attributes into your Java code like fairy dust. So the code would
really look like...

I can tell you right now that whatever solution I end up using will not
employ XML or any XDoclet equivalent, and I hope to avoid fairy dust as
much as possible, as I am allergic to it. I'd like to centralize the
typing/mapping info as much as I can, and likewise use reflection as
much as I can.

In Java5, the introduction of annotations allows these special
@whatevers to be placed outside of comments. Hibernate3 supports both
styles (in comment and true annotations). If you can say that these
Attribute/Annotations don't couple themselves to your model code, then
yes, the assertion that Hibernate is unobtrusive is true. On the other
hand, actually keeping the metadata in a separate file (xml mapping in
Hibernates case), means that the turn-around on a change is fairly
significant. Trust me, coding up a Hibernate app without using Ant +
XDoclet is an exercise in pain, even with Ant + XDoclet, the change
code->deploy is still a drag.

The idea of rampant annotation is repugnant to me. So is the concept
of storing such info in a separate file.

Basically, I want the mapping info to be part of my code, but not part
of the objects I want to serialize. And I want it to be fairly
centralized rather than scattered here and there.

There are some cool parts of Hibernate, being completely flexible on how
you configure every aspect is probably the most 'enterprise' feature of
it. It allows it to be used so much more easily with legacy data
(composite business keys, wierd table structures etc).

I guess I'm more concerned with legacy objects than legacy tables. I'm
one of those who would like to build database tables from object defs,
rather than the other way around.

Hibernate is very good at allowing you to specify everything, but you
pay the price with overly complex and verbose configuration files that
*must* be in sync with your model code for the application to work -
this synchronization issue is the achilles heal of Hibernate in my
experience - I've wasted too much time when the server has cached an old
mapping file instead of deploying the new one.

As I said, I'd want this mapping info stored *in* my code, but not
scattered through it, and not in my stored classes. Hopefully that is
a viable world view.

Thanks much,
Hal

Duane Johnson wrote:

I've been looking in to the various options available to us Ruby
developers as well. A coworker (Troy Heninger) and I are looking at
implementing a "knowledge base" or ODBMS, but I haven't worked out
the details yet. Troy is much more knowledgeable in this area.

The classic ODBMS has some implementation and usage problems. It's
a problem always worth revisiting, though.

Let's stay in touch on this.

If you're interested, the 3 interesting packages I've found so far are:

Purple - http://purple.rubyforge.org/
DyBase - DyBASE - Object Oriented Database for Languages with Dynamic Type Checking
Madeleine - http://madeleine.sourceforge.net/

I looked at Madeleine and it seemed very restrictive to me. The others
I've never heard of. (Gosh, more reading...)

Thanks,
Hal

Kirk Haines wrote:

[snip snip snip]

The above examples come directly from my current plan of how I want
the library to work, based on my needs and the input that I have gotten
from others. It's completely subject to change from internal or
external influence at this point, as I'm still working on the
modularization of the query generation/db interface code. The
motivation for this, quite honestly, is so that I can have an adaptor
to KirbyBase or even directly to a directory of CSV files which can be
treated as a database of tables, or to other non-db data sources.

Hmm. I think the idea of multiple adaptors is great, and everybody
should
consider doing something similar. Your code as shown above, though,
seems
a tiny bit too low-level to me.

Of course, there's a good chance that what I'm trying to implement will
simply blow up in my face (figuratively speaking). If so, I may be very
happy with a lower-level solution.

Thanks,
Hal

Alexandru Popescu wrote:

I've been working with Hibernate for quite a while and imo it is correctly
approaching so called
object - relational mismatch.

The real good thing about this approach is that it is not obtrusive in any ways with your domain
model objects and it let's you focus and work only on the objectual world.

That sounds good so far.

On the dark side of the problem: you should provide in some way the mapping between the object world
and the relational world.

Yes, that is a necessary evil. Naturally I want the toolkit to be as
smart
as possible, and its usage to be painless as possible.

While there are a few things that could be a little simplified (like
automatic type conversions), the big problem is the impossibility to use this simplified form on
relationships. If the parametrized types would have been implemented without the erasure mechanism
than this simplification could be brought further, but for the moment we have to use some other way
to describe relations: and here comes into play the metadata.

That sounds interesting, but I did not understand any of it. :slight_smile: I am
not
sure what you mean by parametrized types, or what an erasure mechanism
is.

There are a few different approaches
used: metadata through external XML, metadata through javadoc comments and lately metadata through
annotations.

All of these seem wrong to me. My approach will be: Metadata through
Ruby code
external to the stored objects.

I will try to give an example in a week or so.

Thanks,
Hal

Bob Hutchison wrote:

Do you want an ORM? or do you want a way to persist classes in a non-
intrusive way? Be careful what you ask for :slight_smile:

Hshs... I really want the latter, but I am considering implementing
it using the former. :wink:

I am putting the final touches on a project (<http://rubyforge.org/
projects/xampl/>) that I've been working on for a while now. It has
its roots in a Java tool that I've been working on since 1998 or so
and that has been used in eight or nine quite large commercial
products (500k to 2000k lines of code). There is a Common Lisp
version as well. I am working through a small but relatively complex
example (in Ruby) just to make sure I've not missed anything that
Ruby needs (and a good thing I did too).

Great, I will look that over as soon as I find time.

It is unobtrusive as long as you play along. Persistence is only one
of the goals of the tool, it is also trying to provide a useful
framework for projects that use it.

"Playing along" seems reasonable, for appropriate values of "playing
along." (Another fine tautology from Hal.)

Hmm, what would the framework do besides provide persistence?

There is more information on my weblog in the ruby category <http://
recursive.ca/hutch/index.php?cat=16>, a few additional articles in
the xampl category talking about the Java or CL version, if you are
curious. The articles mostly talk about xampl as an XML binding tool
-- which it also does do.

XML! Back, thou fiend, back! /me makes sign of cross

Right now, xampl is targeted at new code. Fitting it into existing
code can be done but requires some familiarity with the tool, and
there is no guarantee that it would be all that useful in the end.

A lot fo things are like that. I hope to avoid a little of that if
I can.

Thanks,
Hal

Adam Van Den Hoven wrote:

I'm not sure if this is relevant but in my opinion the perfect ORM is no ORM.

Why do we think we need ORM? Why do we build wonderful things like Rails?

Because our applications have objects and we need to persist those objects in a useful way. We want to be able to find those objects and change them and save them.

Personally, I'm willing to sacrifice a LOT to get really simple object persistence. Then again I'm not writing applications that need to handle tens of thousands of object finds every second.

I would love to be able to do something like:

class Person < ActiveObject::Base
    field :last_name, String
    field :first_name, String
    has_many :aliases, String
    transient :some_transient_value
    belongs_to :team
    has_many_ordered :roles

    #methods and the like
end

Looks like Og. Except you can leave out the inheritance part

James

···

--

http://www.ruby-doc.org - The Ruby Documentation Site
http://www.rubyxml.com - News, Articles, and Listings for Ruby & XML
http://www.rubystuff.com - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com - Playing with Better Toys

George Moschovitis wrote:

> So anyway, this is one of my highest priorities -- to
> make an ORM (that works the way I like) to wrap
> KirbyBase. (With additional code, it should/could wrap
> any other db, of course.)

FYI, the development release of Og includes a KirbyBase wrapper.

That is interesting.

> Og is cool, but is even more intrusive.

Why is Og intrusive? can you elaborate?

This is only my opinion.

I dislike putting the metadata for my objects into the objects
themselves.

As someone pointed out, if I "reopen" the class, it is a little
better, but I am still unconfortable this way.

In addition, my memory of Og is that it encourages thinking in
database terms (like "has_many") -- true or not?

What I want is:

1. To think in object (and persistence) terms, not database terms.
2. To specify the minimum information necessary in order to marshal
   each of my types.
3. To store the metadata separately from my classes/objects so as to
   minimize impact on them. (But probably not in a separate file.)

Does that make any sense?

Hal

Why didn't anyone mention Cayenne. As a comparison it seems closer to Ruby than Hibernate. It is a sort of advanced clone of what was EOF: Apache Cayenne

···

--
Alexander Lamb
Service d'Informatique Médicale
Hôpitaux Universitaires de Genève
Alexander.J.Lamb@sim.hcuge.ch
+41 22 372 88 62
+41 79 420 79 73

On Oct 26, 2005, at 5:06 PM, Bob Hutchison wrote:

On Oct 25, 2005, at 11:18 PM, Duane Johnson wrote:

DyBase - DyBASE - Object Oriented Database for Languages with Dynamic Type Checking

This guys stuff is very good. I've not used dybase, but I have used several of his other tools (have a look around his site, it is amazing what this one guy has done).

----
Bob Hutchison -- blogs at <http://www.recursive.ca/hutch/&gt;
Recursive Design Inc. -- <http://www.recursive.ca/&gt;
Raconteur -- <http://www.raconteur.info/&gt;

Kev Jackson wrote:

Actually re-thinking this again,

If you don't want to inherit from a base class (a la ActiveRecord),
could you build some kind of Dependency Injection to use ActiveRecord
without having to inherit?

Welll... I don't really understand DI (sorry, Jamis) and I am not
totally thrilled with AR.

I'm going to write my own solution. It may be harder than I think,
and my goals may not be fully reachable. But even if I fail, it will
be a learning experience.

Hal

#: rubyhacker@gmail.com changed the world a bit at a time by saying on 10/26/2005 9:52 PM :#

Alexandru Popescu wrote:

I've been working with Hibernate for quite a while and imo it is correctly
approaching so called
object - relational mismatch.

The real good thing about this approach is that it is not obtrusive in any ways with your domain
model objects and it let's you focus and work only on the objectual world.

That sounds good so far.

On the dark side of the problem: you should provide in some way the mapping between the object world
and the relational world.

Yes, that is a necessary evil. Naturally I want the toolkit to be as
smart
as possible, and its usage to be painless as possible.

While there are a few things that could be a little simplified (like
automatic type conversions), the big problem is the impossibility to use this simplified form on
relationships. If the parametrized types would have been implemented without the erasure mechanism
than this simplification could be brought further, but for the moment we have to use some other way
to describe relations: and here comes into play the metadata.

That sounds interesting, but I did not understand any of it. :slight_smile: I am
not
sure what you mean by parametrized types, or what an erasure mechanism
is.

He he... no problem. Probably I can help here.

Considering a small example Foo has a 1 - N relation with Bar. In your objectual world, considering that you would like/need to have both way navigation you should have

class Foo {
  List<Bar> myBars;
}

class Bar {
  Foo myParentFoo;
}

List<Bar> is a parametrized type; List<Bar> is a collection whose elements are Bar-s. Having this strongly typed you could do some magic and don't have to describe through metadata the relation between Foo and Bar. Unfortunately the java compiler is removing the Bar part, so you are left with a untyped collection => there is not way to know that Foo has some relation to Bar.

There are a few different approaches
used: metadata through external XML, metadata through javadoc comments and lately metadata through
annotations.

All of these seem wrong to me. My approach will be: Metadata through
Ruby code
external to the stored objects.

Even if you accept it or not this is still metadata and the format is more or less same verbose (at least for me). I would probably agree that maybe in Ruby this makes sense, but to do this on Java would be plain wrong ;-).

./alex

···

--
.w( the_mindstorm )p.

I will try to give an example in a week or so.

Thanks,
Hal

rubyhacker@gmail.com wrote:

As I said, I'd want this mapping info stored *in* my code, but not
scattered through it, and not in my stored classes. Hopefully that is
a viable world view.

What would be wrong with using re-opening classes for the mapping e.g.

#file A.rb
class A
  def foo ...
  def bar ...
end

#file A_store.rb
class A
  prop :foo, String
  has_many :bars, B
end

Is it the number of additional instance methods added to A?

rubyhacker@gmail.com wrote:

George Moschovitis wrote:

Why is Og intrusive? can you elaborate?

This is only my opinion.

I dislike putting the metadata for my objects into the objects
themselves.

As someone pointed out, if I "reopen" the class, it is a little
better, but I am still unconfortable this way.

Here's a Devil's Advocate argument. It may have actual merit; I'm not entirely convinced.

There was a time when people believed you could create distributed objects that would let you code as if all code was local, and move objects to different machines at will. You, the coder, did not have to do anything special when dealing with such objects. Just create an instance and invoke methods.

But the reality is that sending message over the wire has a cost, and one really does need to keep this in mind when designing and working with distributed objects.

Likewise for autopersisted objects. It might be nice if one could just use objects and have them magically saved/loaded with no special consideration from the coder, but since it has a real cost, the coder benefits from having at least some indication that this is what is happening. So, putting the metadata in the class definition is Good and Helpful because it alerts the coder to special conditions. It also makes more clear when some attributes are to be saved and others are transient.

In addition, my memory of Og is that it encourages thinking in
database terms (like "has_many") -- true or not?

Interesting. I don't see "has_many" as being database-centric, just a means for referring to some form of a relationship that can occur with or without any persistence mechanism. But maybe I've just become immune to the effects of certain words and phrases.

What I want is:

1. To think in object (and persistence) terms, not database terms.
2. To specify the minimum information necessary in order to marshal
   each of my types.
3. To store the metadata separately from my classes/objects so as to
   minimize impact on them. (But probably not in a separate file.)

Does that make any sense?

It does, and this is one of the reasons I prefer Og to ActiveRecord. I can just code my objects without thinking in terms of a database, and migrate to a persistence mechanism, if and when I need one, with a few minor class-code annotations. That my class has explicit indicators of storage metadata is less of an issue for me, and is arguably a feature.

James Britt

···

--

http://www.ruby-doc.org - The Ruby Documentation Site
http://www.rubyxml.com - News, Articles, and Listings for Ruby & XML
http://www.rubystuff.com - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com - Playing with Better Toys

This is only my opinion.

I dislike putting the metadata for my objects into the objects
themselves.

As someone pointed out, if I "reopen" the class, it is a little
better, but I am still unconfortable this way.

In addition, my memory of Og is that it encourages thinking in
database terms (like "has_many") -- true or not?

This is a hard thing to deal with, though. A relational database has to use
keys to implement relationships between the tables. Some databases make it
very clear what the relationship is.

If the database has a foreign key constraint on a table, this means that the
database is saying that field X in the table references field Y in another
table. If a database supports this sort of thing, then the ORM can
automatically tell from the database structure that one table, and thus, one
object, has a relationship with another. It can create that relationship for
you.

But if you start from the other end, with the objects, you are not starting
with any of that meta information, so you have to do something to identify
classes which should map to tables, and if you intend for one field to store
only objects or arrays of objects of another class which is also represented
in the db by a table, you have to do something on the ruby side to declare
that. The has_one, has_many, many_to_many, and similar terms are commonly
accepted terms for describing these relationships. In thinking a bit about
it, though, I do suppose that one need not need to actually use those terms.
I could pretty easily make the following work in Kansas today.

class Schools
  # The line below would indicate that there is a one to many relationship
  # between a school and the inventories. So one school can be associated
  # with many inventory records.
  relationship Inventories.school_idx
end

class Inventories
  # This line indicates that there is a one to one relationship between
  # an Inventories object (and thus, db record) and a Chemicals object.
  relationship chemical_idx => Chemicals
end

3. To store the metadata separately from my classes/objects so as to
   minimize impact on them. (But probably not in a separate file.)

I am not seeing why it would be beneficial to keep this annotation seperate
from the classes. The information has to be looked up somewhere, and if the
annotation doesn't interfere with the class otherwise, what is the downside
to having that information attached? If it is seperate, you still have to,
somehow, associate the two, and the information still needs to be looked up.
What is the benefit?

Kirk Haines

···

On Thursday 27 October 2005 1:57 pm, rubyhacker@gmail.com wrote:

I dislike putting the metadata for my objects into the objects
themselves.

The metadata is stored in the object class not the actual instances.

As someone pointed out, if I "reopen" the class, it is a little
better.

Og 0.24.0 allows reopening.

In addition, my memory of Og is that it encourages thinking in
database terms (like "has_many") -- true or not?

I dont think that has many is a database term, it just decribes object
relations amd allows Og to automagiaclly generate some useful methods.
This is an abstraction.

1. To think in object (and persistence) terms, not database terms.

I think, using Og you almost forget that you are using a database. In
fact you dont need an RDBMS store.

2. To specify the minimum information necessary in order to marshal
   each of my types.

Og supports this.

3. To store the metadata separately from my classes/objects so as to
   minimize impact on them. (But probably not in a separate file.)

You can do this in the latest version.

Og is constantly evolving stay tunned for even better abstractions.
Even better you can help us with suggestions and/or patches. Join the
mailing list :wink:

regards,
George.

···

--
http://www.gmosx.com
http://www.navel.gr

Alexander Lamb wrote:

Why didn't anyone mention Cayenne. As a comparison it seems closer to
Ruby than Hibernate. It is a sort of advanced clone of what was EOF:
Apache Cayenne

I've never heard of it, but I will add it to my
mountain^H^H^H^H^H^H^H^H
list of things to evaluate.

Thanks,
Hal

What would be wrong with using re-opening classes for the mapping e.g.

FYI, the development version of Og also supports this :slight_smile:

-g.

···

--
http://www.gmosx.com
http://www.navel.gr

The failure modes are radically different also. If you don't design/code
for those failure modes, the illusion of transparency will dissolve when
you are least prepared.

Gary Wright

···

On Oct 27, 2005, at 5:08 PM, James Britt wrote:

There was a time when people believed you could create distributed objects that would let you code as if all code was local, and move objects to different machines at will. You, the coder, did not have to do anything special when dealing with such objects. Just create an instance and invoke methods.

Likewise for autopersisted objects. It might be nice if one could just use objects and have them magically saved/loaded with no special consideration from the coder, but since it has a real cost, the coder benefits from having at least some indication that this is what is happening. So, putting the metadata in the class definition is Good and Helpful because it alerts the coder to special conditions. It also makes more clear when some attributes are to be saved and others are transient.

You hit the nail on the head here, James. Persisting objects to a relational database has repeatedly and raucously denied pixie dust treatment. If giving an inch means a simple table <-> class ORM can take us a mile, we should meditate on the nature of pragmatism.

In addition, my memory of Og is that it encourages thinking in
database terms (like "has_many") -- true or not?

Interesting. I don't see "has_many" as being database-centric, just a means for referring to some form of a relationship that can occur with or without any persistence mechanism. But maybe I've just become immune to the effects of certain words and phrases.

While terminology like has_many is not restricted to db-think, it is most commonly found there. Fowleresque ActiveRecord ORMs encourage a similar pattern of thought: their metadata are clearly relational hints sitting in your class, so it feels like you're mapping database - -> objects in your head as you develop your app.

This is very different from composing a domain model, coding it up, then devising a mapper to persist your object graph. As far as I know there are no Ruby ORM that attempt this.

Having done it both ways, I prefer those little hints. The apparent cost of a generic domain mapper is deceptively low due to the "it seems nice" discount, but its true cost is far higher: high conceptual overhead, difficult mapping bugs, and carpal tunnel.

···

On Oct 27, 2005, at 2:08 PM, James Britt wrote:

rubyhacker@gmail.com wrote:

What I want is:
1. To think in object (and persistence) terms, not database terms.
2. To specify the minimum information necessary in order to marshal
   each of my types.
3. To store the metadata separately from my classes/objects so as to
   minimize impact on them. (But probably not in a separate file.)
Does that make any sense?

For deeper satisfaction, look at how Smalltalkers have done it. Why introduce the R and M to O at all?

jeremy

Kirk Haines wrote:

[snip]

But if you start from the other end, with the objects, you are not starting
with any of that meta information, so you have to do something to identify
classes which should map to tables, and if you intend for one field to store
only objects or arrays of objects of another class which is also represented
in the db by a table, you have to do something on the ruby side to declare
that. The has_one, has_many, many_to_many, and similar terms are commonly
accepted terms for describing these relationships. In thinking a bit about
it, though, I do suppose that one need not need to actually use those terms.
I could pretty easily make the following work in Kansas today.

class Schools
  # The line below would indicate that there is a one to many relationship
  # between a school and the inventories. So one school can be associated
  # with many inventory records.
  relationship Inventories.school_idx
end

class Inventories
  # This line indicates that there is a one to one relationship between
  # an Inventories object (and thus, db record) and a Chemicals object.
  relationship chemical_idx => Chemicals
end

Yes, but that's just giving different names to the same things, isn't
it?
If I see "relationship" in someone's code, I don't really know what it
means. I assume it's some database stuff.

As far as I can see, all that's really needed is:
1. Let each class map to a table
2. Let each table have a known unique primary key

Then we don't need all this "relationship" stuff, do we?

When I think of an object containing a sub-object (and yes, I certainly
know these are only *references* internally), I don't think "a Foo
object has a one-to-one relationship with a Bar object"; I just think
"Foo has a field bar, which will typically be a Bar." (And no, I'm not
a fan of static typing, either.)

As for a "has-many" relationship (in objects, not in DBs) -- isn't that

just what we call an "array"? The difference being that Ruby arrays are

heterogeneous whereas rows of a table all represent the same type?

Reflection could tell us that a field is an array. It could also tell
us
the type of each element in the array.

Over the yeats I've stuck thousands of arrays into thousands of
objects,
all without ever thinking about the "relationship" of the container to
the containee; the former contains the latter, that's about it. And
I've
never dwelled long on the fact that an array indeed "has many" items in
it,
or felt the need to annotate that fact explicitly.

I want to do as little specification as possible to store my objects.
That's where I'm coming from. I want the persistence framework to be as
smart as possible and make as many reasonable assumptions as possible.

I want to spend as little time coding the metadata portion as I can,
and
I want it all stuck in the same place in my code, in as few lines as
possible.

Again, I'm not criticizing your opinions or anyone else's. This is just
me. This sort of thing is as personal as the choice of variable and
method names.

I am not seeing why it would be beneficial to keep this annotation seperate
from the classes. The information has to be looked up somewhere, and if the
annotation doesn't interfere with the class otherwise, what is the downside
to having that information attached? If it is seperate, you still have to,
somehow, associate the two, and the information still needs to be looked up.
What is the benefit?

It's highly subjective. If I do reflection and look at the stuff in my
classes, I don't want to see the extraneous stuff.

Again I stress, it's just me. I'm not arguing it's *wrong* to do it the
other
way. If a class inherits from another or includes a module, then those
are more
tightly coupled than if I simply pass an object into a method of
another class
(which is the ultimate decoupling other than not interacting at all).

Hal