Enterprise ruby

I am thinking of doing a 'side by side' distro of Ruby that includes the
latest SVN up's, as well as some 'fringe' best practices, like a tweaked
GC.
It would have the ability to do force_recycle on objects arbitrarily (at
your own risk), and getters and setters for the GC variables (like how
often to recycle, how close you are to the next collection, how big of
heap blocks to use, etc.)
and also have a GC that is write-on-copy friendly (takes barely longer,
but doesn't dirty memory).

And any other personal tweaks that people contribute. Kind of a
bleeding edge Ruby.

Would that be useful to anyone? Would anyone use it?
Thanks and take care.
-Roger

···

--
Posted via http://www.ruby-forum.com/.

Personally, if I had the resources to invest into this I'd rather spend them on JRuby. You get a GC with many tweaking options etc. plus native threads.

Kind regards

  robert

···

On 09.11.2007 21:28, Roger Pack wrote:

I am thinking of doing a 'side by side' distro of Ruby that includes the
latest SVN up's, as well as some 'fringe' best practices, like a tweaked
GC.
It would have the ability to do force_recycle on objects arbitrarily (at
your own risk), and getters and setters for the GC variables (like how
often to recycle, how close you are to the next collection, how big of
heap blocks to use, etc.)
and also have a GC that is write-on-copy friendly (takes barely longer,
but doesn't dirty memory).

And any other personal tweaks that people contribute. Kind of a
bleeding edge Ruby.

Would that be useful to anyone? Would anyone use it?
Thanks and take care.

Roger Pack wrote the following on 09.11.2007 21:28 :

I am thinking of doing a 'side by side' distro of Ruby that includes the
latest SVN up's, as well as some 'fringe' best practices, like a tweaked
GC.
It would have the ability to do force_recycle on objects arbitrarily (at
your own risk), and getters and setters for the GC variables (like how
often to recycle, how close you are to the next collection, how big of
heap blocks to use, etc.)
and also have a GC that is write-on-copy friendly (takes barely longer,
but doesn't dirty memory).

And any other personal tweaks that people contribute. Kind of a
bleeding edge Ruby.

Would that be useful to anyone? Would anyone use it?
  
Useful ? Yes it would : I have some Rails actions that eat memory quite
happily (associations with hundreds of thousands of objects which
themselves have associations on which to work...). It would help if the
ruby processes would let go of the memory or at least let it live in
swap undisturbed once the action is done.

Would I use it ? Probably, I'll have to get time to both review your
patch myself and stress-test it (I prefer to test it and understand it
first-hand because I suppose it won't be used by many). Would the patch
be easy to understand to someone both familiar with Ruby, GC techniques
and C? Or is preliminary knowledge of Ruby's internal a must?

Regards,

Lionel

Roger Pack wrote:

I am thinking of doing a 'side by side' distro of Ruby that includes the
latest SVN up's, as well as some 'fringe' best practices, like a tweaked
GC.
It would have the ability to do force_recycle on objects arbitrarily (at
your own risk), and getters and setters for the GC variables (like how
often to recycle, how close you are to the next collection, how big of
heap blocks to use, etc.)
and also have a GC that is write-on-copy friendly (takes barely longer,
but doesn't dirty memory).

And any other personal tweaks that people contribute. Kind of a
bleeding edge Ruby.

Would that be useful to anyone? Would anyone use it?
Thanks and take care.
-Roger

What would be more useful to me, and in fact where I'm headed, is a Ruby that's tunable to your *hardware*. Just make a *source* distribution and force people to recompile it. Right now, my tweaks are all at the GCC level, and that's the way it's going to be for a while. I don't believe I've exhausted all of the goodies that GCC has to offer, especially GCC 4.2.

Another thing that would be more useful is a comprehensive enough test and benchmark suite that a user could tell what the payoffs were from the tweaks and whether the language syntax and semantics remained intact after the tweaks.

I'm in the process of re-factoring the Rakefile from my profiling efforts. I'd be happy to profile your source as part of that. By the way, are you starting with 1.9 or 1.8? I'm still profiling 1.8 only, but I expect to have 1.9 profiling working within a week or so.

Robert Klemme wrote the following on 09.11.2007 22:05 :

Personally, if I had the resources to invest into this I'd rather
spend them on JRuby. You get a GC with many tweaking options etc.
plus native threads.

Please don't forget that many gems still don't work or don't have
replacement in JRuby. JRuby is *the* solution for people needing easy
Ruby <-> Java integration but Ruby with strong Unix ties has its
benefits too.

I think I'd have to spend quite some time to migrate applications from
MRI to JRuby: I heavily used ruby-gettext, hpricot, memcache,
ruby-opengl and I believe most of these use C for library interfaces or
performance... some utils like rcov probably don't work either with
JRuby because they probably rely on the same C interface.

So as much as I'd like JRuby to succeed even if I don't use it myself
(currently), people willing to work on MRI (or YARV and Rubinious for
that matter) are most welcomed to do so too.

But maybe there is an efficient way to use JNI to trivially port most of
these to JRuby. This could motivate my toying with JRuby...

Regards,

Lionel

Lionel Bouton wrote:

Useful ? Yes it would : I have some Rails actions that eat memory quite
happily (associations with hundreds of thousands of objects which
themselves have associations on which to work...). It would help if the
ruby processes would let go of the memory or at least let it live in
swap undisturbed once the action is done.

Sounds to me like you're building a data structure in RAM to avoid making your RDBMS earn its keep. :wink: But seriously, unless you can restructure your application so it doesn't keep a lot of stuff in RAM, you're probably doomed to throw hardware at it. In other words, hard drives are where data that must live for extended periods (or forever) belong, *explicitly* filed there by your application code, not *implicitly* filed there by the swapper. RAM is for volatile information that is being used and re-used frequently.

M. Edward (Ed) Borasky wrote:

I'm in the process of re-factoring the Rakefile from my profiling efforts. I'd be happy to profile your source as part of that. By the way, are you starting with 1.9 or 1.8? I'm still profiling 1.8 only, but I expect to have 1.9 profiling working within a week or so.

What profiler are you using Ed? I just ask because I wrote a profiler
a decade ago that used gcc's -pg (gprof profiling) with a custom prof
library that used tail patching and the TSC register to get real-time
nano-second resolution on profiled functions, including parent/child
rollups.

The use of *real-time* in profiling is fantastic, because it tells you
where, for example, I/O or excessive page-faulting is hurting some
function. I also hooked the memory allocation functions to gather
memory info, including both growth and flow (in/out). To carefully
select functions to profile, rather than profiling all, also helps a
lot. Finally, the TSC register counts processor cycles, so you get
nanosecond-resolution.

A lot of Unix-based systems are designed using flawed performance data
from the kernel (statistical) profiling, that simply doesn't see I/O
or VM times.

I've no idea whether the hacks I used to do the tail-patching still
work with a current gcc, but it would be good to reactivate tprof if
possible. It'd be great to have a Ruby that can do such profiling.
I still have the code somewhere...

Clifford Heath.

M. Edward (Ed) Borasky wrote the following on 10.11.2007 05:58 :

Lionel Bouton wrote:

Useful ? Yes it would : I have some Rails actions that eat memory quite
happily (associations with hundreds of thousands of objects which
themselves have associations on which to work...). It would help if the
ruby processes would let go of the memory or at least let it live in
swap undisturbed once the action is done.

Sounds to me like you're building a data structure in RAM to avoid
making your RDBMS earn its keep. :wink:

In some cases, yes because the code is easier to maintain that way. I
usually take the time to switch to SQL when it becomes a problem though
and pure SQL is powerful enough for the task (done that several times
last month).

But my current problem is that simply iterating other large associations
(to create a new object for each and every object on the other end of a
has_many association with complex business rules SQL can't handle for
example) is enough to use 100-300MB with hundreds of thousands of
objects. Usually I can split the task paginating through the whole set,
but in some cases it isn't possible : if inserts or deletes happens
concurrently you can miss some objects or try to process some twice (I'm
actually considering fetching all the primary keys in a first pass and
then paginate using windows in this set, which comes with other problems
though manageable ones in my case)...

A temporary 100-300MB spike isn't a problem, what is a problem is that :
1/ the memory isn't freed after completion of the task,
2/ it's kept dirty by the GC
->there's no way the OS can reuse this memory for another spike
happening in another process, only the original process can reuse it.

This is not a major problem : I can always move all these huge
processings in short-lived dedicated processes but it's kind of a downer
when the language keeps out of your way most of the time and then shows
one limitation.

But seriously, unless you can restructure your application so it
doesn't keep a lot of stuff in RAM, you're probably doomed to throw
hardware at it.

Yes, I've done some simple code tuning that helps memory usage, but it
only helps with the wait for a bigger server.

In other words, hard drives are where data that must live for extended
periods (or forever) belong, *explicitly* filed there by your
application code, not *implicitly* filed there by the swapper. RAM is
for volatile information that is being used and re-used frequently.

As I understand it, the problem is that MRI keeps some unused memory
allocated and then the GC marks it dirty... So technically there's
information being used and re-used frequently but only by the GC :frowning:

Lionel.

Lionel Bouton wrote:

Robert Klemme wrote the following on 09.11.2007 22:05 :

Personally, if I had the resources to invest into this I'd rather
spend them on JRuby. You get a GC with many tweaking options etc.
plus native threads.

Please don't forget that many gems still don't work or don't have
replacement in JRuby. JRuby is *the* solution for people needing easy
Ruby <-> Java integration but Ruby with strong Unix ties has its
benefits too.

The lack of native gems in JRuby has so far not amounted to much of an obstacle. Since in normal Ruby code you can go after the Java equivalent libraries with ease, there's been very little demand to port over native extensiosn.

I think I'd have to spend quite some time to migrate applications from
MRI to JRuby: I heavily used ruby-gettext, hpricot, memcache,
ruby-opengl and I believe most of these use C for library interfaces or
performance... some utils like rcov probably don't work either with
JRuby because they probably rely on the same C interface.

I don't think memcache uses C. Hpricot does, but there's a JRuby port since it's a Ragel-generated state machine. I don't know about gettext. GL could easily be replaced with one of several Java 3D/GL binding libraries.

So as much as I'd like JRuby to succeed even if I don't use it myself
(currently), people willing to work on MRI (or YARV and Rubinious for
that matter) are most welcomed to do so too.

I think MRI is mostly a dead end at this point, unlikely to see any major perf/scaling improvements anymore. If you're going to focus a lot of time on tweaking and improving an implementation, I'd recommend helping out one of the really active 1.8-compatible implementations (JRuby being the most complete and furthest along) or a 1.9 implementation (YARV being most complete and furthest along...but we have some 1.9 features in JRuby too).

But maybe there is an efficient way to use JNI to trivially port most of
these to JRuby. This could motivate my toying with JRuby...

JNI to native extensions is a band-aid at best. The better option is to rewrite the libraries in terms of what's readily available on the JVM.

- Charlie

Lionel Bouton wrote:

But my current problem is that simply iterating other large associations
(to create a new object for each and every object on the other end of a
has_many association with complex business rules SQL can't handle for
example) is enough to use 100-300MB with hundreds of thousands of
objects.

Ah ... complex business rules. That's the big problem with programming languages -- they make it possible to *have* complex business rules. Before computers were invented, we had to make do with "buy low, sell high, collect early, pay late" and double-entry bookkeeping. :slight_smile:

> A temporary 100-300MB spike isn't a problem, what is a problem is that :

1/ the memory isn't freed after completion of the task,
2/ it's kept dirty by the GC
->there's no way the OS can reuse this memory for another spike
happening in another process, only the original process can reuse it.

This is not a major problem : I can always move all these huge
processings in short-lived dedicated processes but it's kind of a downer
when the language keeps out of your way most of the time and then shows
one limitation.

[snip]

As I understand it, the problem is that MRI keeps some unused memory
allocated and then the GC marks it dirty... So technically there's
information being used and re-used frequently but only by the GC :frowning:

Well ... that sounds like an actual bug rather than a design issue in MRI. Is it that the GC can't tell it's unused?

Charles Oliver Nutter wrote:

Lionel Bouton wrote:

So as much as I'd like JRuby to succeed even if I don't use it myself
(currently), people willing to work on MRI (or YARV and Rubinious for
that matter) are most welcomed to do so too.

I think MRI is mostly a dead end at this point, unlikely to see any major perf/scaling improvements anymore. If you're going to focus a lot of time on tweaking and improving an implementation, I'd recommend helping out one of the really active 1.8-compatible implementations (JRuby being the most complete and furthest along) or a 1.9 implementation (YARV being most complete and furthest along...but we have some 1.9 features in JRuby too).

1. As far as I know, *only* MRI is "100 percent MRI compatible". :slight_smile: The other implementations are "extended subsets". JRuby is for the moment the most complete subset and has more extensions, i.e., Java libraries, an AOT compiler and all of the performance tuning that the JRuby team has done. I haven't heard much from the Parrot/Cardinal project recently, but I'm guessing we'll see IronRuby at close to the level of JRuby by early next year, and Rubinius some time in the spring.

2. I don't think MRI is a dead end at all, considering the discussions I've seen on this list just since I got back from RubyConf. I see people seriously proposing re-doing the garbage collector, for example, and I see other people investing a lot of effort in tweaking Rails to use Ruby and the underlying OS more efficiently.

3. As far as I know, YARV/KRI is the only serious 1.9 implementation.

I do think that there is probably more excitement and interesting work on YARV/KRI/1.9 than there is on MRI, or for that matter any of the MRI extended subsets. But MRI is hardly a dead end IMHO.

Charles Oliver Nutter wrote:

Lionel Bouton wrote:

Robert Klemme wrote the following on 09.11.2007 22:05 :

Personally, if I had the resources to invest into this I'd rather
spend them on JRuby. You get a GC with many tweaking options etc.
plus native threads.

Does the jruby GC need much help? I was under the assumption it was
'good nuf' or something.

I think MRI is mostly a dead end at this point, unlikely to see any
major perf/scaling improvements anymore. If you're going to focus a lot
of time on tweaking and improving an implementation, I'd recommend
helping out one of the really active 1.8-compatible implementations
(JRuby being the most complete and furthest along) or a 1.9
implementation (YARV being most complete and furthest along...but we
have some 1.9 features in JRuby too).

I assume by your comments you mean 'work on the 1.9 MRI or on jruby'

True. Matz has specified that he isn't looking to integrate drastic
changes into the 1.8.6 trunk anytime (like changes to the GC)--to keep
it stable. Bug fixes, sure, but other things, no.
So why, then, you ask, would people waste time trying to optimize it?
I guess I figured that the 1.9 code (like for the GC) was about the same
so that patches to 1.8.6 that were useful would be good fodder for 1.9.
I think this is the case, too, as Matz mentioned being interested in
benchmarks for any tweaked GC's in 1.9 (i.e. 'go ahead and tweak
away--it will get implemented then').
I could be wrong about the usefulness of working on 1.8.6, though. Hmm.
I think I'm just afraid of working on 1.9 since bugs seem to still be
rolling in. I like stability in others code so that if it exists--it's
my own fault so I know where to fix it :slight_smile:

Now to find the time to write a real GC ... :slight_smile: (i.e. re-write every
useful extension that has a gc_mark function...sigh).

My latest thought would be to rewrite the object allocater to use *just*
malloc/free and see if it is faster. In my heart of hearts I almost
think it could be :slight_smile: Sometimes we optimize ourselves to death.

void * new_obj(){ return malloc(sizeof(RANY)); }
void * recycle(void *obj) { free(obj)); }

:slight_smile:

Oh wait, traversing the stack and looking for heap pointers is
problematic with my solution. Maybe I could overcome it by making the
size of a ruby object += 4, and putting the chars 'rbrx' at its front so
I know if it's a ruby heap objects--and hoping few people use that for
strings or something. :slight_smile: Oh wait that would introduce more bugs. One
more shot down.

Have a good night!
-Roger

···

--
Posted via http://www.ruby-forum.com/\.

A temporary 100-300MB spike isn't a problem, what is a problem is that :
1/ the memory isn't freed after completion of the task,
2/ it's kept dirty by the GC
->there's no way the OS can reuse this memory for another spike
happening in another process, only the original process can reuse it.

Ahh so you want the process to actually really free the memory after it
is done with it (and GC doesn't do it because if heap 'chunks' have a
least one ruby object stil live in them, they are re-used). Yeah you
could maybe fix this by using fixed size heap 'chunks' (ruby's grow
exponentially currently).

Maybe an optimized ruby would be useful :slight_smile:

···

--
Posted via http://www.ruby-forum.com/\.

I don't know why this gives me a sense of deja vu.

I don't know how many here remember VisualAge/Java. We (IBM/OTI)
built the first IBM Java IDE on top of what we called the UVM (or
universal virtual machine). This was the IBM Smalltalk VM extended to
execute java bytecodes as well as Smalltalk bytecodes.

In this implementation, Java primitives were written in Smalltalk.
IIRC this was either before the JNI existed, or the JNI evolved to
make this impractical, and IBM moved to a Java only VM.

I know that it's difficult, and probably premature to define a
standard extension interface which would work across the various
emerging ruby implementations. But without that I'm afraid that the
promise of having multiple implementations is somewhat muted.

···

On Nov 10, 2007 8:02 PM, Charles Oliver Nutter <charles.nutter@sun.com> wrote:

Lionel Bouton wrote:

JNI to native extensions is a band-aid at best. The better option is to
rewrite the libraries in terms of what's readily available on the JVM.

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

As I understand it, the problem is that MRI keeps some unused memory
allocated and then the GC marks it dirty... So technically there's
information being used and re-used frequently but only by the GC :frowning:

Well ... that sounds like an actual bug rather than a design issue in
MRI. Is it that the GC can't tell it's unused?

The GC's mark and sweep 'recreates' its freelist every time it runs a
GC, so if you have a lot of free objects (believe it or not), it will
remark them all--possibly in about the same order as the previous. A
design thing.

So this interesting point of yours may have two implications: a ruby
proc that retains lots of 'free' memory will have a longer sweep time
(which having lotsa free is quite common with the standard MRI--it
allocates exponentially larger and larger heap sizes, so you're almost
guaranteed (with a large process) the the 'last most' heap will be half
used, and, as you noted, the entire thing constantly remarked for every
GC (all used marked as 'valid', all free remarked for the freelist).

The way to avoid this would be to 'only add' to the freelist as you
unallocate objects. Then you'd avoid marking the free objects. You
could still free heaps the same way. If you did that you'd still be
traversing them for every GC (to look through for allocated objects no
longer accessible--unmarked objects), but wouldn't be marking them
dirty.
Drawback might be a freelist that isn't 'optimized in order' or
something (probably not much of a drawback).

Another way to kind of combat this is to use a smaller 'heap chunk' size
(instead of Ruby's exponentially growing one), as this allows chunks
more frequently to be freed, which means they aren't traversed
(basically you don't have as much free memory kicking around, so you
don't traverse it as much). It still leaves all free memory to
traverse, however.

If you wanted to avoid ever accessing freed objects at all, you'd need
to create an 'allocated' list, as well, so you could just traverse the
allocated list and then add those to the freelist that were freed. So
about 20%/object size increase. Maybe a good trade off??? Tough to
tell. If I guessed I'd say that the trade off is...worth it for large
long standing processes. It would use more RAM and be faster.

Maybe an optimized GC might not be such a bad idea after all :slight_smile:

-Roger

···

--
Posted via http://www.ruby-forum.com/\.

1. As far as I know, *only* MRI is "100 percent MRI compatible". :slight_smile: The

It's only like 99% compatible anyway, there are changes from time to time :wink:

other implementations are "extended subsets". JRuby is for the moment
the most complete subset and has more extensions, i.e., Java libraries,
an AOT compiler and all of the performance tuning that the JRuby team
has done. I haven't heard much from the Parrot/Cardinal project
recently, but I'm guessing we'll see IronRuby at close to the level of
JRuby by early next year, and Rubinius some time in the spring.

2. I don't think MRI is a dead end at all, considering the discussions
I've seen on this list just since I got back from RubyConf. I see people
seriously proposing re-doing the garbage collector, for example, and I
see other people investing a lot of effort in tweaking Rails to use Ruby
and the underlying OS more efficiently.

No, it's not dead end. However, I would expect its lifetime something
like 1-2 years. So small tweaks that bring immediate benefit are worth
it. Rewriting the GC probably not. Even if you manage to do it before
1.8 is obsolete it would get intensive use for a few moths at best.
If 2.0 succeeds (and I believe in it) there will be little incentive
to use 1.8 anymore. 2.0 will be the current actively developed
interpreter, and implementing the GC in there makes more sense.

Thanks

Michal

···

On 11/11/2007, M. Edward (Ed) Borasky <znmeb@cesmail.net> wrote:

M. Edward (Ed) Borasky wrote:

Charles Oliver Nutter wrote:

Lionel Bouton wrote:

So as much as I'd like JRuby to succeed even if I don't use it myself
(currently), people willing to work on MRI (or YARV and Rubinious for
that matter) are most welcomed to do so too.

I think MRI is mostly a dead end at this point, unlikely to see any major perf/scaling improvements anymore. If you're going to focus a lot of time on tweaking and improving an implementation, I'd recommend helping out one of the really active 1.8-compatible implementations (JRuby being the most complete and furthest along) or a 1.9 implementation (YARV being most complete and furthest along...but we have some 1.9 features in JRuby too).

1. As far as I know, *only* MRI is "100 percent MRI compatible". :slight_smile: The other implementations are "extended subsets". JRuby is for the moment the most complete subset and has more extensions, i.e., Java libraries, an AOT compiler and all of the performance tuning that the JRuby team has done. I haven't heard much from the Parrot/Cardinal project recently, but I'm guessing we'll see IronRuby at close to the level of JRuby by early next year, and Rubinius some time in the spring.

I didn't say MRI, I said 1.8...and I said JRuby was "the most complete", not "complete", so I think we basically said the same thing as far as JRuby goes.

When you say "close to the level" do you mean performance-wise or completion-wise?

Performance-wise, I wouldn't be surprised to see IronRuby close to current JRuby in the next six months; but then we'll be another six months on from here too. Rubinius may take a bit longer, since performance is going to be a tough issue for them.

Completion-wise, Rubinius is way ahead of IronRuby, and may be a rough tie with Ruby.NET. I would expect Rubinius to stay ahead as far as API/language support for some time.

In our experience on JRuby, the last 10 or 5% of compatibility has been by far the hardest, at times requiring rewrites of key subsystems. Getting 90% complete is great, but it won't run e.g. Rails. And we still get occasional bug reports for things not working in Rails.

On another note...talking about performance before you can run apps is mostly worthless, so it seems like it would be better for alternative implementations to hold off reporting performance numbers before they can run real apps.

2. I don't think MRI is a dead end at all, considering the discussions I've seen on this list just since I got back from RubyConf. I see people seriously proposing re-doing the garbage collector, for example, and I see other people investing a lot of effort in tweaking Rails to use Ruby and the underlying OS more efficiently.

We shall see.

3. As far as I know, YARV/KRI is the only serious 1.9 implementation.

We haven't started implementing 1.9 semantics yet; but I don't expect it will take more than a couple months when we do.

I do think that there is probably more excitement and interesting work on YARV/KRI/1.9 than there is on MRI, or for that matter any of the MRI extended subsets. But MRI is hardly a dead end IMHO.

You are entitled to your opinion. But perhaps "dead end" was a bit to strong. How about "done"? I see little more than maintenance happening on 1.8 in the future.

- Charlie

Rick DeNatale wrote:

In this implementation, Java primitives were written in Smalltalk.
IIRC this was either before the JNI existed, or the JNI evolved to
make this impractical, and IBM moved to a Java only VM.

I know that it's difficult, and probably premature to define a
standard extension interface which would work across the various
emerging ruby implementations. But without that I'm afraid that the
promise of having multiple implementations is somewhat muted.

This implies that it's only valid to call something an "extension" if it's written in C. That's a bit narrow. JRuby has far better support than MRI for writing extensions in Java, for example. Does it mean that JRuby is somehow less-capable than MRI if it can't use C extensions? Of course it doesn't.

The only thing it means is that existing extensions written in C for MRI won't work in JRuby. That may limit you, if you depend on those specific extensions. But in most cases the same functionality is provided by Java libraries just as well. And even better, you don't need to compile anything. You can just call the library directly.

include Java
import java.util.concurrent.ConcurrentHashMap

chm = ConcurrentHashMap.new
chm[:bar] = 'foo'
# etc

This applies to Java's GUI libraries (around which several frameworks have been written, all in Ruby), graphics libraries, network libraries, and so on. Ruby has a much more difficult time using any of these libraries.

So I'd say it's a matter of perspective.

Can you write extensions for JRuby? Yes. Can you write them in C? Not easily.

Can you write extensions for MRI? Yes. Can you write them in Java? Not easily.

- Charlie

Charles Oliver Nutter wrote:

implementation (YARV being most complete and furthest along...but we
have some 1.9 features in JRuby too).

1. As far as I know, *only* MRI is "100 percent MRI compatible". :slight_smile: The

...

2. I don't think MRI is a dead end at all, considering the discussions
I've seen on this list just since I got back from RubyConf. I see people
seriously proposing re-doing the garbage collector, for example, and I
see other people investing a lot of effort in tweaking Rails to use Ruby
and the underlying OS more efficiently.

In retrospect, maybe it would be worth more long term investment for
1.9, but I still favor 'tweaks' for 1.8.6 as being useful.

So that would mean to apply existing 'useful'
patches/observations/tweaks to 1.8.6 (a small amount of work), and then
do the big jobs on jruby or 1.9. The reason I say this is that beyond
the already existing 1.8.6 patches (there's a coy-on-write (COW)
friendly GC, and you can resize the heap chunks to keep memory low), I
don't think we'll get much more speed savings unless we drastically
alter the GC (i.e. went to generation or, my fav. reference checking).
Therefore if we did completely re-write the GC, to get those 'real speed
boosts' the usefulness would be short sighted, as it would be being
phased out because of slower speed overall compared to 1.9.

So if one were to re-write the GC, since it's a large task, doing it for
1.9 makes sense. Small tweaks, though, don't cost much to make
(tweaking parameters, not rewriting code). Like GCC flag optimization,
smaller memory use.

I hate to say it but with this post, I'm operating under the assumption
that 1.9 is going to become the 'most popular ruby interpreter' and
hence worth investing time into. It's possible that the time
investments should be made into Jruby, should it somehow swamp the
market. For now it seems the MRI 1.9 will be fastest, so most useful to
optimize. Any thoughts? Oh wait is like a self-fulfilling decision...
:slight_smile:
Also I can't think of a cool Jruby project off the top of my head that
would be fun to 'fix.' -- I'm unfamiliar with its bottlenecks.

Have a good one!
-Roger

···

--
Posted via http://www.ruby-forum.com/\.

Roger Pack wrote:

[snip]

Maybe an optimized GC might not be such a bad idea after all :slight_smile:

I haven't been following 1.9 closely enough to know what it does about garbage collection. But yes, it does look like the MRI GC could stand some optimization. Given the trade-offs and use cases, I'd optimize for Rails. And I'm guessing that on Linux/GCC, a nice tight stop-and-copy GC might well outperform what's there, and a generational GC would be better than what's there but not worth the coding effort. I can't help you on Windows or MacOS ... the memory management there is a black box to me.

Which brings up an interesting question. While it seems more Ruby developers work with Macs than with Windows or Linux, where are most of the Ruby server applications (Rails and otherwise) deployed? I want to guess Linux, but I don't actually know for a fact that is the case.

As a previous email suggested, there are a couple of use cases for a garbage collector, only one of them being long-running server applications. But if the overwhelming majority of Ruby server applications are Rails on Linux, it would surely be worthwhile tuning the GC to that. Stop-and-copy integrated with the Linux memory manager (assume RHEL 5/CentOS 5 64-bit) sounds like a winner off the top of my head.

Hmmm ... maybe I should dual-boot my workstation with CentOS 5 and fool around with this. :wink: