Hash Surprises with Fixnum, #hash, and #eql?

Good Morning,

Please keep in mind that in a multithreaded environment there is
synchronization overhead. A solution could use an AtomicBoolean

Oh get real. This is a single variable which may, during the course of
a single execution, *once* change from false to true. In so doing, it
enables a slightly more conservative approach to compatibility for one
small side-effect of a shortcut, which probably doesn't even matter to
the application, and which is almost certainly set by the one thread
that cares about that side effect. It *so* doesn't need to be synchronised.

I'm sorry but the old saying "people in glass houses shouldn't throw stones"
really comes to mind here. This entire thread has been a rant against
implementation semantics being "inconsistent" in 0.001% of applications (if
that) and then you come back with - oh we can cheat here because your fix
would "almost certainly set by the one thread that care about that side
effect".

You don't get to cut a corner in providing a solution to a problem you
believe is cutting a corner. You are asking every one else to take a
performance hit (however small) to fix a problem that you won't even fix
properly? That doesn't seem appropriate at all.

John

···

On Thu, Apr 21, 2011 at 5:36 AM, Clifford Heath <no@spam.please.net> wrote:

On 04/21/11 21:28, Robert Klemme wrote:

Please keep in mind that in a multithreaded environment there is
synchronization overhead. A solution could use an AtomicBoolean
stored somewhere as static final. Now all threads that need to make
the decision need to go through this. Even if it is "only" volatile
semantics (and not synchronized) and allows for concurrent reads there
is a price to pay. Using a ThreadLocal which is initialized during
thread construction or lazily would reduce synchronization overhead at
the risk of the flag value becoming outdated - an issue which becomes
worse with thread lifetime. Applications which use a thread pool
could suffer.

In this case, I'm not using a synchronized, atomic, *or* boolean
field. Because of the rarity of Fixnum and Float modification and the
potential for heavy perf impact, I'm considering redefinition of
methods in one thread while another thread is calling those methods as
somewhat undefined, at least for Fixnum and Float. That's not perfect
(JVM could optimize such that one thread's modifications never are
seen by another thread), but it's closer.

Is redefinition per thread a standard JRuby feature or is this
something you would add? If it was something that would need adding I
wouldn't bother to do it. That sounds like a major change.

It's also worth pointing out that usually modifications to Fixnum or
Float are done for DSL purposes, where there's less likelihood of
heavy threading effects.

You're right, though...if I made that field volatile (it doesn't need
to be Atomic, since I only ever read *or* write, never both), the perf
impact would be higher.

Right you are. That occurred to me after posting as well but I didn't
bother to correct myself as the memory effects are identical. The
only difference is one more dereferencing (which can make a difference
as you pointed out).

But I agree, the effect vastly depends on the frequency of Hash
accesses with a Fixnum key. Unfortunately I guess nobody has figures
about this - and even if, those will probably largely vary with type
of application.

I operate at too low a level to see the 10000-foot view of application
performance. In other words, I spend my time optimizing individual
core methods, individual Ruby language features, and runtime-level
operations like calls and constant lookup...rather than really looking
at full app performance. Once you get to the scale of a real app, the
performance bottlenecks from badly-written code, slow IO, excessive
object creation, slow libraries and other userland issues almost
always trump runtime-level speed. As an example, I point at the fact
that Ruby 1.9 is almost always much faster than Ruby 1.8, but Rails
under Ruby 1.9 is only marginally faster than Rails on Ruby 1.8 (or so
I've seen when people try to measure it).

That exactly demonstrates your point: Rails will spend most of its
time doing IO (to and from the database, to and from network clients).
OTOH the small difference shows that Rails code cannot be awfully
written because otherwise you would likely notice a bigger difference.
:slight_smile:

The benefit of a faster and faster runtime is often outweighed by
writing better Ruby code in the first place. But I don't live in the
application world...I work on JRuby low-level performance. You have to
do the rest :slight_smile:

Will do. :slight_smile: I'd also love to lend JRuby a hand but unfortunately I
can't find the time right now.

Cheers

robert

···

On Wed, Apr 27, 2011 at 7:33 AM, Charles Oliver Nutter <headius@headius.com> wrote:

On Thu, Apr 21, 2011 at 6:28 AM, Robert Klemme > <shortcutter@googlemail.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Ouch. No wonder it hurts. I hadn't looked into the internals of JRuby,
but I assumed that you had some native C (JNI or whatever) in there,
which could make this feasible.

I can totally understand you not wanting any native extensions though!

None of the JRuby runtime is native code, and obviously that's how
we'd like to keep it. Even if we were to do it in C, we would still
need to do some object-walking, since JRuby supports having multiple
JRuby instances in the same process (that's how a single JRuby process
can serve many different applications at the same time).

It seems that a collection of such flags at known offsets inside a
singleton instance could make this a lot quicker. There's a finite
need for such things, so it's not as though it would pervade all of
the interpreter.

Such a singleton would still need to be rooted to a specific JRuby
instance. The logic as it stands now is pretty much a
JRuby-instance-global flag.

Your argument (from a previous response) that cross-thread effects of
monkey-patching was "somewhat undefined" was my thinking also, but
contrary to John's accusation, thought it could be (mostly?) hidden
under the existing synchronisation around method definition.

The JVM memory model allows for the JVM to optimize non-volatile
memory accesses away if it can prove the memory location is never
modified by the same thread. Only when specifing volatility will it
guarantee the memory access is always performed with full CPU cache
semantics.

I think the argument that "folk don't do it, therefore they wouldn't"
isn't a strong one. Look for example at Rail's HashWithIndifferentAccess,
which makes string and symbol keys interchangeable. I merely want to
do the same thing with Fixnums and Floats.

Certainly...but at the moment no implementations agree on what is
correct. Your vision might be correct, but until it's "standard" we
would probably not implement it.

If you can't make it quicker (than you outlined), best to drop it I
guess. I've mostly worked around the need for my current library.

It's probably possible to reduce the cost under Java 7, which includes
capabilities to have nearly guard-free dynamic invocation. But there's
no way I know of to make this completely cost-free under Java 6, which
we still support.

- Charlie

···

On Sat, Apr 30, 2011 at 2:45 AM, Clifford Heath <no@spam.please.net> wrote:

Is redefinition per thread a standard JRuby feature or is this
something you would add? If it was something that would need adding I
wouldn't bother to do it. That sounds like a major change.

It's something I've thought about, a la Groovy's "categories"
(thread-downstream selector namespacing, basically), but no, there's
no support for this currently in JRuby (and even in Groovy it is an
oft-maligned feature, since it imposes a perf penalty even if you
aren't using it since you have to constantly check a thread-local for
an installed category.)

Will do. :slight_smile: I'd also love to lend JRuby a hand but unfortunately I
can't find the time right now.

Well, you can always help by submitting docs, editing wiki, answering
JRuby ML questions, answering StackOverflow questions, submitting
JRuby talks to conferences, presenting at user groups and
clubs...perhaps there's something that fits your schedule better :slight_smile:

- Charlie

···

On Wed, Apr 27, 2011 at 1:44 AM, Robert Klemme <shortcutter@googlemail.com> wrote: