Hi,
There's a set of background (Sidekiq) jobs that fail at a rate of about one
in a million
This sounds to me like a race condition. Check if there is data shared
between multiple threads and properly guarded. (I don't know Sidekiq,
though, and for the rest of this answer I am going to assume it spawns
threads and not processes).
ArgumentError: wrong number of arguments (-2 for 2)
Now that looks interesting. It is possible on the Ruby C side to define
a method that takes -2 arguments (arity -2) using this call:
rb_define_method(klass, "methodname", funcptr, -2);
-2 is treated as a special indicator rather than the expected number of
arguments; it is a way on the C side to define a splat (rest)
argument. As per the docs[1], -2 causes Ruby to call the C function with
the arguments passed as a Ruby Array instance, whereas -1 would pass
them as a C array of VALUE objects.
However, what you get is the inverse: the method has an arity of 2, but
you pass -2 arguments. That would indicate that it is possible to call
rb_funcall() with its `argc' parameter set to such a value, which to my
knowledge is not possible.
From this follows at least that your mail subject is misleading: your
method has a positive arity of 2 (which is correct), it is the argument
list passed to that method that has a negative length of -2 (which is
unexpected).
* Secondly, the line that calls the method that throws said exception is of
the form:
method_that_raises not_nil_string, {
:inline_hash => 'something' }
So it's particularly confusing because this is not a dynamic call.
The exception you are getting implies that something on the C side of
Ruby went wrong. The two arguments (the String and Hash instances) were
passed to the C side, but before they could be counted for determining
the value of `argc', something happened that caused `argc' to be set to
this weird -2 value. So it is likely that after the Ruby side was left,
but before the C code has constructed the argument list for
rb_funcall(), the thread got descheduled and some other thread got
scheduled that interferred with the argument list. Double-check if your
`not_nil_string' is a shared resource.
If I am right and we are looking at a race condition, you can increase
the likelihood of this phenomenon by spawning more threads to force the
scheduler to do more task switching, increasing the likelihood to
trigger just in this moment before `argc' is assigned.
Still, even if there is a race condition, it should not be possible to
deschedule a thread that is doing the construction of the rb_funcall()
call. The exception you get looks like a Ruby bug to me.
inline with only the curly on the calling line and it being some sort of
parsing weirdness?
I am pretty sure this is not the case. Your Ruby code gets parsed to an
Abstract Syntax Tree and then compiled to YARV bytecode before it is
executed. Any newlines in the original source file will have vanished by
the time the code is actually executed. If you want to verify, put the
brace on the same line and retry.
Has anyone even seen something nonsensical like this
before? Really what I'm looking for is any kind of clue on what the cause
of this error might be and how to address it.
You'll need to try to create a minimal demonstration example we can
fiddle with. Given the low likelihood of the incident, it might be
difficult to create, but as I said try throwing more threads on the code
and see if you can increase it.
Greetings
Marvin
···
On Fri, Sep 09, 2016 at 03:22:50PM -0400, Jonathan Calvert wrote:
--
Blog: http://www.guelkerdev.de
PGP/GPG ID: F1D8799FBCC8BC4F