WARNING: rather long.
The problem is simple:
Q: Why isn’t “x++” just syntax sugar for “x += 1”?
A: Because in C, for example, “x++” doesn’t mean “x += 1”!
int x, y, z;
x = 5;
y = x++; // y == 5
x = 5;
z = x += 1; // z == 6
It would confuse and upset C programmers, so best not to put it in.
Remember, the following code could only be intended to mean two things:
x = 5
x++
Either you want “++” to be a method, so you really meant so call the “++”
method of the object 5; or you want “++” to be syntax sugar, which we have
already seen is a bad idea (confusing and counter-intuitive).
I believe there’s yet another (insane) answer:
trying to apply ++ on inmediate objects yields an error, for the
associated Variable object is nil.
I think this leads to hell, but it’s seductive enough to toy with it };^>
WARNING: here follows a rather long Gedankenexperiment, and the
reasoning path is shown almost complete (although I had to edit it a
couple times, in an iterative approach).
If we went along the line of “everything is an object”, variables
themselves could be objects. In an expression matching
RE1 = /[_a-z][_a-zAZ1-9]*#{op}/ (where
op is some kind of operator such as ‘++’), the operator would
correspond to the call of a method of the variable object. The list of
such methods (seen as operators from the point of view of the value being
referenced to, but as methods from that of the variable in use) would be
hard-coded into the interpreter.
This is admitedly confusing, so perhaps the following will render it
clearer:
a = 1 # two objects involved here: inmediate ‘1’
# and a Variable whose value can be represented as :a (it’s
# related to symbols in a way, but a Variable object has also
# some information on the scope of the var (not only the name)
p a == 1 # instance method ‘==’ called on inmediate value object ‘1’
a++ # instance method ‘++’ applied to an anonymous object of class
# Variable, holding a reference to the inmediate value object
# of class Fixnum ‘1’
I will call such ‘instance method of objects of class Variable’
vmethods (stands for variable methods).
When finding some expression matching RE1, the interpreter would have to
decide whether to call a regular method on the object being referred to
by the variable or the vmethod of the latter. A number of operator
look-alike tokens could be chosen to always mean ‘this is a vmethod’.
An obvious candidate would be ‘++’.
Such vmethods could be defined internally, (that is implemented in C in
Ruby’s AST tree walker), but another possibility would be giving a
more complete reflection of the code:
class Variable # all vars will derive from this one
def ++(*args)
old = @referee
@referee = @referee.next # (or += 1 or something)
old
end
@referee is a magical attribute defined by Ruby
this is the new ‘reflective’ thing needed for this code to be
possible (besides the new vmethod semantics)
end
Notice that, given some kind of Number class with Number#puts
the following would happen
a = Number.new(1)
(a++).puts # writes ‘1’
this seems to be what we want for the postfix ‘++’.
Then we might want to overload the ‘++’ vmethod (no longer an operator
if you consider from the Variable point of view), and a natural
extension in Ruby would be
a = getSomeIterator()
class <<< a
we want to add overload instance methods of the singleton class of
the Variable object!!!
that is, create a singleton vmethod
def ++(*args)
this is meant to be an iterator
@referee.pointToNextElement
end
end
So far so good, we have
- a clean separation between regular instance methods and vmethods
(normally distinction on a lexical level)
- a way to overload these vmethods per variable (singleton vmethods)
I can already hear you cry about prefix (no-longer-)operators,
and we’re all wondering the possible use of *args…
The answer is of course… redefining the ‘++’ vmethod of the variable
referring to the top-level object!!!
self # -> top-level object, variable of type Object
class Variable
def pre++
@referee = @referee.next # or += 1, default implementation
end
end
class <<< self
def ++(arg) # arg is the ref to the rhs Variable object
arg.send :pre++
end
end
a = 1 # a refers to object '1'
++a # now points to object '2'
puts(++a) # puts '3'
Why did I write the ()'s? Well, we need some way to have Ruby know that
it is to do
puts(++a)
and not
(puts++) a
for the second one would try to call the ‘++’ vmethod of nil (cause no
variable named ‘puts’ exists). We could say that the ‘++’ “in the
middle” always binds to the right variable, but I’ll later give one
reason why we could possibly not want this rule
So, looking for another solution, the following comes to mind:
I think matz’s last thought on style (and we all trust him on this is
that parenthesis are a good thing (I think Ruby was beginning to warn
about parenthesis being needed in future versions in some contexts) in
many occasions, and this one seems at first sight one of them.
But then we realize that this is in fact the very same problem we have
with methods vs local variables, and the trick there was to draw the
difference during parsing. Quoting from the Pickaxe
\begin{quote}
When Ruby sees a name such as ``a’’ in an expression, it needs to
determine if it is a local variable reference or a call to a method with
no parameters. To decide which is the case, Ruby uses a heuristic. As
Ruby reads a source file, it keeps track of symbols that have been
assigned to. It assumes that these symbols are variables. When it
subsequently comes across a symbol that might be either a variable or a
method call, it checks to see if it has seen a prior assignment to that
symbol. If so, it treats the symbol as a variable; otherwise it treats
it as a method call. As a somewhat pathological case of this, consider
the following code fragment, submitted by Clemens Hintze.
def a
print “Function ‘a’ called\n”
99
end
for i in 1…2
if i == 2
print “a=”, a, “\n”
else
a = 1
print “a=”, a, “\n”
end
end
Produces:
a=1
Function ‘a’ called
a=99
\end{quote}
This is the very same kind of problem we’re facing here!!!
In ‘puts ++ a’ we don’t know what to bind ‘++’ to. A heuristic as the
former would mean that ‘puts ++ a’ means
-
puts(++a) if there’s been no assignment of the form ‘puts=…’ before
-
puts++(a) if there has been one. Then fail horribly (or not, more on
that later…) because functions (or rather methods) are not first
class values in Ruby (so we cannot meaningfully ‘increment one’)
Yet another easier way (which enforces a particular spacing) is having
to attach ‘++’ to the variable it’s applied to:
puts++a # illegal, syntax error
puts ++ a # illegal, what is is attached to?
puts++ a # do (puts++)(a)
The later sentence would normally fail because we cannot ‘increment’ a
method. But what if [methods were first class values in Ruby]
class <<< Kernel.puts
def ‘++’
# do some magic with @referee
end
end
Unless Ruby becomes another FP, the latter is useless (or rather
impossible to use).
But there’s even another new possibility if we have Variable objects:
implementing the ‘()’ “around-fix” vmethod:
class Function
def initialize(val)
@val = val
end
end
b = Function.new(0)
class <<< b
def (*args) # maybe we need something to name this ‘operator()’
# do something, such as…
ret = @val + @args.size
puts “#{@val} + number of args => #{ret}”
ret
end
end
b(“one”, “two”, “three”) # => 3
We then need the quite complex heuristic introduced before, instead of
taking “‘puts++a’ means puts(++a) always” because we might want to be
able to omit the parentheses, mirroring what is happening now in Ruby
(continues the last example)
class <<< b
def ++
@val += 1
end
end
a = 1
b ++ a # => 1
b a # => 2
++b a # => 3
b a # => 3
There’s only one problem left: what happens in
class A
a = 1
++a # calling the ++ vmethod of the ‘Variable’ A
end
Ruby will try to run (check the example about redefining ‘++’ for the top-level
object) the ‘++’ vmethod of the ‘A’ constant. There’s two pbs. there:
-
the vmethod is called on an object of class Variable which happens to
be referring to a constant. We solve it by changing the name of that
class to a more sensible name: NameBinding or something
-
that vmethod is not defined!!! Solved if we define
class Object
def ++(arg) # arg is the ref to the rhs NameBinding object
arg.send :pre++
end
end
this will work as it’s the only way that vmethod can be used, for in
every other (postfix) context ‘++’ is taken as the vmethod related to
the pertinent NameBinding.
You might wonder where this ‘pre++’ came from… it was in fact already
present in the third example, since I cheated and modified it while
writing these lines It is a vmethod used to do the prefix ++ thing.
Finally, there’s the issue of other “NameBindings” such as
- attributes
- locals inside blocks
The latter are handler exactly the same way, for the former the parser
must be wise enough to see that @a++ means (@a)++ and not @(a++).
One last concern: what is the meaning of
a = 1
(a++)++ # a++++ ?
Surprisingly enough, the proposed implementation of NameBinding#++
does the right thing!
(a++)++ # => 1 a == 3
We just have to remember that when given a receiver, the #++ vmethod
is always called on the related variable, not on the “value”. This
relates the #++ distinction issue to the way access restrictions are done
(ie no receiver allowed => use self).
BTW, this will fail:
1++++ # ‘++’ called on NameBinding nil
and it’s what we had always wanted, for changing such an inmutable
object is meaningless.
This thought experiment shows what would happen if things where
evaluated in two different contexts: “value” and “name binding”
(== variable, but not always). So you can relate it to Perl’s contexts,
and dislike it for that reason.
You might also find it to be some kind of “call by name”-like semantics
extended the whole language, and detest it for that reason.
Maybe you can associate it with the complexity of Common Lisp’s macros,
and detest it for that reason.
Or you could think that this is nothing but C++ in disguise, with
references à la Java working like pointers, and operator(), and abhor
it for that reason.
If you actually like it (or thinking about it), you’re really into
nasty things. I am too
My head hurts slightly now.
I’d anyway appreciate comments on why this is foolish.
Were it not, and had nobody else thought of it before (which I doubt),
I’d retain all the credit for myself }}
···
On Sun, Dec 01, 2002 at 06:22:27AM +0900, Chris wrote:
–
_ _
__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_
_ \ / ` | ’ \
) | (| | |__ \ | | | | | (| | | | |
.__/ _,|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com
Not only Guinness - Linux is good for you, too.
– Banzai on IRC