Two oddities in ruby 1.8.0p2

OK, recently a cool feature has been added to ruby to start assigning
immediately to a hash of hashes, hash of arrays, etc. as follows:

x = Hash.new{|a,b| a[b] = Hash.new}
y = Hash.new{|a,b| a[b] = Array.new}

x[‘a’][‘b’] = ‘hello’ # -> {“a”=>{“b”=>“hello”}}
y[‘a’][3] = 100 # -> {“a”=>[nil, nil, nil, 100]}

However, this is not possible for arrays, so we cannot do the
following for arrays of hashes or arrays of arrays:

x = Array.new{|a,b| a[b] = Hash.new}
y = Array.new{|a,b| a[b] = Array.new}

I’m hoping this is just an oversight, because it would be useful.
Comments?

Another oddity is one that I think has been around for a while, but I
haven’t seen it discussed before (correct me if I’m wrong – I’m new
to ruby). It’s the infamous local variable scope issue. In the
discussions I’ve read, methods are supposed to define a scope for
local variables. Blocks within methods also have their own scope, if
the variables have not been used already. BUT control structures like
if-else, for, while, etc. are NOT supposed to define their own scope.
How, then, to explain this example?

i = 0
while i < 10
puts x if defined? x
x = 10
i += 1
end

Nothing is outputted. The value of x is indeed local to the while
loop; it becomes undefined again each time the loop restarts. On the
other hand, if x is first defined outside the loop, this does not
occur. I thought a while loop was supposed to act differently from a
block?

Please enlighten me,
Dave

Another oddity is one that I think has been around for a while, but I
haven’t seen it discussed before (correct me if I’m wrong – I’m new
to ruby).

Oh, it has been discussed many a time :slight_smile:

It’s the infamous local variable scope issue. In the
discussions I’ve read, methods are supposed to define a scope for
local variables. Blocks within methods also have their own scope, if
the variables have not been used already. BUT control structures like
if-else, for, while, etc. are NOT supposed to define their own scope.
How, then, to explain this example?

i = 0
while i < 10
puts x if defined? x
x = 10
i += 1
end

Nothing is outputted.

It’s nothing to do with blocks. It’s because of the method/local variable
ambiguity.

The decision “is x a local variable or is it a method?” is made
statically, at compile time, as the code is being read in and turned into
a syntax tree. The rule is that if any statement of the form ‘x = …’ has
been seen earlier, then a standalone ‘x’ is treated as a local variable, not
a method call.

In your example, at the line ‘puts x if defined? x’, no local variable
assignment to x has been seen previously. Therefore this is compiled as a
method call:

puts self.x if defined? self.x

which always fails since you never define a method ‘x’

You can see it in this example:

def x
33
end
i = 0
while i < 10
puts x # compiled as puts self.x
x = 99
puts x # compiled as puts x (the local variable)
i += 1
end

which prints 33 99 33 99 33 99 …

The assignment does not even have to be executed:

def y
33
end
y = 99 if false
p y # prints ‘nil’

The last reference to ‘y’ is interpreted as a local variable because an
assignment to y was seen earlier during the parsing of this method, even
though at run-time it never gets executed.

Some other discussions are archived here:

Cheers,

Brian.

···

On Thu, Apr 24, 2003 at 08:04:28AM +0900, Dave wrote:

Brian Candler B.Candler@pobox.com wrote in message news:20030424075455.GB32891@uk.tiscali.com

It’s nothing to do with blocks. It’s because of the method/local variable
ambiguity.

The decision “is x a local variable or is it a method?” is made
statically, at compile time, as the code is being read in and turned into
a syntax tree. The rule is that if any statement of the form ‘x = …’ has
been seen earlier, then a standalone ‘x’ is treated as a local variable, not
a method call.

Oh, right… yeah, if I knew it had anything to do with that, I could
have found the discussions.

I’m still a bit confused, though. I’ve read messages from matz here
saying that he is against declarations that are for the sole purpose
of helping the compiler figure out what’s going on. And yet that is
effectively what must be done in this case – some declaration like x
= nil. It seems that the compiler has no lookahead – when it sees
“puts x if defined? x” it must decide then and there, based on what it
has seen until that point, if x is a variable or method. It doesn’t
matter that even the very next line uses x as a variable.

Is there a particular reason that such decisions can’t be postponed
until the rest of the current scope region has been looked at? The
semantics would make a lot more sense then, I think, even though it
might not take care of everything (like eval “puts x” for example,
before x is defined).

Anyway, I don’t quit understand the rationale for having it behave as
it does now. I’ll keep reading old threads. Thanks for the
explanation.

Dave

I’m still a bit confused, though. I’ve read messages from matz here
saying that he is against declarations that are for the sole purpose
of helping the compiler figure out what’s going on. And yet that is
effectively what must be done in this case – some declaration like x
= nil.

Well, I agree that it’s a declaration in all but name :slight_smile:

You still need some disambiguating rule as to whether “x” means a local
variable or method self.x

Ruby’s way is surprising to people new to Ruby, but it’s easy to get used
to. The alternatives?

  • explicit declarations
    or
  • prefixing all local variables with some symbol, e.g. %x
    or
  • forcing all method calls to have an explicit receiver

They would all make Ruby look more like Perl, and we wouldn’t want that :slight_smile:

The current behaviour works in most cases, and if you really do need to
write a loop which reads a variable before it’s been assigned, then a quick
‘x = nil’ above it sorts it out. In fact under the current scoping rules you
need that outside a block iterator anyway, otherwise x is local to the block
itself and gets a fresh instance for each iteration.

It seems that the compiler has no lookahead – when it sees
“puts x if defined? x” it must decide then and there, based on what it
has seen until that point, if x is a variable or method. It doesn’t
matter that even the very next line uses x as a variable.

Is there a particular reason that such decisions can’t be postponed
until the rest of the current scope region has been looked at?

Ruby is a one-pass compiler. This does have some very useful benefits - for
example you can pass in a program on stdin via a pipe, and it can compile
cleanly without having to buffer up potentially the whole source.

The
semantics would make a lot more sense then, I think, even though it
might not take care of everything (like eval “puts x” for example,
before x is defined).

Well it would still make sense - x would be a local variable containing
‘nil’ at that point. I’m not sure the rule would be much simpler to
understand than the current behaviour though, or worth the complexity to
implement.

It would also mean that you couldn’t test your script using irb:

def x
99
end
p x
x = 4

Under your rules this should print ‘nil’, which requires looking into the
future :slight_smile: It’s effectively the same as the one-pass issue: your proposal
would mean that you can’t compile anything until the end of the current
scope. Being able to compile and run a line at a time makes interactive
programming much more feasible.

Regards,

Brian.

···

On Fri, Apr 25, 2003 at 02:51:41AM +0900, Dave wrote:

Dnia czw 24. kwietnia 2003 20:13, Brian Candler napisa³:

The alternatives?

  • explicit declarations
    or
  • prefixing all local variables with some symbol, e.g. %x
    or
  • forcing all method calls to have an explicit receiver

Or forcing to write x() for nullary methods without a receiver.

···


__("< Marcin Kowalczyk
__/ qrczak@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/