CORE - Inconsistent Handling of Uninitialized Variables

puts "\n== Testin in MAIN Context =="

local = 'local'
@instance = 'instance'
@@class = 'class'
$global = 'global'

puts "#@instance, #@@class, #$global, #{local}"

begin puts $empty_global == nil rescue puts "undefined" end
begin puts @empty_instance == nil rescue puts "undefined" end
begin puts empty_local == nil rescue puts "undefined" end
begin puts @@empty_class == nil rescue puts "undefined" end

class VarTest
  puts "\n== Testin in Class Context =="

  local = 'local'
  @instance = 'instance'
  @@class = 'class'
  $global = 'global'

  puts "#@instance, #@@class, #$global, #{local}"

  begin puts $empty_global == nil rescue puts "undefined" end
  begin puts @empty_instance == nil rescue puts "undefined" end
  begin puts empty_local == nil rescue puts "undefined" end
  begin puts @@empty_class == nil rescue puts "undefined" end

end
#OUTPUT

== Testin in MAIN Context ==
instance, class, global, local
true
true
undefined
undefined

== Testin in Class Context ==
instance, class, global, local
true
true
undefined
undefined

···

-

The inconsistency:

become nil, do not raise error:
$empty_global
@empty_instance

are undefined, raise an error:
empty_local
@@empty_class

-

Is this a defect or is there an explanation for this behaviour?

.

--
htttp://lazaridis.com

I can speak to local variables; class variables still break my brain a little.

Local variables are created during parsing and initialized to nil when they
are first encountered. They are available for use at any point in the same
scope lexically after their initialization.

This is done because if you attempt to read a local variable lexically before it has
been introduced in a local context, we can be sure you have made an error.

For global variables and instance variables, we cannot have such lexical
guarantees. While it may be obvious within the scope of a simple program
that a global or ivar has not yet been introduced, it is not a local property.
Thus, access is permitted; if the variable has not been initialized, then it
is initialized to nil.

The same occurs when Ruby sees an introduced, but uninitialized, local
variable:

def foo(y)
  if false
    x = y
  end
  p x
end
foo(10)

#=> nil

The local is seen at "x = y", created in the local variable table, and initialized
to nil. The reference to "x" later succeeds because the local has been created,
though never initialized.

Michael Edgar
adgar@carboni.ca
http://carboni.ca/

···

On Jun 15, 2011, at 1:50 PM, Ilias Lazaridis wrote:

Is this a defect or is there an explanation for this behaviour?

[...]

A simplified version, using only locals and globals

def set_local(y)
  if false
    x = y
  end
  p x #=> nil
  #p xx #=> undefined
end
set_local(10)

def set_global(y)
  if false
    $x = y
  end
  p $x #=> nil
  p global_variables.include?(:$x) #=> true
  p $xx #=> nil
  p global_variables.include?(:$xx) #=> true
end
set_global(10)

···

On 15 Ιούν, 20:49, Ilias Lazaridis <il...@lazaridis.com> wrote:

-

I understand the "set_local", and it works as expected.

I don't understand, why "p $xx" does not fail with an "not defined"
error.

Technically, the existence of the variable is observable all over the
program.

I don't see the reason why "$xx" is created and set to nil, instead of
throwing an "not defined" error (which I would expect when accessing
an undefined var, could be e.g. a typo of me).

.

--
htttp://lazaridis.com

Well, I haven't checked the draft ISO standard or RubySpec, but the
behavior is certainly widely known and documented in many Ruby books,
and lots of code relies on the current behavior, and would break if it
were changed. So, I'm going to say its a feature rather than a defect,
whether or not it was originally an intended feature.

And I suspect that its pretty well thought out with the possible
exception of class variable behavior. Globals and locals are often
first assigned remotely from the site where they are used, so calling
them when they haven't been set isn't really a sign of a logic error
-- default to nil makes sense. Local variables are generally (there
are some possible exceptions, I think) set directly in the local
context where they are used, so using one without it being assigned
first is a sign of an error, so not defaulting to nil makes sense.

Class variables probably should behave like instance variables, but
then, class variables are almost never the right tool to use anyway.

···

On Wed, Jun 15, 2011 at 10:50 AM, Ilias Lazaridis <ilias@lazaridis.com> wrote:

Is this a defect or is there an explanation for this behaviour?

(slightly corrected, order of variable types)

puts "\n== Testin in MAIN Context =="

$global = 'global'
@instance = 'instance'
@@class = 'class'
local = 'local'

puts "#$global, #@instance, #@@class, #{local}"

begin puts $empty_global == nil rescue puts "undefined" end
begin puts @empty_instance == nil rescue puts "undefined" end
begin puts @@empty_class == nil rescue puts "undefined" end
begin puts empty_local == nil rescue puts "undefined" end

class VarTest
  puts "\n== Testin in Class Context =="

  $global = 'global'
  @instance = 'instance'
  @@class = 'class'
  local = 'local'

  puts "#$global, #@instance, #@@class, #{local}"

  begin puts $empty_global == nil rescue puts "undefined" end
  begin puts @empty_instance == nil rescue puts "undefined" end
  begin puts @@empty_class == nil rescue puts "undefined" end
  begin puts empty_local == nil rescue puts "undefined" end

end

#OUTPUT

== Testin in MAIN Context ==
global, instance, class, local
true
true
undefined
undefined

== Testin in Class Context ==
global, instance, class, local
true
true
undefined
undefined

···

The inconsistency:

become nil, do not raise error:
$empty_global
@empty_instance

are undefined, raise an error:
empty_local
@@empty_class

-

Is this a defect or is there an explanation for this behaviour?

--
http://lazaridis.com

> Is this a defect or is there an explanation for this behaviour?

I can speak to local variables; class variables still break my brain a little.

[...] - (explanation)

You have company now, cause all this "breaks my brain", too.

I understand usually better by example, thus I focus on what I've
understood bye the code (and you explanation):

The same occurs when Ruby sees an introduced, but uninitialized, local
variable:

The code

def foo(y)
if false
x = y

x is introduced (exists), but not yet assigned (value: nil)

end
p x
end
foo(10)

#=> nil

The local is seen at "x = y", created in the local variable table, and initialized
to nil. The reference to "x" later succeeds because the local has been created,
though never initialized.

I understand this.

To simplify (and thus protect our brains), we discuss only locals/
globals

···

On 15 Ιούν, 21:05, Michael Edgar <ad...@carboni.ca> wrote:

On Jun 15, 2011, at 1:50 PM, Ilias Lazaridis wrote:

-

I understand the "set_local", and it works as expected.

def set_local(y)
  if false
    x = y
  end
  p x #=> nil
  #p x2 #=> undefined
end
set_local(10)

def set_global(y)
  if false
    $x = y
  end
  p $x #=> nil
  p global_variables.include?(:$x) #=> true
  p $xx #=> nil
  p global_variables.include?(:$xx) #=> true

end
set_global(10)

-

I don't understand, why "p $xx" does not fail with and "not defined"
error.

Technically, the existence of the variable is observable all over the
program.

I don't see the reason why "$xx" is created and set to nil, instead of
throwing an "not defined" error (which I would expect when accessing
an undefined var, could be e.g. a typo of me).

Can this be demonstrated with code?

.

--
http://lazaridis.com

I tend to think class variables are a defect in the first place. If you need
one for some reason, it's usually much better to define it as an instance
variable on the class, rather than as a class variable. That is, instead of:

class Foo
  @@bar = true

  def hello
    if @@bar
      puts 'Hello, world!'
    else
      puts 'Goodbye, world!'
    end
  end

  def depressed!
    @@bar = false
  end
end

Do something like this instead:

class Foo
  self << class
    attr_accessor :bar
  end

  self.bar = true

  def hello
    if self.class.bar
      puts 'Hello, world!'
    else
      puts 'Goodbye, world!'
    end
  end

  def depressed!
    self.class.bar = false
  end
end

Of course, that's a terrible example of why you'd ever want to do such a thing
-- there's rarely a reason to have anything approaching class variables -- but
if you need them, I think it makes much more sense to do them that way. This
also keeps them somewhat saner across inheritance, in my opinion. That's the
part that breaks your brain, right? This way, while subclasses seem to inherit
class methods from the superclass, they won't automatically access the same
values, though it's trivial to override them to do that:

class Other < Foo
  self << class
    def bar
      superclass.bar
    end
    def bar= value
      superclass.bar = value
    end
  end
end

I'd probably use the superclass value as a default until someone overrides it
on this class, so more like:

class Other < Foo
  self << class
    def bar
      @bar || superclass.bar
    end
  end
end

It's still not as clean as I'd like it to be, but at least this generally
obeys basic concepts I've learned elsewhere which I actually understand. I
have never actually understood class variables.

···

On Wednesday, June 15, 2011 01:05:39 PM Michael Edgar wrote:

On Jun 15, 2011, at 1:50 PM, Ilias Lazaridis wrote:
> Is this a defect or is there an explanation for this behaviour?

I can speak to local variables; class variables still break my brain a
little.

[...]

There is a simplified version, see message from 2011-06-18

.

···

On 19 Ιούν, 18:54, Christopher Dicely <cmdic...@gmail.com> wrote:

On Wed, Jun 15, 2011 at 10:50 AM, Ilias Lazaridis <il...@lazaridis.com> wrote:

> Is this a defect or is there an explanation for this behaviour?

Well, I haven't checked the draft ISO standard or RubySpec, but the

--
http://lazaridis.com

This is the crux of where Ilias does have a bit of a point: class variables
always require initialization, instance variables and globals do not. The
same rationale for ivars and globals not requiring initialization likely
applies to class variables. I honestly always assumed class variables
behaved like ivars and globals until this thread started.

I imagine the reason behind class variables requiring initialization is
because they're hard enough to use correctly to begin with. However,
changing them to be auto-initialized to `nil` would be more consistent.

I don't care either way - as Chris pointed out, class variables are almost
always wrong. But for long-term internal consistency, it might actually
be worth discussing.

Michael Edgar
adgar@carboni.ca
http://carboni.ca/

···

On Jun 19, 2011, at 11:54 AM, Christopher Dicely wrote:

Class variables probably should behave like instance variables, but
then, class variables are almost never the right tool to use anyway.

Because the parser 'sees' the variable $xx and defines it before "p $xx" gets executed.

This behavior is different than the local variable case because global variables can be discovered by their name alone (i.e. they start with '$'). This is a syntactic property that the parser can take advantage of. That property doesn't exist for local variables because, in some contexts, they are indistinguishable syntactically from a zero-argument method call. Consider these examples separately and not as part of a single snippet of code:

  a = b # 'a' is clearly a local variable while 'b' could be a local variable or a zero-argument method call

  c = d() # 'd()' is clearly a zero-argument method call and *not* a local variable, 'c' is a local variable

  do_something_with(x,y,z) # x,y, and z could be method calls or variables

Gary Wright

···

On Jun 15, 2011, at 5:00 PM, Ilias Lazaridis wrote:

I don't understand, why "p $xx" does not fail with and "not defined"
error.

Gary,

As I summarized in an e-mail I sent on this thread earlier today, the distinction
being drawn is between global variables and class variables. Your same
argument for global variables applies equally well to class variables, yet
class variables require initialization.

Michael Edgar
adgar@carboni.ca
http://carboni.ca/

···

On Jun 19, 2011, at 6:01 PM, Gary Wright wrote:

Because the parser 'sees' the variable $xx and defines it before "p $xx" gets executed.

[...] - (explanations, referring to why to locals cannot behave this
way)

I'll reduce this further down, dealing *only* with global variables.

(Please, if possible, avoid comparisons to the other variable types.)

···

On 20 Ιούν, 01:01, Gary Wright <gwtm...@mac.com> wrote:

On Jun 15, 2011, at 5:00 PM, Ilias Lazaridis wrote:

> I don't understand, why "p $xx" does not fail with and "not defined"
> error.

Because the parser 'sees' the variable $xx and defines it before "p $xx" gets executed.

-

$x_void

def make_nil_global(val)
  $x_undefined = val
end

p global_variables

-

Behaviour (ruby 1.9.2p180):

$x_void is added to the the global_variables (undefined, nil)
$x_undefined is added to the global_variables (undefined, nil)

Expected Behaviour:

$x_void is ignored (*not* added to global_variables, access would
raise error)
$x_undefined is added to the global_variables (undefined, nil)

-

I cannot see a use case, where placing $x_void into the
global_variables is *necessary*.

I possibly oversee something very fundamental, but the main rules for
my expectations are:

* variables come to existence when a value is assigned
* if the value cannot be determined, "nil" is assigned

.

--
http://lazaridis.com

Global variables are just that... global. There's no two ways about it.

Class variables are shared amongst a tree... but WHERE in the tree is defined by where it is initialized.

And for the record, I think that your assertion that class variables "are almost always wrong" is false. Like all things in software design, they can be used poorly or they can be used well. When they're used well, they're perfect for the job. When they're not, they're horrible. I use class variables all the time to good effect (some of these are from Eric):

% p4 grep -le @@ //src/*/dev/lib/...
//src/IMAPCleanse/dev/lib/imap_client.rb#8
//src/Inliner/dev/lib/inliner.rb#3
//src/ParseTree/dev/lib/parse_tree_extensions.rb#3
//src/RubyInline/dev/lib/inline.rb#37
//src/Sphincter/dev/lib/sphincter/search.rb#5
//src/Sphincter/dev/lib/sphincter/tasks.rb#3
//src/ZenHacks/dev/lib/r2c_hacks.rb#6
//src/ZenHacks/dev/lib/zenoptimize.rb#6
//src/ZenTest/dev/lib/autotest.rb#123
//src/ZenTest/dev/lib/autotest/autoupdate.rb#2
//src/ZenTest/dev/lib/autotest/isolate.rb#2
//src/ZenTest/dev/lib/autotest/rcov.rb#5
//src/ZenTest/dev/lib/functional_test_matrix.rb#3
//src/ZenTest/dev/lib/zentest_mapping.rb#4
//src/ZenWeb/dev/lib/ZenWeb.rb#6
//src/ZenWeb/dev/lib/ZenWeb/MetadataRenderer.rb#2
//src/ar_mailer/dev/lib/action_mailer/ar_mailer.rb#10
//src/flay/dev/lib/flay.rb#26
//src/flog/dev/lib/flog.rb#51
//src/heckle/dev/lib/autotest/heckle.rb#1
//src/heckle/dev/lib/heckle.rb#47
//src/heckle/dev/lib/test_unit_heckler.rb#20
//src/hoe/dev/lib/hoe.rb#148
//src/hoe/dev/lib/hoe/deps.rb#5
//src/imap_processor/dev/lib/imap_processor.rb#19
//src/imap_processor/dev/lib/imap_processor/archive.rb#7
//src/minitest/dev/lib/minitest/spec.rb#23
//src/minitest/dev/lib/minitest/unit.rb#74
//src/newri/dev/lib/ri_display.rb#1
//src/png/dev/lib/png.rb#15
//src/png/dev/lib/png/font.rb#2
//src/rake-remote_task/dev/lib/rake/remote_task.rb#12
//src/rake-remote_task/dev/lib/rake/test_case.rb#1
//src/ruby_parser/dev/lib/ruby_lexer.rb#84
//src/ruby_parser/dev/lib/ruby_parser_extras.rb#52
//src/ruby_to_c/dev/lib/rewriter.rb#13
//src/ruby_to_c/dev/lib/typed_sexp.rb#3
//src/sexp_processor/dev/lib/pt_testcase.rb#1
//src/sexp_processor/dev/lib/sexp.rb#6
//src/sexp_processor/dev/lib/unique.rb#1
//src/wilson/dev/lib/wilson.rb#9
//src/zenprofile/dev/lib/memory_profiler.rb#1
//src/zenprofile/dev/lib/spy_on.rb#3
//src/zenprofile/dev/lib/zenprofiler.rb#9

···

On Jun 19, 2011, at 15:26 , Michael Edgar wrote:

On Jun 19, 2011, at 6:01 PM, Gary Wright wrote:

Because the parser 'sees' the variable $xx and defines it before "p $xx" gets executed.

Gary,

As I summarized in an e-mail I sent on this thread earlier today, the distinction
being drawn is between global variables and class variables. Your same
argument for global variables applies equally well to class variables, yet
class variables require initialization.

(Please, if possible, avoid comparisons to the other variable types.)

First, you specifically started this thread discussing the "inconsistency"
in the way different variables types are handled. Now you sit here and
complain when people are comparing the variable types. Which would you like?
Inconsistencies most times come from trade offs between the different types
- a discussion of those differences can be important to figuring out why
they are "inconsistent"

I cannot see a use case, where placing $x_void into the
global_variables is *necessary*.

I possibly oversee something very fundamental, but the main rules for
my expectations are:

* variables come to existence when a value is assigned
* if the value cannot be determined, "nil" is assigned

You technically might have a point here - but as with most things in life
there is always a trade off. Obviously needing to do a lookup to find a var
is always the slowest option so there appears to be a speed enhancement here
in that the variable is put into the global vars table immediately so we
don't need to do a lookup each time we hit it. This can be accomplished
because the $ notation makes it automatic what we are looking at. An
argument could be made that the global vars table might be enhanced to
denote something along the lines of "not initialized" for variables in it
that have not have a value assigned yet.

As always however, a patch that implements something is the easiest way to
get there.

John

···

On Sun, Jun 19, 2011 at 4:35 PM, Ilias Lazaridis <ilias@lazaridis.com>wrote:

Global variables are just that... global. There's no two ways about it.

Class variables are shared amongst a tree... but WHERE in the tree is defined by where it is initialized.

This does not explain why they do not auto-initialize to nil like all
other shared variable types.

And for the record, I think that your assertion that class variables "are almost always wrong" is false. Like all things in software design, they can be used poorly or they can be used well. When they're used well, they're perfect for the job. When they're not, they're horrible. I use class variables all the time to good effect (some of these are from Eric):

I think you support my point - for an entire variable class type, I see
about 20 projects there, some of which you've admittedly not written
yourself. They certainly have uses; I use them in Laser for dynamically
loaded warning passes, and I see many of the projects you link use them
for plugins. "Almost always wrong" isn't countered by fewer than 2 dozen
projects that use them once or twice.

Michael Edgar
adgar@carboni.ca
http://carboni.ca/

···

On Jun 19, 2011, at 7:20 PM, Ryan Davis wrote:

[Note: parts of this message were removed to make it a legal post.]

> (Please, if possible, avoid comparisons to the other variable types.)

First, you specifically started this thread discussing the "inconsistency"
in the way different variables types are handled. Now you sit here and
complain when people are comparing the variable types. Which would you like?
Inconsistencies most times come from trade offs between the different types
- a discussion of those differences can be important to figuring out why
they are "inconsistent"

First of all, my plea was optional ("If possible").

And then it's just meant for this sub-thread (where I wanted to focus
just on the globals).

> I cannot see a use case, where placing $x_void into the
> global_variables is *necessary*.

> I possibly oversee something very fundamental, but the main rules for
> my expectations are:

> * variables come to existence when a value is assigned
> * if the value cannot be determined, "nil" is assigned

You technically might have a point here - but as with most things in life

[...] - (assuming it's for speed reasons)

The question is:

*Concretely*, is there any reason, by design/specification or by
implementation, that the above rules are not kept?

.

···

On 20 Ιούν, 04:49, John W Higgins <wish...@gmail.com> wrote:

On Sun, Jun 19, 2011 at 4:35 PM, Ilias Lazaridis <il...@lazaridis.com>wrote:

--
http://lazaridis.com

Just speculation on my part but if you had implicit-initialization, then the scope of the instance variable would be very dependent on the order in which various classes were parsed. An errant reference to the class variable in a subclass would effectively hide the same class variable in a superclass, which is probably not what is intended. By requiring an explicit initialization the actual scope of the class variable is explicitly established by the programer.

Gary

···

On Jun 19, 2011, at 9:36 PM, Michael Edgar wrote:

On Jun 19, 2011, at 7:20 PM, Ryan Davis wrote:

Global variables are just that... global. There's no two ways about it.

Class variables are shared amongst a tree... but WHERE in the tree is defined by where it is initialized.

This does not explain why they do not auto-initialize to nil like all
other shared variable types.

Given my most common use of class variables (shared array/hash/object which
is mutated by the inheritance hierarchy), this makes sense. The cvar is
written once at the top of the class tree, and read and mutated multiple times by
subclasses. Interestingly, @@cvar ||= ... doesn't raise (in 1.9.2 at least) if it has not
been initialized.

Michael Edgar
adgar@carboni.ca
http://carboni.ca/

···

On Jun 20, 2011, at 12:55 AM, Gary Wright wrote:

Just speculation on my part but if you had implicit-initialization, then the scope of the instance variable would be very dependent on the order in which various classes were parsed. An errant reference to the class variable in a subclass would effectively hide the same class variable in a superclass, which is probably not what is intended.