Fun about inspect

Hi all.

By some complex reasons, I've made an interesting conclusion: for
"serializable" types it's always good to have eval(obj.inspect) == obj

For me it was a good thought, because previously, I've always doubt what
#inspect should do, and typically have ended with #inspect as alias for
#to_s

Here is a dumb test for some core classes:

# dumb testing function
# tests, if eval(obj.inspect) == obj, and if any
# parsing errors are thrown by evaluating

···

#
def tst(obj)
  begin
    print "testing %-10.10s: " % obj.class.name
    res = eval(obj.inspect)
    if res == obj
      puts 'OK'
    else
      print 'WRONG : '
      puts "\t%-7s => %-7s" % [obj.inspect, res.inspect]
    end
  rescue Exception
    puts "ERROR"
  end
end

#the tests itself

tst(5) #=> OK
tst("str") #=> OK
tst(RuntimeError.new) #=> WRONG : #<RuntimeError: RuntimeError> => nil

tst([1,2,3]) #=> OK
tst(:a => 1) #=> OK
tst(Time.new) #=> ERROR

There are several interesting things to note:
* basic types (Numeric, String, Array, Hash) are all behave good here.
* Exception descendants are evaluated to nil, which can be considered OK, as
we typically not plan to serialize and store exceptions. Even in this case,
results of #inspect are evaluated without any error!
* But Time is a bad guy! It's #inspect result is something like "Wed Jun 27
00:36:12 +0300 2007" and is equal to it's #to_s result and can't be
evaluated in any manner. I think, it's not the most reasonable variant.

What do you think?

V.

I've given up on #inspect.

I use my #dump! method (VERY useful, search the list for it, the best
four lines I've ever written) and use #to_yaml within it.

Aur

···

On 6/27/07, Victor Zverok Shepelev <vshepelev@imho.com.ua> wrote:

Hi all.

By some complex reasons, I've made an interesting conclusion: for
"serializable" types it's always good to have eval(obj.inspect) == obj

For me it was a good thought, because previously, I've always doubt what
#inspect should do, and typically have ended with #inspect as alias for
#to_s

Here is a dumb test for some core classes:

# dumb testing function
# tests, if eval(obj.inspect) == obj, and if any
# parsing errors are thrown by evaluating
#
def tst(obj)
  begin
    print "testing %-10.10s: " % obj.class.name
    res = eval(obj.inspect)
    if res == obj
      puts 'OK'
    else
      print 'WRONG : '
      puts "\t%-7s => %-7s" % [obj.inspect, res.inspect]
    end
  rescue Exception
    puts "ERROR"
  end
end

#the tests itself

tst(5) #=> OK
tst("str") #=> OK
tst(RuntimeError.new) #=> WRONG : #<RuntimeError: RuntimeError> => nil

tst([1,2,3]) #=> OK
tst(:a => 1) #=> OK
tst(Time.new) #=> ERROR

There are several interesting things to note:
* basic types (Numeric, String, Array, Hash) are all behave good here.
* Exception descendants are evaluated to nil, which can be considered OK, as
we typically not plan to serialize and store exceptions. Even in this case,
results of #inspect are evaluated without any error!
* But Time is a bad guy! It's #inspect result is something like "Wed Jun 27
00:36:12 +0300 2007" and is equal to it's #to_s result and can't be
evaluated in any manner. I think, it's not the most reasonable variant.

What do you think?

V.

Hi all.

By some complex reasons, I've made an interesting conclusion: for
"serializable" types it's always good to have eval(obj.inspect) == obj

For me it was a good thought, because previously, I've always doubt what
#inspect should do, and typically have ended with #inspect as alias for
#to_s

Here is a dumb test for some core classes:

# dumb testing function
# tests, if eval(obj.inspect) == obj, and if any # parsing errors are thrown by evaluating
#
def tst(obj)
  begin
    print "testing %-10.10s: " % obj.class.name res = eval(obj.inspect)
    if res == obj
      puts 'OK'
    else
      print 'WRONG : '
      puts "\t%-7s => %-7s" % [obj.inspect, res.inspect]
    end
  rescue Exception
    puts "ERROR"
  end
end

#the tests itself

tst(5) #=> OK
tst("str") #=> OK
tst(RuntimeError.new) #=> WRONG : #<RuntimeError: RuntimeError> => nil

tst([1,2,3]) #=> OK
tst(:a => 1) #=> OK
tst(Time.new) #=> ERROR

There are several interesting things to note:
* basic types (Numeric, String, Array, Hash) are all behave good here.

No, they don't. See below.

* Exception descendants are evaluated to nil, which can be considered OK, as
we typically not plan to serialize and store exceptions. Even in this case,
results of #inspect are evaluated without any error!
* But Time is a bad guy! It's #inspect result is something like "Wed Jun 27
00:36:12 +0300 2007" and is equal to it's #to_s result and can't be
evaluated in any manner. I think, it's not the most reasonable variant.

What do you think?

I think nobody should rely on #inspect creating something that will be able to resurrect the original. Note, that it does not even work properly for Hashes:

irb(main):001:0> h1 = Hash.new 666
=> {}
irb(main):002:0> h1[:foo]
=> 666
irb(main):003:0> s1 = h1.inspect
=> "{}"
irb(main):004:0> h2 = eval s1
=> {}
irb(main):005:0> h2[:foo]
=> nil

The sole purpose of #inspect is to return something that will reveal human readable information about an object's state which can be used for logging and debugging. By no means #inspect is meant for serialization, that's what Marshal, YAML and the likes are for.

Kind regards

  robert

···

On 26.06.2007 23:42, Victor "Zverok" Shepelev wrote:

What do you think?

In one specific situation, I use a journal to resemble a small
DB. The reason to use a journal is the requirement to be able
to go back to _any_ moment in time.

When adding, changing or deleting an object to the hash, the
corresponding line is written to the journal. For speed, the
hash and the length and MD5 checksum of the journal are dumped
(using Marshal) to a file at the end of the program. On
startup, this file is loaded and the length and MD5 checksum of
the journal are checked. If the length and MD5 checksum don't
match or Marshal.load fails for whatever reason, the database
is rebuild using the journal (which might take a while... :}).
This gives us the possibility to go back in time. And it's a
nice way to ensure that all data gets effectively written when
the application ends.

I even added commits to the journal. When loading the journal,
everything after the last commit is disregarded.

Loading the journal is done with the code below (simplified).

As long as the objects inspect to Ruby syntax, this approach
works fine.

gegroet,
Erik V. - http://www.erikveen.dds.nl/

···

----------------------------------------------------------------

# journal

[{:action=>:added, :timestamp=>"20070627140031"},
{:id=>"aaa", :something=>17}]
[{:action=>:added, :timestamp=>"20070627140031"},
{:id=>"bbb", :something=>18}]
[{:action=>:changed, :timestamp=>"20070627140031"},
{:id=>"bbb", :something=>19}]
[{:action=>:deleted, :timestamp=>"20070627140033"}, {:id=>"bbb"}]

----------------------------------------------------------------

# read_journal.rb

items = {}

File.open("journal") do |f|
   f.readlines.each do |line|
     raise "line doesn't end with \\n, probably caused by a previous
IO error [#{line}]" unless line[-1..-1] == "\n"

     action, item = Thread.new{$SAFE=4; eval(line,
Module.new.module_eval{binding})}.value

     case action[:action]
     when :added then items[item[:id]] = item
     when :changed then items[item[:id]] = item
     when :deleted then items.delete(item[:id])
     else
       raise "unknown action [#{action.inspect}]"
     end
   end
end

p items

----------------------------------------------------------------

By some complex reasons, I've made an interesting conclusion: for
"serializable" types it's always good to have eval(obj.inspect) == obj

For me it was a good thought, because previously, I've always doubt what
#inspect should do, and typically have ended with #inspect as alias for
#to_s

[...]

There are several interesting things to note:
* basic types (Numeric, String, Array, Hash) are all behave good here.

No, they don't. See below.

[...]

What do you think?

I think nobody should rely on #inspect creating something that will be
able to resurrect the original. Note, that it does not even work
properly for Hashes:

[...]

There is a bit of misunderstanding. I'm not talking about some "strict
rules". I've just suggested a "rule of thumb" for designing #inspect. Of
course, there can be different cases and so on, but typically and especially
for "data" objects (those whose data is more important than behavior) it's
good to have object inspect-"restorable" at some level (even not in full, as
in your smart example about Hash).

V.

···

From: Robert Klemme [mailto:shortcutter@googlemail.com]
Sent: Wednesday, June 27, 2007 1:40 PM

On 26.06.2007 23:42, Victor "Zverok" Shepelev wrote: