The kernel is written in C, which _is_ a language for girls.
SCNR
Michael (Ruby and C++ hacker)
···
--
Michael Ulm
R&D Team
ISIS Information Systems Austria
tel: +43 2236 27551-219, fax: +43 2236 21081
e-mail: michael.ulm@isis-papyrus.com
Visit our Website: www.isis-papyrus.com
"Robert Klemme" <bob.news@gmx.net> schrieb im Newsbeitrag
def self.list
FOOS.each do |oid, x|
begin
p ObjectSpace._id2ref(oid)
rescue RangeError => e
puts "#{oid} collected but not finalized:
x=#{x.inspect}"
end
end
end
I assume you meant FOOS above. I fixed it.
Correct.
Just catching a RangeError is not all you need. You better
make sure the object is finalized and the finalizer removes its
oid from FOOS before the space is reclaimed. Otherwise oid
could refer to a completely new object and you wouldn't detect
it.
Probably. My assumption was, that oids are not reused. But I may be
wrong here.
On top of that, these finalizers can't be called while this
FOOS.each loop is going on. Otherwise you'd get an error about
the hash being modified while your iterating over it. It is
like the GC and finalizers are in another thread, but
unfortunately you can't control it like a thread (i.e.
Thread.critical=). This is my primary dilemma.
Hm... Did you try Mutex or Monitor?
These issues may be very difficult to detect problems with and
you may need 1000's of tests to excite the GC differently to
detect the problem. You must be prepared for this using
_id2ref.
This approach works ok - you'll have to imagine that x contains
information needed for proper cleanup of a Foo instance, for
example, an open IO instance (although I'm sure that will do proper
cleanup on finalization):
class Foo
FOOS = {}
def initialize(x)
self.x=x
ObjectSpace.define_finalizer(self) do |oid|
puts "Cleanup of #{oid} with #{FOOS[oid]}"
end
end
Won't work. This is the exact mistake I made when first trying
to use a finalizer. In the above, you gave define_finalizer a
Proc that has a Binding with direct access to self (a Foo).
This creates an unintended reference to the object in
ObjectSpace/GC. It will never be GCed because of this.
Darn, yes you're right!
Two alternatives would be
class Foo
class<<self
alias :_new :new
def new(*a,&b)
obj = _new(*a,&b)
ObjectSpace.define_finalizer(obj) do |oid|
puts "Cleanup of #{oid} with #{FOOS[oid]}"
end
obj
end
end
end
class Foo
def initialize(x)
self.x=x
self.class.instance_eval do
ObjectSpace.define_finalizer(obj) do |oid|
puts "Cleanup of #{oid} with #{FOOS[oid]}"
end
end
end
end
It
took me a while to figure this out. The Proc/Method needs to
be defined in a context that doesn't have access to the object
you are trying to put a finalizer on. For this reason, I don't
think define_finalizer should even allow the block form. Also,
a Proc#unbind would be nice to have in this situation.
Also, I assume you'd want to delete the oid entry from FOOS in
this finalizer, right?
Yes. Sorry for the errors and omissions - it was late already...
def x=(y) @x = y
FOOS[object_id] = y
end
def x() @x end
end
What is the real world problem you are trying to solve?
> Just catching a RangeError is not all you need. You better
> make sure the object is finalized and the finalizer removes
its
> oid from FOOS before the space is reclaimed. Otherwise oid
> could refer to a completely new object and you wouldn't
detect
> it.
Probably. My assumption was, that oids are not reused. But
I may be
wrong here.
Yep. Other than immediates, I believe an object id is just a
memory location. After an object is GCed, it will want to
reuse the space at some point. The new object may start at
that same location (same object id) or that location may
correspond to the middle of an object.
> On top of that, these finalizers can't be called while this
> FOOS.each loop is going on. Otherwise you'd get an error
about
> the hash being modified while your iterating over it. It
is
> like the GC and finalizers are in another thread, but
> unfortunately you can't control it like a thread (i.e.
> Thread.critical=). This is my primary dilemma.
Hm... Did you try Mutex or Monitor?
I have tried Thread.critical= which is what Mutex and probably
other mutual exclusivity stuff are based on. The problem is
that GC/finalizers is not considered to be in separate threads.
> These issues may be very difficult to detect problems with
and
> you may need 1000's of tests to excite the GC differently
to
> detect the problem. You must be prepared for this using
> _id2ref.
>
>>> This approach works ok - you'll have to imagine that x
contains
>>> information needed for proper cleanup of a Foo instance,
for
>>> example, an open IO instance (although I'm sure that will
do proper
>>> cleanup on finalization):
>>>
>>> class Foo
>>> FOOS = {}
>>>
>>> def initialize(x)
>>> self.x=x
>>>
>>> ObjectSpace.define_finalizer(self) do |oid|
>>> puts "Cleanup of #{oid} with #{FOOS[oid]}"
>>> end
>>> end
>
> Won't work. This is the exact mistake I made when first
trying
> to use a finalizer. In the above, you gave
define_finalizer a
> Proc that has a Binding with direct access to self (a Foo).
> This creates an unintended reference to the object in
> ObjectSpace/GC. It will never be GCed because of this.
Darn, yes you're right!
Two alternatives would be
class Foo
class<<self
alias :_new :new
def new(*a,&b)
obj = _new(*a,&b)
ObjectSpace.define_finalizer(obj) do |oid|
puts "Cleanup of #{oid} with #{FOOS[oid]}"
end
obj
end
end
end
class Foo
def initialize(x)
self.x=x
self.class.instance_eval do
ObjectSpace.define_finalizer(obj) do |oid|
puts "Cleanup of #{oid} with #{FOOS[oid]}"
end
end
end
end
Nope. Still doesn't work. Try this:
100000.times { obj=Foo.new("hi") }
The memory size just keeps growing and the obj's are not GCed.
In your second solution above obj isn't defined. I'm not sure
what you intended.
The problem is that the Proc's you give to define_finalizer
still have access to the object you are putting the finalizer
on. This time through a local variable (obj) instead of self.
You see why I say that the block form of define_finalizer isn't
useful? And should be removed?
If you've been creating finalizers this way you've probably
never had issues with _id2ref because the object are never
GCed!
What is the real world problem you are trying to solve?
In my cursor package, I create children cursors and need to
keep track of them. But, I don't want to keep a normal ref on
them so that they can't be garbage collected. For example:
child = parent.postion # child holds the current position
parent.position? # any positions/children outstanding?
parent.position?(child) # is this a valid position?
parent.position! # kill positions/children
parent.delete1Next # update all children after this point
child.succ # next position - use with Range
I don't want the user of this package to have to worry about
closing every single child. It would be a pain to have to do
this especially when intermediate expressions can yield a
child. I need to keep track of any outstanding, but I want GC
to get rid of any that aren't used anymore.
Here is some more code that I did some testing on:
#!/bin/env ruby
require 'set'
require 'weakref'
class WeakRefList
def finalizer(id)
__old_status = Thread.critical if @critical
Thread.critical = true if @critical
begin
print("f") @ids.delete?(id) or raise
ensure
Thread.critical = __old_status if @critical
end
end
def initialize(flags) @ids = Set.new @useWeakRef = flags[0].nonzero? @disable = flags[1].nonzero? @start = flags[2].nonzero? @critical = flags[3].nonzero? @dummy = flags[4].nonzero?
end
def << (obj)
if @useWeakRef @ids << WeakRef.new(obj)
else @ids << obj.object_id
ObjectSpace.define_finalizer(obj,
method(:finalizer))
end
dummy = WeakRef.new("") if @dummy
self
end
def each(&block)
GC.start if @start
GC.disable if @disable
__old_status = Thread.critical if @critical
Thread.critical = true if @critical
begin @ids.to_a.each { |id|
begin
block.call(@useWeakRef ?
id.__getobj__ :
ObjectSpace._id2ref(id)
)
rescue RangeError,WeakRef::RefError
print("e") @ids.delete(id)
end
}
ensure
Thread.critical = __old_status if @critical
end
GC.enable if @disable
end
end
if __FILE__==$0
class MyString < String; end
weakrefs = WeakRefList.new((ARGV[0]||0).to_i)
$stdout.sync=true
at_exit {puts}
1000.times { |i|
print(".")
weakrefs << MyString.new("X"*i)
weakrefs.each { |o|
MyString==o.class or
raise("not a MyString: #{o.inspect}")
}
}
end
I tried out a bunch of ways to make this WeakRefList. You can
pass in a flags number ORing the options I provided for this
class:
1: use WeakRef instead of simply an object id
2: use GC.disable/GC.enable around code using _id2ref
4: use GC.start before trying _id2ref
8: use Thread.critical= to try to stop the finalizer
16: add a dummy allocation of a WeakRef
I found that only using WeakRef's, GC.start, and dummy
WeakRef's worked for me. In other cases it looks like the
obect_id got reclaimed by another object before the finalizer
was run on the original object. I don't understand why using
WeakRef worked. Looking at the code it looks like an object id
could still get reused before the finalizer is called and it
would still look OK. I think it is just luck because
allocating dummy WeakRef's also worked. Currently, I trust
using GC.start the most. But I've still seen cases where I
have to call GC.start multiple times back-to-back. Anybody
have a better solution for this WeakRefList? I'll want more
methods eventually, but << and each seem sufficient for
testing.
···
--- Robert Klemme <bob.news@gmx.net> wrote:
____________________________________________________
Yahoo! Sports
Rekindle the Rivalries. Sign up for Fantasy Football
The kernel is written in C, which _is_ a language for girls.
Michael (Ruby and C++ hacker)
Ruby is written in C too.
If Ruby isn't a language for girls i must deduce that or you are polymorphic at least like cpp code ?
1) Ruby is just implemented in a language for girls. That
doesn't make Ruby a language for girls. I think we may
agree that it is the language for supermen and superwomen.
2) Real men get the job done, even if it involves using
girlie tools (I hope that brings me back on Matz' good side).
Michael (wondering how long he can keep arguing for this silly notion)
···
--
Michael Ulm
R&D Team
ISIS Information Systems Austria
tel: +43 2236 27551-219, fax: +43 2236 21081
e-mail: michael.ulm@isis-papyrus.com
Visit our Website: www.isis-papyrus.com
Just catching a RangeError is not all you need. You better
make sure the object is finalized and the finalizer removes its
oid from FOOS before the space is reclaimed. Otherwise oid
could refer to a completely new object and you wouldn't detect
it.
Probably. My assumption was, that oids are not reused. But
I may be
wrong here.
Yep. Other than immediates, I believe an object id is just a
memory location. After an object is GCed, it will want to
reuse the space at some point. The new object may start at
that same location (same object id) or that location may
correspond to the middle of an object.
So we have to look at the sources to get a definitive answer...
On top of that, these finalizers can't be called while this
FOOS.each loop is going on. Otherwise you'd get an error about
the hash being modified while your iterating over it. It is
like the GC and finalizers are in another thread, but
unfortunately you can't control it like a thread (i.e.
Thread.critical=). This is my primary dilemma.
Hm... Did you try Mutex or Monitor?
I have tried Thread.critical= which is what Mutex and probably
other mutual exclusivity stuff are based on. The problem is
that GC/finalizers is not considered to be in separate threads.
But in that case one of the two might help - the one that isn't reentrant, would it?
Nope. Still doesn't work. Try this:
100000.times { obj=Foo.new("hi") }
The memory size just keeps growing and the obj's are not GCed.
In your second solution above obj isn't defined. I'm not sure
what you intended.
obj = self - but then again, as you said it doesn't matter whether it's self or obj that keeps the instance alive.
The problem is that the Proc's you give to define_finalizer
still have access to the object you are putting the finalizer
on. This time through a local variable (obj) instead of self.
You see why I say that the block form of define_finalizer isn't
useful? And should be removed?
It dawns on me. But wait:
Robert@Babelfish2 /c/TEMP
$ ruby gc3.rb > xx
Robert@Babelfish2 /c/TEMP
$ wc -l xx
10001 xx
Robert@Babelfish2 /c/TEMP
$ sort xx|wc -l
10001
Robert@Babelfish2 /c/TEMP
$ fgrep -n end xx
10001:end
10.times do
1000.times { testit }
GC.start
sleep 1
end
puts "end"
If you comment the "obj = nil" the binding is not modified and the instances are kept. The way it is here, instances are collected, which you can see from the "end" statement in the output IMHO. The trick is to use the local var and modify the binding afterwards.
Also, the oids seem not reused (see the sort output).
What is the real world problem you are trying to solve?
In my cursor package, I create children cursors and need to
keep track of them. But, I don't want to keep a normal ref on
them so that they can't be garbage collected. For example:
child = parent.postion # child holds the current position
parent.position? # any positions/children outstanding?
parent.position?(child) # is this a valid position?
parent.position! # kill positions/children
parent.delete1Next # update all children after this point
child.succ # next position - use with Range
I don't want the user of this package to have to worry about
closing every single child. It would be a pain to have to do
this especially when intermediate expressions can yield a
child. I need to keep track of any outstanding, but I want GC
to get rid of any that aren't used anymore.
A solution to tackle the oid reuse issue (if it's an issue) would be to store something that can verify that the oid belongs to the data stored for that oid. Unfortunately I can't think of a way ATM...
If you are a Ruby hacker you need to know and understand c.
You write Ruby extentions in c.
it is the language for supermen and superwomen.
it's a special language
wondering how long he can keep arguing for this silly notion
I think, if i would be forced to asscociate a language to sex, accorging my
experiences with develpers of that language i must say:
···
---
language sex:
asm x86: geek
asm sparc: macho
v.b.: nerd
c: geek
cpp: man
c#: man
java: gay*
ruby: man
html: children, girl
sql: girl
php: bisex
python: bho!
prolog: geek
lisp: geek
* i just know many develpers that after learned java they started to dress
with same gadgets.. like yellow rubber bracelets.
i don't know if this can be considered at 100% a gay aptitude.
> I have tried Thread.critical= which is what Mutex and
probably
> other mutual exclusivity stuff are based on. The problem
is
> that GC/finalizers is not considered to be in separate
threads.
But in that case one of the two might help - the one that
isn't reentrant,
would it?
I'll go look at this again. I all of this had to do with
exclusivity between threads. If I can do it within one thread,
that may work.
10.times do
1000.times { testit }
GC.start
sleep 1
end
puts "end"
...
Also, the oids seem not reused (see the sort output).
I find them reused. I modified your code a little to track
what's been used:
@used = {}
def testit
obj = Object.new
id = obj.object_id @used[id]==false and puts("reused #{id} after finalizing") @used[id] = true
ObjectSpace.define_finalizer(obj) {|oid| #puts "cleanup #{oid}"
begin
ObectSpace._id2ref(oid)
puts("reused #{id} before finalizing!!!!!!!!!!")
rescue RangeError
end @used[oid] = false
}
obj = nil
end
10.times do
1000.times { testit }
GC.start
sleep 1
end
Fortunately, I never see it reuse id's before the object for
that id is finalized.
A solution to tackle the oid reuse issue (if it's an issue)
would be to
store something that can verify that the oid belongs to the
data stored for
that oid. Unfortunately I can't think of a way ATM...
That's kind of what WeakRef does. Unfortunately it is about
20X slower than using simple object ids. Using GC.start with
object ids is much faster.
···
--- Robert Klemme <bob.news@gmx.net> wrote:
____________________________________________________
Yahoo! Sports
Rekindle the Rivalries. Sign up for Fantasy Football
If one assumed that languages could procreate and evolve through sexual reproduction, what might happen if you took a "male" and a "female" language and left them alone in a room together with a cheap bottle of chardonnay?
I think I finally figured out a solution to making a set/list
of "WeakRef"s. The code is below. It mimics much of the Set
interface. This is an order of magnitude faster than (and less
memory) using WeakRef to implement this.
The main thing I was missing was to check to see if the object
was finalized after a successful _id2ref. If it was finalized
that means I got an object that reclaimed the space my old
object had.
I wasn't sure that you could delete elements of a hash while
iterating over it, but it looks to work fine. Should this be
OK? Anybody see any race conditions or other problems with
this?
#!/bin/env ruby
class WeakRefList
include Enumerable
def finalizer(id);@ids.delete(id >> 1);end
private :finalizer
def initialize(enum=[],&block)
replace(enum.collect(&block))
end
def add(o) @ids[o.__id__ >> 1] = true
ObjectSpace.define_finalizer(o,method(:finalizer))
self
end
alias << add
def each(&block) @ids.each_key { |id|
begin
o = ObjectSpace._id2ref(id << 1)
# double-check in case it was finalized
block.call(o) if @ids.include?(id)
rescue RangeError
end
}
nil
end
def clear;@ids = {};self;end
def merge(enum);enum.each{|o|add(o)};self;end
def replace(enum);clear;merge(enum);self;end
def delete(o);@ids.delete(o.__id__ >> 1);self;end
def delete?(o);@ids.delete(o.__id__ >> 1)&&self;end
def empty?;each{return(false)};true;end
def include?(o);@ids.include?(o.__id__ >> 1);end
alias member? include?
def inspect;"#<#{self.class}: #{to_a.inspect}>";end
def subtract(enum);enum.each{|o|delete(o)};self;end
def size;n=0;each{n+=1};n;end
alias length size
end
if __FILE__==$0
require 'benchmark'
class MyString < String; end
weakrefs = WeakRefList.new
$stdout.sync=true
times = Benchmark.measure {
10000.times { |i|
print(".")
obj = MyString.new("X"*rand(i+1))
weakrefs << obj
weakrefs.each { |o|
MyString==o.class or
raise("not a MyString: #{o.object_id}
#{o.inspect}")
}
weakrefs.include?(obj) or
raise("#{obj.inspect} disappeared")
if rand(10).zero?
weakrefs.delete(obj)
!weakrefs.include?(obj) or
raise("#{obj.inspect} didn't delete")
end
}
}
p weakrefs
p weakrefs.size
p weakrefs.empty?
p weakrefs.clear
p weakrefs.size
p weakrefs.empty?
puts(times)
end
···
__________________________________
Do you Yahoo!?
Make Yahoo! your home page