Very strange GC behaviour

(Brian Schröder) #1

Hello Geert,

The next version of ruby will contain another GC. IIRC it will be a
two byte reference counting GC. I'm no expert in this domain, so I
can't tell what will be the implications of the change.

Regarding your proposal to replace the hash with an avl tree. If you
look at the hash sources, you will see that it already uses a tree for
storage. Already we use the name hash only for the abstract data type,
the same thing that is called a map in c++.

I'd suppose that the two times speedup you observed comes from storing
the data in c-structs instead of ruby objects, not from the hash
implementation. The algorithms and datastructures of the ruby c-code
are already pretty good.

But I think that a number of well developed basic datastructures, such
as a real hash, a b-tree a designated queue, deque, priority queue
etc. would be a nice addition to the standard library. Especially
together with duck typing one could start with a simple array that
supports all operations and exchange it against the optimal
datastructure at will.

regards,

Brian

···

On 24/08/05, Geert Fannes <Geert.Fannes@ikanconsulting.com> wrote:

The c avl tree indeed keeps its data in c-format, not by using ruby objects since that would indeed not give any gain (and I'd have to write some mark and sweep functions). I will check my code and define all the variables in the most outer loop. Together with more regular checking, hopefully I can keep the memory under control. Thank you very much for this advice.

On a more general level, I have the feeling the GC is not exactly doing what it should be doing. Perhaps some extensive investigation can lead to improvement of this. Do you know who to contact to find out IF GC is behaving differently from what is expected and how things can be improved? I would like to contribute, but ofcourse some help from people with experience and GC knowledge is crucial. Should I post the essence of my problem on the ruby-core list? Do you have any idea if things are different in the 1.9 version?

Also my recent positive experience with the performance of AVLTrees could be interesting: why not exchange the Hash class with an AVLTree class. I these trees can be made to behave the same way as hashes do, all hash related things would go faster, but I can imagine the reluctance of the ruby developers to exchange the hash with a tree and keep calling it a hash for compatibility reasons :-). Anyway, it would be nice to make a tree by default available for the next version of ruby.

Greetings,
Geert.

--
http://ruby.brian-schroeder.de/

Stringed instrument chords: http://chordlist.brian-schroeder.de/

(Brian Schröder) #2

One gotcha I've only seen after sending the other mail is, that moving
the variables to the outer loop won't buy you anything. While the GC
is disabled, this should eat infinite amounts of memory

GC.disable
a = "empty-"
loop do a += "" end
GC.enable

regards,

Brian

···

On 24/08/05, Brian Schröder <ruby.brian@gmail.com> wrote:

On 24/08/05, Geert Fannes <Geert.Fannes@ikanconsulting.com> wrote:
> The c avl tree indeed keeps its data in c-format, not by using ruby objects since that would indeed not give any gain (and I'd have to write some mark and sweep functions). I will check my code and define all the variables in the most outer loop. Together with more regular checking, hopefully I can keep the memory under control. Thank you very much for this advice.
>
> [snip]

--
http://ruby.brian-schroeder.de/

Stringed instrument chords: http://chordlist.brian-schroeder.de/

(Mauricio Fernandez) #3

The next version of ruby will contain another GC. IIRC it will be a
two byte reference counting GC. I'm no expert in this domain, so I

  ^^^^^^^^
  two *bit* refcounting

I'm not sure that was the final decision. At any rate, it seems mark &
sweep is not going away.

Regarding your proposal to replace the hash with an avl tree. If you
look at the hash sources, you will see that it already uses a tree for
storage. Already we use the name hash only for the abstract data type,
the same thing that is called a map in c++.

Have you patched ruby?
In both HEAD and the ruby_1_8 branch Hash is implemented using the hash
tables defined in st.c (no trees underneath).

···

On Wed, Aug 24, 2005 at 10:48:48PM +0900, Brian Schröder wrote:

--
Mauricio Fernandez

(Brian Schröder) #4

> The next version of ruby will contain another GC. IIRC it will be a
> two byte reference counting GC. I'm no expert in this domain, so I
  ^^^^^^^^
  two *bit* refcounting

I'm not sure that was the final decision. At any rate, it seems mark &
sweep is not going away.

> Regarding your proposal to replace the hash with an avl tree. If you
> look at the hash sources, you will see that it already uses a tree for
> storage. Already we use the name hash only for the abstract data type,
> the same thing that is called a map in c++.

Have you patched ruby?
In both HEAD and the ruby_1_8 branch Hash is implemented using the hash
tables defined in st.c (no trees underneath).

Ooops, that was a bunch of dingo kidneys I wrote. I remembered a post
on this mailing list that claimed hashes where trees. But I never
bothered to look myself. I need to add the necessary disclaimers to my
posts :-/

best regards,

Brian

···

On 24/08/05, Mauricio Fernández <mfp@acm.org> wrote:

On Wed, Aug 24, 2005 at 10:48:48PM +0900, Brian Schröder wrote:

--
Mauricio Fernandez

--
http://ruby.brian-schroeder.de/

Stringed instrument chords: http://chordlist.brian-schroeder.de/