Where Is Method Call Precedence?

Hi,

Isn’t that in the current Ruby (1.6.7) the memory area is already managed
in the ‘VALUE *ptr’ member of struct RArray? Are you also suggesting to
remove both ‘len’ and ‘capa’ and using some C “sizeof” trick? But then it
seems it is still only a saving of 8 bytes per tuple as compared to an
array (I don’t think we can remove the ‘struct RBasic basic’ member).

Regards,

Bill

···

============================================================================
Mauricio Fern?ndez batsman.geo@yahoo.com wrote:

On Mon, Sep 30, 2002 at 10:36:41PM +0900, William Djaja Tjokroaminata wrote:

Do you mean just because we need to story only “len” instead of “len” and
“capa” for each array/tuple? I think it is only a saving of 4 bytes per
array.

If size is fixed (de)allocation can be made much faster and is
easily managed with a memory arena.

Bulat Ziganshin bulatz@integ.ru wrote:

tuple is an array with fixed size. when you have very large number of
small arrays, you will save many memory bytes

“Mauricio Fernández” batsman.geo@yahoo.com wrote in message
news:20020930165416.GB1790@kodos…

Hi Bulat,

Do you mean just because we need to story only “len” instead of “len”
and
“capa” for each array/tuple? I think it is only a saving of 4 bytes per
array.

If size is fixed (de)allocation can be made much faster and is
easily managed with a memory arena.

Depends - on a copying garbage collector it doesn’t matter. Allocation
happens by moving a pointer and deallocating happens by copying objects that
are still in use. Of course Ruby’s garbage collector isn’t copying. OCaml
uses a mixture where the young generation is copying and the old generation
is mark sweep. (Young survivors are copied into the mark/sweep heap).

Mikkel

···

On Mon, Sep 30, 2002 at 10:36:41PM +0900, William Djaja Tjokroaminata wrote:

Hello Mauricio,

Monday, September 30, 2002, 11:04:12 PM, you wrote:

If size is fixed (de)allocation can be made much faster and is
easily managed with a memory arena.

it is another question. memory allocator is common to all classes, so
it’s his problem - fast allocation of small memory areas. your idea is
only for case when class has ITS OWN memory sub-allocator

···


Best regards,
Bulat mailto:bulatz@integ.ru

Hi Bulat,

Everything below is “obvious” and natural, isn’t it? That’s why I don’t
understand why Python does not allow mutable object to be used as a
hash/dictionary key in the first place.

Regards,

Bill

···

==========================================================================
Bulat Ziganshin bulatz@integ.ru wrote:

Hello Yukihiro,

Monday, September 30, 2002, 10:25:01 AM, you wrote:

s = “mutable”
hash = { s =>> “object” }
s.upcase!
p hash[“mutable”] #=> “object”

p hash[s] #=> nil

:slight_smile:


Best regards,
Bulat mailto:bulatz@integ.ru

Mon, 30 Sep 2002 22:16:37 +0900, William Djaja Tjokroaminata billtj@z.glue.umd.edu pisze:

Everything below is “obvious” and natural, isn’t it? That’s why
I don’t understand why Python does not allow mutable object to be
used as a hash/dictionary key in the first place.

It’s not obvious that a Ruby hash doesn’t put given objects as keys
but their duplicates. In Python it puts the keys as given (strings
and tuples are immutable so it would be a waste of time and memory
to make duplicates).

I think the difference between Python and Ruby is more philosophical
than technical, i.e. there mutable and immutable objects are
technically about the same.

Technically “immutable” means that there is no operation provided which
changes the state of the object. Philosophically it means that the
object’s identity is an artifact of the implementation and shouldn’t
be taken into account, only the value matters. It implies that passing
them by reference and by value is the same. All immutable objects of
some kind with the given value are equivalent. The implementation is
free to merge or disjoin representations of immutable objects when
it wants.

In Ruby any object can be frozen. In Python immutable and mutable
types are usually separate: mutability is not an attribute of an
object but a property of its type. (User-defined classes are not
formally classified, only intentionally.)

Python doesn’t enforce the consistency of treatment of these kinds
of objects: it allows to compare identity of immutable objects like
numbers (with unspecified results when objects are equal) and it
allows to use user-defined objects as keys of hashes, hoping that
their hash and equality relations won’t change over time.

Python tuples are different from lists (i.e. really arrays but
python calls them lists) also in this respects: tuples are usually
heterogeneous and of a fixed length, i.e. the length is usually
statically known in places they are used; lists are usually homogeneous
and of variable length.

Tuples are represented more efficiently than lists:

#define PyObject_HEAD
int ob_refcnt;
struct _typeobject *ob_type;

#define PyObject_VAR_HEAD
PyObject_HEAD
int ob_size; /* Number of items in variable part */

typedef struct {
PyObject_VAR_HEAD
PyObject *ob_item[1];
} PyTupleObject;
// Allocated with the additional space for elements.

typedef struct {
PyObject_VAR_HEAD
PyObject **ob_item;
} PyListObject;
// Capacity is implied by rounded size.

···


__("< Marcin Kowalczyk
__/ qrczak@knm.org.pl
^^ Blog człowieka poczciwego.

Agreed, we get to save a bit of mem per tuple, not per element. But it
could be significant if there’s a lot of small tuples…

I believe you’re thinking about the allocation of space for an object pointer
(which is simply taken from the area pointed to by ptr if capa > len),
whereas I was talking about the (de)allocation of the “ptr area” itself…

As for len & capa… capa would certainly go away, but this is not the
main enhancement. IMHO the main gain is never having to realloc() and
less memory fragmentation (especially internal). Right now, when an
array grows, its capa is increased by 0.5 * capa.

···

On Tue, Oct 01, 2002 at 04:37:38AM +0900, William Djaja Tjokroaminata wrote:

Hi,

Isn’t that in the current Ruby (1.6.7) the memory area is already managed
in the ‘VALUE *ptr’ member of struct RArray? Are you also suggesting to
remove both ‘len’ and ‘capa’ and using some C “sizeof” trick? But then it
seems it is still only a saving of 8 bytes per tuple as compared to an
array (I don’t think we can remove the ‘struct RBasic basic’ member).


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

Footnotes are for things you believe don’t really belong in LDP manuals,
but want to include anyway.
– Joel N. Weber II discussing the ‘make’ chapter of LPG

Hello William,

Monday, September 30, 2002, 8:19:31 PM, you wrote:

Hi Bulat,

I am using Ruby v 1.6.7 (which is called the “stable version” at
www.ruby-lang.org).

First, I think in array.c “len” is really of type long (4 bytes in my
machine).

Second, if you use “Array.new(n)” (as for tuple also we need to know the
exact size when we create it), then although initially Ruby allocs 16
elements, it will be resized immediately to exactly n elements. So I
guess using tuple will only save memory of 4 bytes per array (assuming the
underlying pointer handle is the same for array and tuple).

  1. sizeof(len)+sizeof(capa) == 8 :slight_smile:
  2. Array don’t change capa after each operation. and there is times
    when we need to use another method to create initial array. tuple
    guarantees minimal size, when Array implementation is INTERNAL thing,
    which can be changed
···


Best regards,
Bulat mailto:bulatz@integ.ru

I was rather thinking of the common memory allocator using a separate
area (and algorithm) if it is told the memory for a n-sized tuple.

···

On Tue, Oct 01, 2002 at 02:01:57PM +0900, Bulat Ziganshin wrote:

Hello Mauricio,

Monday, September 30, 2002, 11:04:12 PM, you wrote:

If size is fixed (de)allocation can be made much faster and is
easily managed with a memory arena.

it is another question. memory allocator is common to all classes, so
it’s his problem - fast allocation of small memory areas. your idea is
only for case when class has ITS OWN memory sub-allocator


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

People disagree with me. I just ignore them.
– Linus Torvalds, regarding the use of C++ for the Linux kernel

Hello William,

Monday, September 30, 2002, 5:16:37 PM, you wrote:

Everything below is “obvious” and natural, isn’t it? That’s why I don’t
understand why Python does not allow mutable object to be used as a
hash/dictionary key in the first place.

eklmn! add one level of indirection:

s = “mutable”
hash = { [s] => “object” }
s.upcase!
p hash[[“mutable”]] #=> nil

it’s a well-known theoretical problem. call/asign by reference or by
value, and mutable objects versus non-modified

···


Best regards,
Bulat mailto:bulatz@integ.ru

Hi,

Thanks for the explanation; now it is more clear to me regarding the
difference between “pass by value” and “pass by reference” with respect to
hash keys. But still, the following code still does not give what I
expect:

ruby 1.6.7 (2002-03-01) [i686-linux]

s = “mutable”
arr = [s]
arr.freeze # make the array immutable ???
hash = { arr => “object” }
s.upcase!
p hash[arr] #=> nil

(I really thought that in Ruby the hash keys had all been “pass by
reference.”)

Regards,

Bill

···

============================================================================
Marcin ‘Qrczak’ Kowalczyk qrczak@knm.org.pl wrote:

In Ruby any object can be frozen. In Python immutable and mutable
types are usually separate: mutability is not an attribute of an
object but a property of its type. (User-defined classes are not
formally classified, only intentionally.)

“Marcin ‘Qrczak’ Kowalczyk” qrczak@knm.org.pl writes:

Mon, 30 Sep 2002 22:16:37 +0900, William Djaja Tjokroaminata billtj@z.glue.umd.edu pisze:

Everything below is “obvious” and natural, isn’t it? That’s why
I don’t understand why Python does not allow mutable object to be
used as a hash/dictionary key in the first place.

It’s not obvious that a Ruby hash doesn’t put given objects as keys
but their duplicates. In Python it puts the keys as given (strings
and tuples are immutable so it would be a waste of time and memory
to make duplicates).

I think the difference between Python and Ruby is more philosophical
than technical, i.e. there mutable and immutable objects are
technically about the same.

Technically “immutable” means that there is no operation provided which
changes the state of the object. Philosophically it means that the
object’s identity is an artifact of the implementation and shouldn’t
be taken into account, only the value matters. It implies that passing
them by reference and by value is the same.

on this subject, i’ve collected the behaviour of important types
(numbers, strings, lists, objects) in various language.

http://merd.net/pixel/language-study/various/mutability-and-sharing

as for me, the most surprising is the behaviour of strings in perl:
the deep copying is done via the "my ($s) = @", so that you can still
have the pass by ref by accessing $
[0] directly.

i thought ruby had somehow the same behaviour, but no, the “=” doesn’t
deep copy strings. Fortunately, Ruby has many purely functional
functions (eg: “s.sub(…)” vs perl’s =~ s///) allowing to ignore this
problem.

Tue, 1 Oct 2002 05:25:04 +0900, Pixel pixel@mandrakesoft.com pisze:

http://merd.net/pixel/language-study/various/mutability-and-sharing

You are wrong that OCaml string constructor allocates a copy.
OCaml Strings aren’t copy-on-write so this would be expensive.

let f () = “aaa”;;

val f : unit → string =

(f ()).[0] ← ‘A’;;

  • : unit = ()

f ();;

  • : string = “Aaa”

OCaml strings are technically mutable but they can’t change the length.
The mutability is generally used only to construct a new string by
allocation and filling.

A typo: not ‘strcmp(“hello”)’ but perhaps strcpy or strdup (non-std).
In Ruby it’s faster than strcpy under the hood because of COW.

···


__("< Marcin Kowalczyk
__/ qrczak@knm.org.pl
^^ Blog człowieka poczciwego.

Hello Mauricio,

Wednesday, October 02, 2002, 11:24:55 PM, you wrote:

it is another question. memory allocator is common to all classes, so
it’s his problem - fast allocation of small memory areas. your idea is
only for case when class has ITS OWN memory sub-allocator

I was rather thinking of the common memory allocator using a separate
area (and algorithm) if it is told the memory for a n-sized tuple.

better approach - using special blocks for allocating 1-20 byte areas,
and one common block for all areas larger. tuple is not only one
structure with fixed size which ryby VM has to allocate

···


Best regards,
Bulat mailto:bulatz@integ.ru

Hi Bulat,

When I first read your code below, it was also obvious to me, since the
original reference to the array [s] in the second line was lost
forever. In the last line, [“mutable”] is really a newly created object,
which is obviously different from the original array object.

But then, toying with your idea, I found this:

ruby 1.6.7 (2002-03-01) [i686-linux]

s = “mutable”
arr = [s]
hash = { arr => “object” }
s.upcase!
p hash[arr] #=> nil

This is really surprising to me! Is this the expected behavior in
Ruby? Because in Python, using an array as a hash key is forbidden in the
first place (syntax error, I guess), while Ruby simply allows us to do so.

Any idea, anyone? (I really thought before that the hash key were
actually just the object ID in Ruby.)

Regards,

Bill

···

============================================================================
Bulat Ziganshin bulatz@integ.ru wrote:

eklmn! add one level of indirection:

s = “mutable”
hash = { [s] => “object” }
s.upcase!
p hash[[“mutable”]] #=> nil

it’s a well-known theoretical problem. call/asign by reference or by
value, and mutable objects versus non-modified

“Marcin ‘Qrczak’ Kowalczyk” qrczak@knm.org.pl writes:

Tue, 1 Oct 2002 05:25:04 +0900, Pixel pixel@mandrakesoft.com pisze:

http://merd.net/pixel/language-study/various/mutability-and-sharing

You are wrong that OCaml string constructor allocates a copy.

fixing

[…]

A typo: not ‘strcmp(“hello”)’ but perhaps strcpy or strdup (non-std).

fixing (well removed :slight_smile:

In Ruby it’s faster than strcpy under the hood because of COW.

fixing.

thanks!

This is quite easy to implement in Ruby right now, but I don’t know how
big a speed increase we’d see. malloc and friends have been subject to
serious optimization so it won’t be easy to beat them even with a more
specific allocator.

···

On Thu, Oct 03, 2002 at 02:06:30PM +0900, Bulat Ziganshin wrote:

Hello Mauricio,

Wednesday, October 02, 2002, 11:24:55 PM, you wrote:

it is another question. memory allocator is common to all classes, so
it’s his problem - fast allocation of small memory areas. your idea is
only for case when class has ITS OWN memory sub-allocator

I was rather thinking of the common memory allocator using a separate
area (and algorithm) if it is told the memory for a n-sized tuple.

better approach - using special blocks for allocating 1-20 byte areas,
and one common block for all areas larger. tuple is not only one
structure with fixed size which ryby VM has to allocate


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

We come to bury DOS, not to praise it.
– Paul Vojta, vojta@math.berkeley.edu

Hi –

···

On Tue, 1 Oct 2002, William Djaja Tjokroaminata wrote:

ruby 1.6.7 (2002-03-01) [i686-linux]

s = “mutable”
arr = [s]
hash = { arr => “object” }
s.upcase!
p hash[arr] #=> nil

This is really surprising to me! Is this the expected behavior in
Ruby? Because in Python, using an array as a hash key is forbidden in the
first place (syntax error, I guess), while Ruby simply allows us to do so.

Any idea, anyone? (I really thought before that the hash key were
actually just the object ID in Ruby.)

ri Hash#rehash:

 hsh.rehash -> hsh

 Rebuilds the hash based on the current hash values for each
 key. If values of key objects have changed since they were
 inserted, this method will reindex hsh. If Hash#rehash is called
 while an iterator is traversing the hash, an IndexError will be
 raised in the iterator.

    a = [ "a", "b" ]
    c = [ "c", "d" ]
    h = { a => 100, c => 300 }
    h[a]       #=> 100
    a[0] = "z"
    h[a]       #=> nil
    h.rehash   #=> {["c", "d"]=>300, ["z", "b"]=>100}
    h[a]       #=> 100

David


David Alan Black | Register for RubyConf 2002!
home: dblack@candle.superlink.net | November 1-3
work: blackdav@shu.edu | Seattle, WA, USA
Web: http://pirate.shu.edu/~blackdav | http://www.rubyconf.com

Hello William,

Monday, September 30, 2002, 8:02:32 PM, you wrote:

s = “mutable”
hash = { [s] => “object” }
s.upcase!
p hash[[“mutable”]] #=> nil

When I first read your code below, it was also obvious to me, since the
original reference to the array [s] in the second line was lost
forever. In the last line, [“mutable”] is really a newly created object,
which is obviously different from the original array object.

:)))) you are really thinking ruby way if you tell us about OBJECTS,
not VALUES! it is OBVIOUS that hash keys must not change just because
we changed some variable used to COMPUTE hash key at some place

(to everyone: this is not critique of ruby or reference-oriented
semantic :wink:

But then, toying with your idea, I found this:

ruby 1.6.7 (2002-03-01) [i686-linux]

s = “mutable”
arr = [s]
hash = { arr =>> “object” }
s.upcase!
p hash[arr] #=>> nil

This is really surprising to me! Is this the expected behavior in
Ruby? Because in Python, using an array as a hash key is forbidden in the
first place (syntax error, I guess), while Ruby simply allows us to do so.

Any idea, anyone? (I really thought before that the hash key were
actually just the object ID in Ruby.)

all ruby variables just hold REFERENCES to objects in heap (except for
fixnum/true/false/nil). when you assign variable or call the method,
just refernce (4 bytes) is copied. so, from this moment you have 2
references to the same object, and when you change object thorough one
reference, you will see change thorough another references

a=“xy”
b=a
a.upcase!
p a,b
def f(x); x[0] = ‘-’; x end
c=f(b)
p a,b,c

it’s a well-known theoretical problem. call/asign by reference or by
value, and mutable objects versus non-modified

and ruby has the best solution i know of this problem

···


Best regards,
Bulat mailto:bulatz@integ.ru

“Mauricio Fernández” batsman.geo@yahoo.com wrote in message
news:20021003104503.GC1129@kodos…

Hello Mauricio,

better approach - using special blocks for allocating 1-20 byte areas,
and one common block for all areas larger. tuple is not only one
structure with fixed size which ryby VM has to allocate

This is quite easy to implement in Ruby right now, but I don’t know how
big a speed increase we’d see. malloc and friends have been subject to
serious optimization so it won’t be easy to beat them even with a more
specific allocator.

malloc is constrained by synchronization issues - for that reason alone it
pays off to customize allocation when you know your threads.

I have an unsynchronized (you can of course add sync.) block allocation that
is pretty fast. It was originally posted in Dr.Dobbs Journal as smart alloc
and I made a few changes. It creates a chained list of pools for each size
and by default create blocks that are powers of two - and uses standard
malloc above a certain size. There is virtually no space overhead in the
allocation thanks to aligned pointer arithmetic.
Also, by default it uses logarithm from requested size to locate the right
pool. However, if you always allocate a known size that you have a dedicated
pool for - you can skip the pool selection step and get faster allocation.
For small strings, you don’t know the size smallest larger pool size and you
do want the pool selection logic. I’ve seen fairly decent speedups on a copy
on write string class using this allocation.

If interested let me know - drop the “-anti-spam” from my mail address.
mikkelfj-anti-spam@bigfoot.com

Mikkel

···

On Thu, Oct 03, 2002 at 02:06:30PM +0900, Bulat Ziganshin wrote:

Wow, quite a revelation to me, David! Thanks a lot for pointing the very
importance
of “rehash” when dealing with “mutable” keys. (I guess
then there is definitely no “rehash” in Python.)

Regards,

Bill

···

===========================================================================
dblack@candle.superlink.net wrote:

    a = [ "a", "b" ]
    c = [ "c", "d" ]
    h = { a => 100, c => 300 }
    h[a]       #=> 100
    a[0] = "z"
    h[a]       #=> nil
    h.rehash   #=> {["c", "d"]=>300, ["z", "b"]=>100}
    h[a]       #=> 100