Is Ruby Array#shift/unshift Efficient?

Hi,

I just scanned the Ruby array implementation in array.c, and to my
surprise, every time Array#shift/unshift is called, memmove() is
invoked. So far I never saw any advice for the preferred use of
push/pop to shift/unshift. The last time I used C++, vector was good only
for push/pop while if we wanted to use push/pop/shift/unshift it was
better to use a deque.

Therefore, is memmove() really that efficient (O(1) operation), or should
Ruby array be re-implemented using a deque, or is there something else
that I am not aware of? (For example, in Python array, only pop() and
append() are given, to signify that insert/delete at the beginning of
array is not an efficient (O(1)) operation. I think C++ STL follows the
same philosophy.)

Regards,

Bill

Hi,

···

In message “Is Ruby Array#shift/unshift Efficient?” on 02/09/17, William Djaja Tjokroaminata billtj@y.glue.umd.edu writes:

I just scanned the Ruby array implementation in array.c, and to my
surprise, every time Array#shift/unshift is called, memmove() is
invoked.

“shift” is fixed in 1.7.

						matz.

Hi Matz,

For the next stable release of Ruby (with shift() fixed), will these two C
calls still be valid:

len = (long) RARRAY (someArray)->len;
obj = rb_ary_entry (someArray, i);

Regards,

Bill

···

============================================================================
Yukihiro Matsumoto matz@ruby-lang.org wrote:

Hi,

In message “Is Ruby Array#shift/unshift Efficient?” > on 02/09/17, William Djaja Tjokroaminata billtj@y.glue.umd.edu writes:

I just scanned the Ruby array implementation in array.c, and to my
surprise, every time Array#shift/unshift is called, memmove() is
invoked.

“shift” is fixed in 1.7.

  					matz.

Hi,

···

In message “Re: Is Ruby Array#shift/unshift Efficient?” on 02/09/17, William Djaja Tjokroaminata billtj@y.glue.umd.edu writes:

For the next stable release of Ruby (with shift() fixed), will these two C
calls still be valid:

len = (long) RARRAY (someArray)->len;
obj = rb_ary_entry (someArray, i);

Yes.

						matz.

Hi Matz,

Thanks a lot for the response.

Now this is a general question to anyone. In the field of data
structures, there are so many of them, and each one is efficient for
certain types of operation. However, it seems that most scripting
languages provide only array and hash, like Ruby, Python, and Perl. The
original Tcl still provided array, list, and hash. Some other languages
like Lua even provides only hash.

In using a scripting language, aren’t we concerned with the underlying
data structure itself? Is the answer:

  1. Beside array and hash, other data structures are not “practical”, in
    the sense that they are very rarely used.

  2. In a scripting language, the efficiency of array vs. list is immaterial
    as compared with the overhead of the scripting language itself.

  3. Or something else?

I think this is interesting, as in C++ I always have to decide at least to
use a vector or a list (or even a deque) when I need an ordered
collection. But in Ruby, there is no such choice (at least using the
built-in classes).

Regards,

Bill

···

=============================================================================
Yukihiro Matsumoto matz@ruby-lang.org wrote:

For the next stable release of Ruby (with shift() fixed), will these two C
calls still be valid:

len = (long) RARRAY (someArray)->len;
obj = rb_ary_entry (someArray, i);

Yes.

  					matz.

Hi,

···

In message “Re: Is Ruby Array#shift/unshift Efficient?” on 02/09/17, William Djaja Tjokroaminata billtj@y.glue.umd.edu writes:

In using a scripting language, aren’t we concerned with the underlying
data structure itself?

Interesting point. I wish users do not need to worry about efficiency
as less as possible. I’m trying to make everything efficient enough
for scripting language. The problem is that the target domain of Ruby
is expanding every day. No one even imagine embedding Ruby in a soft
real time system like commercial video games 5 years ago.

						matz.

“William Djaja Tjokroaminata” billtj@y.glue.umd.edu wrote in message
news:am593i$l3s$1@grapevine.wam.umd.edu…

Hi Matz,
In using a scripting language, aren’t we concerned with the underlying
data structure itself? Is the answer:

  1. Beside array and hash, other data structures are not “practical”, in
    the sense that they are very rarely used.

I have been looking into a deque-like datastructure I call an I-Tree or
index tree. It is a linked list of buffers where all buffers are leaf nodes
of a tree indexed but the size of the subtree. It is a slight variation of
the B-Tree.
This datastructure uses memory move operations within a single buffer, but
does not need to move the entire vector for insertion at the front, back or
in the middle.

At the same time I’m using B-Trees instead of hashes because hashes tend to
be most efficient for large collections while many collections are not that
large.

However, the implementation of in-memory B-Tree and I-Tree is not trivial
compared to an array and a hash table.

  1. In a scripting language, the efficiency of array vs. list is immaterial
    as compared with the overhead of the scripting language itself.

I disagree. Some scripting languages are really fast exactly because they
have efficient datastructures well integrated with memory management, where
it can be quite tedious to achieve the same result in C++, although STL has
helped a lot.

In fact I believe this is the reason that Ruby has a quite good performance
in praxis.

  1. Or something else?

I think this is interesting, as in C++ I always have to decide at least to
use a vector or a list (or even a deque) when I need an ordered
collection. But in Ruby, there is no such choice (at least using the
built-in classes).

I seldom uses deque mostly by habit because vector and list does the job,
but in reality all I need is the deque datastructure. You only need a few
good datastructures.

Mikkel

Hi Matz,

I also noticed that the pickaxe book does not discuss execution
efficiency as much as, say, the book “Python Essential Reference”. There
is also “Performance tips for Ruby” at
http://www.bagley.org/~doug/shootout/lang/ruby/. Do you think you can
bless them?

If the target domain of Ruby is expanding every day (mine is for network
simulator), it is because more and more people find that the language that
you created is easy, consistent, practical, and powerful; these terms do
not come together so often.

Regards,

Bill

···

=============================================================================
Yukihiro Matsumoto matz@ruby-lang.org wrote:

Hi,

In message “Re: Is Ruby Array#shift/unshift Efficient?” > on 02/09/17, William Djaja Tjokroaminata billtj@y.glue.umd.edu writes:

In using a scripting language, aren’t we concerned with the underlying
data structure itself?

Interesting point. I wish users do not need to worry about efficiency
as less as possible. I’m trying to make everything efficient enough
for scripting language. The problem is that the target domain of Ruby
is expanding every day. No one even imagine embedding Ruby in a soft
real time system like commercial video games 5 years ago.

  					matz.