Real arrays in Ruby [AAA]

Hi everybody

how can I create a real array (like C), all items have the same type.
If using array.new for 1Mio entries, Ptrs need more space than the contents.
- or can this be automatically optimized by Ruby?
What is the best solution (for performance and mem-usage)?

thank you
Andrew

···

Sent from my iPhone

On Dec 7, 2015, at 5:59 PM, Die Optimisten <inform@die-optimisten.net> wrote:

Hi everybody

how can I create a real array (like C), all items have the same type.
If using array.new for 1Mio entries, Ptrs need more space than the contents.
- or can this be automatically optimized by Ruby?
What is the best solution (for performance and mem-usage)?

thank you
Andrew

how can I create a real array (like C), all items have the same type.

Ruby arrays aren't "fake" arrays any more than C arrays are "real" arrays.

Ruby array's do all have the same type: Object.

If using array.new for 1Mio entries, Ptrs need more space than the contents.
- or can this be automatically optimized by Ruby?

I don't understand this question.

What is the best solution (for performance and mem-usage)?

Depending on what you really want to do (since you're really not describing it), narray might be what you want:

% time ruby -rnarray -e 'n = 10_000_000; a = NArray.new(NArray::SFLOAT, n); n.times { |i| a[i] = i * i }; p a[-1]'
99999983599616.0

real 0m2.040s
user 0m1.998s
sys 0m0.035s

···

On Dec 7, 2015, at 15:59, Die Optimisten <inform@die-optimisten.net> wrote:

Hi
thanks for your answer. I've read this already.
I'm using Ruby 1.9 - are there any changes?
Do you mean array.pack ?? - so I could directly use string-ops, but they are very slow
I would like to access the content directly (at any time) - how is it done without 1Mio pointers?

thank you
Andrew

···

On 2015-12-08 01:42, thomas Perkins wrote:

Class: Array (Ruby 2.2.0)

Sent from my iPhone

On Dec 7, 2015, at 5:59 PM, Die Optimisten <inform@die-optimisten.net > <mailto:inform@die-optimisten.net>> wrote:

Hi everybody

how can I create a real array (like C), all items have the same type.
If using array.new for 1Mio entries, Ptrs need more space than the contents.
- or can this be automatically optimized by Ruby?
What is the best solution (for performance and mem-usage)?

thank you
Andrew

Hi Andrew,

Could you tell us a bit more about the thing you're trying to accomplish?
What makes you concerned about the fact that pointers need more space than
data? Is memory a limiting factor?

If you want to get a feel for memory usage you can fire up irb and do: a =
(1..1_000_000).to_a; then take a look at RSS of the process. In my case
it's around 32 MiB.

Best regards

···

--
Greg Navis

On Tue, Dec 8, 2015 at 12:59 AM, Die Optimisten <inform@die-optimisten.net> wrote:

Hi everybody

how can I create a real array (like C), all items have the same type.
If using array.new for 1Mio entries, Ptrs need more space than the
contents.
- or can this be automatically optimized by Ruby?
What is the best solution (for performance and mem-usage)?

thank you
Andrew

Sorry for my delay in responding - I've only just seen this.

Apart from the other suggestions, including NArray, in the past I've
used JRuby and found that integrated quite well with Java.

···

On 12/7/15, Die Optimisten <inform@die-optimisten.net> wrote:

Hi everybody

how can I create a real array (like C), all items have the same type.
If using array.new for 1Mio entries, Ptrs need more space than the
contents.
- or can this be automatically optimized by Ruby?
What is the best solution (for performance and mem-usage)?

Use a hash, not an array, hashes are better for bigger amounts of objects, you could also use a YAML or JSON file if you really wanted to get fancy:

Hash:
x = {
       1 => "example",
       2 => "another_example"
       }

YAML:
user:
       lost_bam,
       another_user
password:
       123456789,
       987654321

JSON:
user: {
          username: {
                              [ "lost_bam", "another_user" ]
                             }
           password: {
                              [ "123456788", "987654321" ]
}

Any of these are a better option then an array for the amount you want to store

···

Sent from my iPhone

On Dec 7, 2015, at 8:13 PM, Bee.Lists <bee.lists@gmail.com> wrote:

1 million items in an array? Build a temporary table if need be in your database. That’s a big array. Doesn’t sound practical. I’ve never approached something that large in memory.

On Dec 7, 2015, at 8:38 PM, Die Optimisten <inform@die-optimisten.net> wrote:

I would like to access the content directly (at any time) - how is it done without 1Mio pointers?

Cheers, Bee

1 million items in an array? Build a temporary table if need be in your database. That’s a big array. Doesn’t sound practical. I’ve never approached something that large in memory.

···

On Dec 7, 2015, at 8:38 PM, Die Optimisten <inform@die-optimisten.net> wrote:

I would like to access the content directly (at any time) - how is it done without 1Mio pointers?

Cheers, Bee

Hi Andrew,

Ruby gives you practically zero control over memory usage, and uses heap
allocation all the time. So there's no way to avoid the internal pointer
allocation with an array.

I'd suggest that if you need fast and memory efficient array access with an
array of ~1 million records, you're probably using the wrong language! You
probably want a language where you have control over how memory is
allocated, such as C or perhaps Rust.

If you really want to use ruby you could look into integrating with an
external storage solution that gives fast array like operations, such as
redis.

Best of luck!

Jon

···

On Tue, 8 Dec 2015, 02:24 thomas Perkins <thomas.perkins23@icloud.com> wrote:

Use a hash, not an array, hashes are better for bigger amounts of objects,
you could also use a YAML or JSON file if you really wanted to get fancy:

Hash:
x = {
       1 => "example",
       2 => "another_example"
       }

YAML:
user:
       lost_bam,
       another_user
password:
       123456789,
       987654321

JSON:
user: {
          username: {
                              [ "lost_bam", "another_user" ]
                             }
           password: {
                              [ "123456788", "987654321" ]
}

Any of these are a better option then an array for the amount you want to
store

Sent from my iPhone

> On Dec 7, 2015, at 8:13 PM, Bee.Lists <bee.lists@gmail.com> wrote:
>
> 1 million items in an array? Build a temporary table if need be in your
database. That’s a big array. Doesn’t sound practical. I’ve never
approached something that large in memory.
>
>
>
>> On Dec 7, 2015, at 8:38 PM, Die Optimisten <inform@die-optimisten.net> > wrote:
>>
>> I would like to access the content directly (at any time) - how is it
done without 1Mio pointers?
>
>
>
> Cheers, Bee
>
>
>
>

--
https://twitter.com/joonty

Use a hash, not an array, hashes are better for bigger amounts of objects,

That is not true. Whether to use Hash or Array is determined not by
the amount of objects to be put inside but by the key type and usage
patterns. Hashes are only better if you need non integer indexes or
have holes in the sequence of integer keys. Because the way Hash works
it will likely take more memory for one million entries indexed by 0
... 1_000_000 than the corresponding Array instance.

irb(main):005:0> Benchmark.bmbm do |x|
irb(main):006:1* x.report("array 1") { Array.new(1_000_000) }
irb(main):007:1> x.report("array 2") { a=; for i in 0 ... 1_000_000;
a[i]=a; end }
irb(main):008:1> x.report("hash") { h={}; for i in 0 ... 1_000_000;
h[i]=h; end }
irb(main):009:1> end
Rehearsal -------------------------------------------
array 1 0.000000 0.000000 0.000000 ( 0.004964)
array 2 0.078000 0.000000 0.078000 ( 0.086736)
hash 0.609000 0.015000 0.624000 ( 0.623278)
---------------------------------- total: 0.702000sec

              user system total real
array 1 0.000000 0.000000 0.000000 ( 0.000741)
array 2 0.062000 0.000000 0.062000 ( 0.066444)
hash 0.421000 0.000000 0.421000 ( 0.419497)
=> [#<Benchmark::Tms:0x0000060004f0f0 @label="array 1",
@real=0.0007412240083795041, @cstime=0.0, @cutime=0.0, @stime=0.0,
@utime=0.0, @total=0.0>, #<Benchmark::Tms:0x0000060004feb0
@label="array 2", @real=0.06644390399742406, @cstime=0.0, @cutime=0.0,
@stime=0.0, @utime=0.06200000000000028, @total=0.06200000000000028>,
#<Benchmark::Tms:0x0000060047cda8 @label="hash",
@real=0.41949673400085885, @cstime=0.0, @cutime=0.0, @stime=0.0,
@utime=0.4209999999999998, @total=0.4209999999999998>]

you could also use a YAML or JSON file if you really wanted to get fancy:

Those are used for external storage, not for manipulating objects in memory.

Any of these are a better option then an array for the amount you want to store

I beg to differ. Even Marshaling an Array with 1_000_000 elements is
not that big deal IMHO.

Cheers

robert

···

On Tue, Dec 8, 2015 at 3:24 AM, thomas Perkins <thomas.perkins23@icloud.com> wrote:

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/

Hmm, well I learned something nw today, and I just woke up! I stand corrected. Thank you sir.

···

Sent from my iPhone

On Dec 8, 2015, at 4:39 AM, Robert Klemme <shortcutter@googlemail.com> wrote:

On Tue, Dec 8, 2015 at 3:24 AM, thomas Perkins > <thomas.perkins23@icloud.com> wrote:

Use a hash, not an array, hashes are better for bigger amounts of objects,

That is not true. Whether to use Hash or Array is determined not by
the amount of objects to be put inside but by the key type and usage
patterns. Hashes are only better if you need non integer indexes or
have holes in the sequence of integer keys. Because the way Hash works
it will likely take more memory for one million entries indexed by 0
... 1_000_000 than the corresponding Array instance.

irb(main):005:0> Benchmark.bmbm do |x|
irb(main):006:1* x.report("array 1") { Array.new(1_000_000) }
irb(main):007:1> x.report("array 2") { a=; for i in 0 ... 1_000_000;
a[i]=a; end }
irb(main):008:1> x.report("hash") { h={}; for i in 0 ... 1_000_000;
h[i]=h; end }
irb(main):009:1> end
Rehearsal -------------------------------------------
array 1 0.000000 0.000000 0.000000 ( 0.004964)
array 2 0.078000 0.000000 0.078000 ( 0.086736)
hash 0.609000 0.015000 0.624000 ( 0.623278)
---------------------------------- total: 0.702000sec

             user system total real
array 1 0.000000 0.000000 0.000000 ( 0.000741)
array 2 0.062000 0.000000 0.062000 ( 0.066444)
hash 0.421000 0.000000 0.421000 ( 0.419497)
=> [#<Benchmark::Tms:0x0000060004f0f0 @label="array 1",
@real=0.0007412240083795041, @cstime=0.0, @cutime=0.0, @stime=0.0,
@utime=0.0, @total=0.0>, #<Benchmark::Tms:0x0000060004feb0
@label="array 2", @real=0.06644390399742406, @cstime=0.0, @cutime=0.0,
@stime=0.0, @utime=0.06200000000000028, @total=0.06200000000000028>,
#<Benchmark::Tms:0x0000060047cda8 @label="hash",
@real=0.41949673400085885, @cstime=0.0, @cutime=0.0, @stime=0.0,
@utime=0.4209999999999998, @total=0.4209999999999998>]

you could also use a YAML or JSON file if you really wanted to get fancy:

Those are used for external storage, not for manipulating objects in memory.

Any of these are a better option then an array for the amount you want to store

I beg to differ. Even Marshaling an Array with 1_000_000 elements is
not that big deal IMHO.

Cheers

robert

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/

Quoting Jonathan Cairns (jon@joncairns.com):

   If you really want to use ruby you could look into integrating with an
   external storage solution that gives fast array like operations, such as
   redis.

Or you can learn how to mix C and Ruby. It is not beginner stuff,
though.

Carlo

···

Subject: Re: real arrays in Ruby [AAA]
  Date: mar 08 dic 15 07:23:45 +0000

--
  * Se la Strada e la sua Virtu' non fossero state messe da parte,
* K * Carlo E. Prelz - fluido@fluido.as che bisogno ci sarebbe
  * di parlare tanto di amore e di rettitudine? (Chuang-Tzu)

Hi Andrew,

Ruby gives you practically zero control over memory usage, and uses heap
allocation all the time. So there's no way to avoid the internal pointer
allocation with an array.

Integers are referred to (almost) directly. I'm not sure how much
overhead there is on arrays just of integers.

I'd suggest that if you need fast and memory efficient array access with an
array of ~1 million records, you're probably using the wrong language! You
probably want a language where you have control over how memory is
allocated, such as C or perhaps Rust.

If you really want to use ruby you could look into integrating with an
external storage solution that gives fast array like operations, such as
redis.

Or NArray.

···

On Tue, Dec 08, 2015, Jonathan Cairns wrote:

--
        Eric Christopherson

What you say is true for MRI - I do not know about other Ruby
implementations. And it is only Fixnum not integers in general. The
technology used is Tagged pointer - Wikipedia
Basically the pointer *is* the value, there is nothing referred to. So
yes, there is no memory overhead when an Array consists only of values
which are either Fixnum or any of the tagged values (true, false, nil
IIRC).

Cheers

robert

···

On Wed, Dec 9, 2015 at 5:57 AM, Eric Christopherson <echristopherson@gmail.com> wrote:

On Tue, Dec 08, 2015, Jonathan Cairns wrote:

Ruby gives you practically zero control over memory usage, and uses heap
allocation all the time. So there's no way to avoid the internal pointer
allocation with an array.

Integers are referred to (almost) directly. I'm not sure how much
overhead there is on arrays just of integers.

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/