Ruby implements copy-on-write for strings, so you can do stuff like
this very cheaply:
str = 0.chr * (2**24) # 16MiB allocated
str[100..-1] # this costs only a small amount of memory
How come this optimization does not apply in this case?:
str[100..-2] # this costs around 16MiB bytes of memory
As a side effect, if using regexps on a large string, the pre-match
and post-match variables behave differently:
s = 0.chr * (2**23) + "Hello" + 0.chr * (2**23) # About 16MiB
allocated (after GC)
s.scan(/Hello/) { |m| p m } # This is free
p $'.size # This is free
p $`.size # This costs another 8MiB.
Interesting. Do you also happen to know why not an additional field is used that stores the length? Is the reason maybe usage of C library string functions that work on zero terminated strings?
Cheers
robert
···
On 05.05.2008 18:07, ts wrote:
Lars Christensen wrote:
Well, it's best if you look at rb_str_substr() in string.c
str[100..-1] # this costs only a small amount of memory
ruby just need to adjust the pointer and the length in the new
object
str[100..-2] # this costs around 16MiB bytes of memory
one character is missing from the previous string, if it do the
same thing than previously then it must
* adjust the pointer
* adjust the length
* add \0 at the end
This mean that fatally it has modified the string, this is why it
duplicate.
p $'.size # This is free
p $`.size # This costs another 8MiB.
Interesting. Do you also happen to know why not an additional field is used that stores the length?
I've not understood : it has a field which give it the length of
the string, for example with
Ah, ok. This happens when one is too lazy to look into the source. Somehow I had assumed that the length was not stored because you made the point that the \0 could not be inserted without altering the original. I concluded, there is no length.
str = '0' * 200
str[100 .. -1]
the first object (in str) will have 200 for its length
the field length in the new object will have the value 100
Is the reason maybe usage of C library string functions that work on zero terminated strings?