Some of the wiki pitfall pages are located on a japanese
server. The html
source indicates a japanese encoding, while in fact the text
is in UTF-8,
causing backslash to be displayed incorrectly.
I told Seki-san who maintains rwiki.jin.gr.jp and tried
to avoid this problem. The server returned
Content-type: text/html; charset=EUC-JP
because pages may contains Japanese. Now it should return
Content-type: text/html; charset=us-ascii
when the page does not contain any 8bit chars.
In us-ascii I can’t write my name: Mikkel Fahnoe Jorgensen (In case it
doesn’t display, the letter after ‘hn’ and after ‘J’ is similar to lower
case of the empty set symbol, and is sometimes written ‘oe’ in ASCII).
Likewise I can’t write my name: なひ (In case it doesn’t display,
the letter, ah hmmm, forget it.
Thank you for checking our wiki server. We had
another trouble last night that posted Japanese chars
were lost because of charset=us-ascii.
As you said, UTF-8 is a better choice for the server,
even though RWiki cannot run with UTF-8…
I added it to RWiki ToDo.
Very interesting observation, Gergely. If it is true, like you said, then
probably Ruby internal implementation in C can further be optimized. (We
deal with this kind of issue all the time in C++, don’t we? I mean,
isn’t that the “+” operator and the “+=” operator invoke two different
functions in C++?)
Yes they are
But if we are at operators:
Has anybody suggested to be able to define operators?
I mean NEW operators, so one can define ++, if he wants, define ** for
power…
Gergo
±[Kontra, Gergely @ Budapest University of Technology and Economics]-+
Very interesting observation, Gergely. If it is true, like you said, then
probably Ruby internal implementation in C can further be optimized. (We
deal with this kind of issue all the time in C++, don’t we? I mean,
isn’t that the “+” operator and the “+=” operator invoke two different
functions in C++?)
Before a long discussion starts, I would suggest you
to read the thread at Ruby Talk 15197 etc…
I have written a several stupid things (I had 4 days
Ruby experience that time), but the answers by others
are really interesting and insightful.
... which I presume is cleverly exploiting the property that in the
underlying processor, all pointers need to point to objects which are
word-aligned; a word is 16 bits (2 bytes) or a multiple thereof; and
therefore all valid pointers are even.
So if an odd pointer cannot be valid, we might as well use it to represent a
Fixnum, with the value in the top 31 bits.
Brian.
···
On Sat, Oct 12, 2002 at 11:11:47AM +0100, Brian Candler wrote:
However in practise this is efficient, since there appears to be a very
compact internal representation of references to Fixnums:
So I don’t think that an object representing “the number 2” is actually
allocated on the heap, thankfully
Hence “a+=1” does not increment the object pointed to by ‘a’ (you can’t
increment the number 1, it’s meaningless); rather, ‘a’ is changed to
reference a new object which is the result of applying method +(1) to the
original object.
When I discovered this some time ago, I wondered deeply what might be occupying
id #s 2, 4, 6, …
Interesting that “id” is documented in ‘ri’ to return a Fixnum, but:
1000000001.id => 2000000003
1000000001.id.class => Bignum
I’m surprised that numbers that high have a predictable ID, as if they’re
predefined. But then maybe the numbers are brought into being only as needed
(except for the first 100, like Python), with a special case formulaic approach
to assigning them an ID, and then all non-Integers are given an even ID so they
don’t clash. When I ran
a =
ObjectSpace.each_object { |obj| a << obj }
a.find { |obj| obj.id % 2 == 1 }
it returned nil, so that must be the case.
What I also found interesting was that successive runs of this testlet seemed
to produce many more objects.
what you mean? if you want to increment in place, use another
operations, like “succ!”. btw, integers are saved w/o references, so
for Fixnum this question is without sense
It is also senseless for Bignum, Float and just about any number class,
due to immutability semantics of numbers.
I suggested the bad idea of “add!” and similar for numbers that are real
objects a year ago, to reduce object creation, but luckily Matz set me
straight. [ruby-talk:18999]
···
–
([ Kent Dahl ]/)_ ~[ http://www.stud.ntnu.no/~kentda/ ]/~
))_student/(( _d L b_/ NTNU - graduate engineering - 5. year )
( __õ|õ// ) )Industrial economics and technological management(
_/ö____/ (_engineering.discipline=Computer::Technology)
One question, if I just post the list to comp.lang.ruby, is it also
echoed to -talk, or is it the other way around? (I don’t subscribe to
-talk as I don’t have big e-mail box.)
Probably not everyone will agree, but I intend to keep the list as
minimal as possible, so that a person can finish reading it in
probably less than 15 minutes. With that, as the title implies also,
the list is more about preventing errors than about optimization.
Therefore, I have put some “extra” things under separate sub-heading
“Things That Are Good to Know :-)”.
Regards,
Bill
Small == Good.
“Things that are good to know” can be one-liners with no explanation, just so
nubies know what to look out for in the documentation, i.e. to whet their
appetite.
It is a very interesting reading. Now I realize more and more that all
the problems that are explicitly exposed when someone learns C++ do not
simply disappear in Ruby. For example, C++ explicitly differentiates
“+” from “+=”, C++ explicitly discusses assignment operator and copy
constructor, etc.
I guess part of the “magic” of Ruby is a conscious design decision on what
problems are to be suppressed (such as making “a += b” as simply an alias
of “a = a + b” at the cost of some performance and not providing a deep
copy method) in its OO design.
I think it will be very interesting to compare the design of C++ and
Ruby from the OO point of view.
Regards,
Bill
···
============================================================================
Christian Szegedy szegedy@nospam.or.uni-bonn.de wrote:
Before a long discussion starts, I would suggest you
to read the thread at Ruby Talk 15197 etc…
That’s because 1000000001 still can fit in Fixnum, so Ruby calls INT2FIX
which simply left-shift by one and set the least significant bit to
When you call “id”, unless the object is a special constant, Ruby
simply set the least significant bit to 1, and return the “VALUE” of the
object (which for a Fixnum is the object itself). Because the least
significant bit is always set to 1 (unless probably for special
constants), you will always see odd numbers for id’s.
So for all objects of type Fixnum, its id is simply given by left shift by
one and set least significant bit to 1, which is simply equivalent to
multiply by two and add 1.
When you call “1000000001.id.class”, the “class” method of
“1000000001.id” is called; because in this case 2000000003 does not
fit into a Fixnum, Ruby creates a Bignum, and returns its class.
So the id’s are not predefined (not stored in memory), but created
(computed) as necessary based on the simple formula above.
Finally, when you ran the successive testlets and you found many more
objects, I guess it is just because the testlet itself creates many more
Ruby (temporary) objects, which exist until the next gc cycle.
When I discovered this some time ago, I wondered deeply what might be occupying
id #s 2, 4, 6, …
Interesting that “id” is documented in ‘ri’ to return a Fixnum, but:
1000000001.id => 2000000003
1000000001.id.class => Bignum
I’m surprised that numbers that high have a predictable ID, as if they’re
predefined. But then maybe the numbers are brought into being only as needed
(except for the first 100, like Python), with a special case formulaic approach
to assigning them an ID, and then all non-Integers are given an even ID so they
don’t clash. When I ran
a =
ObjectSpace.each_object { |obj| a << obj }
a.find { |obj| obj.id % 2 == 1 }
it returned nil, so that must be the case.
What I also found interesting was that successive runs of this testlet seemed
to produce many more objects.
and others dynamically allocated, where the id is (I presume) a pointer to
the data structure. I haven’t looked at the source, but I posted a guessed
explanation of why fixnums have odd ids at ruby-talk:53040.
Bignums are normal heap-allocated objects and have normal (even) ids. This
means that the same value can end up being held in two different Bignum
objects: