Rb_str_new with custom subclass of rb_cString

Is it possible to construct a string with a custom subclass or replace the
klass pointer of a VALUE in an existing one?

I see rb_str_new_with_class but it actually takes an object and uses its
class rather than a class directly which might work but is a bit of a hack.
Ideally would be able to change the class of the object... but perhaps
that's asking too much?

Thanks,
Samuel

Is it possible to construct a string with a custom subclass or replace the
klass pointer of a VALUE in an existing one?

You can always use:

  rb_funcall(your_subclass, ... rb_intern("new"))

I see rb_str_new_with_class but it actually takes an object and uses its
class rather than a class directly which might work but is a bit of a hack.

I guess it works, but yeah, hacky...

Ideally would be able to change the class of the object... but perhaps
that's asking too much?

That might make future (and maybe today's) optimizations more
difficult.

The only place where an object's class is malleable is IO,
because that maps to the OS behavior (at least on *nix).

Keep in mind subclassing some highly-used core classes (such as
String) is a bad idea, performance-wise. The VM special-cases
for some core classes, but those optimizations will be disabled
in your subclass. You can see most of these VM optimizations in
insns.def in the C Ruby sources.

···

Samuel Williams <space.ship.traveller@gmail.com> wrote:

Hello!
Excuse me, may be I don't quite understand you.
Why do you want use native code to create an String subclass instance?

Doesn't this work for you?

class MyStr < String; end
MyStr.new('Hello, World!')

На 18 марта 2017 г., 6:13, в 6:13, Samuel Williams <space.ship.traveller@gmail.com> написал:п>Is it possible to construct a string with a custom subclass or replace

···

the
klass pointer of a VALUE in an existing one?

I see rb_str_new_with_class but it actually takes an object and uses
its
class rather than a class directly which might work but is a bit of a
hack.
Ideally would be able to change the class of the object... but perhaps
that's asking too much?

Thanks,
Samuel

------------------------------------------------------------------------

Unsubscribe:
<mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk&gt;

Doing

class MyStr < String; end
MyStr.new('Hello, World!')

makes this hot code path too inefficient.

I found I can use rb_obj_reveal to change class, it worked perfectly
and performance was maintained. Not sure if this is a good idea but
according to the docs its okay, but the source code says not to use
it.

Thanks

···

On 19 March 2017 at 18:17, Dmitriy Non <non.dmitriy@gmail.com> wrote:

Hello!
Excuse me, may be I don't quite understand you.
Why do you want use native code to create an String subclass instance?

Doesn't this work for you?

class MyStr < String; end
MyStr.new('Hello, World!')

На 18 марта 2017 г., 6:13, в 6:13, Samuel Williams
<space.ship.traveller@gmail.com> написал:п>Is it possible to construct a
string with a custom subclass or replace

the
klass pointer of a VALUE in an existing one?

I see rb_str_new_with_class but it actually takes an object and uses
its
class rather than a class directly which might work but is a bit of a
hack.
Ideally would be able to change the class of the object... but perhaps
that's asking too much?

Thanks,
Samuel

------------------------------------------------------------------------

Unsubscribe:
<mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk&gt;

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk&gt;

rb_obj_reveal is intended for hidden objects (klass == 0);
I don't think ruby-core can guarantee it will remain working
for changing non-hidden objects.

Anyways, why do you need to subclass String? You seem to care
about performance, and you may lose a lot of it when subclassing.

···

Samuel Williams <space.ship.traveller@gmail.com> wrote:

Doing

class MyStr < String; end
MyStr.new('Hello, World!')

makes this hot code path too inefficient.

I found I can use rb_obj_reveal to change class, it worked perfectly
and performance was maintained. Not sure if this is a good idea but
according to the docs its okay, but the source code says not to use
it.

Anyways, why do you need to subclass String? You seem to care
about performance, and you may lose a lot of it when subclassing.

I've got quite a few benchmarks that show that significant performance
is gained by my code. But, it's still a work in progress. I don't
assert that it's good or the right way, but I just see how I can
improve real world performance for our use case.

I'm tracking whether strings need to be escaped on output (e.g.
something similar to CGI.escape_html). It's almost a zero-cost
operation during parsing of the source markup, so I can avoid it
entirely. It's a big performance win, page render time improve by 50%.
The method needs to be low-cost, instantiating a custom sub-class of
string blows away any performance gains.

You can see a bit more how it works here:

So, when parsing Markup (approximately HTML), we track if any entities
were seen:

If entities are not seen, we don't need to escape the string on
output, we mark it with Trenni_markup_safe:

On output, we use this information, e.g. when generating tags:

Additionally, I found that CGI.escape_html is a bit slow, can be about
20% faster. I haven't finished my implementation since I'm just
exploring the various available optimisations. But, the tangible
result is page render times for a complex page on my laptop went from
about 15ms down to 6-7ms with native C code and avoiding escape_html
where possible.

Thanks for your interest.

···

On 20 March 2017 at 13:27, Eric Wong <e@80x24.org> wrote:

Samuel Williams <space.ship.traveller@gmail.com> wrote:

Doing

class MyStr < String; end
MyStr.new('Hello, World!')

makes this hot code path too inefficient.

I found I can use rb_obj_reveal to change class, it worked perfectly
and performance was maintained. Not sure if this is a good idea but
according to the docs its okay, but the source code says not to use
it.

rb_obj_reveal is intended for hidden objects (klass == 0);
I don't think ruby-core can guarantee it will remain working
for changing non-hidden objects.

Anyways, why do you need to subclass String? You seem to care
about performance, and you may lose a lot of it when subclassing.

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk&gt;