Wow, YAML / Psych in 1.9.3 is *slow*!

I just started trying Ruby 1.9.3, coming from Ruby 1.8.7, and was
surprised to discover YAML's sluggishness. The chief problem seems to be
the Psych library.

In my main use case, a certain routine takes about 14 seconds.

Fortunately, I can switch to the "syck" library:

begin
  YAML::ENGINE.yamler = 'syck'
rescue

Using syck, the same routine takes about 7 seconds.

That's actually a bit faster than the same routine under Ruby 1.8.7 with
the old YAML, where the time for the same routine is 8 or 9 seconds. So
to get the juicy goodness of improved speed in Ruby 1.9.3, I definitely
need to use "syck".

A little googling suggests I'm not the only person to make this sort of
observation.

m.

···

--
matt neuburg, phd = matt@tidbits.com <http://www.tidbits.com/matt/>
A fool + a tool + an autorelease pool = cool!
AppleScript: the Definitive Guide - Second Edition!
http://www.tidbits.com/matt/default.html#applescriptthings

File. A. Bug.

···

On Sep 14, 2012, at 7:06, matt@tidbits.com (Matt Neuburg) wrote:

I just started trying Ruby 1.9.3, coming from Ruby 1.8.7, and was
surprised to discover YAML's sluggishness. The chief problem seems to be
the Psych library.

In my main use case, a certain routine takes about 14 seconds.

Fortunately, I can switch to the "syck" library:

begin
YAML::ENGINE.yamler = 'syck'
rescue

Using syck, the same routine takes about 7 seconds.

That's actually a bit faster than the same routine under Ruby 1.8.7 with
the old YAML, where the time for the same routine is 8 or 9 seconds. So
to get the juicy goodness of improved speed in Ruby 1.9.3, I definitely
need to use "syck".

A little googling suggests I'm not the only person to make this sort of
observation.

m.

--
matt neuburg, phd = matt@tidbits.com <http://www.tidbits.com/matt/&gt;
A fool + a tool + an autorelease pool = cool!
AppleScript: the Definitive Guide - Second Edition!
Matt Neuburg’s Home Page

Matt Neuburg wrote in post #1076014:

I just started trying Ruby 1.9.3, coming from Ruby 1.8.7, and was
surprised to discover YAML's sluggishness. The chief problem seems to be
the Psych library.

[...]

Fortunately, I can switch to the "syck" library:

begin
  YAML::ENGINE.yamler = 'syck'
rescue

Using syck, the same routine takes about 7 seconds.

Wow indeed! With a test app with some of my data (having to load 820
yaml documents) switching to syck took execution from 16.7 to 1.3
seconds!
Are there any known side-effects with doing this?

Patrick

···

--
Posted via http://www.ruby-forum.com/\.

File. A. Bug.

With? Whom?

m.

···

Ryan Davis <ryand-ruby@zenspider.com> wrote:

On Sep 14, 2012, at 7:06, matt@tidbits.com (Matt Neuburg) wrote:

> I just started trying Ruby 1.9.3, coming from Ruby 1.8.7, and was
> surprised to discover YAML's sluggishness. The chief problem seems to be
> the Psych library.
>
> In my main use case, a certain routine takes about 14 seconds.
>
> Fortunately, I can switch to the "syck" library:
>
> begin
> YAML::ENGINE.yamler = 'syck'
> rescue
>
> Using syck, the same routine takes about 7 seconds.
>
> That's actually a bit faster than the same routine under Ruby 1.8.7 with
> the old YAML, where the time for the same routine is 8 or 9 seconds. So
> to get the juicy goodness of improved speed in Ruby 1.9.3, I definitely
> need to use "syck".
>
> A little googling suggests I'm not the only person to make this sort of
> observation.
>
> m.
>
> --
> matt neuburg, phd = matt@tidbits.com <http://www.tidbits.com/matt/&gt;
> A fool + a tool + an autorelease pool = cool!
> AppleScript: the Definitive Guide - Second Edition!
> Matt Neuburg’s Home Page
>
>

--
matt neuburg, phd = matt@tidbits.com <http://www.tidbits.com/matt/&gt;
A fool + a tool + an autorelease pool = cool!
AppleScript: the Definitive Guide - Second Edition!
Matt Neuburg’s Home Page

Patrick B. писал 19.09.2012 19:14:

Matt Neuburg wrote in post #1076014:

I just started trying Ruby 1.9.3, coming from Ruby 1.8.7, and was
surprised to discover YAML's sluggishness. The chief problem seems to be
the Psych library.

[...]

Fortunately, I can switch to the "syck" library:

begin
  YAML::ENGINE.yamler = 'syck'
rescue

Using syck, the same routine takes about 7 seconds.

Wow indeed! With a test app with some of my data (having to load 820
yaml documents) switching to syck took execution from 16.7 to 1.3
seconds!
Are there any known side-effects with doing this?

Yes; namely, not being compatible with YAML specification.
Psych was written for a reason, and that reason was Syck's brokenness.
For years it has generated invalid YAML (or was refusing to consume
valid YAML; I have seen different opinions on this) in common cases. For
example, this:
http://blog.rubygems.org/2011/08/31/shaving-the-yaml-yak.html

···

Patrick

--
   WBR, Peter Zotov.

The psych devs.

Are you asking for someone to look up their email address for you?

···

On Sep 14, 2012, at 5:20 PM, matt@tidbits.com (Matt Neuburg) wrote:

Ryan Davis <ryand-ruby@zenspider.com> wrote:

File. A. Bug.

With? Whom?

m.

On Sep 14, 2012, at 7:06, matt@tidbits.com (Matt Neuburg) wrote:

I just started trying Ruby 1.9.3, coming from Ruby 1.8.7, and was
surprised to discover YAML's sluggishness. The chief problem seems to be
the Psych library.

In my main use case, a certain routine takes about 14 seconds.

Fortunately, I can switch to the "syck" library:

begin
YAML::ENGINE.yamler = 'syck'
rescue

Using syck, the same routine takes about 7 seconds.

That's actually a bit faster than the same routine under Ruby 1.8.7 with
the old YAML, where the time for the same routine is 8 or 9 seconds. So
to get the juicy goodness of improved speed in Ruby 1.9.3, I definitely
need to use "syck".

A little googling suggests I'm not the only person to make this sort of
observation.

m.

--
matt neuburg, phd = matt@tidbits.com <http://www.tidbits.com/matt/&gt;
A fool + a tool + an autorelease pool = cool!
AppleScript: the Definitive Guide - Second Edition!
Matt Neuburg’s Home Page

--
matt neuburg, phd = matt@tidbits.com <http://www.tidbits.com/matt/&gt;
A fool + a tool + an autorelease pool = cool!
AppleScript: the Definitive Guide - Second Edition!
Matt Neuburg’s Home Page

Given that you've now blogged about psych and cited its source... I'm guessing you know.

···

On Sep 14, 2012, at 16:20 , Matt Neuburg <matt@tidbits.com> wrote:

Ryan Davis <ryand-ruby@zenspider.com> wrote:

File. A. Bug.

With? Whom?

Peter Zotov wrote in post #1076749:

Patrick B. писал 19.09.2012 19:14:

  YAML::ENGINE.yamler = 'syck'
rescue

Using syck, the same routine takes about 7 seconds.

Wow indeed! With a test app with some of my data (having to load 820
yaml documents) switching to syck took execution from 16.7 to 1.3
seconds!
Are there any known side-effects with doing this?

Yes; namely, not being compatible with YAML specification.
Psych was written for a reason, and that reason was Syck's brokenness.
For years it has generated invalid YAML (or was refusing to consume
valid YAML; I have seen different opinions on this) in common cases.
For
example, this:
http://blog.rubygems.org/2011/08/31/shaving-the-yaml-yak.html

So then what can be done to improve the speed of the 'correct' YAML
library? The difference is to put it mildly, significant. :frowning:

···

--
Posted via http://www.ruby-forum.com/\.

I think the question is a reasonable one. Who is to "blame" in a case
like this? The psych people? The YAML people, who have so forcibly
replaced syck with psych and made this horrible warning notice appear in
1.9.3 if you didn't build psych into Ruby? The Ruby people, who have
accepted that situation and allowed it to be built into the core? I
don't think the answer is obvious. m.

PS Don't be a ninny. Assume that your interlocutor has a brain. Give me
the same benefit of the doubt that I give you.

···

Jam Bees <jam@jamandbees.net> wrote:

The psych devs.

Are you asking for someone to look up their email address for you?

On Sep 14, 2012, at 5:20 PM, matt@tidbits.com (Matt Neuburg) wrote:

> Ryan Davis <ryand-ruby@zenspider.com> wrote:
>
>> File. A. Bug.
>
> With? Whom?
>
> m.
>
>
>>
>> On Sep 14, 2012, at 7:06, matt@tidbits.com (Matt Neuburg) wrote:
>>
>>> I just started trying Ruby 1.9.3, coming from Ruby 1.8.7, and was
>>> surprised to discover YAML's sluggishness. The chief problem seems to be
>>> the Psych library.
>>>
>>> In my main use case, a certain routine takes about 14 seconds.
>>>
>>> Fortunately, I can switch to the "syck" library:
>>>
>>> begin
>>> YAML::ENGINE.yamler = 'syck'
>>> rescue
>>>
>>> Using syck, the same routine takes about 7 seconds.
>>>
>>> That's actually a bit faster than the same routine under Ruby 1.8.7 with
>>> the old YAML, where the time for the same routine is 8 or 9 seconds. So
>>> to get the juicy goodness of improved speed in Ruby 1.9.3, I definitely
>>> need to use "syck".
>>>
>>> A little googling suggests I'm not the only person to make this sort of
>>> observation.
>>>
>>> m.
>>>
>>> --
>>> matt neuburg, phd = matt@tidbits.com <http://www.tidbits.com/matt/&gt;
>>> A fool + a tool + an autorelease pool = cool!
>>> AppleScript: the Definitive Guide - Second Edition!
>>> Matt Neuburg’s Home Page
>>>
>>>
>
>
> --
> matt neuburg, phd = matt@tidbits.com <http://www.tidbits.com/matt/&gt;
> A fool + a tool + an autorelease pool = cool!
> AppleScript: the Definitive Guide - Second Edition!
> Matt Neuburg’s Home Page
>
>

--
matt neuburg, phd = matt@tidbits.com <http://www.tidbits.com/matt/&gt;
A fool + a tool + an autorelease pool = cool!
AppleScript: the Definitive Guide - Second Edition!
Matt Neuburg’s Home Page

Where is the performance problem? File the issue there.

···

On Sat, Sep 15, 2012 at 9:30 AM, Matt Neuburg <matt@tidbits.com> wrote:

I think the question is a reasonable one. Who is to "blame" in a case
like this? The psych people? The YAML people, who have so forcibly
replaced syck with psych and made this horrible warning notice appear in
1.9.3 if you didn't build psych into Ruby? The Ruby people, who have
accepted that situation and allowed it to be built into the core? I
don't think the answer is obvious. m.

PS Don't be a ninny. Assume that your interlocutor has a brain. Give me
the same benefit of the doubt that I give you.

Jam Bees <jam@jamandbees.net> wrote:

> The psych devs.
>
> Are you asking for someone to look up their email address for you?
>
> On Sep 14, 2012, at 5:20 PM, matt@tidbits.com (Matt Neuburg) wrote:
>
> > Ryan Davis <ryand-ruby@zenspider.com> wrote:
> >
> >> File. A. Bug.
> >
> > With? Whom?
> >
> > m.
> >
> >
> >>
> >> On Sep 14, 2012, at 7:06, matt@tidbits.com (Matt Neuburg) wrote:
> >>
> >>> I just started trying Ruby 1.9.3, coming from Ruby 1.8.7, and was
> >>> surprised to discover YAML's sluggishness. The chief problem seems
to be
> >>> the Psych library.
> >>>
> >>> In my main use case, a certain routine takes about 14 seconds.
> >>>
> >>> Fortunately, I can switch to the "syck" library:
> >>>
> >>> begin
> >>> YAML::ENGINE.yamler = 'syck'
> >>> rescue
> >>>
> >>> Using syck, the same routine takes about 7 seconds.
> >>>
> >>> That's actually a bit faster than the same routine under Ruby 1.8.7
with
> >>> the old YAML, where the time for the same routine is 8 or 9 seconds.
So
> >>> to get the juicy goodness of improved speed in Ruby 1.9.3, I
definitely
> >>> need to use "syck".
> >>>
> >>> A little googling suggests I'm not the only person to make this sort
of
> >>> observation.
> >>>
> >>> m.
> >>>
> >>> --
> >>> matt neuburg, phd = matt@tidbits.com <http://www.tidbits.com/matt/&gt;
> >>> A fool + a tool + an autorelease pool = cool!
> >>> AppleScript: the Definitive Guide - Second Edition!
> >>> Matt Neuburg’s Home Page
> >>>
> >>>
> >
> >
> > --
> > matt neuburg, phd = matt@tidbits.com <http://www.tidbits.com/matt/&gt;
> > A fool + a tool + an autorelease pool = cool!
> > AppleScript: the Definitive Guide - Second Edition!
> > Matt Neuburg’s Home Page
> >
> >

--
matt neuburg, phd = matt@tidbits.com <http://www.tidbits.com/matt/&gt;
A fool + a tool + an autorelease pool = cool!
AppleScript: the Definitive Guide - Second Edition!
Matt Neuburg’s Home Page

In the end I filed in two places: Psych on github, and Ruby itself.
Here's why. There are really two problems. First, Psych is demonstrably
slow. I was able to write a simple real-world case showing that it takes
Pysch more then 3 times as long as Syck to do a load_file:

However, there's also a deeply disturbing philosophical problem. Psych
has been crammed down the throats of users before it's ready for prime
time. If you build Ruby 1.9.3 without libyaml being present, Psych won't
be installed and every time yaml is used, including every time you touch
"gem" in any way, you get a nasty warning. So I also filed at rubybugs
asking that this force-feeding of Psych be backed out. It won't be, of
course, but the point is made. m.

···

James Harrison <jam@jamandbees.net> wrote:

Where is the performance problem? File the issue there.

--
matt neuburg, phd = matt@tidbits.com <http://www.tidbits.com/matt/&gt;
A fool + a tool + an autorelease pool = cool!

You do realize Syck is completely broken and unmaintained, right?

···

On Sat, Sep 15, 2012 at 4:50 PM, Matt Neuburg <matt@tidbits.com> wrote:

However, there's also a deeply disturbing philosophical problem. Psych
has been crammed down the throats of users before it's ready for prime
time.

--
Tony Arcieri