I have a small (200 lines) program that processes XML files and creates ruby object trees. It uses the native Ruby xml library REXML a great deal.
On my MacOS X 2GHz macintel system using the latest YARV speed up my code by a factor of 2.
ruby 1.8.4 (2005-12-24) [i686-darwin8.6.1]
53 seconds
ruby 2.0.0 (Base: Ruby 1.9.0 2006-04-08) [i686-darwin8.6.1]
YARVCore 0.4.0 Rev: 502 (2006-05-18) [opts: ]
28 seconds
My program processes BlackBoard XML course archives and produces numerous statistics about the discussion threads. In the original program it then produced an excel output file. In these tests that function was removed. In my test I processed a course with 26 separate XML files with discussion threads and created ruby objects representing the info I am interested in.
···
--
- Stephen Bannasch
Concord Consortium, http://www.concord.org
I have a small (200 lines) program that processes XML files and
creates ruby object trees. It uses the native Ruby xml library REXML
a great deal.
On my MacOS X 2GHz macintel system using the latest YARV speed up my
code by a factor of 2.
ruby 1.8.4 (2005-12-24) [i686-darwin8.6.1]
53 seconds
ruby 2.0.0 (Base: Ruby 1.9.0 2006-04-08) [i686-darwin8.6.1]
YARVCore 0.4.0 Rev: 502 (2006-05-18) [opts: ]
28 seconds
I've never looked at yarv's internals, but I'm guessing that it creates
ruby objects directly as it processes the stream.
Unless you're using the stream api in rexml, you're comparing apples and
oranges. It stands to reason that parsing a stream directly is much
faster than parsing a stream to a document tree and processing the tree.
I'd be surprised if it still wasn't faster to use Yarv to build a ruby
object tree rather than building from an xml stream, but the difference
probably wont be as great.
Hi Daniel,
I've never looked at yarv's internals, but I'm guessing that it creates
ruby objects directly as it processes the stream.
Unless you're using the stream api in rexml, you're comparing apples and
oranges. It stands to reason that parsing a stream directly is much
faster than parsing a stream to a document tree and processing the tree.
I'd be surprised if it still wasn't faster to use Yarv to build a ruby
object tree rather than building from an xml stream, but the difference
probably wont be as great.
I'm not using the stream api. I'm creating a dom tree and using rexml's version of xpath to process the tree. I do processing on the results and create a set of ruby objects. Unless there is something important I'm missing I assume that yarv is just executing my algorithms faster. I can't see how it would know to use the stream api.
I hadn't used yarv before and wanted to see how it works and I picked a script that runs pretty slowly and spends most of its time in native Ruby so I could see how much faster it would be. I think a 2x speed up is nice.
···
--
- Stephen Bannasch
Concord Consortium, http://www.concord.org
Quoting daniels@pronto.com.au, on Fri, May 26, 2006 at 08:59:38AM +0900:
> I have a small (200 lines) program that processes XML files and
> creates ruby object trees. It uses the native Ruby xml library REXML
> a great deal.
>
> On my MacOS X 2GHz macintel system using the latest YARV speed up my
> code by a factor of 2.
>
> ruby 1.8.4 (2005-12-24) [i686-darwin8.6.1]
> 53 seconds
>
> ruby 2.0.0 (Base: Ruby 1.9.0 2006-04-08) [i686-darwin8.6.1]
> YARVCore 0.4.0 Rev: 502 (2006-05-18) [opts: ]
> 28 seconds
I've never looked at yarv's internals, but I'm guessing that it creates
ruby objects directly as it processes the stream.
I think that you are mistaking yarv for an XML parser. Its not. Its a
virtual machine for ruby code. Do a quick google.
The benchmarks are on the same code, same rexml, same xml processing
technique, same numbers of objects created.
Cheers,
Sam
Quoting daniels@pronto.com.au, on Fri, May 26, 2006 at 08:59:38AM +0900:
I think that you are mistaking yarv for an XML parser. Its not. Its a
virtual machine for ruby code. Do a quick google.
I know it's a ruby VM, I checked out the latest yarv from the subversion repository and compiled it and then compared the speed of ruby 1.8.4 and a newer ruby compiled by yarv by running my program with both ruby and yarv. The xml parser is rexml a ruby library hat can be used by either ruby or yarv.
I think you have misinterpreted my original post. I was comparing the time of execution like this:
ruby test.rb
and then ...
/usr/local/yarv/bin/ruby-yarv test.rb
···
--
- Stephen Bannasch
Concord Consortium, http://www.concord.org
He knows that you know. He doesn't think Daniel Sheppard knows. That's who he was replying to.
···
On May 26, 2006, at 3:23 AM, Stephen Bannasch wrote:
Quoting daniels@pronto.com.au, on Fri, May 26, 2006 at 08:59:38AM +0900:
I think that you are mistaking yarv for an XML parser. Its not. Its a
virtual machine for ruby code. Do a quick google.
I know it's a ruby VM, I checked out the latest yarv from the subversion repository and compiled it and then compared the speed of ruby 1.8.4 and a newer ruby compiled by yarv by running my program with both ruby and yarv. The xml parser is rexml a ruby library hat can be used by either ruby or yarv.
I think you have misinterpreted my original post. I was comparing the time of execution like this:
ruby test.rb
and then ...
/usr/local/yarv/bin/ruby-yarv test.rb
--
- Stephen Bannasch
Concord Consortium, http://www.concord.org