XmlConfigFile usage

Yep, I'm listening :slight_smile: I've re-read the SOAP spec, which I actually
understand a little better now after playing with YAML. I've also spent some
time today getting soap4r working and have been testing with a small
client-server database program.

In this program, the client sends a request (a text key string), and the
server reads a database row, builds a little tree of objects representing
the values it contains, and returns it.

Unfortunately, performance is a major problem. Using soap4r, an exchange of
just 10 RPC calls takes 12.5 seconds on a Sun E250. It makes little
difference whether the client and server are on different machines or not;
one end is waiting for the other to marshall, and then vice versa, so CPU
load is around 50% at both ends when split this way.

As far as I can tell, the majority of this time is spent in marshalling and
unmarshalling the objects. I did a few separate tests:

- just taking the final tree and marshalling/unmarshalling it ten times
   takes 9.3 seconds. The XML generated is 7,137 bytes.

[This is with REXML-stable; am I likely to see much improvement with a
different XML parser?]

- using yaml4r the same ten operations take 10.6 seconds, although the
   generated data is smaller at 3,396 bytes, and much easier to read

- using Ruby's built-in Marshall.dump/load is staggeringly fast, taking
   just 0.04 seconds for the same test, and generating 1,651 bytes to
   represent the same object tree.

In fact, using dRuby instead of SOAP, it takes only 0.3 seconds to do ten
complete RPC exchanges, including the actual server work of a DBI query and
building the return object tree each time!

I can hardly ignore a performance factor of 250 times... unfortunately if I
go with drb then I am tied to using Ruby for both front-end and back-end
systems. Hence I'll lose the flexibility to get (say) a PHP programmer to
write something which talks to it.

Perhaps the solution is to forget SOAP, and to recode Ruby's native Marshal
module for Perl, PHP, Java etc :-))

Regards,

Brian.

···

On Sat, Mar 01, 2003 at 10:54:26AM +0900, NAKAMURA, Hiroshi wrote:

Hi,

Here's SOAP! Is there somebody listening?!

Hey Brian.

I’ve been busting my hump on a YAML parser written in native C. The parser is
called Syck and is located in the yaml4r CVS repository. I’ve also been
documenting my progress on my weblog.

I’m already a fair ways along. The tokenizer just needs to be completed. And
I already have basic extensions for Ruby, Python and PHP. Syck is nearly as
fast as Marshall. I use Marshall’s technique for importing symbols directly
into Ruby’s symbol table. So the only speed difference is that my tokens are
slightly more complicated than Marshals.

Anyways, I’m very excited about this forthcoming YAML parser and don’t want
anyone to be too discouraged by the current parse times of YAML.rb. Sometime
next week I’ll post a comprehensive benchmark to the list.

_why

···

On Thursday 06 March 2003 05:23 pm, Brian Candler wrote:

  • using yaml4r the same ten operations take 10.6 seconds, although the
    generated data is smaller at 3,396 bytes, and much easier to read

  • using Ruby’s built-in Marshall.dump/load is staggeringly fast, taking
    just 0.04 seconds for the same test, and generating 1,651 bytes to
    represent the same object tree.

I can hardly ignore a performance factor of 250 times… unfortunately if I
go with drb then I am tied to using Ruby for both front-end and back-end
systems. Hence I’ll lose the flexibility to get (say) a PHP programmer to
write something which talks to it.

Perhaps the solution is to forget SOAP, and to recode Ruby’s native Marshal
module for Perl, PHP, Java etc :-))

Hi, Brian,

From: “Brian Candler” B.Candler@pobox.com
Sent: Friday, March 07, 2003 9:22 AM

In fact, using dRuby instead of SOAP, it takes only 0.3 seconds to do ten
complete RPC exchanges, including the actual server work of a DBI query and
building the return object tree each time!

I can hardly ignore a performance factor of 250 times…

250 times! Definitely rpc via soap is slower than drb,
but 250 times is too much. I think there should be another
bottleneck, TCP socket creation, loading WSDL everytime for example.

I tried to benchmark it myself. Testsuit is attached at the bottom
of this mail. Calling a method 500 times.

$ ruby18 tst.rb

stub, user, system, total, real

direct_stub 0.200000 0.000000 0.200000 ( 0.202000)
drb_stub 0.721000 0.771000 1.492000 ( 4.855000)
soap_stub 18.116000 4.627000 22.743000 ( 61.047000)

In this testsuit, soap_stub is 13 times as slow as drb,
300 times as direct method call.

Hmm. DRb may run faster I think… seki-san?

ruby: ruby 1.8.0 (2003-03-05) [i386-cygwin]
drb: 2.0.2
webrick: 1.3.0
soap4r: 1.4.8.1
http-access2: j

Regards,
// NaHi

== tst.rb

require ‘client’
require ‘soap/driver’
require ‘drb/drb’

def direct_stub
require ‘servant’
Servant.new
end

def soap_stub
drv = SOAP::Driver.new(nil, nil, ‘Bing’,
http://localhost:2000/soapsrv’)
drv.addMethod(‘traverse_node’, ‘node’)
drv
end

def drb_stub
DRb.start_service
DRbObject.new(nil, ‘druby://localhost:2001’)
end

Client.new(direct_stub).run( 'direct_stub ', 500)
Client.new(drb_stub).run( 'drb_stub ', 500)
Client.new(soap_stub).run( 'soap_stub ', 500)

== client

require ‘node’
require ‘benchmark’

class Client
def initialize(srv)
@srv = srv
end

def run(title, n)
node = setup_node
Benchmark.benchmark do |bm|
count = 0
bm.report(title) do
n.times do
count = @srv.traverse_node(node)
end
end
end
end

def setup_node
n9 = Node.new
n81 = Node.new(n9)
n82 = Node.new(n9)
n7 = Node.new(n81, n82)
n61 = Node.new(n7)
n62 = Node.new(n7)
n5 = Node.new(n61, n62)
n41 = Node.new(n5)
n42 = Node.new(n5)
n3 = Node.new(n41, n42)
n21 = Node.new(n3)
n22 = Node.new(n3)
n1 = Node.new(n21, n22)
n1
end
end

== node.rb

class Node
attr_reader :first, :second
def initialize( *initNext )
@first = initNext[0]
@second = initNext[1]
end
end

== servant.rb

require ‘node’

class Servant
def traverse_node(node)
Thread.current[:count] = 0
do_traverse_node(node)
Thread.current[:count]
end

def do_traverse_node(node)
if node
Thread.current[:count] += 1
do_traverse_node(node.first)
do_traverse_node(node.second)
end
end
end

== drb_server.rb

require ‘drb/drb’
require ‘servant’

druby = ‘druby://localhost:2001’
DRb.start_service(druby, Servant.new)
DRb.thread.join

== soap_server.rb

require ‘soaplet’
require ‘servant’
require ‘webrick’

soapsrv = WEBrick::SOAPlet.new
soapsrv.addServant(‘Bing’, Servant.new)

svr = WEBrick::HTTPServer.new(:Port => 2000, :AccessLog => {})
svr.mount(‘/soapsrv’, soapsrv)
svr.start

Anyways, I’m very excited about this forthcoming YAML parser and don’t want
anyone to be too discouraged by the current parse times of YAML.rb. Sometime
next week I’ll post a comprehensive benchmark to the list.

That sounds very interesting. I’ve not tried using ‘okay’ RPC yet but I
imagine it’s no harder to set up than SOAP.

Parsing is not the only issue with yaml4r though; the marshalling is also
slow, accounting for about 1/3rd of the total time. With my test object:

  • 10.times to_yaml: 3.6 secs
  • 10.times YAML::load: 6.7 secs

But I imagine that marshalling shouldn’t be too hard to write natively.

(I did wonder if there was any mileage in calling Marshal.dump and then
converting its output to YAML…)

Regards,

Brian.

···

On Fri, Mar 07, 2003 at 10:09:40AM +0900, why the lucky stiff wrote:

_why

On Thursday 06 March 2003 05:23 pm, Brian Candler wrote:

  • using yaml4r the same ten operations take 10.6 seconds, although the
    generated data is smaller at 3,396 bytes, and much easier to read

  • using Ruby’s built-in Marshall.dump/load is staggeringly fast, taking
    just 0.04 seconds for the same test, and generating 1,651 bytes to
    represent the same object tree.

I can hardly ignore a performance factor of 250 times… unfortunately if I
go with drb then I am tied to using Ruby for both front-end and back-end
systems. Hence I’ll lose the flexibility to get (say) a PHP programmer to
write something which talks to it.

Perhaps the solution is to forget SOAP, and to recode Ruby’s native Marshal
module for Perl, PHP, Java etc :-))

I've done some more tests too, and SOAP still seems to be the major
bottleneck. Maybe it's to do with the way I'm using it?

What I've done is taken my app and cut it down to the bare minimum. It's
attached to this mail as a tgz because I've used a long constant string to
import the result object (lazy!). The server returns this object in response
to an RPC request, and the client makes 100 such requests.

Running both server and client on the same FreeBSD / PIII-400 laptop I get:

    DRb: 0.31 seconds
    soap4r: 48.6 seconds => 160 times slower

Modifying the client to point to a remote server, which is a Sun (E250? It's
in a different building and I don't have the spec to hand, but it's not in
use by anyone else), and the client on the same FreeBSD box, I get:

    DRb: 0.85 - 0.92 seconds
    soap4r: 54.8 - 56.3 seconds => 60 times slower

So introducing a TCP socket dilutes the SOAP overhead somewhat, but it is
still pretty huge. Half a second to marshal and unmarshal a relatively small
cluster of objects is something I'm not going to be able to live with.

I've included a separate program which compares just the marshal+unmarshal
times:

$ ruby justmarshal.rb
Ruby: 0.130195
SOAP: 28.341594
Ruby is 217.6857329 times faster
$ ruby justmarshal.rb
Ruby: 0.129693
SOAP: 28.590072
Ruby is 220.4442183 times faster
$ ruby justmarshal.rb
Ruby: 0.129829
SOAP: 28.891096
Ruby is 222.5319151 times faster

So I stand by my original "250 times faster", allowing for some experimental
error :slight_smile:

Which XML parser are you using? I have only REXML installed.

Regards,

Brian.

soapperf.tgz (1.36 KB)

···

On Fri, Mar 07, 2003 at 10:05:22PM +0900, NAKAMURA, Hiroshi wrote:

> In fact, using dRuby instead of SOAP, it takes only 0.3 seconds to do ten
> complete RPC exchanges, including the actual server work of a DBI query and
> building the return object tree each time!
>
> I can hardly ignore a performance factor of 250 times...

250 times! Definitely rpc via soap is slower than drb,
but 250 times is too much. I think there should be another
bottleneck, TCP socket creation, loading WSDL everytime for example.

Actually, I posted timings done under FreeBSD. On the Sun, where I was
originally testing:

$ ruby justmarshal.rb
Ruby: 0.169696
SOAP: 48.753113
Ruby is 287.2967719 times faster
$ ruby justmarshal.rb
Ruby: 0.168854
SOAP: 47.111901
Ruby is 279.0096829 times faster
$ ruby justmarshal.rb
Ruby: 0.167228
SOAP: 47.74334
Ruby is 285.4984811 times faster

Regards,

Brian.

···

On Fri, Mar 07, 2003 at 01:58:34PM +0000, Brian Candler wrote:

So I stand by my original "250 times faster", allowing for some experimental
error :slight_smile:

Hi,

From: “Brian Candler” B.Candler@pobox.com
Sent: Friday, March 07, 2003 10:58 PM

In fact, using dRuby instead of SOAP, it takes only 0.3 seconds to do ten
complete RPC exchanges, including the actual server work of a DBI query and
building the return object tree each time!

I can hardly ignore a performance factor of 250 times…

250 times! Definitely rpc via soap is slower than drb,
but 250 times is too much. I think there should be another
bottleneck, TCP socket creation, loading WSDL everytime for example.

I’ve done some more tests too, and SOAP still seems to be the major
bottleneck. Maybe it’s to do with the way I’m using it?

I misunderstood you. You said
ruby’s marshal : soap marshal = 1 : 250
I thought
druby’s call : soap’s call = 1 : 250
The last testsuite of mine is for the latter.

Ruby’s marshal is 300 times faster than soap? Should be.
The larger object, soap’s marshal may get more slowner.

So I stand by my original “250 times faster”, allowing for some experimental
error :slight_smile:

You are correct.

And more:
Marshal: binary string/stream ↔ Ruby’s object
SOAP4R: XML string/stream ↔ SOAP Data Model ↔ Ruby’s object

So SOAP4R will never be as fast as Marshal, even if I’ll rewrite it in C.

What I’ve done is taken my app and cut it down to the bare minimum. It’s
attached to this mail as a tgz because I’ve used a long constant string to
import the result object (lazy!). The server returns this object in response
to an RPC request, and the client makes 100 such requests.

Thanks. I wrote a server with webrick, which supports HTTP/1.1
persistent connection to reduce TCP sockets.

$ ruby18 drbclient.rb
0.110000 0.140000 0.250000 ( 0.752000)
$ ruby18 soapclient-xmlscan.rb
14.130000 0.822000 14.952000 ( 25.227000)
$ ruby18 soapclient-rexml.rb (with soapserver-rexml.rb)
12.018000 0.821000 12.839000 ( 23.880000)

30 times slower. Hmm. I should get profile.

Regards,
// NaHi

···

On Fri, Mar 07, 2003 at 10:05:22PM +0900, NAKAMURA, Hiroshi wrote:

And more:
Marshal: binary string/stream ↔ Ruby’s object
SOAP4R: XML string/stream ↔ SOAP Data Model ↔ Ruby’s object

So SOAP4R will never be as fast as Marshal, even if I’ll rewrite it in C.

Sure, although I guess you could go straight from Ruby object to XML
string/stream for some common cases like String and Fixnum.

I wrote a server with webrick, which supports HTTP/1.1
persistent connection to reduce TCP sockets.

Ah, that’s a nice optimisation, and I hadn’t realised that DRb had this
unfair advantage.

$ ruby18 drbclient.rb
0.110000 0.140000 0.250000 ( 0.752000)
$ ruby18 soapclient-xmlscan.rb
14.130000 0.822000 14.952000 ( 25.227000)
$ ruby18 soapclient-rexml.rb (with soapserver-rexml.rb)
12.018000 0.821000 12.839000 ( 23.880000)

30 times slower. Hmm. I should get profile.

Have you done any comparison with xmlparser (the expat-derived one)? I’ve
not installed it yet.

Regards,

Brian.

···

On Sat, Mar 08, 2003 at 12:20:13AM +0900, NAKAMURA, Hiroshi wrote:

Hi,

From: Brian Candler
Sent: Saturday, March 08, 2003 1:15 AM

And more:
Marshal: binary string/stream ↔ Ruby’s object
SOAP4R: XML string/stream ↔ SOAP Data Model ↔ Ruby’s object

So SOAP4R will never be as fast as Marshal, even if I’ll rewrite it in C.

Sure, although I guess you could go straight from Ruby object to XML
string/stream for some common cases like String and Fixnum.

I might be able to do it. But this SOAP Data Model layer is
very important for interoperability between SOAP implementations
and between SOAP/other data model. WSDL4R maps WXS(W3C XML Schema)
to SOAP Data Model. There is also RDF-SOAP mapping. NArray’s n-dim
array is mapped to SOAP’s n-dim array in SOAP Data Model as well.

I’m going to write YAML/SOAP bridge once YAML spec will be fixed.
SOAP Data Model → YAML generator and YAML → SOAP Data Model parser
are needed. Using yamlrb, it won’t be too hard to do.

Have you done any comparison with xmlparser (the expat-derived one)? I’ve
not installed it yet.

$ ruby18 soapclient-xmlparser.rb
10.846000 0.971000 11.817000 ( 22.007000)
$ ruby18 soapclient-rexmlparser.rb
11.527000 1.011000 12.538000 ( 22.863000)
$ ruby18 soapclient-xmlscanner.rb
13.329000 0.841000 14.170000 ( 24.347000)

xmlparser/0.6,5
rexml-stable/2.4.7
xmlscanner/0.2.2

Under ruby/1.6, xmlscanner is faster in some situation (depends on
marshalled data.) Xmlscanner seems not to get along with 1.8.

Regards,
// NaHi