I've got a local rack app which is a simple disk/file based wiki with
no caching whatsoever. I did a simple test:
puma -w 8
% wrk -c 32 -t 32 -d 10 http://localhost:9292/wiki/index
Running 10s test @ http://localhost:9292/wiki/index
32 threads and 32 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 40.90ms 25.46ms 228.79ms 68.09%
Req/Sec 25.60 17.39 131.00 96.08%
8146 requests in 10.10s, 27.17MB read
Requests/sec: 806.54
Transfer/sec: 2.69MB
async-http with 8 processes
% wrk -c 32 -t 32 -d 10 http://localhost:9292/wiki/index
Running 10s test @ http://localhost:9292/wiki/index
32 threads and 32 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 9.96ms 6.08ms 119.85ms 98.30%
Req/Sec 99.50 28.28 130.00 90.18%
8626 requests in 10.10s, 28.89MB read
Socket errors: connect 0, read 0, write 0, timeout 15
Requests/sec: 854.35
Transfer/sec: 2.86MB
It's an interesting comparison for a number of reasons
- async-http despite being pure ruby, is pretty similar to puma which
uses c extensions for parsing requests.
- puma has an average latency of 40ms which increases as contention
goes up, while async-http latency stays around 10ms and has a much
tighter standard deviation.
- both http servers max out all 8 virtual cores.
I'm not proclaiming "puma is slow", "async-http" is fast - they both
have hugely different performance profiles depending on workload.
async-http is not optimised at all - it's pure ruby.
Puma will handle traditional blocking IO much better than async-http
as it runs each request in a separate thread. On the other hand since
async-http runs each request in a Fiber, and supports cooperative IO
scheduling - all it would take is one upstream request, e.g.
RestClient.get "otherserver.com/resource" to completely saturate the
puma thread pools, while async-http in theory would continue to
service requests.