So, I've got a chunk of code which needs optimizing. Yes, it really needs
it, I'm not optimizing prematurely. It (mostly) works, it's just way too
slow. And I knew it would be too slow when I wrote it, I just wanted to come
up with a simple implementation first, and get it under test.
That said, I'm wondering if there are any suggestions out there for
tutorials on using ruby-prof. Not tutorials like "here are what the numbers
mean", but "here are some suggestions for looking at the numbers and
deciding where you'll get the most gain in optimizing.
That's the usual approach when tuning: take the slowest part, make it
faster. Repeat until performance is sufficient - or you run out of
money.
A generally candidates for optimization is object allocation. That is
slow compared to other operations because of the GC management
involved. Also, if you do lookups in huge Array you might want to
switch to indexed access via Hash.
Kind regards
robert
···
On Fri, Oct 15, 2010 at 3:20 PM, Andrew Wagner <wagner.andrew@gmail.com> wrote:
So, I've got a chunk of code which needs optimizing. Yes, it really needs
it, I'm not optimizing prematurely. It (mostly) works, it's just way too
slow. And I knew it would be too slow when I wrote it, I just wanted to come
up with a simple implementation first, and get it under test.
That said, I'm wondering if there are any suggestions out there for
tutorials on using ruby-prof. Not tutorials like "here are what the numbers
mean", but "here are some suggestions for looking at the numbers and
deciding where you'll get the most gain in optimizing.
The biggie for me is to look at the values in the % column and see what the curve is like:
(contrived)
% time call
36% w
10% x
5% y
1% z
vs:
% time call
5% w
4% x
3% y
1% z
In the first example, you have something more akin to a power law curve. That says that a majority of the time is being spent in a minority of the calls and is a good candidate for optimizing. You then need to look and see if it is the number of calls that is causing it (which would mean it is a good candidate for memoization or some other form of caching) or if it is from accessing external resources that block or... something. It is also a good candidate to use something like zenprofile's spy_on method to figure out where the calls are coming from to determine the breakdown. If it is evenly distributed then it might not actually be worthwhile to memoize. It really depends on the design of the code and how it is used.
In the second example, while the times of the profile might be high, the % of time spent is not... graphing that is pretty flat... it is relatively linear with a fairly flat slope.
···
On Oct 15, 2010, at 06:20 , Andrew Wagner wrote:
That said, I'm wondering if there are any suggestions out there for
tutorials on using ruby-prof. Not tutorials like "here are what the numbers
mean", but "here are some suggestions for looking at the numbers and
deciding where you'll get the most gain in optimizing.
So, I've got a chunk of code which needs optimizing. […]
That said, I'm wondering if there are any suggestions
out there for tutorials on using ruby-prof.
My advice would be to check out perftools.rb¹ and the generated graphs,
with the additional benefits of it having no profiling overhead and
being able to see how much time garbage collection takes. You might also
want to take a look at my Profiling Ruby slides² for example graphs.
--
‘Could we have Math::INFINITY which would make code using it cleaner?’
‘Ruby assumes the environment has native threading and
US-ASCII compatible locale. Is there an environment which
satisfies the conditions but does not have infinity?’
‘NetBSD on VAX, I guess.’
‘VAX! OK. I’m against introducing Math::INFINITY.’
[Marc-Andre Lafortune, Yugui, Tanaka Akira, ruby-core]
Also check if you have: slow DB queries, calls to external services,
system calls that might be slow.
Loops around any of the issues above.
Jesus.
···
On Fri, Oct 15, 2010 at 4:54 PM, Robert Klemme <shortcutter@googlemail.com> wrote:
On Fri, Oct 15, 2010 at 3:20 PM, Andrew Wagner <wagner.andrew@gmail.com> wrote:
So, I've got a chunk of code which needs optimizing. Yes, it really needs
it, I'm not optimizing prematurely. It (mostly) works, it's just way too
slow. And I knew it would be too slow when I wrote it, I just wanted to come
up with a simple implementation first, and get it under test.
That said, I'm wondering if there are any suggestions out there for
tutorials on using ruby-prof. Not tutorials like "here are what the numbers
mean", but "here are some suggestions for looking at the numbers and
deciding where you'll get the most gain in optimizing.
That's the usual approach when tuning: take the slowest part, make it
faster. Repeat until performance is sufficient - or you run out of
money.
A generally candidates for optimization is object allocation. That is
slow compared to other operations because of the GC management
involved. Also, if you do lookups in huge Array you might want to
switch to indexed access via Hash.
That said, I'm wondering if there are any suggestions out there
for tutorials on using ruby-prof. Not tutorials like "here are what
the numbers mean", but "here are some suggestions for looking at
the numbers and deciding where you'll get the most gain in
optimizing.
The biggie for me is to look at the values in the % column and see
what the curve is like:
(contrived)
% time call 36% w 10% x 5% y 1% z
vs:
% time call 5% w 4% x 3% y 1% z
In the first example, you have something more akin to a power law
curve. That says that a majority of the time is being spent in a
minority of the calls and is a good candidate for optimizing. You
then need to look and see if it is the number of calls that is
causing it (which would mean it is a good candidate for memoization
or some other form of caching) or if it is from accessing external
resources that block or... something.
Very good advice IMHO!
In the second example, while the times of the profile might be high,
the % of time spent is not... graphing that is pretty flat... it is
relatively linear with a fairly flat slope.
Something seems to be missing here. What would be your recommendation in this case? I'd say, tuning is more difficult in this case since you might have to consider architectural changes that affect multiple components. In any case chances are that tuning is more costly than in the first case.