People,
The "how can we make a ruby compiler" thread has been very interesting - I really like hearing from serious and competent programmers about the theoretical problems involved with this issue.
William Rutiser asked for an expansion on the details of my C/C++ population genetics simulation program as a specific example of how one might proceed depending on a particular situation. I am happy to elaborate - not least because it would be good to get input from experienced Ruby programmers before I just try to replicate the same program in Ruby - I'm sure there will be more sensible/efficient ways of doing things than what I would attempt first off and so comparisons between my C/C++ version and a dodgy Ruby program might be even more unfair . .
I will summarise the C/C++ program as it exists now (it has gone through a number of versions and has a lot of code that does not need replicating for the present comparison) with a general overview (I can add more detail later if people are interested):
- A population is represented by a number of sub-populations which occupy cells of a two dimensional array
- each cell has pointers to three ordered lists - representing the parental population, the offspring population and a temporary population and each element of a list represents an individual
- at each generation (of potentially hundreds), the whole array is iterated through, new offspring are produced, migrants move to adjacent cells, parents die off etc
As well as this main simulation program, I have already replaced all the original shell scripts and some of the statistical processing with Ruby scripts but the main simulation program is where I couldn't afford an order or two increase in running times by rewriting in Ruby.
The main problem I see with some of the Ruby conversions that I have looked at (eg RubyInline) is that the performance problem comes in in repeating the WHOLE simulation with different starting parameters which is done thousands of times - so it is not like you have a single recursive algorithm which is a bottleneck that can be optimised or rewritten in C or something. There are lots of little steps that happen millions of times . .
I hope that is sort of clear?
Regards,
Phil.