Thanks for all the comments so far. They are really helping.
Let me elaborate a little more. In the comments I read interesting ideas
I want to go into a little more.
Robert Klemme wrote:
The problem with performance is that you normally have to try and
measure to see whether it’s ok or not. And of course it heavily
depends on the algorithms and kind of application at hand. I’d say
without more information about these it’s impossible to say offhand
whether Ruby will deliver or not.
I know that. You only going to really know the moment your service gets
used. Still you have to find a good scalable approach to begin with to
minimize the chances of rewriting everything. And I also know that
premature optimization is a bad thing. But blindly doing anything that
works can be equally problematic in the future.
We are currently in the planning stages writing the use cases. And if I
would have to compare what the system will have to do I would probably
have to say it will be a lot like ebay. Except that it will have nothing
to do with auctions. We are not that crazy Ppl will be able to put
there profiles in, upload pictures and descriptions of their objects,
search the database for these objects, communicate over the system about
these objects. Those are the basics. So from the viewpoint of what you
will be able to do I see that it is pretty similiar to the basic things
you can do on ebay. I hope that clears things up a little more.
Robert also states that typically (in web applications?) a queue based
approach is used. That sounds very interesting. Not only that you can
circumvent threads which is always good if possible but I can imagine
that such a system would be fairly easy to implement. The thing I am
wondering about is the response times. While you can have
progress bars in a desktop app you really can’t afford to let the
user wait in a web app. It would probably take a while before that
happens on newer hardware, but this was exactly my point. In the event
you are so lucky that ppl get crazy about your service and you really
get more visitors than anticipated even a queue based solution has to
scale. I cannot imagine how to do that so I have to give it more
thought. Maybe such a system can be implemted in such a way that it can
be easily scaled with new (upgrading) hardware? I could imagine a queue
based system that forwards its items to a distributed network of
computers via DRb…
Lothar Scholz wrote:
The best way to scale ruby web apps is to use numerous external
started FCGI servers and use the session id to bind to the same user
to the same FCGI server. If you don’t have extremely high interuser
interaction this would simplify your life. It can also reduce your
database load a lot and make HTML-GUI implementation easier. The most
complicated thing is to get this configuration up and running - and
it does not work good with load balancers.
You seem to speak out of experience. Could you elaborate a little more
or maybe point me into a direction where I can read a more about this
technique? I will read about FCGI because others have mentioned it too.
The reason I haven’t already done so, is that from a PHP perspective you
wouldn’t bother to use CGI because of the process duplication which
slows down things considerably. mod_php is a lot more stable and
complete than mod_ruby at the moment. I must investigate more in this
option. One other thing I wonder about is when you say that it is
complicated to set up. It sounds to me that this technique is like
distributed processes in contrast to distributed services on different
machines. With the advantage of being able to scale the hardware on one
machine.
The first thing would be to check if ruby is really what you want. Do
you have the libraries you need for your project, are they stable
I don’t know if there are any really stable and mature libraries for
Ruby in comparison to Java libraries. At first I thought it would be a
good idea
to write everything myself so I could keep the system as small as
possible. But
I have come to realize that this may be not the optimal approach. At the
moment I think it would be a better idea to use one of the available
frameworks and contribute to them rather than rewrite my own. That way I
can give something back to the open source community which is long
overdue for me (using open source stuff all the time). At the moment my
favourite framework is Cerise (http://cerise.rubyforge.org/) as it is
the most elegant solution I have encountered so far (not only for Ruby).
I have done some
preliminary tests with Cerise and the functionality I want to implement
will be a snap to do so with this framework. Will Glozer the project
maintainer seems to be very experienced and still working alone on this
wonderful piece of software. I would like to join that project (or any
other) if it turns out to be the right one for my purposes. But I
haven’t made my final
decision, yet. I still have to really test a couple of the other
frameworks that were mentioned on this list earlier, including the soon
(?) to be released rails. The project will span from June 2004 to March
2005, so
there is still some time.
David Heinemeier Hansson wrote:
Stay on a single box as long as you can. Next, move the database to a
separate machine. Then start thinking about scaling the application
server. (Or you may start thinking beforehand as you do now, just no
need to commit the dollars)
That was the idea. First the database, then the images on a another
machine (maybe with a fast webserver and logging turned off).
COMMERCIAL: If you happen to be in Chicago on June 25th, we’ll be
realing all about Basecamp and Rails in a one-day workshop. There’s
more information on http://www.37signals.com/workshop-062504.php.
Yeah, I read all about that. Two weeks ago our first child was born so I
am a little tied at the moment. I would have probably come to Denmark
for your
presentation, but because of these new “circumstances” I really cannot.
I wonder if anyone would be able to record your session on video and
offer it for download via torrent or so, maybe you?
Kirk Haines: I really like the way you described the scaling process.
That seems like a viable possibility I must investigate. I have just
started reading about IOWA and have yet to really check it out. I only
know roughly what it is, but not what it can do for me. Very
interesting.
Dan Janowski wrote:
Distributing objects is one thing, but using drb as a
request/response and control protocol is what I am considering at the
moment. I was thinking of FCGI, but it is all or nothing in the sense
that it forces the whole hit service out of Apache. But a
considerable amount of cgi processing is not session dependent and is
more appropriate in the Apache side (I use mod_ruby). Separating
non-web application logic from cgi/web processing is one way to
efficiently distribute the load.
It seems like I will have to weigh those two options against each other.
I think once you have a running DRb system on one machine it should be a
snap to just add other machines to the system. If you first go the FCGI
route you will have to leave it sooner or later. The problem is with
development time. A distributed system can be so much harder to
implement, remember the HURD? GNU Hurd
Lothar Scholz added:
By the way if you have your own webserver (so you don’t need all the
flexible configuration features of apache) then you should never use
apache if performance is important. Apache is not a very fast
webserver. A lot of other servers give you twice the responds then
apache (or even more if apache is not configured correctly).
That sounds good. I was hoping to be able to use a built-in server like
Cerise offers. What would really be nice if someone with the right
skills would implement a webserver as a barebones c-module. Or maybe
somehow integrating one of the available webservers like
http://www.annexia.org/freeware/rws/ or any other. Ideally something
that would work with Webrick and making it faster. Apache is not really
fast but Webrick or any other Ruby implementation is by far slower.
Anyone heard of such a project? My c-skills are really not mentionable.
I never came around to actually write anything in c although I have
already read 3 or 4 books about the languags (just in case).
Thanks again for all your replies.
···
–
Sascha Ebach