Fsync on stdout for mod_rewrite

Eric_Anderson1 · 11 December 2004 06:32

I have a script that I want to ensure has flushed stdout after ever line
of output. I have $stdout.sync=true but when I tried to do $stdout.fsync
I get an Invalid Argument error. Not what I expect according to the docs.

My goal is that I have a script providing lookups for mod_rewrite in
Apache. It hands me the HTTP_HOST header on $stdin and I return the path where it should look for a specific website on $stdout. It seems to work well but every now and then it returns the wrong answer. My only two possibilities that I can see for this problems are:
1) Apache is asking the script the next request before the first request
is answered. But I am using the RewriteLock directive so Apache should
have the requests synchronized.
2) My other option is that the buffer is not getting flushed. Then
another request comes it and both answers are outputted with the first
answer being the wrong answer.

The script seems to perform perfect when run from the command line. Just
not in Apache. Sometime is returns the right answer sometimes it returns
the wrong answer. Even when make the same request (i.e. refresh).

See http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html if you are
curious on the mod_rewrite semantics. If posting the script would help I
can post that also.

I appreciate any pointers.

Eric

Glenn_Parker1 · 11 December 2004 13:12

Eric Anderson wrote:

I have a script that I want to ensure has flushed stdout after ever line
of output. I have $stdout.sync=true but when I tried to do $stdout.fsync
I get an Invalid Argument error. Not what I expect according to the docs.

fsync is only meaningful for files in a file system. You probably want to use $stdout.flush, but if you set $stdout.sync = true then $stdout.flush is already being done implicitly.

If posting the script would help I
can post that also.

Maybe describing a little bit more about the connection between Apache and your script would help, too. Is a new script process created for every URL? If so, then this is likely not a buffering issue. If not, then you might be failing to reset some lingering state in the Ruby interpreter between "calls" from Apache.

···

--
Glenn Parker | glenn.parker-AT-comcast.net | <http://www.tetrafoil.com/>

Eric_Anderson1 · 11 December 2004 17:07

Glenn Parker wrote:

fsync is only meaningful for files in a file system. You probably want to use $stdout.flush, but if you set $stdout.sync = true then $stdout.flush is already being done implicitly.

I first had $stdout.flush. Then I switched to sync when I found out about it. But according to the docs the operating system might no flush it. I wasn't sure if it applied to $stdout but I wanted to issue the command just in case.

Maybe describing a little bit more about the connection between Apache and your script would help, too. Is a new script process created for every URL? If so, then this is likely not a buffering issue. If not, then you might be failing to reset some lingering state in the Ruby interpreter between "calls" from Apache.

I have looked through the script 100 times and from I can tell everything should be ok for each request. The protocol between Apache and the script is:

Apache send the key followed by a newline character to the stdin of the script. In this case I am telling Apache to send the HTTP Host header. My script then does some logic to determine what to return. It usually is a path to the website being requested. The script is started when Apache starts up and continues to run as long as Apache runs. The path returned is followed by a newline character (the spec says you can also send it the characters NULL). The docs give it an example in perl with the following script:

My script is as follows. Any suggestions or improve or fix the problem are greatly appreciated. You can probably guess the structure of the database from the SQL statement. The idea is that it is supposed to return the name that is the closest to what was passed in. So if you get a request for foo.bar.com and you have a site called bar.com it returns bar.com. If you get a request from www.baz.bar.com and you have a site called baz.bar.com it will return baz.bar.com instead of just bar.com. There are a couple exceptions. For example account.<any domain>.com should return account.realsimplehosting.com. Also anything with svn in it should just return domain name and not the entire path. Also there is an effort to retry requests after failure. After 10 failures it will just return error.realsimplehosting.com. The script is attached to this email. I attached it so the lines won't get wrapped.

webroot_lookup (2.59 KB)

Glenn_Parker1 · 11 December 2004 19:43

Eric Anderson wrote:

My script is as follows.

I'm not too familiar with embedding SQL in Ruby, so I may not be much help here, but my spidey-sense tingles when I look at the way the database connection is manipulated here.

I would try two debug modes, one where the database connection closed and re-opened on every Apache request, and another mode where the script is restarted for every Apache request. It might help narrow down the problem to eliminate lingering state in the continuous connection as a culprit.

···

--
Glenn Parker | glenn.parker-AT-comcast.net | <http://www.tetrafoil.com/>

Dominik_Werder2 · 11 December 2004 21:17

clear last_request after the puts so that not accidentially an old value can be given to apache.. maybe overcautious, anyway..
bye!
Dominik

Topic		Replies	Views
Buffering problems ruby-talk	5	93	14 October 2005
Can't flush print when download open-uri ruby-talk	1	103	14 November 2003
Forcing STDOUT.sync for scripts ruby-talk	9	120	30 November 2005
Interactively processing a line at a time ruby-talk	5	158	12 August 2005
Ruby and popen ruby-talk	1	98	16 June 2004

Fsync on stdout for mod_rewrite

Related topics