Cgi serving xml from Apache problem

(This seems like more an Apache question than a Ruby question, but …)

I’m trying to write a simple Ruby script to serve up an xml file (rss).
Apache seems to be doing something weird. Here’s the raw content output from
the script to the command line:

Content-type: text/xml

<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" The HEX for the gap between "text/xml" and "<!DOC" is just: 0D 0A 0D 0A (The Ruby for this is simple: def output_feed print "Content-type: text/xml\n\n" print rss_xml end ) But here's the header coming from Apache: HTTP/1.1 200 OK Server: Apache/2.0.44 (Win32) Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Transfer-Encoding: chunked Content-Type: text/xml ff2 <!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" Notice the 'ff2' inserted in there. Anyone know what the heck that is? Hex between 'text/xml' and '<!DOC': 0D 0A 0D 0A 66 66 32 0D 0A Chris http://clabs.org
···

Date: Thu, 03 Apr 2003 15:07:46 GMT

(The Ruby for this is simple:

def output_feed
  print "Content-type: text/xml\n\n"
  print rss_xml
end

)

I use Apache to run CGI code to emit XML and haven’t
seen this problem. How is rss_xml created? Might that be
the source of the extra bytes?

James

I think it’s because you have “Transfer-Encoding: chunked” - would I be right
in saying your XML document is 4082 bytes long?

From RFC2616:

3.6.1 Chunked Transfer Coding

The chunked encoding modifies the body of a message in order to
transfer it as a series of chunks, each with its own size indicator,
followed by an OPTIONAL trailer containing entity-header fields. This
allows dynamically produced content to be transferred along with the
information necessary for the recipient to verify that it has
received the full message.

   Chunked-Body   = *chunk
                    last-chunk
                    trailer
                    CRLF

   chunk          = chunk-size [ chunk-extension ] CRLF
                    chunk-data CRLF
   chunk-size     = 1*HEX
   last-chunk     = 1*("0") [ chunk-extension ] CRLF

Regards,

Brian.

···

On Fri, Apr 04, 2003 at 12:13:48AM +0900, Chris Morris wrote:

Transfer-Encoding: chunked
Content-Type: text/xml

ff2

<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" Notice the 'ff2' inserted in there. Anyone know what the heck that is?

(The Ruby for this is simple:

def output_feed
  print "Content-type: text/xml\n\n"
  print rss_xml
end

)

Seems this change fixed it:

 def output_feed
   print "Content-type: text/xml\n\n"
   $stdout.flush
   print rss_xml
   $stdout.flush
 end

Without the flush, I guess Apache was aware that the output may not be done
and automatically went into ‘chunked’ mode.

With the flush, ‘Transfer-encoded: chunked’ is not used and the XML comes
across clean.

Chris
http://clabs.org

I use Apache to run CGI code to emit XML and haven’t
seen this problem. How is rss_xml created? Might that be
the source of the extra bytes?

I don’t see how – I included in my original post the top of the raw output
when I run the script from the command line – there’s no extra bytes there.
It doesn’t appear Ruby is putting them there, I’m assuming Apache must be. I
guess I could try this on a non-Windows, non-2.0 Apache install and see what
happens there.

Chris

I think it’s because you have “Transfer-Encoding: chunked”

Ah, I’ll have to check that out.

would I be right
in saying your XML document is 4082 bytes long?

It’s longer than that (350-ish bytes), but close. I’d thought that maybe
that was a length value – I’ll poke around. Thanks and sorry for the
non-Ruby noise.

Chris
http://clabs.org

It shouldn’t cause a problem though. RFC2616 again:

All HTTP/1.1 applications MUST be able to receive and decode the
“chunked” transfer-coding

If the client was making a HTTP/1.0 request then possibly Apache should not
be using this transfer-coding, so you might want to dig around there a bit
more. Flushing stdout a few times is a rather dubious way to solve this if
it is indeed a problem; you may find it reappears again when your dataset
gets a bit larger!

Chunked encoding saves the server having to buffer your entire CGI output
just so that it can create a correct Content-Length: header.

Regards,

Brian.

···

On Fri, Apr 04, 2003 at 02:45:16AM +0900, Chris Morris wrote:

Seems this change fixed it:

 def output_feed
   print "Content-type: text/xml\n\n"
   $stdout.flush
   print rss_xml
   $stdout.flush
 end

Without the flush, I guess Apache was aware that the output may not be done
and automatically went into ‘chunked’ mode.

With the flush, ‘Transfer-encoded: chunked’ is not used and the XML comes
across clean.

Then that may be in the next chunk? 4082 is 14 bytes shy of 4k…

···

Chris Morris (chrismo@clabs.org) wrote:

I think it’s because you have “Transfer-Encoding: chunked”

Ah, I’ll have to check that out.

would I be right
in saying your XML document is 4082 bytes long?

It’s longer than that (350-ish bytes), but close. I’d thought that maybe
that was a length value – I’ll poke around. Thanks and sorry for the
non-Ruby noise.


Eric Hodel - drbrain@segment7.net - http://segment7.net
All messages signed with fingerprint:
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04