CGI uses file size to distinguish between regular values and files

I’m using the CGI module for file upload - I think it works,
it is just poorly documented.

At least it works for me - I use it mainly for uploading binary files (jpg images) - typically in the range 1k-100k bytes in size - but also larger.

Anyway, my code looks like this (upload.rbx)

cgi=CGI.new
if cgi.has_key?(‘image’)
image=cgi[‘image’][0].read
end

image is now a string representing the file that has been uploaded.
This only works for multipart forms however - ie. the HTML page
should look something like this:

<form method="post" action="upload.rbx" ENCTYPE="multipart/form-data">
<input type=hidden name=page value="upload">
<input type=file size=20 name=image>
...

Or did I miss something?

Cheers
Jesper

···

Austin Ziegler austin@halostatue.ca wrote:

On Mon, 3 Nov 2003 20:12:06 +0900, Dmitry Borodaenko wrote:

On Mon, Nov 03, 2003 at 02:29:08PM +0900, Simon Kitching wrote:

I suspect that the fact that tempfiles are used by this module is a
private implementation detail, ie something that users are not supposed
to be aware of, or depend upon.
Why, you can always check if the parameter you’ve got is a Tempfile,
and, if it is not, assume that it is small enough to be handled in core.

Adopted from Samizdat source:

if Tempfile === file then
File.syscopy(file.path, upload)
else # StringIO
File.open(upload, ‘w’) {|f| f.write(file.read) }
end

Did I miss something?

Why should I have to make that distinction in my code? I’ve got a Perl app
that I plan to port over to Ruby at some point, and it allows file uploads
(up to a couple megabytes in size). Indeed, the way that I deal with the
file uploads never touches the filesystem from the perspective of my
application (the file “string” goes directly into a MySQL database). If I
have to step into my own filehandling, then I may not port this app to Ruby
even though it’s a fine candidate for it otherwise.

The library should provide me the raw data. If I decide something is a file
and want to take some shortcuts with it, then the library can provide me
additional information (e.g., CGI.tempfile?(parameter_index)) and
functionality to access that implementation detail, but I should never
have to make the distinction in my application code. Ever.

-austin

austin ziegler * austin@halostatue.ca * Toronto, ON, Canada
software designer * pragmatic programmer * 2003.11.03
* 09.35.55

This is what I don’t want. I shouldn’t have to handle it
differently (I think that the Perl mechanism is even worse), but I
should be able handle it differently if I want.

Not only that, as the OP said at the beginning of this thread, if
the size of the file is large enough, then all CGI parameters are
turned into tempfiles, requiring a #read on all of them, depending
on the size of the file being uploaded. When I have a file uploaded,
I have the file and five other parameters. I shouldn’t have to do a
#read on all of the parameters.

Give me a way to access the temp files behind the data, but I’m not
at all interested in having to access the temp files unless I
explicitly want to.

-austin

···

On Tue, 4 Nov 2003 00:43:49 +0900, Jesper Olsen wrote:

I’m using the CGI module for file upload - I think it works, it is
just poorly documented.

At least it works for me - I use it mainly for uploading binary
files (jpg images) - typically in the range 1k-100k bytes in size

  • but also larger.

Anyway, my code looks like this (upload.rbx)
cgi=CGI.new
if cgi.has_key?(‘image’)
image=cgi[‘image’][0].read
end


austin ziegler * austin@halostatue.ca * Toronto, ON, Canada
software designer * pragmatic programmer * 2003.11.03
* 11.15.01

Session#params in samizdat/session.rb in Samizdat:

def params(keys)
result =
for key in keys
value = self[key]
raise UserError, “Input size exceeds content size limit” if
value.methods.include? :size and
value.size > config[‘limit’][‘content’]
result <<
case value
when String then (value =~ /[^\s]/)? value.to_s : nil
when StringIO, Tempfile then value.read
else nil
end
end
return result
end

Is that what you want?

···

On Tue, Nov 04, 2003 at 01:25:31AM +0900, Austin Ziegler wrote:

On Tue, 4 Nov 2003 00:43:49 +0900, Jesper Olsen wrote:

Anyway, my code looks like this (upload.rbx)
cgi=CGI.new
if cgi.has_key?(‘image’)
image=cgi[‘image’][0].read
end
This is what I don’t want. I shouldn’t have to handle it
differently (I think that the Perl mechanism is even worse), but I
should be able handle it differently if I want.


Dmitry Borodaenko

[snip]

Sort of. I want it to work that way in the default CGI library. There should
be no distinction for the user who just wants the data. I still want to be
able to access the resulting Tempfile, though, if I’m deciding to do
file-based storage instead of, say, database storage. That way, I don’t need
to write the data, but can tell the OS to copy the file.

-austin

···

On Tue, 4 Nov 2003 04:44:50 +0900, Dmitry Borodaenko wrote:

austin ziegler * austin@halostatue.ca * Toronto, ON, Canada
software designer * pragmatic programmer * 2003.11.03
* 15.56.30

The original response from Simon Kitching was the best on this topic.
You are trying to access a private implementation object instead of the
public object. It looks like all the CGI.rb API guarantees you is an IO
object from which you can access the uploads. If the writer of the class
wishes to use a regular file, or a tempfile, or StringIO or whatever
other subclasses of the IO superclass that really is up to them. You are
trying to “cheat” (couldn’t think of a better word) by grabbing the
tempfile instead of using the generic IO methods of read or write. Yes,
there is no question that a tempfile is easier to use if you want to
copy the data - however, I’m sure that from a performance standard - the
author has decided that the overhead of a tempfile shouldn’t be incurred
by the average user when the length of the data is below the 10240
character threshold. It’s like 99.9% of performance decisions made with
something like this - it will never please everyone all the time -
however, I think the author made a pretty good decision and I can’t
imagine it’s that difficult to do a

File.open(upload, ‘w’) {|f| f.write(file.read) }

which would work under both circumstances. If you find that you
performance suffers greatly by this then I would look to superclassing
CGI and making the trivial modification required to force the use of
Tempfiles.

Ruby offers soooo many choices to do any particular task that we all
have to realize that the authors of the various modules have to make
decisions that they believe will meet our requirements. That said, it’s
so simple to make the changes that you need that I would spend my time
superclassing CGI and moving onto better things.

John W Higgins
john@wishdev.com

···

On Mon, 2003-11-03 at 12:58, Austin Ziegler wrote:

On Tue, 4 Nov 2003 04:44:50 +0900, Dmitry Borodaenko wrote:
[snip]

Sort of. I want it to work that way in the default CGI library. There should
be no distinction for the user who just wants the data. I still want to be
able to access the resulting Tempfile, though, if I’m deciding to do
file-based storage instead of, say, database storage. That way, I don’t need
to write the data, but can tell the OS to copy the file.

-austin

austin ziegler * austin@halostatue.ca * Toronto, ON, Canada
software designer * pragmatic programmer * 2003.11.03
* 15.56.30

Actually, my complaint is subtly different than the OP’s complaint. cgi.rb
is incorrect in how it behaves because what is returned will vary based on
whether or not a multipart form is provided or whether the size is the
same or not. Also, it will extend tempfile objects, but it will not extend
StringIO objects – so the “type” isn’t the same using ducktyping, either.
The behaviour isn’t consistent, thus it’s wrong.

I suggest you reread my complaint a bit more thoroughly. There is no way
that CGI can be seen to be correct in this case.

-austin

···

On Tue, 4 Nov 2003 07:17:33 +0900, John W Higgins wrote:

The original response from Simon Kitching was the best on this topic.


austin ziegler * austin@halostatue.ca * Toronto, ON, Canada
software designer * pragmatic programmer * 2003.11.03
* 18.10.05