CGI and multipart data


(Martin Hart) #1

Hi all,

I have a single HTML form which contains simple text controls as well as file
upload controls. I had assumed that the CGI library would return me items of
type String for the text boxes and items of type Tempfile for the file upload
boxes…

However, it appears (after searching the archives) that StringIO is used
instead of Tempfile on ruby 1.8 (in some cases and not others?), and that as
soon as you specify an enctype of multipart you never? receive Strings but
always receive either Tempfile or StringIO.

This is what i really want from the CGI library:

  • CGI returns everything as a string except for file uploads - which are
    returned as Tempfiles

Assuming that I cannot have that :slight_smile: can somebody in the know please tell me
(or point me to some documentation)…

  1. how can I tell whether or not the supplied parameter was a file upload or
    simple text that has been passed to me as a Tempfile/StringIO? At the moment
    I don’t know whether to copy the file or not - because I don’t know if it is
    a valid file or just another parameter.

  2. do hidden controls work with multi-part forms? there is some discussion of
    them not working on the mailing list dating back to last November but I don’t
    know if that has been fixed or not. My limited testing here indicates that
    hidden parameters are ignored on multipart forms?

  3. why are all values expressed as Tempfile/StringIO when using a multipart
    form? Why not just have the file uploads as tempfiles?

  4. how does the CGI library determine whether or not to switch between
    Tempfile/StringIO

  5. do I have to specifically close the IO object (assuming that one is
    returned?) - or perhaps I do not have to close the StringIO but I do have to
    close the Tempfile?

  6. are all of the form parameters in different tempfiles or are they all in
    one? Say i close the file after reading 1 argument … does this mean that I
    cannot read any more?

Thanks for any help you can give me - I am pretty stuck on this. If I get
code working with StringIO objects it seems to break when I submit large
files, and if I get code working with Tempfile it seems to break when I
submit small files :slight_smile:

Cheers,
Martin

···


Martin Hart
Arnclan Limited
53 Union Street
Dunstable, Beds
LU6 1EX
http://www.arnclanit.com


(Paul Vudmaska) #2

Martin Hart wrote:

Hi all,

I have a single HTML form which contains simple text controls as well as file
upload controls. I had assumed that the CGI library would return me items of
type String for the text boxes and items of type Tempfile for the file upload
boxes…

However, it appears (after searching the archives) that StringIO is used
instead of Tempfile on ruby 1.8 (in some cases and not others?), and that as
soon as you specify an enctype of multipart you never? receive Strings but
always receive either Tempfile or StringIO.

This is what i really want from the CGI library:

  • CGI returns everything as a string except for file uploads - which are
    returned as Tempfiles

Assuming that I cannot have that :slight_smile: can somebody in the know please tell me
(or point me to some documentation)…

  1. how can I tell whether or not the supplied parameter was a file upload or
    simple text that has been passed to me as a Tempfile/StringIO? At the moment
    I don’t know whether to copy the file or not - because I don’t know if it is
    a valid file or just another parameter.

If it is just posted, its text. If you use enctype=multipart/form-data
then all will be files.
If you use the second, use $cgi[‘your_form_item’][0].read to get the
text value.

For example:
oOverwrite = $cgi[‘overwrite’][0] #this is a checkbox
if oOverwrite
overwrite = oOverwrite.read.chomp #i’m reading it’s value
if overwrite == ‘on’ then bOverwrite = true end
end

Notice i never close the tmp file. Never thot about it ;o)

  1. do hidden controls work with multi-part forms? there is some discussion of
    them not working on the mailing list dating back to last November but I don’t
    know if that has been fixed or not. My limited testing here indicates that
    hidden parameters are ignored on multipart forms?

Yes. You’ll need to access them as if they were a file - like above.

  1. why are all values expressed as Tempfile/StringIO when using a multipart
    form? Why not just have the file uploads as tempfiles?

I’ve not used StringIO

  1. how does the CGI library determine whether or not to switch between
    Tempfile/StringIO

Dont know

  1. do I have to specifically close the IO object (assuming that one is
    returned?) - or perhaps I do not have to close the StringIO but I do have to
    close the Tempfile?

Hum, dont know.I’ve never closed them

  1. are all of the form parameters in different tempfiles or are they all in
    one? Say i close the file after reading 1 argument … does this mean that I
    cannot read any more?

I think they are individual files.

Somthing to note : IE sends the whole path for the file name. NS(1.6)
only send the name.

CGI can be kind of counter intuitive - but it will work!

:stuck_out_tongue:


(Paul Vudmaska) #3

Martin Hart wrote:

Hi all,

I have a single HTML form which contains simple text controls as well as file
upload controls. I had assumed that the CGI library would return me items of
type String for the text boxes and items of type Tempfile for the file upload
boxes…

However, it appears (after searching the archives) that StringIO is used
instead of Tempfile on ruby 1.8 (in some cases and not others?), and that as
soon as you specify an enctype of multipart you never? receive Strings but
always receive either Tempfile or StringIO.

This is what i really want from the CGI library:

  • CGI returns everything as a string except for file uploads - which are
    returned as Tempfiles

Assuming that I cannot have that :slight_smile: can somebody in the know please tell me
(or point me to some documentation)…

  1. how can I tell whether or not the supplied parameter was a file upload or
    simple text that has been passed to me as a Tempfile/StringIO? At the moment
    I don’t know whether to copy the file or not - because I don’t know if it is
    a valid file or just another parameter.

  2. do hidden controls work with multi-part forms? there is some discussion of
    them not working on the mailing list dating back to last November but I don’t
    know if that has been fixed or not. My limited testing here indicates that
    hidden parameters are ignored on multipart forms?

  3. why are all values expressed as Tempfile/StringIO when using a multipart
    form? Why not just have the file uploads as tempfiles?

  4. how does the CGI library determine whether or not to switch between
    Tempfile/StringIO

  5. do I have to specifically close the IO object (assuming that one is
    returned?) - or perhaps I do not have to close the StringIO but I do have to
    close the Tempfile?

  6. are all of the form parameters in different tempfiles or are they all in
    one? Say i close the file after reading 1 argument … does this mean that I
    cannot read any more?

Thanks for any help you can give me - I am pretty stuck on this. If I get
code working with StringIO objects it seems to break when I submit large
files, and if I get code working with Tempfile it seems to break when I
submit small files :slight_smile:

Cheers,
Martin

Here is the code that i use. Prob not the best but has worked for me so far.

class Upload
#{{{ -----------------------Uploads file to server from html
form--------------------------

#max size of file in bytes
MAX_SIZE     = 100000

#where the file goes / MUST HAVE WRITE PRIVS HERE
PATH          = "/home/paul/web/"

#how many file inputs - you can upload multiple files at once
FILE_COUNT     = 3

#what file types do we allow?
CONTENT_TYPES= ['image/jpg','image/jpeg','image/gif','image/png']




def initialize
   
    #how are things going?
    @status      = []
   
    if $form.isPost
       
        post
       
        print @status.join('<br/>')
       
        form
       
    else
   
   
        form
   
       
    end
       
end
def form
#{{{

    puts '<form method="post" enctype="multipart/form-data">'
   
    FILE_COUNT.times do
   
        puts '<p/><input type="file" name="myfile">'

    end
       
     puts '<p/><input type=hidden name=upl value="upload">'         
     puts '<br/><input type=checkbox name="overwrite"/>Overwrite?'
     puts '<br/><input type="submit">'         
     puts '</form>'   

end#}}}


def post       
#{{{
   
   
    oOverwrite = $cgi['overwrite'][0]
    bOverwrite = false
    overwrite = ''
   
    if oOverwrite
        overwrite = oOverwrite.read.chomp
        if overwrite == 'on' then bOverwrite = true end
    end

    $cgi['myfile'].each do |incoming|
       
        if incoming.size == 0
           
            @status<< "Ignoring empty field"
            next
       
        end
       
        if incoming.size > MAX_SIZE
           
            @status<< "Data too large for 

#{incoming.original_filename}(#{incoming.size} > #{MAX_SIZE})"
next

        end
       
        #need to strip :)...trailing space...ouch
        if not CONTENT_TYPES.include? incoming.content_type.strip
           
            @status<< "Type not allowed(type = 

#{incoming.content_type}) allowed content = #{CONTENT_TYPES.join(’ | ')}"
next

        end           
       
        #puts incoming.filename
        # all should be ok to upload
       
        sfilename = incoming.original_filename.untaint
       
        #see if name has whacks...ie?
        rdash = sfilename.rindex('\\')           
        if rdash
            sfilename = sfilename[rdash+1,sfilename.length]
        end
       
        #physical path
        path      = PATH + $Domain + '/img/' + sfilename
   
       
        if File.exist? path           
            if bOverwrite
                File.delete path
            else
                @status<< "File already exists 

(#{incoming.original_filename})"
next
end
end

        #write to file
        file = File.new(path.untaint,'w')               
        file << incoming.read
        file.close

        #path to link from web
        httpPath = "http://#{$Domain}/img/#{sfilename}"
       
        @status<< "Completed upload of <a target=_blank 

href=’#{httpPath}’>#{httpPath}"
end

end#}}}

end#}}}
Upload.new


(Martin Stannard) #4

Hi Martin,

Hi all,

I have a single HTML form which contains simple text controls as well as
file
upload controls. I had assumed that the CGI library would return me items
of
type String for the text boxes and items of type Tempfile for the file
upload
boxes…

However, it appears (after searching the archives) that StringIO is used
instead of Tempfile on ruby 1.8 (in some cases and not others?), and that
as
soon as you specify an enctype of multipart you never? receive Strings but
always receive either Tempfile or StringIO.

This is what i really want from the CGI library:

  • CGI returns everything as a string except for file uploads - which are
    returned as Tempfiles

Assuming that I cannot have that :slight_smile: can somebody in the know please tell
me
(or point me to some documentation)…
The source for cgi.rb is a good place to start I found.

  1. how can I tell whether or not the supplied parameter was a file upload
    or
    simple text that has been passed to me as a Tempfile/StringIO? At the
    moment
    I don’t know whether to copy the file or not - because I don’t know if it
    is
    a valid file or just another parameter.

Can’t you tell by the name of the parameter? If you’re naming them then
you should know what to expect. Anyway it doesn’t matter cause you treat
them exactly the same. See pt 4 below.

  1. do hidden controls work with multi-part forms? there is some
    discussion of
    them not working on the mailing list dating back to last November but I
    don’t
    know if that has been fixed or not. My limited testing here indicates
    that
    hidden parameters are ignored on multipart forms?

You may want to look at
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/87163 if
you’re having trouble with multiple fields on multi-part forms. I use
hidden controls with no problems.

  1. why are all values expressed as Tempfile/StringIO when using a
    multipart
    form? Why not just have the file uploads as tempfiles?

  2. how does the CGI library determine whether or not to switch between
    Tempfile/StringIO

I use the following method to extract data when using multi-part forms:

def get_element_data(name)
@t << “get_element_data [#{name}]” << "
"
value = @cgi[name]
data = value.read
@t << "value.read : [#{data}]
"
data
end

the passed item is a Tempfile or a StringIO so it shouldn’t matter to the
developer.

  1. do I have to specifically close the IO object (assuming that one is
    returned?) - or perhaps I do not have to close the StringIO but I do have
    to
    close the Tempfile?

I haven’t been doing this.

  1. are all of the form parameters in different tempfiles or are they all
    in
    one? Say i close the file after reading 1 argument … does this mean
    that I
    cannot read any more?

Haven’t done multiple file uploads from 1 form.

Thanks for any help you can give me - I am pretty stuck on this. If I get
code working with StringIO objects it seems to break when I submit large
files, and if I get code working with Tempfile it seems to break when I
submit small files :slight_smile:

Cheers,
Martin


Martin Hart
Arnclan Limited
53 Union Street
Dunstable, Beds
LU6 1EX
http://www.arnclanit.com

Hope that helps,

regards,

Martin

···

From my understanding the read method is mixed in regardless of whether


(Martin Hart) #5

Martin Hart wrote:

  1. how can I tell whether or not the supplied parameter was a file upload
    or simple text that has been passed to me as a Tempfile/StringIO?

If it is just posted, its text. If you use enctype=multipart/form-data
then all will be files.
If you use the second, use $cgi[‘your_form_item’][0].read to get the
text value.

that’s not quite what I meant… On a multipart form, is there a way to
determine whether or not the value at $cgi[“fred”] represents an uploaded
file or just some text that was entered into an input control?

  1. do hidden controls work with multi-part forms?

Yes. You’ll need to access them as if they were a file - like above.

ok I’ll have to try this again - my previous tests led me to believe that
$cgi[“fish”] (where fish was a hidden control) returned me nil whatever the
value of the control was. I’ll retest this.

  1. why are all values expressed as Tempfile/StringIO when using a
    multipart form? Why not just have the file uploads as tempfiles?

I’ve not used StringIO

I didn’t think it was optional - is there a way to turn it off so that only
Tempfiles are used?

Ruby-talk #19664 contains a patch that seems to offer what I want (i.e.
representing only file uploads as objects of class Tempfile). However that
message is from Aug 2001 and things have moved on since then…

Can I ask if that patch was ever incorporated and if not why it was not
suitable?

It seems to me to be illogical and inefficient (assuming we have to create a
load of tempfiles) to present all multipart form values as IO objects -
surely it is better to just present the values that contain actual file
uploads as IO objects?

Cheers,
Martin

···

On Thursday 26 February 2004 21:05, Paul Vudmaska wrote:


(Martin Hart) #6

Hi Martin,
The source for cgi.rb is a good place to start I found.

fair point - I have looked at it - which is why I asked the questions about
why it returns Tempfile/StringIO for an . I still think
it is logical to return a string here - keep the Tempfile for an actual file.

Can’t you tell by the name of the parameter? If you’re naming them then
you should know what to expect. Anyway it doesn’t matter cause you treat
them exactly the same. See pt 4 below.

not really - I was planning on writing some generic code that handles files
separately from String values (at a level where it does not know the names of
the incoming parameters) …

each parameter
if parameter is an uploaded file
copy file to known location
pass filename to cgi script
else
pass value to cgi script
end
end

my plan was to isolate the problems that have been reported on the list by
having my generic code not need to know that it is receiving a multipart or
normal form. Patrick May posted an RCR (ruby-talk #35858) which moves
towards this - although it doesn’t fully work with the new StringIO stuff -
it has provided some ideas.

I’ll keep playing - but as has been pointed out to me off-list, the CGI
library does have vulnerability to a DoS attack that makes it not really
suitable for production. (ruby-talk#83725). From reading cgi.rb I don’t
think that this has been patched yet.

I’ll move away from cgi.rb for the time being, thanks to everybody for the
help.

Cheers,
Martin

···

On Thursday 26 February 2004 23:13, Martin Stannard wrote:


(Paul Vudmaska) #7

Martin Hart wrote:

Martin Hart wrote:

  1. how can I tell whether or not the supplied parameter was a file upload
    or simple text that has been passed to me as a Tempfile/StringIO?

If it is just posted, its text. If you use enctype=multipart/form-data
then all will be files.
If you use the second, use $cgi[‘your_form_item’][0].read to get the
text value.

that’s not quite what I meant… On a multipart form, is there a way to
determine whether or not the value at $cgi[“fred”] represents an uploaded
file or just some text that was entered into an input control?

No, not as far as i know(which is not very far!!)

  1. do hidden controls work with multi-part forms?

Yes. You’ll need to access them as if they were a file - like above.

ok I’ll have to try this again - my previous tests led me to believe that

$cgi[“fish”] (where fish was a hidden control) returned me nil whatever the
value of the control was. I’ll retest this.

They seem to work. I tested it.

  1. why are all values expressed as Tempfile/StringIO when using a
    multipart form? Why not just have the file uploads as tempfiles?

I’ve not used StringIO

Sorry, i dont know!

I didn’t think it was optional - is there a way to turn it off so that only
Tempfiles are used?

Ruby-talk #19664 contains a patch that seems to offer what I want (i.e.
representing only file uploads as objects of class Tempfile). However that
message is from Aug 2001 and things have moved on since then…

The version I use (on my host) is 1.6 or so

Can I ask if that patch was ever incorporated and if not why it was not
suitable?

It seems to me to be illogical and inefficient (assuming we have to create a
load of tempfiles) to present all multipart form values as IO objects -
surely it is better to just present the values that contain actual file
uploads as IO objects?

Cheers,
Martin

Seems like a good questions but I honestly dont know.

When things are sent with multipart/form-data the items are actually
sent as a sort of binary form(to support say,gif files). And it is the
responsibility of the server side code to disect it.

So, the fields are either all binary or all text.

But hey, i’m pretty new so keep that in mind :slight_smile:
Paul

···

On Thursday 26 February 2004 21:05, Paul Vudmaska wrote: