Mechanize and charset issues

I'm not sure what is causing this error as I can successfully login, I
just can't submit this form without the script bombing out.

formatstring = "testing submission"

agent = WWW::Mechanize.new
page = agent.get 'hidden'
form = page.forms.first
if !(form.action.eql?('submit.php'))
        p "logging in....."
        form['username'] = 'hidden'
        form['password'] = 'hidden'

        page = agent.submit form
        page = agent.click(page.link_with(:text => 'Add'))
end

page = agent.click(page.link_with(:text => '[Add Content]'))
uploadForm = page.forms[6]
uploadForm['format'] = formatstring
page = agent.submit uploadForm
#pp page

Gives me the error:

/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/util.rb:40:in
`iconv': "\342\202\254\305\223a condition"... (Iconv::IllegalSequence)
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/util.rb:40:in
`from_native_charset'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:151:in
`from_native_charset'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:143:in
`proc_query'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:142:in
`map'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:142:in
`proc_query'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:165:in
`build_query'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:164:in
`each'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:164:in
`build_query'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:213:in
`request_data'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize.rb:392:in
`post_form'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize.rb:335:in
`submit'

···

--
Posted via http://www.ruby-forum.com/.

It is caused by the following html â€&oelig in one of the
hidden form entries that is being submitted. I'm not sure how to avoid
this from bombing and still submit the form though?

···

--
Posted via http://www.ruby-forum.com/.

John Schmitz wrote:

page = agent.click(page.link_with(:text => '[Add Content]'))
uploadForm = page.forms[6]
uploadForm['format'] = formatstring
page = agent.submit uploadForm
#pp page

Gives me the error:

/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/util.rb:40:in
`iconv': "\342\202\254\305\223a condition"... (Iconv::IllegalSequence)
        from

Running "ruby -KU ..." will probably fix it (at least it has worked for
me whenever I had errors from \nnn inside strings).

···

--
Posted via http://www.ruby-forum.com/\.

The Higgs bozo wrote:

John Schmitz wrote:

page = agent.click(page.link_with(:text => '[Add Content]'))
uploadForm = page.forms[6]
uploadForm['format'] = formatstring
page = agent.submit uploadForm
#pp page

Gives me the error:

/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/util.rb:40:in
`iconv': "\342\202\254\305\223a condition"... (Iconv::IllegalSequence)
        from

Running "ruby -KU ..." will probably fix it (at least it has worked for
me whenever I had errors from \nnn inside strings).

Thank you for the response but it doesn't seem to solve the issue. I
think it's related to charsets and iconv, but I have no idea where to go
from there. I get a near duplicate error message with ruby -KU:

/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/util.rb:40:in
`iconv': "â¬Åa condition"... (Iconv::IllegalSequence)
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/util.rb:40:in
`from_native_charset'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:151:in
`from_native_charset'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:143:in
`proc_query'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:142:in
`map'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:142:in
`proc_query'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:165:in
`build_query'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:164:in
`each'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:164:in
`build_query'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize/form.rb:213:in
`request_data'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize.rb:392:in
`post_form'
        from
/var/lib/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize.rb:335:in
`submit'

···

--
Posted via http://www.ruby-forum.com/\.

If anyone comes across this problem, this is how I fixed it. Found a
method online and made some minor changes and additions. I just pass the
problem strings through this and it gives me back strings that don't
have issues.

def fix_quotes(c)
      c.gsub!(/\342\200(?:\234|\235)/,'"')
      c.gsub!(/\342\200(?:\230|\231)/,"'")
      c.gsub!(/\342\200\223/,"-")
      c.gsub!(/\342\200\246/,"...")
      c.gsub!(/\303\242\342\202\254\342\204\242/,"'")
      c.gsub!(/\303\242\342\202\254\302\235/,'"')
      c.gsub!(/\303\242\342\202\254\305\223/,'"')
      c.gsub!(/\303\242\342\202\254"/,'-')
      c.gsub!(/\342\202\254\313\234/,'"')
end

···

--
Posted via http://www.ruby-forum.com/.

Have you tried to set encoding for page something like this:
page.encoding = 'UTF-8'?

Jarmo

John Schmitz wrote:

···

If anyone comes across this problem, this is how I fixed it. Found a
method online and made some minor changes and additions. I just pass the
problem strings through this and it gives me back strings that don't
have issues.

def fix_quotes(c)
      c.gsub!(/\342\200(?:\234|\235)/,'"')
      c.gsub!(/\342\200(?:\230|\231)/,"'")
      c.gsub!(/\342\200\223/,"-")
      c.gsub!(/\342\200\246/,"...")
      c.gsub!(/\303\242\342\202\254\342\204\242/,"'")
      c.gsub!(/\303\242\342\202\254\302\235/,'"')
      c.gsub!(/\303\242\342\202\254\305\223/,'"')
      c.gsub!(/\303\242\342\202\254"/,'-')
      c.gsub!(/\342\202\254\313\234/,'"')
end

--
Posted via http://www.ruby-forum.com/\.