Lots of web pages contain copyright characters (not © but
something that displays in Mozilla view source as the copyright symbol,
in emacs as a square box, and probably says to the world “Hi! I’m an
HTML file that was created with Word.”). SOAP4r is unhappy with that
character, as you can see in this use of the googleSearch sample:
% ruby wsdlDriver.rb ‘Mark Swanson’
/usr/local/lib/ruby/1.8/xsd/datatypes.rb:184:in _set': {http://www.w3.org/2001/XMLSchema}string: cannot accept 'Artwork by <b>Mark</b> <b>Swanson</b> Copyright © 2002 <b>Mark</b> <b>Swanson</b>. All rights reserved. '. (XSD::ValueSpaceError) from /usr/local/lib/ruby/1.8/xsd/datatypes.rb:125:in
set’
from /usr/local/lib/ruby/1.8/soap/encodingstyle/soapHandler.rb:446:in
decode_textbuf' from /usr/local/lib/ruby/1.8/soap/encodingstyle/soapHandler.rb:223:in
decode_tag_end’
In contrast, the Java version that comes with the Google API download
prints the peculiar character.
This is easy for me to work around, just comment out the check in
XSDString#_set:
def _set(value)
unless XSD::Charset.is_ces(value, XSD::Charset.encoding)
raise ValueSpaceError.new("#{ type }: cannot accept ‘#{ value
}’.")
end
@data = value
end
My questions:
- Is there a better workaround?
Or something I’m misunderstanding?
- Is this behavior something that
should be changed in SOAP4r?
- Is Google in error in delivering
that character in that type of string?
I am using ruby 1.8.1-preview2 and the code from soap4r-1_5_1.
···
Brian Marick
Consulting, training, contracting, and research
Focused on the intersection of testing, programming, and design
marick@testing.com, marick@visibleworkings.com
www.testing.com, www.visibleworkings.com
Hi, good morning from far east.
From: “Brian Marick” marick@testing.com
Sent: Sunday, November 09, 2003 5:57 AM
% ruby wsdlDriver.rb ‘Mark Swanson’
/usr/local/lib/ruby/1.8/xsd/datatypes.rb:184:in _set': {http://www.w3.org/2001/XMLSchema}string: cannot accept 'Artwork by <b>Mark</b> <b>Swanson</b> Copyright © 2002 <b>Mark</b> <b>Swanson</b>. All rights reserved. '. (XSD::ValueSpaceError) from /usr/local/lib/ruby/1.8/xsd/datatypes.rb:125:in
set’
from /usr/local/lib/ruby/1.8/soap/encodingstyle/soapHandler.rb:446:in
decode_textbuf' from /usr/local/lib/ruby/1.8/soap/encodingstyle/soapHandler.rb:223:in
decode_tag_end’
GoogleAPI returns “\xc2\xa9” sequence in utf-8 format. Can you try this?
$ ruby -Ku wsdlDriver.rb ‘Mark Swanson Copyright Artwork’
There may be another reason (no iconv?) though… I cannot reproduce
the same error on my linux/cygwin boxes even though I run the
wsdlDriver.rb with “-Kn”.
Beside this, I should add ‘$KCODE = “UTF8”’ at the head of the
wsdlDriver.rb.
Regards,
// NaHi
Once again, you have saved me. Thank you.
···
On Saturday, November 8, 2003, at 06:49 PM, NAKAMURA, Hiroshi wrote:
Hi, good morning from far east.
From: “Brian Marick” marick@testing.com
Sent: Sunday, November 09, 2003 5:57 AM
% ruby wsdlDriver.rb ‘Mark Swanson’
/usr/local/lib/ruby/1.8/xsd/datatypes.rb:184:in _set': {http://www.w3.org/2001/XMLSchema}string: cannot accept 'Artwork by <b>Mark</b> <b>Swanson</b> Copyright © 2002 <b>Mark</b> <b>Swanson</b>. All rights reserved. '. (XSD::ValueSpaceError) from /usr/local/lib/ruby/1.8/xsd/datatypes.rb:125:in
set’
from /usr/local/lib/ruby/1.8/soap/encodingstyle/soapHandler.rb:446:in
decode_textbuf' from /usr/local/lib/ruby/1.8/soap/encodingstyle/soapHandler.rb:223:in
decode_tag_end’
GoogleAPI returns “\xc2\xa9” sequence in utf-8 format. Can you try
this?
$ ruby -Ku wsdlDriver.rb ‘Mark Swanson Copyright Artwork’
There may be another reason (no iconv?) though… I cannot reproduce
the same error on my linux/cygwin boxes even though I run the
wsdlDriver.rb with “-Kn”.
Beside this, I should add ‘$KCODE = “UTF8”’ at the head of the
wsdlDriver.rb.
Regards,
// NaHi
Brian Marick
Consulting, training, contracting, and research
Focused on the intersection of testing, programming, and design
marick@testing.com, marick@visibleworkings.com
www.testing.com, www.visibleworkings.com