Open-uri error

Elliot_Temple · 7 June 2006 06:47

I am writing a script to download warcraft 3 replays for me. It got a few (which work) then had an error:

URI::InvalidURIError: bad URI(is not URI?): http://ftp.replays.net/w3g/060607/060606_mYm]Lucifer(UD)_vs_mTw-LasH(Hum)_TwistedMeadows_RN.w3g

The URL works in Safari, so I'm not sure what's going on. My wild guess is that Safari accepts technically invalid URLs. Hopefully someone knowledgeable can tell me what the issue is. Here's my code:

require "open-uri"
path = "/Applications/Warcraft\ III/Replay/auto/"
files = Dir.glob "#{path}*"
count = 0

urls = `lynx -dump http://war3.replays.net/`.split "\n"
urls = urls.select { |url| url =~ %r-\d{1,3}\. http://ftp.replays.net/w3g- }
urls = urls.collect { |url| url.sub(%r-\s*\d{1,3}\.\s*-, "")}

puts "I got #{count} files!"

-- Elliot Temple

Elliot_Temple · 7 June 2006 18:23

I fixed my problem. The key change is:

url = URI.escape(url)

Here's the current version of the code:

require "open-uri"
path = "/Applications/Warcraft\ III/Replay/auto/"
Dir.chdir path
files = Dir.glob "*"
count = 0

urls = `lynx -dump http://war3.replays.net/`.split "\n"
urls = urls.select {|url| url =~ %r-\d{1,3}\. http://ftp.replays.net/w3g-\}
urls = urls.collect {|url| url.sub(%r-\s*\d{1,3}\.\s*-, "")}.uniq

puts "I found #{urls.length} replays!"

urls.each do |url|
   filename = url.sub(%r-http://ftp.replays.net/w3g/\d*/-, "")
   url = URI.escape(url)
   if not files.include?(filename)
     puts "Count is #{count}. Getting #{url}"
     open(url) do |remote_file|
       File.open(path + filename, "w") do |local_file|
         local_file.write remote_file.read
         count += 1
       end
     end
   end
end

puts "I got #{count} files!"

-- Elliot Temple

···

On Jun 6, 2006, at 11:47 PM, Elliot Temple wrote:

I am writing a script to download warcraft 3 replays for me. It got a few (which work) then had an error:

URI::InvalidURIError: bad URI(is not URI?): http://ftp.replays.net/w3g/060607/060606_mYm]Lucifer(UD)_vs_mTw-LasH(Hum)_TwistedMeadows_RN.w3g

Elliot_Temple · 8 June 2006 08:18

oops. that didn't work for URLS with in them. now i've added this code:

     begin
       get_replay url, filename
     rescue URI::InvalidURIError
       url = url.scan(%r-http://ftp.replays.net/w3g/\d*/-\)[0] + CGI.escape(filename)
       begin
         get_replay url, filename
       rescue URI::InvalidURIError
         STDERR.puts $!
       end
     end

the CGI.escape changes but isn't safe to do on the entire URL (it changes slashes as well). observe:

irb(main):013:0> x = CGI.escape "http://www.google.com"
=> "http%3A%2F%2Fwww.google.com"
irb(main):014:0> open(x)
Errno::ENOENT: No such file or directory - http%3A%2F%2Fwww.google.com
         from /usr/local/lib/ruby/1.8/open-uri.rb:88:in `initialize'
         from /usr/local/lib/ruby/1.8/open-uri.rb:88:in `open'
         from (irb):14
irb(main):015:0> open "http://www.google.com"
=> #<StringIO:0x585b74>

I don't know if I'm doing this the correct way, but it's working so far (got about 60 files).

Elliot

···

On Jun 7, 2006, at 11:23 AM, Elliot Temple wrote:

On Jun 6, 2006, at 11:47 PM, Elliot Temple wrote:

I am writing a script to download warcraft 3 replays for me. It got a few (which work) then had an error:

URI::InvalidURIError: bad URI(is not URI?): http://ftp.replays.net/w3g/060607/060606_mYm]Lucifer(UD)_vs_mTw-LasH(Hum)_TwistedMeadows_RN.w3g

I fixed my problem. The key change is:

url = URI.escape(url)

Topic		Replies	Views
Is this an open-uri bug? ruby-talk	0	106	1 November 2004
Open of web page fails ruby-talk	7	126	8 October 2011
Is this an open-uri bug? ruby-talk	4	109	25 November 2004
Found a ruby bug in the URI class, what do I do? ruby-talk	4	139	28 August 2009
URI.parse errors! ruby-talk	2	99	4 February 2009

Open-uri error

Related topics