Checking that an URL exists

Hi all,

Well, I want to make a little script that checks if the URLs in my bookmarks
are OK (in other words, if they are online yet).

I’ve been hanging around with URI classes and I’ve seen clearly a function
that performs that check.

I’ve tried using open-uri.rb but, if open method fails, the script fails.

Any ideas?

Thanks in advance.

···


EuropeSwPatentFree
(o_.’ Imobach González Sosa imobachgs@softhome.net
//\c{} imobachgs@step.es a2419@dis.ulpgc.es
V__)_ imodev@softhome.net osoh en jabber.at y jabber.org
Usuario Linux #201634
Gentoo Linux con núcleo 2.6.6 sobre Intel Pentium 4

Try this:

···

###################################################

#!/usr/bin/ruby

require ‘net/http’

h = Net::HTTP.new(ARGV[0] || ‘www.nonexist-ruby-lang.org’, 80)
url = ARGV[1] || ‘/’

begin
resp, data = h.get(url, nil) { |a| }
puts "Valid URL"
rescue Net::ProtoRetriableError =>detail
puts "Retriable URL"
rescue
puts "Not a Valid URL"
end

####################################################

----- Original Message -----
From: “Imobach González Sosa” imodev@softhome.net
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Wednesday, May 26, 2004 8:55 AM
Subject: Checking that an URL exists

Hi all,

Well, I want to make a little script that checks if the URLs in my bookmarks
are OK (in other words, if they are online yet).

I’ve been hanging around with URI classes and I’ve seen clearly a function
that performs that check.

I’ve tried using open-uri.rb but, if open method fails, the script fails.

Any ideas?

Thanks in advance.


EuropeSwPatentFree
(o_.’ Imobach González Sosa imobachgs@softhome.net
//\c{} imobachgs@step.es a2419@dis.ulpgc.es
V__)_ imodev@softhome.net osoh en jabber.at y jabber.org
Usuario Linux #201634
Gentoo Linux con núcleo 2.6.6 sobre Intel Pentium 4

I am having trouble with my delete_if block.

prop_array.delete_if{ |p|
(p=~/#{attr}/).to_i > 0; }

and i have tried the longer version

prop_array.delete_if{ |p|
m = (p=~/#{attr}/).to_i
m > 0; }

and i have tried

prop_array.delete_if{ |p|
(p=~/#{attr}/) != nil }

It deletes every element, even when m == 0 or m == nil. Any ideas? Thanks,

Zach

···

Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.684 / Virus Database: 446 - Release Date: 5/13/2004

Thank you… just before getting the mail i had just write a piece of code
that could do the work… but your proposal looks better :wink:

Thank you.

···

El Miércoles, 26 de Mayo de 2004 15:09, Mohammad Khan escribió:

Try this:

###################################################

#!/usr/bin/ruby

require ‘net/http’

h = Net::HTTP.new(ARGV[0] || ‘www.nonexist-ruby-lang.org’, 80)
url = ARGV[1] || ‘/’

begin
resp, data = h.get(url, nil) { |a| }
puts “Valid URL”
rescue Net::ProtoRetriableError =>detail
puts “Retriable URL”
rescue
puts “Not a Valid URL”
end

####################################################


EuropeSwPatentFree
(o_.’ Imobach González Sosa imobachgs@softhome.net
//\c{} imobachgs@step.es a2419@dis.ulpgc.es
V__)_ imodev@softhome.net osoh en jabber.at y jabber.org
Usuario Linux #201634
Gentoo Linux con núcleo 2.6.6 sobre Intel Pentium 4

Try this:

###################################################

#!/usr/bin/ruby

require ‘net/http’

h = Net::HTTP.new(ARGV[0] || ‘www.nonexist-ruby-lang.org’, 80)
url = ARGV[1] || ‘/’

begin
resp, data = h.get(url, nil) { |a| }
puts “Valid URL”
rescue Net::ProtoRetriableError =>detail
puts “Retriable URL”
rescue
puts “Not a Valid URL”
end

####################################################

This is fine if you aren’t checking very many urls. If, however, you
are going to be doing this very many times, you should use “head”
instead, to reduce the webserver’s load (and increase your applications
response time):

resp = h.head(url, nil)

Also, many invalid urls won’t raise errors, they just return error
codes in the headers. You may want to check for that, too:

h.head “/non-existant-page.html”
=> #<Net::HTTPNotFound 404 Not Found readbody=true>
h.head “/”
=> #<Net::HTTPMovedPermanently 301 Moved Permanently readbody=true>
h.head(“/”)[‘location’]
=> “Ruby Programming Language
h.head “/en/”
=> #<Net::HTTPOK 200 OK readbody=true>
h.head

check the class of the header: Net::HTTPOK is a good url;
Net::HTTPMovedPermanantly is what it sounds like, and you can treat it
like a hash to look up the redirected location. Net::HTTPNotFound is a
dead end.

···

On May 26, 2004, at 7:09 AM, Mohammad Khan wrote:

----- Original Message -----
From: “Imobach González Sosa” imodev@softhome.net
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Wednesday, May 26, 2004 8:55 AM
Subject: Checking that an URL exists

Hi all,

Well, I want to make a little script that checks if the URLs in my
bookmarks
are OK (in other words, if they are online yet).

I’ve been hanging around with URI classes and I’ve seen clearly a
function
that performs that check.

I’ve tried using open-uri.rb but, if open method fails, the script
fails.

Any ideas?

Thanks in advance.


EuropeSwPatentFree
(o_.’ Imobach González Sosa imobachgs@softhome.net
//\c{} imobachgs@step.es a2419@dis.ulpgc.es
V__)_ imodev@softhome.net osoh en jabber.at y jabber.org
Usuario Linux #201634
Gentoo Linux con núcleo 2.6.6 sobre Intel Pentium 4

=~ Matches a regex against a string and return the offset of the start of
the match or nil.

Below example can illustrate this:

···

##############################
#!/usr/bin/ruby

a = [‘at’, ‘atx’, ‘xatx’, ‘xat’, ‘xyz’]

puts “\nBefore delete_if array is : "
puts a.join(”\n")

rgx = Regexp.new(/at/)

puts “\n”
a.each { | x | puts “offset of 'at' in #{x} is #{rgx =~ x}” }

delete_if

a.delete_if { | x | rgx =~ x }

puts “\nAfter delete_if array is : "
puts a.join(”:")

#############################

----- Original Message -----
From: “Zach Dennis” zdennis@mktec.com
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Wednesday, May 26, 2004 10:14 AM
Subject: delete_if woes

I am having trouble with my delete_if block.

prop_array.delete_if{ |p|
(p=~/#{attr}/).to_i > 0; }

and i have tried the longer version

prop_array.delete_if{ |p|
m = (p=~/#{attr}/).to_i
m > 0; }

and i have tried

prop_array.delete_if{ |p|
(p=~/#{attr}/) != nil }

It deletes every element, even when m == 0 or m == nil. Any ideas? Thanks,

Zach

Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.684 / Virus Database: 446 - Release Date: 5/13/2004

Hi –

···

On Wed, 26 May 2004, Zach Dennis wrote:

I am having trouble with my delete_if block.

prop_array.delete_if{ |p|
(p=~/#{attr}/).to_i > 0; }

and i have tried the longer version

prop_array.delete_if{ |p|
m = (p=~/#{attr}/).to_i
m > 0; }

and i have tried

prop_array.delete_if{ |p|
(p=~/#{attr}/) != nil }

It deletes every element, even when m == 0 or m == nil. Any ideas? Thanks,

I’m not sure exactly what you’re trying to do. The examples do
different things. Can you give sample before and after arrays? And
what’s in attr?

David


David A. Black
dblack@wobblini.net

[Zach Dennis zdennis@mktec.com, 2004-05-26 16.14 CEST]

I am having trouble with my delete_if block.

prop_array.delete_if{ |p|
(p=~/#{attr}/).to_i > 0; }

[…]

It deletes every element, even when m == 0 or m == nil. Any ideas? Thanks,

Works here. Check the contents of prop_array and attr.

$ /usr/bin/ruby -ve ’
prop_array= %w{ aax xaa }
attr=“aa”
result = prop_array.delete_if{ |p|
(p=~/#{attr}/).to_i > 0; }
p result

ruby 1.8.1 (2004-02-03) [i386-linux]
[“aax”]
$

I think, I got you wrong.

a = [ ‘at’, ‘atx’, ‘xatx’, ‘xat’, ‘xyz’ ]
rgx = Regexp.new(/at/)

a.delete_if { | x |
(rgx =~ x).to_i > 0
}

#after this, my ‘a’ is

[“at”, “atx”, “xyz”]

which is logical and I dont see any odd.

what is #{attr} from your end?

···

----- Original Message -----
From: “Zach Dennis” zdennis@mktec.com
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Wednesday, May 26, 2004 10:14 AM
Subject: delete_if woes

I am having trouble with my delete_if block.

prop_array.delete_if{ |p|
(p=~/#{attr}/).to_i > 0; }

and i have tried the longer version

prop_array.delete_if{ |p|
m = (p=~/#{attr}/).to_i
m > 0; }

and i have tried

prop_array.delete_if{ |p|
(p=~/#{attr}/) != nil }

It deletes every element, even when m == 0 or m == nil. Any ideas? Thanks,

Zach

Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.684 / Virus Database: 446 - Release Date: 5/13/2004

Thanks for everyone’s response. My problem was that I needed to switch the
‘p’ and the ‘attr’ so it was:

attr =~ /#{p}/

attr is something like: “Component.behavior”
p would be something like: “behavior”

Zach

···

-----Original Message-----
From: David A. Black [mailto:dblack@wobblini.net]
Sent: Wednesday, May 26, 2004 11:17 AM
To: ruby-talk ML
Subject: Re: delete_if woes

Hi –

On Wed, 26 May 2004, Zach Dennis wrote:

I am having trouble with my delete_if block.

prop_array.delete_if{ |p|
(p=~/#{attr}/).to_i > 0; }

and i have tried the longer version

prop_array.delete_if{ |p|
m = (p=~/#{attr}/).to_i
m > 0; }

and i have tried

prop_array.delete_if{ |p|
(p=~/#{attr}/) != nil }

It deletes every element, even when m == 0 or m == nil. Any ideas? Thanks,

I’m not sure exactly what you’re trying to do. The examples do
different things. Can you give sample before and after arrays? And
what’s in attr?

David


David A. Black
dblack@wobblini.net


Incoming mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.684 / Virus Database: 446 - Release Date: 5/13/2004


Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.684 / Virus Database: 446 - Release Date: 5/13/2004