How do I parse a string to find a URL?

Is there a command in Ruby that will accept a string, and spit out a
URL that is contained in the string? I think I remember reading about
something that would do this, but I cant recall.

URI::extract

http://ruby-doc.org/core/classes/URI.html#M004839

···

On 9/17/07, Jayson Williams <williams.jayson@gmail.com> wrote:

Is there a command in Ruby that will accept a string, and spit out a
URL that is contained in the string? I think I remember reading about
something that would do this, but I cant recall.

Outstanding!
Thanks

···

On 9/17/07, Jano Svitok <jan.svitok@gmail.com> wrote:

On 9/17/07, Jayson Williams <williams.jayson@gmail.com> wrote:
> Is there a command in Ruby that will accept a string, and spit out a
> URL that is contained in the string? I think I remember reading about
> something that would do this, but I cant recall.

URI::extract

http://ruby-doc.org/core/classes/URI.html#M004839

Jano Svitok wrote:

Is there a command in Ruby that will accept a string, and spit out a
URL that is contained in the string? I think I remember reading about
something that would do this, but I cant recall.

URI::extract

http://ruby-doc.org/core/classes/URI.html#M004839

Wow, I didn't know about that, very nice. But it has a few weaknesses:
   >> URI.extract("behold: www.abc.com and http://www.xyz.com.")
   => ["behold:", "http://www.xyz.com."]
(notice the period at the end of xyz.com)

Daniel

···

On 9/17/07, Jayson Williams <williams.jayson@gmail.com> wrote:

not a weakness, in that string 'behold:' is a valid uri, it has a
scheme with a scheme delimeter (":"). "www.abc.com" is not an
unambiguous uri, no scheme present.

···

On Sep 17, 7:46 pm, Daniel DeLorme <dan...@dan42.com> wrote:

Jano Svitok wrote:
> On 9/17/07, Jayson Williams <williams.jay...@gmail.com> wrote:
>> Is there a command in Ruby that will accept a string, and spit out a
>> URL that is contained in the string? I think I remember reading about
>> something that would do this, but I cant recall.

> URI::extract

>http://ruby-doc.org/core/classes/URI.html#M004839

Wow, I didn't know about that, very nice. But it has a few weaknesses:
   >> URI.extract("behold:www.abc.comandhttp://www.xyz.com.")
   => ["behold:", "http://www.xyz.com."]
(notice the period at the end of xyz.com)

Daniel

also the period is legal,

···

On Sep 17, 7:46 pm, Daniel DeLorme <dan...@dan42.com> wrote:

Jano Svitok wrote:
> On 9/17/07, Jayson Williams <williams.jay...@gmail.com> wrote:
>> Is there a command in Ruby that will accept a string, and spit out a
>> URL that is contained in the string? I think I remember reading about
>> something that would do this, but I cant recall.

> URI::extract

>http://ruby-doc.org/core/classes/URI.html#M004839

Wow, I didn't know about that, very nice. But it has a few weaknesses:
   >> URI.extract("behold:www.abc.comandhttp://www.xyz.com.")
   => ["behold:", "http://www.xyz.com."]
(notice the period at the end of xyz.com)

Daniel

franco wrote:

not a weakness, in that string 'behold:' is a valid uri, it has a
scheme with a scheme delimeter (":"). "www.abc.com" is not an
unambiguous uri, no scheme present.

Is it a valid uri if nothing is present after the scheme? Anyway, I know that the results are technically valid but they are less than useful if you want, say, to extract and "linkify" urls that users might have written inside a message. (which is what I assumed the OP wanted but I might have been mistaken)

Daniel

you could just select the ones with a scheme scpecific part? or screen
scrape for //a/@href to get all hyperreferenced anchors (links).

···

On Sep 17, 10:06 pm, Daniel DeLorme <dan...@dan42.com> wrote:

franco wrote:
> not a weakness, in that string 'behold:' is a valid uri, it has a
> scheme with a scheme delimeter (":"). "www.abc.com" is not an
> unambiguous uri, no scheme present.

Is it a valid uri if nothing is present after the scheme? Anyway, I know
that the results are technically valid but they are less than useful if
you want, say, to extract and "linkify" urls that users might have
written inside a message. (which is what I assumed the OP wanted but I
might have been mistaken)

Daniel