Problem with a regular expression

I have the following code snippet:

    require 'net/http'
         begin
              hdoc =
Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?s=Dupont&t=S&m=US'))

              re = /<TD>(.*)</TD>/
              if hdoc =~ re
                   print "#{$&}\n"
              else
                   print "Nothing\n"
              end
         end

The regular expression is never matched when I use the code as shown
above (the expression for re is just a simple one for my testing).
However, if I replace the variable name hdoc by a string like
"<TD>Test</TD>Test1", the regular expression is matched. The type of
hdoc is String. What is wrong with the snippet above. I even tried to
replace hdoc by hdoc.to_s and it still doesn't work.
Thanks for your help!

Charles

···

------
http://radio.weblogs.com/0111823/
http://charlesnadeau.blogspot.com/

Hi,

I have the following code snippet:

    require 'net/http'
         begin
              hdoc =
Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?s=Dupont&t=S&m=US'))

              re = /<TD>(.*)</TD>/
              if hdoc =~ re
                   print "#{$&}\n"
              else
                   print "Nothing\n"
              end
         end

The regular expression is never matched when I use the code as shown
above (the expression for re is just a simple one for my testing).
However, if I replace the variable name hdoc by a string like
"<TD>Test</TD>Test1", the regular expression is matched. The type of
hdoc is String. What is wrong with the snippet above. I even tried to
replace hdoc by hdoc.to_s and it still doesn't work.
Thanks for your help!

It looks like there are no upper case "TD" tags in the page that you are
fetching. Try this instead:

  begin
  hdoc = Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?s=Dupont&t=S&m=US'))
  
    re = /<TD>(.*)<\/TD>/i
    if hdoc =~ re
      print "#{$&}\n"
    else
      print "Nothing\n"
    end
  end

Your regular expression was case sensitive, I changed it to be case
insensitive by adding the "i" switch.

···

On Sat, Oct 14, 2006 at 02:55:10AM +0900, charles.nadeau@gmail.com wrote:

Charles
------
http://radio.weblogs.com/0111823/
http://charlesnadeau.blogspot.com/

--
Aaron Patterson
http://tenderlovemaking.com/

    require 'net/http'
         begin
              hdoc =
Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?s=Dupont&t=S&m=US'))

              re = /<TD>(.*)</TD>/
              if hdoc =~ re
                   print "#{$&}\n"
              else
                   print "Nothing\n"
              end
         end

Whrn I substitute '\/TD' for '/TD' and make the regex case insensitive, I get a match. See below:

<code>
! /usr/bin/env ruby -w
require 'net/http'
hdoc = Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?s=Dupont&t=S&m=US'))
re = /<TD>(.*)<\/TD>/i ### note changes
if hdoc =~ re
    puts "#{$&}\n"
else
    puts "Nothing\n"
end
</code>

<result>
<td><table border="0" cellpadding="6" width="100%" cellspacing="0"><tr><td bgcolor="#556f93"><big><b style="color:#ffffff">Symbol Lookup </b></big></td></tr></table></

</tr><tr><td></td></tr></table></td></tr><tr><td><table

cellpadding="0" border="0" cellspacing="0"><tr><td></td></tr></

</td></tr><tr><td valign="top"><form><table border="0"

cellpadding="4" bgcolor="a0b8c8" cellspacing="1"><tr><td bgcolor="eeeeee"><table cellpadding="1" width="100%" cellspacing="0"><tr><td>Name:</td><td>Type:</td><td>Market:</td><td></

</tr><tr><td><input size="30" name="s"></td><td><select

name="t"><option selected value="S"> Stocks </option><option value="E"> ETFs </option><option value="I"> Indices </option><option value="M"> Mutual Funds </option><option value="F"> Futures </

</select></td><td><select name="m"><option selected

value="US">U.S. & Canada</option><option value="ALL">World Market</

</select></td><td><input value="Look Up" type="submit"></td></<tr><td valign="bottom" colspan="4"><small><a href="http://

finance.yahoo.com/exchanges">View supported exchanges</a></small></

</tr></table></td></tr></table></form><table><tr><td

align="left">2 results for <b>'Dupont'</b> (type=<b>Stocks</b>, market=<b>U.S. &amp; Canada</b>)</td></result>

Regards, Morton

···

On Oct 13, 2006, at 1:55 PM, charles.nadeau@gmail.com wrote:

              re = /<TD>(.*)</TD>/

Use a HTML parser? Hpricot considered sexy recently.

David Vallner

Morton Goldberg wrote:

···

On Oct 13, 2006, at 1:55 PM, charles.nadeau@gmail.com wrote:

> require 'net/http'
> begin
> hdoc =
> Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?
> s=Dupont&t=S&m=US'))
>
> re = /<TD>(.*)</TD>/
> if hdoc =~ re
> print "#{$&}\n"
> else
> print "Nothing\n"
> end
> end

Whrn I substitute '\/TD' for '/TD' and make the regex case
insensitive, I get a match. See below:

<code>
! /usr/bin/env ruby -w
require 'net/http'
hdoc = Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?
s=Dupont&t=S&m=US'))
re = /<TD>(.*)<\/TD>/i ### note changes
if hdoc =~ re
    puts "#{$&}\n"
else
    puts "Nothing\n"
end
</code>

<result>
<td><table border="0" cellpadding="6" width="100%"
cellspacing="0"><tr><td bgcolor="#556f93"><big><b
style="color:#ffffff">Symbol Lookup </b></big></td></tr></table></
></tr><tr><td></td></tr></table></td></tr><tr><td><table
cellpadding="0" border="0" cellspacing="0"><tr><td></td></tr></
></td></tr><tr><td valign="top"><form><table border="0"
cellpadding="4" bgcolor="a0b8c8" cellspacing="1"><tr><td
bgcolor="eeeeee"><table cellpadding="1" width="100%"
cellspacing="0"><tr><td>Name:</td><td>Type:</td><td>Market:</td><td></
></tr><tr><td><input size="30" name="s"></td><td><select
name="t"><option selected value="S"> Stocks </option><option
value="E"> ETFs </option><option value="I"> Indices </option><option
value="M"> Mutual Funds </option><option value="F"> Futures </
></select></td><td><select name="m"><option selected
value="US">U.S. & Canada</option><option value="ALL">World Market</
></select></td><td><input value="Look Up" type="submit"></td></
><tr><td valign="bottom" colspan="4"><small><a href="http://
finance.yahoo.com/exchanges">View supported exchanges</a></small></
></tr></table></td></tr></table></form><table><tr><td
align="left">2 results for <b>'Dupont'</b> (type=<b>Stocks</b>,
market=<b>U.S. &amp; Canada</b>)</td></result>

Regards, Morton

Morton, Aaron,

You are both right, thanks a lot! I also added "m" at the end of the
regular expression to match whatever might span two lines.
Cheers!

Charles
------
http://radio.weblogs.com/0111823/
http://charlesnadeau.blogspot.com/

Morton Goldberg wrote:

···

On Oct 13, 2006, at 1:55 PM, charles.nadeau@gmail.com wrote:

> require 'net/http'
> begin
> hdoc =
> Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?
> s=Dupont&t=S&m=US'))
>
> re = /<TD>(.*)</TD>/
> if hdoc =~ re
> print "#{$&}\n"
> else
> print "Nothing\n"
> end
> end

Whrn I substitute '\/TD' for '/TD' and make the regex case
insensitive, I get a match. See below:

<code>
! /usr/bin/env ruby -w
require 'net/http'
hdoc = Net::HTTP.get(URI.parse('http://finance.yahoo.com/lookup?
s=Dupont&t=S&m=US'))
re = /<TD>(.*)<\/TD>/i ### note changes
if hdoc =~ re
    puts "#{$&}\n"
else
    puts "Nothing\n"
end
</code>

<result>
<td><table border="0" cellpadding="6" width="100%"
cellspacing="0"><tr><td bgcolor="#556f93"><big><b
style="color:#ffffff">Symbol Lookup </b></big></td></tr></table></
></tr><tr><td></td></tr></table></td></tr><tr><td><table
cellpadding="0" border="0" cellspacing="0"><tr><td></td></tr></
></td></tr><tr><td valign="top"><form><table border="0"
cellpadding="4" bgcolor="a0b8c8" cellspacing="1"><tr><td
bgcolor="eeeeee"><table cellpadding="1" width="100%"
cellspacing="0"><tr><td>Name:</td><td>Type:</td><td>Market:</td><td></
></tr><tr><td><input size="30" name="s"></td><td><select
name="t"><option selected value="S"> Stocks </option><option
value="E"> ETFs </option><option value="I"> Indices </option><option
value="M"> Mutual Funds </option><option value="F"> Futures </
></select></td><td><select name="m"><option selected
value="US">U.S. & Canada</option><option value="ALL">World Market</
></select></td><td><input value="Look Up" type="submit"></td></
><tr><td valign="bottom" colspan="4"><small><a href="http://
finance.yahoo.com/exchanges">View supported exchanges</a></small></
></tr></table></td></tr></table></form><table><tr><td
align="left">2 results for <b>'Dupont'</b> (type=<b>Stocks</b>,
market=<b>U.S. &amp; Canada</b>)</td></result>

Regards, Morton

Morton, Aaron,

You are both right, thanks a lot! I also added "m" at the end of the
regular expression to match whatever might span two lines.
Cheers!

Charles
------
http://radio.weblogs.com/0111823/
http://charlesnadeau.blogspot.com/