Copying $1.. A loop with two regular expressions

Hi, sorry if this is a stupid question.. i've only been programming ruby
for about six hours.

I'm trying to white a loop to parse through a webpage and get all the
links to other pages. This loop depends on a regular expression to find
all the <a href tags.. but inside the loop there is another regular
expression which looks to see if the link is relative or static. The
problem is the inner regular expression changes the $1 variable so the
loop just fails on the first iteration. I've tried making a copy of the
$1 variable but the result just ends up containing nil.

Any help you could offer would be gratefully appreciated

Hears my code so far:

      loop do
         url = $1
        puts $1 #A url
        puts $url #Always nil ?

         if $1 =~ /^http/ //Inner regular expression
          new_url = host + path
        else
          new_url = path
        end

        newPage = WebPage.new(new_url, link_depth + 1)

      break unless url =~ @@ahref_filter
      end

···

--
Posted via http://www.ruby-forum.com/.

Oky wrote:

Hi, sorry if this is a stupid question.. i've only been programming ruby for about six hours.

I'm trying to white a loop to parse through a webpage and get all the links to other pages. This loop depends on a regular expression to find all the <a href tags.. but inside the loop there is another regular expression which looks to see if the link is relative or static. The problem is the inner regular expression changes the $1 variable so the loop just fails on the first iteration. I've tried making a copy of the $1 variable but the result just ends up containing nil.

Are you trying to use a command line argument? If so, try ARGV[1]
instead of $1 (which is a global variable storing the text of the
first subexpression in the most recent match.

Hal

Hi there,

Oky wrote:

Hi, sorry if this is a stupid question.. i've only been programming ruby for about six hours.

I'm trying to white a loop to parse through a webpage and get all the links to other pages. This loop depends on a regular expression to find all the <a href tags.. but inside the loop there is another regular expression which looks to see if the link is relative or static. The problem is the inner regular expression changes the $1 variable so the loop just fails on the first iteration. I've tried making a copy of the $1 variable but the result just ends up containing nil.

Any help you could offer would be gratefully appreciated

Hears my code so far:

      loop do
         url = $1
        puts $1 #A url
        puts $url #Always nil ?
  

Here's _a_ problem for you to start with. You've assigned the value of 1 to the local variable url. When you puts $url, you are examining and printing the global variable $url. Since you haven't assigned anything to $url yet, it is nil.

In Ruby an unadorned name like "url" is either a local variable or method call. Putting a "$" on the front of something tells Ruby that you want to refer to a global.

So try changing the "$url" to "url" and see what happens.

I hope it's something good.

Matthew

Hi Hal,

Thank you for your reply

No i'm not trying to use a command line argument, just use a loop which
uses two regular expression. After having a fresh look at the code i
spotted my mistake (url = $1 should have been url = $') and it seams to
work now. Although I still don't understand why I can't copy the $1
variable (doller_one = $1 equals nil?) but it doesn't matter to much now
as ruby lets you copy the $' variable.

Thank very much for taking the time to reply

Oky

···

--
Posted via http://www.ruby-forum.com/.

Matthew Desmarais wrote:

Hi there,

Oky wrote:

Hi, sorry if this is a stupid question.. i've only been programming ruby for about six hours.

I'm trying to white a loop to parse through a webpage and get all the links to other pages. This loop depends on a regular expression to find all the <a href tags.. but inside the loop there is another regular expression which looks to see if the link is relative or static. The problem is the inner regular expression changes the $1 variable so the loop just fails on the first iteration. I've tried making a copy of the $1 variable but the result just ends up containing nil.

Any help you could offer would be gratefully appreciated

Hears my code so far:

        loop do
             url = $1
            puts $1 #A url
            puts $url #Always nil ?
  

Here's _a_ problem for you to start with. You've assigned the value of 1 to the local variable url. When you puts $url, you are examining and printing the global variable $url. Since you haven't assigned anything to $url yet, it is nil.

In Ruby an unadorned name like "url" is either a local variable or method call. Putting a "$" on the front of something tells Ruby that you want to refer to a global.

So try changing the "$url" to "url" and see what happens.

I hope it's something good.

Matthew

Huh.

Try "You've assigned the value of $1 to the local variable url." That'll make more sense (maybe).

Sheesh. Sorry. :wink:

Thanks Matthew,

Your absolutely right. A schoolboy mistake from me :S I come from a
C++/Asm background and haven’t got the hang of all these undefined
variables yet.

Thanks again for your reply

···

--
Posted via http://www.ruby-forum.com/.

Oky wrote:

Thanks Matthew,

Your absolutely right. A schoolboy mistake from me :S I come from a C++/Asm background and haven’t got the hang of all these undefined variables yet.

Thanks again for your reply
  

No problem! I'm glad that I could help.

Have fun!