Koans: About_Regexp Question

Hi,

As the title implies, I'm pretty new to Ruby (and programming) and I've
got a question regarding regular expressions in the above Koans
exercises, specifically with the "sub" and "gsub" methods.

What I would like to know is: is my thinking correct here:

def test_sub_is_like_find_and_replace
    assert_equal ["one t-three"], "one two-three".sub(/(t\w*)/) { $1[0,
1] }
  end

So, as I see it, the "sub" method is looking for the first "t" in the
string. The hash says that it will replace [0, 1] elements of the string
following the "t" ("w" and "o" respectively) with "$1", which has not
been defined, meaning it will take these letters out of the string.

The gsub exercise is the same code, but with gsub instead of sub, so in
that instance it looks to find and replace the [0, 1] elements following
*every* "t" in the string.

I think I'm way off base here, but any advice you can lend me will be
greatly appreciated.

Cheers!

···

--
Posted via http://www.ruby-forum.com/.

So, as I see it, the "sub" method is looking for the first "t" in the
string.

Almost, it looks for the first t and all "word"-characters after the t.

The hash says that it will replace [0, 1] elements of the string
following the "t" ("w" and "o" respectively) with "$1", which has not
been defined, meaning it will take these letters out of the string.

The thing in curly braces is a block, not a hash.
The $1 is a varaible which contains the first group of the last regexp
match (in this example $1="two").
So the block sasy replace the word "two" with the first letter of the word.

Actually not: your interpretation leads to the correct result although
you are a bit off in some details:

- What you see at the end of the line is not a Hash but a block passed
to the method call #sub.

- The block comes only into play when matching has finished (see below).

- Matching does not look for the first "t" _only_, instead it will
match the first "t" followed by as much "word" characters as possible
(this is what the "\w*" is for).

- $1 *is* defined, actually the value in $1 is a side effect of the
matching which takes place in #sub.

- $1 is filled from everything which is matched by the first capturing
group in the regexp (there is also $2, $3 etc.); capturing groups are
denoted by "(...)" and there are also non capturing groups denoted by
"(?:...)" and a few other variants: in this case the capturing group
happens to span the whole expression, which is a deprecated case
because usually you use capturing groups to extract _parts_ from a
match. For the whole expression there is $& already - and in the case
of #sub, #gsub and #scan it's also stored in the block argument.

- The expression $1[0,1] denotes the part of $1 which is used for the
replacement (literally the first character). You can as well use
$1[0] here.

- Basically using $1 in the block is superfluous here since you know
the first character is always "t", so you can use it literally.

In IRB

irb(main):006:0> "one two-three".sub(/(t\w*)/) { $1[0,1] }
=> "one t-three"
irb(main):007:0> "one two-three".sub(/(t\w*)/) { $1[0] }
=> "one t-three"
irb(main):008:0> "one two-three".sub(/(t\w*)/) { 't' }
=> "one t-three"
irb(main):009:0> "one two-three".sub(/t\w*/) { 't' }

irb(main):017:0> "one two-three".sub(/t\w*/) {|m| m[0,1] }
=> "one t-three"
irb(main):018:0> "one two-three".sub(/t\w*/) {|m| m[0] }
=> "one t-three"

Other simplifications:

irb(main):010:0> "one two-three".sub(/(t)\w*/) { $1 }
=> "one t-three"
irb(main):011:0> "one two-three".sub(/(?<=t)\w*/) { '' }
=> "one t-three"
irb(main):012:0> "one two-three".sub(/(?<=t)\w*/, '')
=> "one t-three"

11 and 12 work with lookbehind. Lookbehind and -ahead ensure
something is there but do not include it in the match. So in this
case we replace sequences of any word characters of arbitrary length
with the empty string, effectively removing it.

You can even use String#= to immediately assign.

irb(main):014:0> s = "one two-three"
=> "one two-three"
irb(main):015:0> s[/(?<=t)\w*/]=''
=> ""
irb(main):016:0> s
=> "one t-three"

etc.

Kind regards

robert

···

On Fri, Dec 23, 2011 at 9:39 AM, Russell Whittington <russell.whittington@gmail.com> wrote:

Hi,

As the title implies, I'm pretty new to Ruby (and programming) and I've
got a question regarding regular expressions in the above Koans
exercises, specifically with the "sub" and "gsub" methods.

What I would like to know is: is my thinking correct here:

def test_sub_is_like_find_and_replace
assert_equal ["one t-three"], "one two-three".sub(/(t\w*)/) { $1[0,
1] }
end

So, as I see it, the "sub" method is looking for the first "t" in the
string. The hash says that it will replace [0, 1] elements of the string
following the "t" ("w" and "o" respectively) with "$1", which has not
been defined, meaning it will take these letters out of the string.

The gsub exercise is the same code, but with gsub instead of sub, so in
that instance it looks to find and replace the [0, 1] elements following
*every* "t" in the string.

I think I'm way off base here, but any advice you can lend me will be
greatly appreciated.

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Thanks for the excellent replies, gentlemen, and apologies for the
rookie error with regards to calling the block a hash.

Robert, those examples at the end of your reply are fascinating, and
shows that I have a long way to go before I've mastered any of this -
though the Koans find my footing!

Thanks again!

···

--
Posted via http://www.ruby-forum.com/.

-----Messaggio originale-----

···

Da: Russell Whittington [mailto:russell.whittington@gmail.com]
Inviato: martedì 27 dicembre 2011 15:22
A: ruby-talk ML
Oggetto: Re: Koans: About_Regexp Question

Thanks for the excellent replies, gentlemen, and apologies for the rookie
error with regards to calling the block a hash.

Robert, those examples at the end of your reply are fascinating, and shows
that I have a long way to go before I've mastered any of this - though the
Koans find my footing!

Thanks again!

--
Posted via http://www.ruby-forum.com/.

--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Capodanno a Riccione, Pacchetto Relax: Mezza Pensione + bagno turco + solarium + massaggio. Wifi e parcheggio gratis. 2 giorni euro 199 a persona
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid977&d)-12