String.gsub with regex and block

Alexey_Petrushin · 7 April 2011 15:25

Probably a stupid question, but is there a way to use :gsub replacement
without $0 $1 $2 $3 (and without "\0\1\2\3")?

I would prefer something like:

"John Smith".gsub /(.+)\s(.+)/ do |name, family|
p [name, family]

      # instead of this
      p [$1, $2]
    end

···

--
Posted via http://www.ruby-forum.com/.

Reid_Thompson1 · 7 April 2011 15:36

is it a requirement that you use gsub?

irb(main):008:0> name, family = "John Smith".split
=> ["John", "Smith"]
irb(main):009:0> p [name, family]
["John", "Smith"]
=> nil

···

On Fri, 2011-04-08 at 00:25 +0900, Alexey Petrushin wrote:

"John Smith".gsub /(.+)\s(.+)/ do |name, family|
      p [name, family]

      # instead of this
      p [$1, $2]
    end

Brian_Candler · 7 April 2011 15:50

Alexey Petrushin wrote in post #991484:

Probably a stupid question, but is there a way to use :gsub replacement
without $0 $1 $2 $3 (and without "\0\1\2\3")?

There is also $~ (Regexp.last_match); $1/$2/etc are just a facade.

I would prefer something like:

    "John Smith".gsub /(.+)\s(.+)/ do |name, family|
      p [name, family]

      # instead of this
      p [$1, $2]
    end

"John Smith".gsub /(.+)\s(.+)/ do
name, family = $~.captures
p [name, family]
end

Not pretty, but you can wrap it up in your own method:

class String
  def gsubcap(*arg)
    gsub(*arg) { yield $~.captures }
  end
end

"John Smith".gsubcap /(.+)\s(.+)/ do |name, family|
p [name, family]
end

···

--
Posted via http://www.ruby-forum.com/\.

Brian_Candler · 7 April 2011 15:57

Or if you are a ruby 1.9 user, you could use named capture groups
instead. I'm not sure they make the regexp itself any clearer in this
case though:

"John Smith".gsub /(?<name>.+)\s(?<family>.+)/ do
p [$~[:name],$~[:family]]
end

···

--
Posted via http://www.ruby-forum.com/.

7stud · 7 April 2011 18:10

Alexey Petrushin wrote in post #991484:

Probably a stupid question, but is there a way to use :gsub replacement
without $0 $1 $2 $3 (and without "\0\1\2\3")?

Where are you replacing anything?

I would prefer something like:

    "John Smith".gsub /(.+)\s(.+)/ do |name, family|
      p [name, family]

      # instead of this
      p [$1, $2]
    end

"John Smith".scan(/\S+/) do |match|
puts match
end

--output:--
John
Smith

···

--
Posted via http://www.ruby-forum.com/\.

jake_kaiden · 7 April 2011 20:57

Alexey Petrushin wrote in post #991484:

I would prefer something like:

    "John Smith".gsub /(.+)\s(.+)/ do |name, family|
      p [name, family]

      # instead of this
      p [$1, $2]
    end

i also wonder if gsub is necessary... there's no replacement here as
far as i can tell. my oversimplified monkey brain comes up with this:

irb(main):001:0> str = "John Smith - Minister of Funny Walks."
=> "John Smith - Minister of Funny Walks."
irb(main):002:0> arr = str.split(" ")
=> ["John", "Smith", "-", "Minister", "of", "Funny", "Walks."]
irb(main):003:0> name, family = arr[0], arr[1]
=> ["John", "Smith"]
irb(main):004:0> puts name
John
=> nil
irb(main):005:0> puts family
Smith
=> nil

this also makes some assumptions about the data, of course...

-j

···

--
Posted via http://www.ruby-forum.com/\.

Alexey_Petrushin · 8 April 2011 12:31

Thanks for advices, I didn't know about $~ containting arrays of
results, add new 'substitute' method is probably the best solution.

This solution is not very generalizable. It only works as presented for
cases where all the stuff you want to discard looks the same.

No, it's no less generalizable than $X stuff, use splats if You have
different matches.
"John Smith".substitute{|*tokens| ...}

And yes the provided sample is unclear, there where actually no
replacement, maybe it should be something like that:

    "John Smith".gsub /(.+)\s(.+)/ do |name, family|
      "#{name[0..0]}. #{family}"
    end

Here's the complete solution:

    class String
      def substitute(*args)
        gsub(*args){yield Regexp.last_match.captures}
      end

      def substitute!(*args)
        gsub!(*args){yield Regexp.last_match.captures}
      end
    end

Thanks for help!

···

--
Posted via http://www.ruby-forum.com/\.

Chad_Perrin · 7 April 2011 17:08

This solution is not very generalizable. It only works as presented for
cases where all the stuff you want to discard looks the same. I wouldn't
want to have to deal with this kind of thing for a particularly complex
case:

str = 'John Smith: A Good Man -- A Good Husband. RIP (1976)'.split

first, *remainder = str.split

last, *non_name = remainder.split(': ')

desc1, *the_rest = non_name.split(' -- ')

desc2, *deceased = the_rest.split('. ')

. . . so, is there some way to use more descriptive variable names than
the default $1, $2, et cetera, for captures from within a regex? I'm not
aware of any, but I too would find that agreeable.

···

On Fri, Apr 08, 2011 at 12:36:23AM +0900, Reid Thompson wrote:

On Fri, 2011-04-08 at 00:25 +0900, Alexey Petrushin wrote:
> "John Smith".gsub /(.+)\s(.+)/ do |name, family|
> p [name, family]
>
> # instead of this
> p [$1, $2]
> end
is it a requirement that you use gsub?

irb(main):008:0> name, family = "John Smith".split
=> ["John", "Smith"]
irb(main):009:0> p [name, family]
["John", "Smith"]
=> nil

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

7stud · 7 April 2011 18:22

Brian Candler wrote in post #991490:

Alexey Petrushin wrote in post #991484:

Probably a stupid question, but is there a way to use :gsub replacement
without $0 $1 $2 $3 (and without "\0\1\2\3")?

There is also $~ (Regexp.last_match); $1/$2/etc are just a facade.

I would prefer something like:

    "John Smith".gsub /(.+)\s(.+)/ do |name, family|
      p [name, family]

      # instead of this
      p [$1, $2]
    end

"John Smith".gsub /(.+)\s(.+)/ do
  name, family = $~.captures
  p [name, family]
end

And if you want to avoid writing code in perl:

str = "John Smith"
pattern = /(.+)\s(.+)/

result = str.gsub(pattern) do
md_obj = Regexp.last_match
first_name, last_name = md_obj[1], md_obj[2]

p first_name, last_name
end

Or to avoid any indexing, you could do this:

str = "John Smith"
pattern = /(.+)\s(.+)/

result = str.gsub(pattern) do |match|
first_name, last_name = match.split
p first_name, last_name

"some replacement"
end

puts result

--output:--
"John"
"Smith"
some replacement

···

--
Posted via http://www.ruby-forum.com/\.

Phil · 7 April 2011 18:39

irb(main):001:0> "John;Smith".scan /\S+/ do |match|
irb(main):002:1* puts match
irb(main):003:1> end
John;Smith
=> "John;Smith"

Ups.

Better:

irb(main):04:0> "John;Smith".scan /\w+/ do |match|
irb(main):05:1* puts match
irb(main):06:1> end
John
Smith

The code still makes assumptions about the data, though: it is uniform
in that only the first n parts are the name, and not n[+|-]1.

···

On Thu, Apr 7, 2011 at 8:10 PM, 7stud -- <bbxx789_05ss@yahoo.com> wrote:

"John Smith".scan(/\S+/) do |match|
puts match
end

--
Phillip Gawlowski

Though the folk I have met,
(Ah, how soon!) they forget
When I've moved on to some other place,
There may be one or two,
When I've played and passed through,
Who'll remember my song or my face.

Sergey_Avseyev · 8 April 2011 05:10

name, family, = arr

···

On Apr 7, 11:57 pm, jake kaiden <jakekai...@yahoo.com> wrote:

irb(main):001:0> str = "John Smith - Minister of Funny Walks."
=> "John Smith - Minister of Funny Walks."
irb(main):002:0> arr = str.split(" ")
=> ["John", "Smith", "-", "Minister", "of", "Funny", "Walks."]
irb(main):003:0> name, family = arr[0], arr[1]

Brian_Candler · 8 April 2011 07:20

Chad Perrin wrote in post #991517:

. . . so, is there some way to use more descriptive variable names than
the default $1, $2, et cetera, for captures from within a regex? I'm
not
aware of any, but I too would find that agreeable.

Yes, ruby 1.9 has named capture groups. I posted an example earlier in
this thread.

···

--
Posted via http://www.ruby-forum.com/\.

Topic		Replies	Views
String gsub to replace named capture groups ruby-talk	6	213	16 April 2014
Gsub for string ruby-talk	3	102	1 March 2010
String substitution without RegEx ruby-talk	0	78	23 July 2003
Gsub() replacement containing replaced text ruby-talk	5	169	13 July 2002
Beginner gsub and ri questions ruby-talk	0	101	17 April 2006

String.gsub with regex and block

Related topics