I need a string#all_indices method--is there such a thing?

In ruby you can use string#index as follows:
str = "some text"
str.index(/t/)
=>5

But what if I want to get all the indices for a regex in the string?
Is there an string#all_indices method?

I wrote the following, which works, but there must be a more elegant
way:

class String
  def all_indices(regex)
  indices = []
  index = 0
    while index && index < self.length #index will be nil upon first
match failure, otherwise quit loop when index is equal to string
length
      index = self.index(regex, index)
      if index.is_a? Numeric #avoids getting a nil into the indices
array
        indices << index
        index +=1
      end
    end
    indices
  end
end
p "this is a test string for the ts in the worldt".all_indices(/t/)
p "what is up with all the twitter hype".all_indices(/w/)
# >> [0, 10, 13, 16, 26, 30, 36, 45]
# >> [0, 11, 25]

scan

timr wrote:

···

In ruby you can use string#index as follows:
str = "some text"
str.index(/t/)
=>5

But what if I want to get all the indices for a regex in the string?
Is there an string#all_indices method?

I wrote the following, which works, but there must be a more elegant
way:

class String
  def all_indices(regex)
  indices =
  index = 0
    while index && index < self.length #index will be nil upon first
match failure, otherwise quit loop when index is equal to string
length
      index = self.index(regex, index)
      if index.is_a? Numeric #avoids getting a nil into the indices
array
        indices << index
        index +=1
      end
    end
    indices
  end
end
p "this is a test string for the ts in the worldt".all_indices(/t/)
p "what is up with all the twitter hype".all_indices(/w/)
# >> [0, 10, 13, 16, 26, 30, 36, 45]
# >> [0, 11, 25]

What about
class String
  def indices rgx, idx=0
    .tap{ |r|
      loop do
        idx = index rgx, idx
        break unless idx
        r << idx
        idx += 1
      end
    }
  end
end

p "baaababbabbbba".indices( /a/ )

···

On Fri, Aug 28, 2009 at 10:25 AM, timr<timrandg@gmail.com> wrote:

In ruby you can use string#index as follows:
str = "some text"
str.index(/t/)
=>5

But what if I want to get all the indices for a regex in the string?
Is there an string#all_indices method?

I wrote the following, which works, but there must be a more elegant
way:

class String
def all_indices(regex)
indices =
index = 0
while index && index < self.length #index will be nil upon first
match failure, otherwise quit loop when index is equal to string
length
index = self.index(regex, index)
if index.is_a? Numeric #avoids getting a nil into the indices
array
indices << index
index +=1
end
end
indices
end
end
p "this is a test string for the ts in the worldt".all_indices(/t/)
p "what is up with all the twitter hype".all_indices(/w/)
# >> [0, 10, 13, 16, 26, 30, 36, 45]
# >> [0, 11, 25]

--
If you tell the truth you don't have to remember anything.
--
Samuel Clemens (some call him Mark Twain)

Does this do what you want?

class String
  def all_indices(reg)
    tmp,idx = ,
    (0...self.length).each{|x| tmp = self[x..-1]}
    tmp.each_with_index{|y,i| idx << i if y =~ /\A#{reg}/}
    idx
  end
end

p "this is a test string for the ts in the worldt".all_indices(/th/)
#> [0, 26, 36]

It may not be very fast for very long strings ( I didn't check).
But for strings like your example it seems OK.

Harry

···

On Fri, Aug 28, 2009 at 5:25 PM, timr<timrandg@gmail.com> wrote:

In ruby you can use string#index as follows:
str = "some text"
str.index(/t/)
=>5

But what if I want to get all the indices for a regex in the string?
Is there an string#all_indices method?

--
A Look into Japanese Ruby List in English

Facets has:

  def index_all(s, reuse=false)
    s = Regexp.new(Regexp.escape(s)) unless Regexp===s
    ia = ; i = 0
    while (i = index(s,i))
      ia << i
      i += (reuse ? 1 : $~[0].size)
    end
    ia
  end

···

On Aug 28, 4:25 am, timr <timra...@gmail.com> wrote:

In ruby you can use string#index as follows:
str = "some text"
str.index(/t/)
=>5

But what if I want to get all the indices for a regex in the string?
Is there an string#all_indices method?

timr:

But what if I want to get all the indices for a regex
in the string? Is there an string#all_indices method?

How about the below?

class String
  def all_indices needle
    all =
    offset = 0
    loop do
      i = index needle, offset
      break if i.nil?
      all << i
      offset = i + 1
    end
    all
  end
end

— Shot

···

--
It may look like I’m just sitting here doing nothing. But
I’m really actively waiting for all my problems to go away.

This is a bit simpler:
    class String
      def all_indices(substring)
        idx = 0
        indices =
        loop do
          idx = index(substring, idx)
          break if idx.nil?
          indices << idx
          idx += 1
        end
        indices
      end
    end

    require 'test/unit'
    class TestAllIndices < Test::Unit::TestCase
      def test_it
        assert_equal(
          [0, 10, 13, 16, 26, 30, 36, 45],
          "this is a test string for the ts in the worldt".all_indices(/t/)
        )
        assert_equal(
          [0, 11, 25],
          "what is up with all the twitter hype".all_indices(/w/)
        )
        assert_equal(
          [12, 17, 26, 41],
          "the quick brown fox jumps over the lazy dog".all_indices('o')
        )
        assert_equal(
          [1, 3, 5],
          "bananana".all_indices('ana')
        )
      end
    end

···

At 2009-08-28 04:20AM, "timr" wrote:

In ruby you can use string#index as follows:
str = "some text"
str.index(/t/)
=>5

But what if I want to get all the indices for a regex in the string?
Is there an string#all_indices method?

I wrote the following, which works, but there must be a more elegant
way:

class String
   def all_indices(regex)
   indices =
   index = 0
     while index && index < self.length #index will be nil upon first
match failure, otherwise quit loop when index is equal to string
length
       index = self.index(regex, index)
       if index.is_a? Numeric #avoids getting a nil into the indices
array
         indices << index
         index +=1
       end
     end
     indices
   end
end
p "this is a test string for the ts in the worldt".all_indices(/t/)
p "what is up with all the twitter hype".all_indices(/w/)
# >> [0, 10, 13, 16, 26, 30, 36, 45]
# >> [0, 11, 25]

--
Glenn Jackman
    Write a wise saying and your name will live forever. -- Anonymous

A lot of solutions have been given here. It would be nice to see a
test/benchmark matrix to compare them, if anyone is up to it.

···

On Aug 28, 4:25 am, timr <timra...@gmail.com> wrote:

In ruby you can use string#index as follows:
str = "some text"
str.index(/t/)
=>5

But what if I want to get all the indices for a regex in the string?
Is there an string#all_indices method?

Scan gives you the matches, not the indices (which is what I need).

"this is a test for scan".scan(/t/)

=> ["t", "t", "t"]

Does this do what you want?

class String
def all_indices(reg)
   tmp,idx = ,
   (0...self.length).each{|x| tmp = self[x..-1]}
   tmp.each_with_index{|y,i| idx << i if y =~ /\A#{reg}/}
   idx
end
end

p "this is a test string for the ts in the worldt".all_indices(/th/)
#> [0, 26, 36]

Harry

Sorry, it looks like I had an unnecessary line in there.

class String
  def all_indices(reg)
    idx =
    (0...self.length).each{|x| idx << x if self[x..-1] =~ /\A#{reg}/}
    idx
  end
end

p "this is a test string for the ts in the worldt".all_indices(/th/)
#> [0, 26, 36]
p "banana".all_indices(/ana/) #> [1, 3]

Harry

···

--
A Look into Japanese Ruby List in English

What about
class String
def indices rgx, idx=0
.tap{ |r|
loop do
idx = index rgx, idx
break unless idx
r << idx
idx += 1
end
}
end
end

.tap?
you must have defined a tap method for array somewhere. But not in the
code you showed. Can't run the code without a definition for tap.
Thanks,
Tim

Oh, tap is new in 1.9. Sorry, I hadn't come across it before and was
in 1.8.6 so it wasn't running. Got it now.

···

On Aug 28, 2:01 am, Robert Dober <robert.do...@gmail.com> wrote:

On Fri, Aug 28, 2009 at 10:25 AM, timr<timra...@gmail.com> wrote:
> In ruby you can use string#index as follows:
> str = "some text"
> str.index(/t/)
> =>5

> But what if I want to get all the indices for a regex in the string?
> Is there an string#all_indices method?

> I wrote the following, which works, but there must be a more elegant
> way:

> class String
> def all_indices(regex)
> indices =
> index = 0
> while index && index < self.length #index will be nil upon first
> match failure, otherwise quit loop when index is equal to string
> length
> index = self.index(regex, index)
> if index.is_a? Numeric #avoids getting a nil into the indices
> array
> indices << index
> index +=1
> end
> end
> indices
> end
> end
> p "this is a test string for the ts in the worldt".all_indices(/t/)
> p "what is up with all the twitter hype".all_indices(/w/)
> # >> [0, 10, 13, 16, 26, 30, 36, 45]
> # >> [0, 11, 25]

What about
class String
def indices rgx, idx=0
.tap{ |r|
loop do
idx = index rgx, idx
break unless idx
r << idx
idx += 1
end
}
end
end

p "baaababbabbbba".indices( /a/ )

--
If you tell the truth you don't have to remember anything.
--
Samuel Clemens (some call him Mark Twain)

Hi,

···

Am Samstag, 29. Aug 2009, 11:38:19 +0900 schrieb 7rans:

On Aug 28, 4:25 am, timr <timra...@gmail.com> wrote:

A lot of solutions have been given here. It would be nice to see a
test/benchmark matrix to compare them, if anyone is up to it.

Sure I agree. But my solution was just to show some aspect of
String#scan, not of any practical sense.

Bertram

--
Bertram Scharpf
Stuttgart, Deutschland/Germany
http://www.bertram-scharpf.de

Sorry, bad idea.

timr wrote:

···

Scan gives you the matches, not the indices (which is what I need).

"this is a test for scan".scan(/t/)
      

=> ["t", "t", "t"]

Hi,

Scan gives you the matches, not the indices (which is what I need).

>> "this is a test for scan".scan(/t/)
=> ["t", "t", "t"]

There's a trick to do it with String#scan:

  a =
  "this is a test for scan".scan( /t/) { a.push $`.length }
  a

This does not work when the matches overlap.

  "banana".scan /ana/ #=> ["ana"]

Bertram

···

Am Freitag, 28. Aug 2009, 17:40:05 +0900 schrieb timr:

--
Bertram Scharpf
Stuttgart, Deutschland/Germany
http://www.bertram-scharpf.de

This works and the code is more concise than what I had, but it is a
brute force approach that test for matches from every possible
substring. That would be a bit slow.

···

On Aug 28, 5:02 am, Harry Kakueki <list.p...@gmail.com> wrote:

> Does this do what you want?

> class String
> def all_indices(reg)
> tmp,idx = ,
> (0...self.length).each{|x| tmp = self[x..-1]}
> tmp.each_with_index{|y,i| idx << i if y =~ /\A#{reg}/}
> idx
> end
> end

> p "this is a test string for the ts in the worldt".all_indices(/th/)
> #> [0, 26, 36]

> Harry

Sorry, it looks like I had an unnecessary line in there.

class String
def all_indices(reg)
idx =
(0...self.length).each{|x| idx << x if self[x..-1] =~ /\A#{reg}/}
idx
end
end

p "this is a test string for the ts in the worldt".all_indices(/th/)
#> [0, 26, 36]
p "banana".all_indices(/ana/) #> [1, 3]

Harry

--
A Look into Japanese Ruby List in Englishhttp://www.kakueki.com/ruby/list.html

Sorry I am an unconditional one-niner. I really should be more careful
to mark 1.9 only features with comments. At least for some more weeks
:wink:

···

On Fri, Aug 28, 2009 at 4:55 PM, timr<timrandg@gmail.com> wrote:

What about
class String
def indices rgx, idx=0
.tap{ |r|
loop do
idx = index rgx, idx
break unless idx
r << idx
idx += 1
end
}
end
end

.tap?
you must have defined a tap method for array somewhere. But not in the
code you showed. Can't run the code without a definition for tap.
Thanks,
Tim

--
If you tell the truth you don't have to remember anything.
--
Samuel Clemens (some call him Mark Twain)

Hi --

class String
def all_indices(reg)
   idx =
   (0...self.length).each{|x| idx << x if self[x..-1] =~ /\A#{reg}/}
   idx
end
end

Might as well let #select do the choosing:

   def all_indices(re)
     (0...size).select {|i| self[i..-1][/\A#{re}/] }
   end

And maybe better to create the regex only one:

   def all_indices(re)
     re = /\A#{re}/
     (0...size).select {|i| self[i..-1][re] }
   end

David

···

On Fri, 28 Aug 2009, Harry Kakueki wrote:

--
David A. Black / Ruby Power and Light, LLC / http://www.rubypal.com
Ruby/Rails training, mentoring, consulting, code-review
Latest book: The Well-Grounded Rubyist (http://www.manning.com/black2\)

September Ruby training in NJ has been POSTPONED. Details to follow.

Bertram Scharpf wrote:

Hi,

Scan gives you the matches, not the indices (which is what I need).

"this is a test for scan".scan(/t/)

=> ["t", "t", "t"]

There's a trick to do it with String#scan:

  a =
  "this is a test for scan".scan( /t/) { a.push $`.length }
  a

This does not work when the matches overlap.

  "banana".scan /ana/ #=> ["ana"]

Bertram

Same difficulty with overlap, but for variety:

class String
   def all_indexes re
     a=;scan(re) {a<<$~.begin(0)};a
   end
end

p "foo bar baz".all_indexes(/.../)
p "banana".all_indexes(/ana/)

__END__

Output:

[0, 3, 6]
[1]

···

Am Freitag, 28. Aug 2009, 17:40:05 +0900 schrieb timr:

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

This is not fast enough?

class String
  def all_indices(reg)
    idx =
    (0...self.length).each{|x| idx << x if self[x..-1] =~ /\A#{reg}/}
    idx
  end
end

p ("this is a test string for the ts in the worldt"*1000).all_indices(/th/)

I guess you are processing some big strings.
Speed is not what you asked for.
Well, until now :slight_smile:

Harry

···

On Sat, Aug 29, 2009 at 12:05 AM, timr<timrandg@gmail.com> wrote:

On Aug 28, 5:02 am, Harry Kakueki <list.p...@gmail.com> wrote:

class String
  def all_indices(reg)
    idx =
    (0...self.length).each{|x| idx << x if self[x..-1] =~ /\A#{reg}/}
    idx
  end
end

p "this is a test string for the ts in the worldt".all_indices(/th/)
#> [0, 26, 36]
p "banana".all_indices(/ana/) #> [1, 3]

Harry

This works and the code is more concise than what I had, but it is a
brute force approach that test for matches from every possible
substring. That would be a bit slow.

--
A Look into Japanese Ruby List in English