String iterate through regex matches with possition

Hi,

First of all sorry if this a duplicate question ( I have scanned through
the last answers regarding regex and didn't get any ideas ).

I am scanning a string in order to detect correctly formed "records" in it.
A correct record is a "SP" mark followed by "NL" marks (0 or more ) and an
ending "EP" mark.
If we find an two EPs without a SP in the middle, two SPs without a EP in
the middle, or a mark other than "NL" in between the SP and EP marks the
record is invalid.

"BS HD SP SP EP SP NL EP EP FT BS"

We have the following records:

- SP EP

- SP NL EP

I scan through them and I am able to retrieve them with:

string.scan(/(SP)\s((?:NL\s)*)(EP)/)

But I am not getting the start and end position of the match inside the
string ( which I need to retrive data from another place).

Is there any way to scan the string for matches where I get the index
possition ?

Maybe I should not even be using scan ?

Thanks for your help and time.

Regards,
V.

Hi,

The MatchData object in $~ has a "offset" method to retrieve the start
end end offset a capture group. However, I don't understand why you
capture "SP" and "EP".

string.scan /SP\s((?:NL\s)*)EP/ do
  p $~.offset 1
end

···

--
Posted via http://www.ruby-forum.com/.

See Jan's reply for obtaining the position.

Maybe I should not even be using scan ?

If you need to process the content in between then you could also use #split:

m = string.split /((?:SP)\s(?:NL\s)*EP)/

(When #split is used with capturing groups those are retained in the
resulting array.)

Kind regards

robert

···

On Tue, Sep 11, 2012 at 4:44 PM, Vicente Bosch <vbosch@gmail.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Vicente Bosch wrote in post #1075485:

Is there any way to scan the string for matches where I get the index
possition ?

str = 'BS HD SP SP EP SP NL EP EP FT BS'

str.scan(/
    SP
    \s*
    (?:NL)*
    \s*
    EP
/xms) do |match|
  md = Regexp.last_match
  puts "#{match.inspect} => #{md.offset(0)}"
end

--output:--
"SP EP" => [9, 14]
"SP NL EP" => [15, 23]

···

--
Posted via http://www.ruby-forum.com/\.

Thanks for the answers!! Going to go with Regexp.last_match :slight_smile:

···

On 11 September 2012 22:48, 7stud -- <lists@ruby-forum.com> wrote:

Vicente Bosch wrote in post #1075485:
>
> Is there any way to scan the string for matches where I get the index
> possition ?
>

str = 'BS HD SP SP EP SP NL EP EP FT BS'

str.scan(/
    SP
    \s*
    (?:NL)*
    \s*
    EP
/xms) do |match|
  md = Regexp.last_match
  puts "#{match.inspect} => #{md.offset(0)}"
end

--output:--
"SP EP" => [9, 14]
"SP NL EP" => [15, 23]

--
Posted via http://www.ruby-forum.com/\.