Regex help please please!

Hi All

Please could someone assist as im now bald with a pile of hair around my
chair !

How does one do greedy regex’s in ruby ? in Perl / Python one has access to
the /s or .DOTALL option but in ruby I cannot get a greedy effect.

For example in Python to analyse a, yes im sorry visual basic file and
tokenize it I can do the following:

prog = re.compile(r"""

        (Private|Public)?       #Look for Private or Public identifier

but its optional

        \s*                     #Followed by any whitesoace

        (sub|function)          #Followed by the word Sub or Function

        \s*                     #Followed by any whitesoace


        (.*?)                   #Followed by characters(name of token)


        \s*                     #Followed by whitespace, but its

optional

        \(                      #Followed by an opening bracket - start

of parameter enclosure

        (.*?)                   #Followed by a whole lot of text ie

parameters :slight_smile:

        \)                      #Followed by a closing bracket

        (.*?)                   #Followed by a whole lot of text (the

code)

        end\s+(sub|function)    #Followed by the word End followed by

Sub or Function

        """,re.DOTALL + re.IGNORECASE + re.VERBOSE)

    

    result = prog.findall(data)

Thanks

Graeme

···

__


This e-mail and any attachments may be confidential or legally privileged.
If you received this message in error or are not the intended recipient, you
should destroy the e-mail message and any attachments or copies, and you are
prohibited from retaining, distributing, disclosing or using any information
contained herein. Please inform us of the erroneous delivery by return
e-mail.

Thank you for your cooperation.


ec03/04

[…]

How does one do greedy regex’s in ruby ? in Perl / Python one has
access to the /s or .DOTALL option but in ruby I cannot get a greedy
effect.

That’s not what ‘greedy’ means in regular expressions. You’re thinking
of ‘multiline mode’.

In Perl, that’s /s
In Ruby, that’s /m

(one day, someone will explain why Ruby has Perl’s /m on by default and
renamed /s to /m; feel free to make that day today)

Greediness is that quantifiers grab as much as they can. e.g. /^.+/
matches all of “foo”, not just “f” which /^.+?/ would match.

irb(main):001:0> “foo” =~ /^.+?/; puts $&
f
=> nil
irb(main):002:0> “foo” =~ /^.+/; puts $&
foo
=> nil

cheers,

···


Iain.

How about:

prog = Regexp.compile('
            (Private|Public)?       #Look for Private or Public identifier but its optional
            \s*                     #Followed by any whitesoace
            (sub|function)          #Followed by the word Sub or Function
            \s*                     #Followed by any whitesoace                    
            (.*?)                   #Followed by characters(name of token)                   
            \s*                     #Followed by whitespace, but its optional
            \(                      #Followed by an opening bracket - start of parameter enclosure
            (.*?)                   #Followed by a whole lot of text ie parameters :-)
            \)                      #Followed by a closing bracket
            (.*?)                   #Followed by a whole lot of text (the code)
            end\s+\2                #Followed by the word End followed by Sub or Function
            ' , Regexp::MULTILINE + Regexp::EXTENDED + Regexp::IGNORECASE)


s = "PRIVATE function test() \n some code here \n End function"
m = prog.match(s)
if m
    p m[0]
else
    puts 'Did not match!'
end


s = "PRIVATE Sub test() \n some code here \n End function"
m = prog.match(s)
if m
    p m[0]
else
    puts 'Did not match!'
end

Let me know if that works for you. I have only tested it on ruby 1.7.3 (2002-11-17) [i386-mswin32]

HTH,
– shanko

“Matthew, Graeme” Graeme.Matthew@mercer.com wrote in message news:AFB901DF1A6CD411A3AD00D0B781F8E5049CA1B0@wmelntms01.au.wmmercer.com
Hi All

Please could someone assist as im now bald with a pile of hair around my chair !

How does one do greedy regex’s in ruby ? in Perl / Python one has access to the /s or .DOTALL option but in ruby I cannot get a greedy effect.

For example in Python to analyse a, yes im sorry visual basic file and tokenize it I can do the following:

prog = re.compile(r"""

          (Private|Public)?       #Look for Private or Public identifier but its optional

          \s*                     #Followed by any whitesoace

          (sub|function)          #Followed by the word Sub or Function

          \s*                     #Followed by any whitesoace                    

          (.*?)                   #Followed by characters(name of token)                   

          \s*                     #Followed by whitespace, but its optional

          \(                      #Followed by an opening bracket - start of parameter enclosure

          (.*?)                   #Followed by a whole lot of text ie parameters :-)

          \)                      #Followed by a closing bracket

          (.*?)                   #Followed by a whole lot of text (the code)

          end\s+(sub|function)    #Followed by the word End followed by Sub or Function

          """,re.DOTALL + re.IGNORECASE + re.VERBOSE)

      

      result = prog.findall(data)

Thanks

Graeme

···

__


This e-mail and any attachments may be confidential or legally privileged.

If you received this message in error or are not the intended recipient, you

should destroy the e-mail message and any attachments or copies, and you are

prohibited from retaining, distributing, disclosing or using any information

contained herein. Please inform us of the erroneous delivery by return

e-mail.

Thank you for your cooperation.


ec03/04

By “greedy” I assume you mean make the . metacharacter match anything
including a new line. (Normally greed in regexps is used in the context
of how much of the string is “consumed” by quantified atoms, but I’m
guessing from the context.)

Do you want the m modifier? e.g. (in irb)

irb(main):001:0> languages = “ruby\nperl\npython”
=> “ruby\nperl\npython”
irb(main):002:0> languages =~ /(r.*y)/ && $1
=> “ruby”
irb(main):003:0> languages =~ /(r.*y)/m && $1
=> “ruby\nperl\npy”

or if pythonesque verbosity is what you’re after

irb(main):004:0> Regexp.new(‘(r.*y)’, Regexp::MULTILINE).match(languages)[1]
=> “ruby\nperl\npy”

Hope this helps,

Mike

···

In article AFB901DF1A6CD411A3AD00D0B781F8E5049CA1B0@wmelntms01.au.wmmercer.com, Matthew, Graeme wrote:

Please could someone assist as im now bald with a pile of hair around my
chair !

How does one do greedy regex’s in ruby ? in Perl / Python one has access to
the /s or .DOTALL option but in ruby I cannot get a greedy effect.


mike@stok.co.uk | The “`Stok’ disclaimers” apply.
http://www.stok.co.uk/~mike/ | GPG PGP Key 1024D/059913DA
mike@exegenix.com | Fingerprint 0570 71CD 6790 7C28 3D60
http://www.exegenix.com/ | 75D2 9EC4 C1C0 0599 13DA