Regular Expression help

Tony_De1 · 29 March 2008 06:45

On to my next learning exercise. As I parse a file I need to pull an IP
address out a line. Now I thought a regular expression would be the
ticket, but it's giving me a problem. The follow line is an example
string I need to pull one of two IP address out of: (they are not
always formed the same)

Received: from mmds-111-19-22-30.twm.ca.internet.net (HELO
?192.168.1.2?) (222.222.222.22)

I need that last IP address. Now the problem is that I can't always
count on it being enclosed in paren's. Although I can expect the right
paren to always be there.

So here's my regex exp:
sourceip = line.scan(/\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b/)
And as you might expect, it is pulling both IP addresses. Is there a
way I can adjust the expression to grab the second IP testing for the
")" or is there another method I can use? Short of dissecting the
entire string backwards and testing whether I have a number or a char,
decimal and at most 3 chars from it, etc?

tonyd

···

--
Posted via http://www.ruby-forum.com/.

David_A_Black1 · 29 March 2008 07:05

Hi --

On to my next learning exercise. As I parse a file I need to pull an IP
address out a line. Now I thought a regular expression would be the
ticket, but it's giving me a problem. The follow line is an example
string I need to pull one of two IP address out of: (they are not
always formed the same)

Received: from mmds-111-19-22-30.twm.ca.internet.net (HELO
?192.168.1.2?) (222.222.222.22)

I need that last IP address. Now the problem is that I can't always
count on it being enclosed in paren's. Although I can expect the right
paren to always be there.

So here's my regex exp:
sourceip = line.scan(/\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b/)
And as you might expect, it is pulling both IP addresses. Is there a
way I can adjust the expression to grab the second IP testing for the
")" or is there another method I can use? Short of dissecting the
entire string backwards and testing whether I have a number or a char,
decimal and at most 3 chars from it, etc?

What you want is an IP address, possibly followed by ')' and
definitely coming at the end of the string (give or take a newline
character after it). That can be expressed like this:

/((\d{1,3}\.){3}\d{1,3})(?=\)?\Z)/

I've got 3 occurences of (\d{1,3}\.), followed by the same thing
without a dot. I've stipulated that this submatch be "looking at"
(i.e., positioned just before) an optional ')' followed by the end of
the string. (\Z gives you end of string, ignoring a possible terminal
newline.)

With your line, it gives you:

irb(main):039:0> line[re] # re.match(line)[0], or whatever
=> "222.222.222.22"

David

···

On Sat, 29 Mar 2008, Tony De wrote:

--
Rails training from David A. Black and Ruby Power and Light:
   ADVANCING WITH RAILS April 14-17 New York City
   INTRO TO RAILS June 9-12 Berlin
   ADVANCING WITH RAILS June 16-19 Berlin
See http://www.rubypal.com for details and updates!

Tony_De1 · 29 March 2008 07:16

David A. Black wrote:

Hi --

count on it being enclosed in paren's. Although I can expect the right
paren to always be there.

So here's my regex exp:
sourceip = line.scan(/\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b/)
And as you might expect, it is pulling both IP addresses. Is there a
way I can adjust the expression to grab the second IP testing for the
")" or is there another method I can use? Short of dissecting the
entire string backwards and testing whether I have a number or a char,
decimal and at most 3 chars from it, etc?

What you want is an IP address, possibly followed by ')' and
definitely coming at the end of the string (give or take a newline
character after it). That can be expressed like this:

/((\d{1,3}\.){3}\d{1,3})(?=\)?\Z)/

I've got 3 occurences of (\d{1,3}\.), followed by the same thing
without a dot. I've stipulated that this submatch be "looking at"
(i.e., positioned just before) an optional ')' followed by the end of
the string. (\Z gives you end of string, ignoring a possible terminal
newline.)

With your line, it gives you:

irb(main):039:0> line[re] # re.match(line)[0], or whatever
=> "222.222.222.22"

David

David, you rock. I'll give it a try. Those expressions make my head
hurt. But I've been taking in
http://www.regular-expressions.info/tutorial.html\. It seems to cover a
lot of foundation and application. Thanks again!

tonyd

···

On Sat, 29 Mar 2008, Tony De wrote:

--
Posted via http://www.ruby-forum.com/\.

Jesus_Gabriel_y_Gala · 29 March 2008 12:21

Another possibility (if I understood correctly): check for the
mandatory ')' in the HELO part, followed by any character (could be
changed by the specific spaces), followed by the numbers and dots for
the IP:

a = "Received: from mmds-111-19-22-30.twm.ca.internet.net (HELO
?192.168.1.2?) (222.222.222.22)"
a.match(/\).*?(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/)[1]

gives: "222.222.222.22"

Jesus.

···

On Sat, Mar 29, 2008 at 8:16 AM, Tony De <tonydema@gmail.com> wrote:

David A. Black wrote:
> Hi --
>
> On Sat, 29 Mar 2008, Tony De wrote:
>

>> count on it being enclosed in paren's. Although I can expect the right
>> paren to always be there.
>>
>> So here's my regex exp:
>> sourceip = line.scan(/\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b/)
>> And as you might expect, it is pulling both IP addresses. Is there a
>> way I can adjust the expression to grab the second IP testing for the
>> ")" or is there another method I can use? Short of dissecting the
>> entire string backwards and testing whether I have a number or a char,
>> decimal and at most 3 chars from it, etc?
>
> What you want is an IP address, possibly followed by ')' and
> definitely coming at the end of the string (give or take a newline
> character after it). That can be expressed like this:
>
> /((\d{1,3}\.){3}\d{1,3})(?=\)?\Z)/
>
> I've got 3 occurences of (\d{1,3}\.), followed by the same thing
> without a dot. I've stipulated that this submatch be "looking at"
> (i.e., positioned just before) an optional ')' followed by the end of
> the string. (\Z gives you end of string, ignoring a possible terminal
> newline.)
>
> With your line, it gives you:
>
> irb(main):039:0> line[re] # re.match(line)[0], or whatever
> => "222.222.222.22"
>
>
> David

David, you rock. I'll give it a try. Those expressions make my head
hurt. But I've been taking in
http://www.regular-expressions.info/tutorial.html\. It seems to cover a
lot of foundation and application. Thanks again!

Andrew_Stewart · 31 March 2008 17:39

You may find Rubular helpful for concocting your regular expressions and decoding other people's.

Regards,
Andy Stewart

···

On 29 Mar 2008, at 07:16, Tony De wrote:

Those expressions make my head
hurt. But I've been taking in
http://www.regular-expressions.info/tutorial.html\. It seems to cover a
lot of foundation and application. Thanks again!

-------

Tony_De1 · 30 March 2008 00:10

Jesús Gabriel y Galán wrote:

···

On Sat, Mar 29, 2008 at 8:16 AM, Tony De <tonydema@gmail.com> wrote:

>> sourceip = line.scan(/\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b/)
> /((\d{1,3}\.){3}\d{1,3})(?=\)?\Z)/
> => "222.222.222.22"
>
>
> David

David, you rock. I'll give it a try. Those expressions make my head
hurt. But I've been taking in
http://www.regular-expressions.info/tutorial.html\. It seems to cover a
lot of foundation and application. Thanks again!

Another possibility (if I understood correctly): check for the
mandatory ')' in the HELO part, followed by any character (could be
changed by the specific spaces), followed by the numbers and dots for
the IP:

a = "Received: from mmds-111-19-22-30.twm.ca.internet.net (HELO
?192.168.1.2?) (222.222.222.22)"
a.match(/\).*?(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/)[1]

gives: "222.222.222.22"

Jesus.

Thanks Jesus,

I appraciate your imput as well. You guys have been a great deal of
help.

tonyd
--
Posted via http://www.ruby-forum.com/\.

Topic		Replies	Views
Question about text parsing ruby-talk	7	85	23 July 2009
Newbie regexp question ruby-talk	4	64	6 July 2007
Regex help please? ruby-talk	5	99	10 September 2010
Regular expression help 2 ruby-talk	4	81	18 April 2007
Regular Expression Help ruby-talk	5	117	6 October 2012

Regular Expression help

Related topics