I am usually pretty good at regexes but this one has me stumped.
I want to basically match any line that has a period in it, but only if
that period is not part of a salutation. Ideally I want to do this
with a single regex.
Thus:
'end of line. And we continue' # should match
'The incredible Mrs. Robner' # should not match
'Sammy Davis Jr. is an okay guy.' # should match, due to last .
Thus:
'end of line. And we continue' # should match
'The incredible Mrs. Robner' # should not match
'Sammy Davis Jr. is an okay guy.' # should match, due to last .
I tried doing something logical, like:
/(?!Jr\.|Sr\.|Miss\.|Mr\.|Mrs\.)\./
but, alas, this does not work. Any ideas?
Just by looking at it, this only seems to not-find 'Mrs..'.
I also wonder why you use a look-ahead, I would rather use a
look-behind. As it is, your regex would find any dot, because no dot
matches (Jr\.|Sr\.|Miss\.|Mr\.|Mrs\.). So in the regex dialect I know
best (NEdit):
(?<!(Jr|Sr|Miss|Mr|Mrs))\.
(aka. find a dot not preceeded by Jr, Sr, etc.)
Thorsten
···
--
Gerade wenn wir alle ganz sichergehen wollen, schaffen
wir eine Welt voll äußerster Unsicherheit
- Dag Hammarskjöld
I am usually pretty good at regexes but this one has me stumped.
I want to basically match any line that has a period in it, but only if
that period is not part of a salutation. Ideally I want to do this
with a single regex.
Thus:
'end of line. And we continue' # should match
'The incredible Mrs. Robner' # should not match
'Sammy Davis Jr. is an okay guy.' # should match, due to last .
I tried doing something logical, like:
/(?!Jr\.|Sr\.|Miss\.|Mr\.|Mrs\.)\./
but, alas, this does not work. Any ideas?
a = [
'end of line. And we continue',
'The incredible Mrs. Robner',
'Sammy Davis Jr. is an okay guy.'
]
a.each {|s|
puts s if s.gsub(/(?:Jr\.|Sr\.|Mr\.|Mrs\.)/,"") =~ /\./
}
gga wrote:
> I am usually pretty good at regexes but this one has me stumped.
> I want to basically match any line that has a period in it, but only if
> that period is not part of a salutation. Ideally I want to do this
> with a single regex.
>
> Thus:
> 'end of line. And we continue' # should match
> 'The incredible Mrs. Robner' # should not match
> 'Sammy Davis Jr. is an okay guy.' # should match, due to last .
>
> I tried doing something logical, like:
>
> /(?!Jr\.|Sr\.|Miss\.|Mr\.|Mrs\.)\./
>
> but, alas, this does not work. Any ideas?
a = [
'end of line. And we continue',
'The incredible Mrs. Robner',
'Sammy Davis Jr. is an okay guy.'
]
a.each {|s|
puts s if s.gsub(/(?:Jr\.|Sr\.|Mr\.|Mrs\.)/,"") =~ /\./
}
This would be a lot easier if Ruby had look-behind.
[
'.start',
'-. HERE .-',
'Jr. is rotten',
'Mr. Smith is here',
'Mr-. Smith is here',
'Mr. Smith is here.',
'Mrs. Jones left',
'Meet Mr. Elihu Snark, Jr.',
'A good line.',
'A mystery guest, introduced by his father, Mr. Bob Eck, Sr.'
].each {|s|
if s =~ %r{ (?:
(?!Jr|Sr|Mr) ^ .{0,2} |
(?!.Jr|.Sr|.Mr|Mrs) ...
)
\.
}x
puts s
end
}
A negative look-behind would be the perfect, simple approach to this regex problem. Unfortunately, Ruby's current regexp handler does not have such a feature. Fortunately, the regexp handler of the next version of Ruby does. Even more fortunately, this future handler (Oniguruma) is available now.
So, you can write a more complex regexp/logic to detect your current case, or you can get Oniguruma working and use a negative look-behind.
···
On Aug 21, 2005, at 1:49 AM, Thorsten Haude wrote:
I also wonder why you use a look-ahead, I would rather use a
look-behind.
I'd probably use something like /(\w+)\./ and do a programmatic check (or use a second RX) that the word before the dot is not one of those no match words.
Kind regards
robert
···
Gavin Kistner <gavin@refinery.com> wrote:
On Aug 21, 2005, at 1:49 AM, Thorsten Haude wrote:
I also wonder why you use a look-ahead, I would rather use a
look-behind.
A negative look-behind would be the perfect, simple approach to this
regex problem. Unfortunately, Ruby's current regexp handler does not
have such a feature. Fortunately, the regexp handler of the next
version of Ruby does. Even more fortunately, this future handler
(Oniguruma) is available now.
So, you can write a more complex regexp/logic to detect your current
case, or you can get Oniguruma working and use a negative look-behind.
On Aug 21, 2005, at 1:49 AM, Thorsten Haude wrote:
I also wonder why you use a look-ahead, I would rather use a
look-behind.
A negative look-behind would be the perfect, simple approach to this
regex problem. Unfortunately, Ruby's current regexp handler does not
have such a feature.
Sorry if I added to the confusion, I'm pretty new to Ruby and wasn't
aware of that limitation.
Thorsten
--
A: Top posters
Q: What's the most annoying thing about email these days?