Regex to NOT match?

Sorry it seems like the smallest thing, but I’m stuck on this.

I’m trying to make a simple regex to strip away characters that are NOT certain characters I name.
I’m unclear on what Ruby uses as a “NOT” match in regex, like ! is for !=
(Or if it uses the ^, where does it put it?)

EXAMPLE:

myString = ‘I want only the letters a and t and period. When done it should look like atttaatta.tatttaatta…’

this doesn’t work of course:

puts myString.gsub(!/at./, ‘’)

Can someone show me where to find the rule on this?
I couldn’t find it in any documentation, or Ruby Way or Programming Ruby books.

Thanks!

puts myString.gsub(/[^at.]/,‘’)

Alex

···

On Saturday 10 January 2004 21:57, Ruby Baby wrote:

Sorry it seems like the smallest thing, but I’m stuck on this.

I’m trying to make a simple regex to strip away characters that are NOT
certain characters I name. I’m unclear on what Ruby uses as a “NOT” match
in regex, like ! is for != (Or if it uses the ^, where does it put it?)

EXAMPLE:

myString = ‘I want only the letters a and t and period. When done it
should look like atttaatta.tatttaatta…’

this doesn’t work of course:

puts myString.gsub(!/at./, ‘’)


I’ve got a COUSIN who works in the GARMENT DISTRICT …

puts myString.gsub(/[^at.]/, ‘’)

-Martin

···

On Sun, Jan 11, 2004 at 05:57:17AM +0900, Ruby Baby wrote:

Sorry it seems like the smallest thing, but I’m stuck on this.

I’m trying to make a simple regex to strip away characters that are NOT certain characters I name.
I’m unclear on what Ruby uses as a “NOT” match in regex, like ! is for !=
(Or if it uses the ^, where does it put it?)

EXAMPLE:

myString = ‘I want only the letters a and t and period. When done it should look like atttaatta.tatttaatta…’

this doesn’t work of course:

puts myString.gsub(!/at./, ‘’)

myString = ‘I want only the letters a and t and period. When done it should look like atttaatta.tatttaatta…’

this doesn’t work of course:

puts myString.gsub(!/at./, ‘’)

Can someone show me where to find the rule on this?

    puts myString.gsub(/[^at]/,'')

The ‘^’ inverts the character set and means “any character not in the
following set”. (It only has this meaning as the first character after
the [; [at^] matches only the characters ‘a’, ‘t’, and ‘^’.)

There’s no general way to say “match everything that doesn’t match this
regex”, because any given string likely contains an infinite number of such
matches. But you can match something more concrete, such as “any single
character which is not one of these characters” (the above complemented
character set), or “something that matches this regex and is not followed by
something matching that regex over there” (negative lookahead). There
just always has to be something concrete that IS matched.

I couldn’t find it in any documentation, or Ruby Way or Programming
Ruby books.

The regex syntax isn’t Ruby-specific, but is a subset of the syntax used
in Perl, which is an extension of the syntax used in UNIX tools going
back decades. I recommend Jeffrey Friedl’s book
Mastering_Regular_Expressions for learning everything you ever wanted
to know and then some on the topic.

-Mark

···

On Sun, Jan 11, 2004 at 05:57:17AM +0900, Ruby Baby wrote:

this doesn’t work of course:

puts myString.gsub(!/at./, ‘’)

Can someone show me where to find the rule on this?
I couldn’t find it in any documentation, or Ruby Way or Programming Ruby
books.

The best place to find docs on regex is the book Mastering Regular Expressions.
Barring that, the second best place is likely the Perl documentation (perldoc
perlre), as ruby’s regex engine is about 99% perl compatible.

What you want is this:

myString.gsub(/[^at.]/, ‘’)

The brackets create a character class, and the ^ is the negative of…
If you wanted to deal with full words, instead of single characters, you would
have to use the (?!word|word) construct.

Ruby Baby wrote:

I couldn’t find it in any documentation, or Ruby Way or Programming Ruby books.

Yes, it would be nice if these two excellent Ruby books explain/tutor
more about regexp, especially for beginners. As a comparison, regexp
receives quite a thorough explanation in many Perl books. Not all Ruby
users come from Perl, mind you. :slight_smile:

···


dave

Ruby Baby wrote:

EXAMPLE:
myString = ‘I want only the letters a and t and period. When done it should look like atttaatta.tatttaatta…’

irb(main):002:0> myString.tr(“^at.”, “”)
=> “attttaata.tatttaatta.tatttaatta…”

Please notice that in regexps there is no general and easy way to not
match a sub-pattern. (It’s however possible to match stuff which isn’t
in a character class with [^abc] and there’s exotic stuff like negative
look-ahead which are very complex and not worth the effort most of the
time.)

Thanks!

Regards,
Florian Gross

What you want is this:

myString.gsub(/[^at.]/, ‘’)

\ is unneeded in char class afaik

The brackets create a character class, and the ^ is the negative of…
If you wanted to deal with full words, instead of single characters, you
would have to use the (?!word|word) construct.

just fyi
useful ruby specific docs are found by
googling for “ruby quickref”

:slight_smile:

Alex

···

On Saturday 10 January 2004 23:46, GGarramuno wrote:


Every man is as God made him, ay, and often worse.
– Miguel de Cervantes

“Alexander Kellett” ruby-lists@lypanov.net schrieb im Newsbeitrag
news:200401102209.22341.ruby-lists@lypanov.net

Sorry it seems like the smallest thing, but I’m stuck on this.

I’m trying to make a simple regex to strip away characters that are NOT
certain characters I name. I’m unclear on what Ruby uses as a “NOT”
match
in regex, like ! is for != (Or if it uses the ^, where does it put it?)

EXAMPLE:

myString = ‘I want only the letters a and t and period. When done it
should look like atttaatta.tatttaatta…’

this doesn’t work of course:

puts myString.gsub(!/at./, ‘’)

puts myString.gsub(/[^at.]/,‘’)

This is likely a bit more efficient since it does less replacement
operations:

myString.gsub!( /[^at.]+/, ‘’ )

Regards

robert
···

On Saturday 10 January 2004 21:57, Ruby Baby wrote:

David Garamond wrote:

Ruby Baby wrote:

I couldn’t find it in any documentation, or Ruby Way or Programming
Ruby books.

Yes, it would be nice if these two excellent Ruby books explain/tutor
more about regexp, especially for beginners. As a comparison, regexp
receives quite a thorough explanation in many Perl books. Not all Ruby
users come from Perl, mind you. :slight_smile:

If you’d like to read some really serious stuff about regexes, I
recommend Jeffrey Friedl’s “Masterning Regular Expressions” by O’Reilly.
The current edition covers Ruby too, if I remember correctly - I ‘only’
have the 1st edition, but even that one is quite a help in understanding
regexes in general.

Happy regexing

Stephan

···


“It’s POLYMORPHIC!!!”
A fromer colleague

For many examples which exercises many aspects of Ruby’s regexp engine, see
http://rubyforge.org/cgi-bin/viewcvs/cgi/viewcvs.cgi/projects/regexp_engine/test/match_mixins.rb?rev=1.37&cvsroot=aeditor&content-type=text/vnd.viewcvs-markup

···

On Sun, 11 Jan 2004 19:32:54 +0900, David Garamond wrote:

Ruby Baby wrote:

I couldn’t find it in any documentation, or Ruby Way or Programming Ruby books.

Yes, it would be nice if these two excellent Ruby books explain/tutor
more about regexp, especially for beginners. As a comparison, regexp
receives quite a thorough explanation in many Perl books. Not all Ruby
users come from Perl, mind you. :slight_smile:


Simon Strandgaard

Not before ‘.’, but it is harmless and clarifies what’s going on. The
backslash is encouraged before other symbols in a character class even
though not strictly necessary, such as ‘[’; ‘[’ is deprecated in 1.8.

-Mark

···

On Sun, Jan 11, 2004 at 08:23:11AM +0900, Alexander Kellett wrote:

\ is unneeded in char class afaik

I didn’t come from Perl, but regular expresssion concepts, metacharacters,
character sets and options seem to be a pretty universal thing. The only
huge difference seems to be how the regex is implemented and that some
languages don’t implement some metacharacters, character sets and options,
but the most basic of these are the same in Perl, Java, Ruby, etc…

I think it is beneficial to anyone using regular expressions to read a book
or at least a few chapters of a regex book, I used “Mastering Regular
Expressions” from O’Reilly and I feel much more comfortable using regular
expressions.

Zach

···

-----Original Message-----
From: Simon Strandgaard [mailto:neoneye@adslhome.dk]
Sent: Sunday, January 11, 2004 1:07 PM
To: ruby-talk ML
Subject: Re: regex to NOT match?

On Sun, 11 Jan 2004 19:32:54 +0900, David Garamond wrote:

Ruby Baby wrote:

I couldn’t find it in any documentation, or Ruby Way or Programming Ruby
books.

Yes, it would be nice if these two excellent Ruby books explain/tutor
more about regexp, especially for beginners. As a comparison, regexp
receives quite a thorough explanation in many Perl books. Not all Ruby
users come from Perl, mind you. :slight_smile:

For many examples which exercises many aspects of Ruby’s regexp engine, see
http://rubyforge.org/cgi-bin/viewcvs/cgi/viewcvs.cgi/projects/regexp_engine/
test/match_mixins.rb?rev=1.37&cvsroot=aeditor&content-type=text/vnd.viewcvs-
markup


Simon Strandgaard

Stephan Kämper wrote:

David Garamond wrote:

Ruby Baby wrote:

I couldn’t find it in any documentation, or Ruby Way or Programming
Ruby books.

Yes, it would be nice if these two excellent Ruby books explain/tutor
more about regexp, especially for beginners. As a comparison, regexp
receives quite a thorough explanation in many Perl books. Not all Ruby
users come from Perl, mind you. :slight_smile:

If you’d like to read some really serious stuff about regexes, I
recommend Jeffrey Friedl’s “Masterning Regular Expressions” by O’Reilly.
The current edition covers Ruby too, if I remember correctly - I ‘only’
have the 1st edition, but even that one is quite a help in understanding
regexes in general.

Yeah, I’ve read Jeffrey’s book (1st ed too) and the two chapters of the 2nd.

I was not talking about /me specifically on the previous post though. I
myself learnt pretty much everything about regex initially from Perl
tutorials and books (and after that, of course, from trying it out and
learning from mistakes). But I’ve seen here that many people do not come
from Perl background and the abovementioned two Ruby books didn’t cover
regexp in a friendly way for beginners.

···


dave

Zach Dennis wrote:

The only
huge difference seems to be how the regex is implemented and that some
languages don’t implement some metacharacters, character sets and options,
but the most basic of these are the same in Perl, Java, Ruby, etc…

Btw, I hear Oniguruma will be pretty kick-ass, no? :slight_smile: From what I’ve
read it already supports many things that the current regexp engine
lacks, like look-behind, atomic group & possessive quantifiers, and
named captures.

···


dave

Right.

I am working on another regexp-engine, which both will support
perl5 + perl6 syntax. Mark Sparshatt are working on perl6.
http://raa.ruby-lang.org/list.rhtml?name=regexp
I don’t know it this has anyones interest?

···

On Mon, 12 Jan 2004 06:10:23 +0900, David Garamond wrote:

Zach Dennis wrote:

The only
huge difference seems to be how the regex is implemented and that some
languages don’t implement some metacharacters, character sets and options,
but the most basic of these are the same in Perl, Java, Ruby, etc…

Btw, I hear Oniguruma will be pretty kick-ass, no? :slight_smile: From what I’ve
read it already supports many things that the current regexp engine
lacks, like look-behind, atomic group & possessive quantifiers, and
named captures.


Simon Strandgaard