Regexp exclusion search - find matches NOT ending with a string?

I have the following text in a file:

1 a1.html
2 b.doc
3 c.xml
4 d.tiff
5 e.jpeg
6 f.html
....

I need a regexp to match lines except those that end with ending in
".html" - iow - I want lines 2-5 above. I believe this may require a
negative lookbehind match. I tried the following but Ruby (1.8) gives
an undefined sequence error :

$(?<!\.html) # <---- this seems to work with other engines

Before you jump re Ruby the version I also tested this here -
http://www.rubyxp.com/ and get invalid expression (fyi this tests with
Ruby 1.9). Any ideas/alternatives?

TIA,
BC

The easiest path is to negate that it matches, say for instance:

   if filename !~ /\.html\z/
     # non-HTML here
   end

-- fxn

···

On Fri, Jul 17, 2009 at 2:35 AM, BrendanC<brencam@gmail.com> wrote:

I have the following text in a file:

1 a1.html
2 b.doc
3 c.xml
4 d.tiff
5 e.jpeg
6 f.html
....

I need a regexp to match lines except those that end with ending in
".html"

Hi --

···

On Fri, 17 Jul 2009, BrendanC wrote:

I have the following text in a file:

1 a1.html
2 b.doc
3 c.xml
4 d.tiff
5 e.jpeg
6 f.html
....

I need a regexp to match lines except those that end with ending in
".html" - iow - I want lines 2-5 above. I believe this may require a
negative lookbehind match. I tried the following but Ruby (1.8) gives
an undefined sequence error :

$(?<!\.html) # <---- this seems to work with other engines

Before you jump re Ruby the version I also tested this here -
http://www.rubyxp.com/ and get invalid expression (fyi this tests with
Ruby 1.9). Any ideas/alternatives?

I would probably do:

   lines.reject {|line| line =~ /html$/ }

David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Now available: The Well-Grounded Rubyist (http://manning.com/black2\)
Training! Intro to Ruby, with Black & Kastner, September 14-17
(More info: http://rubyurl.com/vmzN\)

Xavier and David gave good advice.
If however you really have to have a matching regex

  %r($(?<!\.html)\z) # is that what you meant above?

works fine. I believe that you can install Oniguruma on 1.8 as a gem
for that purpose.
HTH
Robert

···

On 7/17/09, BrendanC <brencam@gmail.com> wrote:

I have the following text in a file:

1 a1.html
2 b.doc
3 c.xml
4 d.tiff
5 e.jpeg
6 f.html
....

I need a regexp to match lines except those that end with ending in
".html" - iow - I want lines 2-5 above. I believe this may require a
negative lookbehind match. I tried the following but Ruby (1.8) gives
an undefined sequence error :

$(?<!\.html) # <---- this seems to work with other engines

Before you jump re Ruby the version I also tested this here -
http://www.rubyxp.com/ and get invalid expression (fyi this tests with
Ruby 1.9). Any ideas/alternatives?

--
Toutes les grandes personnes ont d’abord été des enfants, mais peu
d’entre elles s’en souviennent.

All adults have been children first, but not many remember.

[Antoine de Saint-Exupéry]

BrendanC wrote:

I have the following text in a file:

1 a1.html
2 b.doc
3 c.xml
4 d.tiff
5 e.jpeg
6 f.html
....

I need a regexp to match lines except those that end with ending in
".html" - iow - I want lines 2-5 above.

Some alternate means to the same end:

IO.foreach("data.txt") do |line|

  #1
  if line.chomp.split(".")[-1] != "html"
    puts line
  end

  #2
  if line[-5, 4] != "html"
    print line
  end

  #3
  if line.slice(-5..-1) != "html"
    print line
  end

  puts
end

--output:--
2 b.doc
2 b.doc
2 b.doc

3 c.xml
3 c.xml
3 c.xml

4 d.tiff
4 d.tiff
4 d.tiff

5 e.jpeg
5 e.jpeg
5 e.jpeg

···

--
Posted via http://www.ruby-forum.com/\.

  %r($(?<!\.html)\z) # is that what you meant above?

where does this $ come from ?

Is the Ruby regular expression syntax documented anywhere?

I was attempting to use a look-behind, but it's not supported.

The syntax is not documented in the RegExp rdocs, and I haven't seen a
site that spells out all the nitty-gritty details and pokes into the
dark corners.

I'm looking for the Ruby equivalent of:
    re_syntax manual page - Tcl Built-In Commands
    re — Regular expression operations — Python 3.12.1 documentation
    perlre - Perl regular expressions - Perldoc Browser

Does it exist?

···

At 2009-07-16 08:59PM, "David A. Black" wrote:

On Fri, 17 Jul 2009, BrendanC wrote:
> $(?<!\.html) # <---- this seems to work with other engines

I would probably do:

    lines.reject {|line| line =~ /html$/ }

--
Glenn Jackman
    Write a wise saying and your name will live forever. -- Anonymous

You could try the Regular Expressions section of the Standard Types chapter of Programming Ruby. Be advised that this is the online version of the 1st edition that is now 8 years old. Since you seem to be using a version 1.8.x of Ruby, the Regexp parts are going to be mostly the same.

http://www.ruby-doc.org/docs/ProgrammingRuby/

-Rob

Rob Biedenharn http://agileconsultingllc.com
Rob@AgileConsultingLLC.com

···

On Jul 17, 2009, at 11:30 AM, Glenn Jackman wrote:

At 2009-07-16 08:59PM, "David A. Black" wrote:

On Fri, 17 Jul 2009, BrendanC wrote:

$(?<!\.html) # <---- this seems to work with other engines

I would probably do:

   lines.reject {|line| line =~ /html$/ }

Is the Ruby regular expression syntax documented anywhere?

I was attempting to use a look-behind, but it's not supported.

The syntax is not documented in the RegExp rdocs, and I haven't seen a
site that spells out all the nitty-gritty details and pokes into the
dark corners.

I'm looking for the Ruby equivalent of:
   re_syntax manual page - Tcl Built-In Commands
   re — Regular expression operations — Python 3.12.1 documentation
   perlre - Perl regular expressions - Perldoc Browser

Does it exist?

-- Glenn Jackman
   Write a wise saying and your name will live forever. -- Anonymous

> $(?<!\.html) # <---- this seems to work with other engines

I would probably do:

    lines.reject {|line| line =~ /html$/ }

Is the Ruby regular expression syntax documented anywhere?

I was attempting to use a look-behind, but it's not supported.

The syntax is not documented in the RegExp rdocs, and I haven't seen a
site that spells out all the nitty-gritty details and pokes into the
dark corners.

I'm looking for the Ruby equivalent of:
    re_syntax manual page - Tcl Built-In Commands
    re — Regular expression operations — Python 3.12.1 documentation
    perlre - Perl regular expressions - Perldoc Browser

Does it exist?

For Oniguruma I found this most helpful
http://manual.macromates.com/en/regular_expressions#regular_expressions

--
Glenn Jackman
    Write a wise saying and your name will live forever. -- Anonymous

Nice one

Cheers
Robert

···

On 7/17/09, Glenn Jackman <glennj@ncf.ca> wrote:

At 2009-07-16 08:59PM, "David A. Black" wrote:

On Fri, 17 Jul 2009, BrendanC wrote:

Glenn Jackman wrote:

Is the Ruby regular expression syntax documented anywhere?

I was attempting to use a look-behind, but it's not supported.

The syntax is not documented in the RegExp rdocs

In my opinion, documentation is Ruby's weakest aspect by far - and the
deficiency has gotten substantially worse with ruby 1.9.

Best available information is in third-party books, which presumably
have reverse-engineered from the source code. I fairly often resort to
irb to check behaviour is what I want, and have on occasions had to
resort to reading the source.

···

--
Posted via http://www.ruby-forum.com/\.