Regex question

Folks,

Just a quick one.

md = /…*?$/.match "dotslash.tar.gz"
md[0] # -> “.tar.gz”

I expect md[0] to be “.gz”, because the question-mark in the regex tells *
not to be greedy. Can anyone enlighten?

Thanks,
Gavin

Hello –

Folks,

Just a quick one.

md = /…*?$/.match "dotslash.tar.gz"
md[0] # -> “.tar.gz”

I expect md[0] to be “.gz”, because the question-mark in the regex tells *
not to be greedy. Can anyone enlighten?

It will still find the first ‘.’ from the left, and then be
non-greedy. In other words, non-greediness affects how much gets
consumed to the right in a given match.

Try this:

/.[^.]*$/

which forces the match to start at the right-most ‘.’ on the line.

David

···

On Mon, 5 Aug 2002, Gavin Sinclair wrote:


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

Hi!

md = /…*?$/.match "dotslash.tar.gz"
md[0] # -> “.tar.gz”

I expect md[0] to be “.gz”, because the question-mark in the regex tells *
not to be greedy. Can anyone enlighten?
I think this happens because of leftmost matching.

“.tar.gz” is the least greedy leftmost (even though not the smallest match
at all) match.

Regards,
Bernhard

or this:

/\w+$/

···

On Mon, Aug 05, 2002 at 11:37:12AM +0900, David Alan Black wrote:

Try this:

/.[^.]*$/

which forces the match to start at the right-most ‘.’ on the line.


Jim Freeze
If only I had something clever to say for my comment…
~

Hi,

···

At Mon, 5 Aug 2002 11:37:12 +0900, David Alan Black wrote:

Try this:

/.[^.]*$/

Or

str = "dotslash.tar.gz"
if md = str.rindex(/…*$/) #=> 12
str[md…-1] #=> “.gz”


Nobu Nakada

Hi –

···

On Mon, 5 Aug 2002, Jim Freeze wrote:

On Mon, Aug 05, 2002 at 11:37:12AM +0900, David Alan Black wrote:

Try this:

/.[^.]*$/

which forces the match to start at the right-most ‘.’ on the line.

or this:

/\w+$/

(You forgot the . :slight_smile: \w will cover the example given, though I was
generalizing a bit: matching the post-last-dot part of any line.

David


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

Try this:

/.[^.]*$/

which forces the match to start at the right-most ‘.’ on the line.

or this:

/\w+$/

Hmm… that’ll do! :slight_smile: Thanks.

···

On Mon, Aug 05, 2002 at 11:37:12AM +0900, David Alan Black wrote:


Jim Freeze
If only I had something clever to say for my comment…
~

Note that /\w+$/ will not match all extensions. E.g.,

fred.my-dashed-ext

will return ext.
David replied with my first (unpoosted) attempt.
To be sure you get everything past that last ‘.’, use:

/.([^.]*)$/.match(file)[1]

Jim

···

On Mon, Aug 05, 2002 at 12:57:30PM +0900, Gavin Sinclair wrote:

On Mon, Aug 05, 2002 at 11:37:12AM +0900, David Alan Black wrote:

Try this:

/.[^.]*$/

which forces the match to start at the right-most ‘.’ on the line.

or this:

/\w+$/

Hmm… that’ll do! :slight_smile: Thanks.


Jim Freeze
If only I had something clever to say for my comment…
~

Hi –

···

On Mon, 5 Aug 2002, Jim Freeze wrote:

On Mon, Aug 05, 2002 at 12:57:30PM +0900, Gavin Sinclair wrote:

On Mon, Aug 05, 2002 at 11:37:12AM +0900, David Alan Black wrote:

Try this:

/.[^.]*$/

which forces the match to start at the right-most ‘.’ on the line.

or this:

/\w+$/

Hmm… that’ll do! :slight_smile: Thanks.

Note that /\w+$/ will not match all extensions. E.g.,

fred.my-dashed-ext

will return ext.
David replied with my first (unpoosted) attempt.
To be sure you get everything past that last ‘.’, use:

/.([^.]*)$/.match(file)[1]

I guess Gavin didn’t need to include the ‘.’ itself in the match (it
was there in the first one, but if /\w+$/ works on his example then it
must not be needed). If it’s not needed one could just do:

/[^.]*$/.match(file)[0]

David


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

David Alan Black wrote:

Hi –

Try this:

/.[^.]*$/

which forces the match to start at the right-most ‘.’ on the line.

or this:

/\w+$/

Hmm… that’ll do! :slight_smile: Thanks.

Note that /\w+$/ will not match all extensions. E.g.,

fred.my-dashed-ext

will return ext.
David replied with my first (unpoosted) attempt.
To be sure you get everything past that last ‘.’, use:

/.([^.]*)$/.match(file)[1]

I guess Gavin didn’t need to include the ‘.’ itself in the match (it
was there in the first one, but if /\w+$/ works on his example then it
must not be needed). If it’s not needed one could just do:

/[^.]*$/.match(file)[0]

Now wouldn’t all this be much simpler if we had a method
along the lines of ‘basename’, which would pull off the
extension (or suffix) for us? We already have File::basename
and File::dirname, so why not a File::last-part-of-the-name-name?

Or what about a ‘File::parts’ which returns an array of:
directory (or nil), root name (or nil), suffix(es) (or nil).
Would that be more Ruby-esqe?

···

On Mon, 5 Aug 2002, Jim Freeze wrote:

On Mon, Aug 05, 2002 at 12:57:30PM +0900, Gavin Sinclair wrote:

On Mon, Aug 05, 2002 at 11:37:12AM +0900, David Alan Black wrote:


Mike Hall
http://www.enteract.com/~mghall

Hi –

···

On Mon, 5 Aug 2002, Mike Hall wrote:

David Alan Black wrote:

/[^.]*$/.match(file)[0]

Now wouldn’t all this be much simpler if we had a method
along the lines of ‘basename’, which would pull off the
extension (or suffix) for us? We already have File::basename
and File::dirname, so why not a File::last-part-of-the-name-name?

Or what about a ‘File::parts’ which returns an array of:
directory (or nil), root name (or nil), suffix(es) (or nil).
Would that be more Ruby-esqe?

Heavens – object-oriented regular expressions are Ruby-esque enough
for you? :slight_smile:

Of course, writing add-on libraries is very Ruby-esque too, so have
at it!

David


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

[All sorts of people wrote all sorts of things, and then…]

/[^.]*$/.match(file)[0]

Now wouldn’t all this be much simpler if we had a method
along the lines of ‘basename’, which would pull off the
extension (or suffix) for us? We already have File::basename
and File::dirname, so why not a File::last-part-of-the-name-name?

Or what about a ‘File::parts’ which returns an array of:
directory (or nil), root name (or nil), suffix(es) (or nil).
Would that be more Ruby-esqe?

I agree there’s room for some help from the File module here. We already
have

File.split("/home/fred/aliases.sh") # -> ["/home/fred", “aliases.sh”]

which is nice. I think the most sensible think is for this method to take
an optional parameter, indicating whether the extension should be a separate
element. In fact, it could be an integer specifying how many dots to
include in the extension. So:

File.split(“tuesday.txt”) # -> [".", “tuesday.txt”]
File.split("/games/doom.tar.gz") # -> ["/games", “doom.tar.gz”]
File.split("/games/doom.tar.gz", 1) # -> ["/games", “doom.tar”, “gz”]
File.split("/games/doom.tar.gz", 2) # -> ["/games", “doom”, “tar.gz”]
File.split("/games/doom.tar.gz", 3) # -> ["/games", “doom”, “tar.gz”]

Note that the first two lines represent current File.split behaviour, and
the final three are proposed extensions. The method “split” is very
appropriate here, as (and I should have thought of this earlier)
String.split can be used to chop up a filename into extensions in this
manner.

Of course, we’d need File.extension(path, n=1) to match File.dirname and
File.basename, as well.

–Gavin

···

----- Original Message -----
From: “Mike Hall” mghall@enteract.com

sub(/are/, “aren’t”) # :slight_smile:

David

···

On Tue, 6 Aug 2002, David Alan Black wrote:

Hi –

On Mon, 5 Aug 2002, Mike Hall wrote:

David Alan Black wrote:

/[^.]*$/.match(file)[0]

Now wouldn’t all this be much simpler if we had a method
along the lines of ‘basename’, which would pull off the
extension (or suffix) for us? We already have File::basename
and File::dirname, so why not a File::last-part-of-the-name-name?

Or what about a ‘File::parts’ which returns an array of:
directory (or nil), root name (or nil), suffix(es) (or nil).
Would that be more Ruby-esqe?

Heavens – object-oriented regular expressions are Ruby-esque enough
for you? :slight_smile:


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav