How might "Apocalypse 5" affect Ruby?

Larry Wall is redesigning regular expressions, in his latest “Apocalypse 5”.

Will Ruby be taking a similar change? I’m no expert at regexes, but
if it can make then easier to read/write, sounds like a good idea.

But is it worth breaking quite a bit of previously written code?
Ruby may have a big advantage in that it’s code base is quite a bit
smaller.

todd

···


“This UI has been brought to you by the letters ‘S’ and ‘K’, and the runlevel 3.”
- Greg Andrews

Couldn’t the old regex code be loaded with a require? That way people could
convert at their leasure.

···

On 6/5/02 9:35 AM, “Todd Holloway” todd@duckland.org wrote:

Larry Wall is redesigning regular expressions, in his latest “Apocalypse 5”.
Apocalypse 5

Will Ruby be taking a similar change? I’m no expert at regexes, but
if it can make then easier to read/write, sounds like a good idea.

But is it worth breaking quite a bit of previously written code?
Ruby may have a big advantage in that it’s code base is quite a bit
smaller.


Heaven is under our feet as well as over our heads. -Henry David Thoreau,
naturalist and author (1817-1862)

Larry Wall is redesigning regular expressions, in his latest
“Apocalypse 5”. Apocalypse 5

Read it, came out with a headache :slight_smile:

Start at page 6 if you want to see his proposed changes.

Will Ruby be taking a similar change? I’m no expert at regexes, but if
it can make then easier to read/write, sounds like a good idea.

Unfortunately, Larry seems to be putting quite a bit of effort into
embedding Perl directly into the regexp system with expressions like:

/(\d+) { $1 < 582 or fail }/

/(.*) { print .pos }/

/ (\S*) { let $x = .pos } \s* foo /

/ @x := [ (\S+) (\s*) ]* /

/ <@rule_from_array> <%rule_from_hash> <&rule_from_sub()> /

These sorts of expressions litter A5, and I think will make other
languages adopting them quite painful.

I’m not conviced by his other changes; pushing most common things into
the shortest possible set of characters is a very Perlish desire, and in
making such changes everyone’s going to have to re-learn their syntax
and how they write regexp while keeping the old stuff around for pretty
much every other tool in existance.

I can see lots of people looking at the new-style Perl expressions and
wasting time thinking “Uh?” before they remember […] is not a character
class anymore, and that . really is matching everything, and that
whitespace needs to be escaped, and…

But is it worth breaking quite a bit of previously written code?

No. Larry wants to do this now because he’s breaking everything anyway;
unless Rite contains a similar amount of breakage I wouldn’t like to see
our regexp engine copy Perl 6. Even then, I’m not convinced it’s a good
idea to blindly follow them.

Pages 2-5 do contain some useful discussion regarding the shortcomings
of current regexp stuff, and I’d be happy to be able to c&p regexps from
Perl 6 that don’t embed Perl (as I’m sure most won’t, despite what Larry
seems to want), but I’d want to do it using UberRegexp.new or some
alternative foo// syntax or something.

Ideally I’d like to see something like this come from many different
places; if Ruby, Python, Java, etc can come up with a mutually inclusive
matching/parsing system like Perl 6 without falling into the “Gee Wizz,
I can put Perl in this and make it even more unreadable!” trap we’re
much more likely to get somewhere productive.

Ruby may have a big advantage in that it’s code base is quite a bit
smaller.

Well, there’s no code to worry about yet; Perl 6 is, um, ages away, so
there’s no reason to panic. We may even find Perl 6 flops as it’s users
discover it’s easier just to switch to Python/Ruby/Java/Cobol/Assm than
learn all the stuff that’s changing (grin).

···


Thomas ‘Freaky’ Hurst - freaky@aagh.net - http://www.aagh.net/

Critic, n.:
A person who boasts himself hard to please because nobody tries
to please him.
– Ambrose Bierce, “The Devil’s Dictionary”

Unfortunately, Larry seems to be putting quite a bit of effort into
embedding Perl directly into the regexp system with expressions like:

/(\d+) { $1 < 582 or fail }/

/(.*) { print .pos }/

/ (\S*) { let $x = .pos } \s* foo /

/ @x := [ (\S+) (\s*) ]* /

/ <@rule_from_array> <%rule_from_hash> <&rule_from_sub()> /

These sorts of expressions litter A5, and I think will make other
languages adopting them quite painful.

There’s nothing to say that ruby couldn’t just as easily be implemented.
And remember, as A5 pointed out, regexes are a language of their own.

  /(\d+) { $1 < 582 or fail }/

  /(.*) { print .pos }/

  / (\S*) { let x = .pos } \s* foo /

  / #{x} := [ (\S+) (\s*) ]* /

  / <#{rule_from_array}> <#{rule_from_hash}> <#{rule_from_sub()}> /

I’m not conviced by his other changes; pushing most common things into
the shortest possible set of characters is a very Perlish desire, and in
making such changes everyone’s going to have to re-learn their syntax
and how they write regexp while keeping the old stuff around for pretty
much every other tool in existance.

You say perlish, I say huffman coding.

I can see lots of people looking at the new-style Perl expressions and
wasting time thinking “Uh?” before they remember […] is not a character
class anymore, and that . really is matching everything, and that
whitespace needs to be escaped, and…

Ahh, but it’s so much clearer. I would assume that a ruby user would like
clearer, especially one who’s complaining about perl being unreadable.

But is it worth breaking quite a bit of previously written code?

No. Larry wants to do this now because he’s breaking everything anyway;
unless Rite contains a similar amount of breakage I wouldn’t like to see
our regexp engine copy Perl 6. Even then, I’m not convinced it’s a good
idea to blindly follow them.

Yes. Just think… the power of full grammars in ruby. What power that
is! It’d be a shame not to fix the shortcomings in regexes (again…
they’re a language, not just a string).

Well, there’s no code to worry about yet; Perl 6 is, um, ages away, so
there’s no reason to panic. We may even find Perl 6 flops as it’s users
discover it’s easier just to switch to Python/Ruby/Java/Cobol/Assm than
learn all the stuff that’s changing (grin).

Well a year to a year and a half may seem ages away to you, but not to
me. And it won’t flop. Perl is much powerful than the languages you
listed, including ruby, and especially the joking suggestions.

  md |- m:att d:iephouse

Could you perhaps change your quote string too '> ’ rather than just
‘>’? Ommiting the space makes reflowing more difficult.

There’s nothing to say that ruby couldn’t just as easily be
implemented. And remember, as A5 pointed out, regexes are a language
of their own.

 /(\d+) { $1 < 582 or fail }/

I was under the impression the closures were snippets of Perl code, not
parts of the regexp itself.

 / <#{rule_from_array}> <#{rule_from_hash}> <#{rule_from_sub()}> /

Who’s syntax will differ between languages, and be completely useless in
static languages.

You say perlish, I say huffman coding.

Huffman coding is for compression algorithms, not programming languages.
By this logic, Ruby would be improved by changing:

class Foo → c Foo

module Foo → m Foo

etc.

The world is not going to end because I need to type two extra
characters to achieve non-catching grouping, but it might end if I
constantly mistake […] for it.

I can see lots of people looking at the new-style Perl expressions
and wasting time thinking “Uh?” before they remember […] is not a
character class anymore, and that . really is matching everything,
and that whitespace needs to be escaped, and…

Ahh, but it’s so much clearer.

I’ve yet to be convinced it’s “much” clearer, certainly not when the new
syntax is so similar to the new one, both of which are going to have to
co-exist for the forseable future.

[useless provocative statement removed]

Yes. Just think… the power of full grammars in ruby. What power that
is!

Yes, a bit less than the power of the parser frameworks we already have,
only less well defined.

It’d be a shame not to fix the shortcomings in regexes (again…
they’re a language, not just a string).

They’re not a language, they’re a collection of languages. We already
have POSIX regexps and various degrees between Perl regexp and POSIX,
now we’re getting a completely new language which looks almost identical
to the old ones but which does completely different things to all
previous implimentations.

Well, there’s no code to worry about yet; Perl 6 is, um, ages
away, so there’s no reason to panic.

Well a year to a year and a half may seem ages away to you, but not to
me.

A year and a half is about 7% of my life, so yes, it seems a long time
to me :slight_smile:

And it won’t flop. Perl is much powerful than the languages you
listed, including ruby, and especially the joking suggestions.

I don’t think it’s yet been proved that Perl is anything beyond Turing
complete :wink:

···


Thomas ‘Freaky’ Hurst - freaky@aagh.net - http://www.aagh.net/

The sunlights differ, but there is only one darkness.
– Ursula K. LeGuin, “The Dispossessed”