Alternate Regular Expressions?

Ari_Brown · 7 August 2007 01:12

Just randomly curious -

Is there an alternate RegExp "language" to the current one in Ruby and Perl?

Don't get me wrong, I love the current RegExp in Ruby, but I'm allowed to be curious...

Also, is Ruby going to jump on the PERL 6 RegExp ship?

^^^^^^^ That's a big one to some people I know.

Thanks,
~ Ari
English is like a pseudo-random number generator - there are a bajillion rules to it, but nobody cares.

Phlip1 · 7 August 2007 01:40

Ari Brown wrote:

Just randomly curious -

Is there an alternate RegExp "language" to the current one in Ruby and Perl?

I don't know. So here's a dissertation on where to start.

The good news is a RegExp is only two things at heart...

- a Domain-Specific Language to program
- a state machine.

The bad news is, back in the day, people used to invent DSL as long strings of easily parsed characters. For example, a language called LSYSTEM might describe turtle graphics like this:

s=[::cc!!!!&&[FFcccZ]^^^^FFcccZ] # upper spikes

The really bad news is RegExp is one of these string-oriented DSLs that stuck. It will always be useful, so programmers forget how much room it has for improvement.

The good news is Ruby excels at generating light DSLs. The equivalent expression for a modern implementation of LSYSTEM might look like this:

upper_spikes = push.twist(2).thinner(2).increase_angle(4)....

etc. Because Ruby gives your programming interfaces extreme notational flexibility, you can declare the interfaces most convenient for your domain.

So start writing! and research other DSLs as you go. For example, here's a DSL written with C++ metaprogramming:

http://boost-sandbox.sourceforge.net/libs/xpressive/doc/html/index.html

Whenever you like, that language slips back to raw RegExp. Your effort should have a similar shunt.

English is like a pseudo-random number generator - there are a bajillion rules to it, but nobody cares.

Of all the world's languages, English is both the ugliest and the beautifulest.

···

--
  Phlip
  Test Driven Ajax (on Rails) [Book]
  "Test Driven Ajax (on Rails)"
  assert_xpath, assert_javascript, & assert_ajax

Wolfgang_Nadasi-Donn · 7 August 2007 11:01

Ari Brown wrote:

Is there an alternate RegExp "language" to the current one in Ruby
and Perl?

Snobol4 pattern are now available as a Python library. It should be
possible to port it to Ruby. I don't think that the implementation is
complete, because I didn't see the possibility of recursive pattern
definitions, which give Snobol4 the extreme power.

Infos

http://permalink.gmane.org/gmane.comp.python.announce/7217 (Snobol4 in
Python)

SNOBOL - Wikipedia (has some links)

Wolfgang Nádasi-Donner

···

--
Posted via http://www.ruby-forum.com/\.

Albert_Schlef · 24 December 2009 02:27

Ari Brown wrote:

Just randomly curious -

Is there an alternate RegExp "language" to the current one in Ruby
and Perl?

There's a RegExp library for Common Lisp that, besides a string, accepts
also a "parse tree". See CL-PPCRE - Portable Perl-compatible regular expressions for Common Lisp

The regexp string...

(?:abc){3,5}

...is equal to the following data structure:

(:GREEDY-REPETITION 3 5 (:GROUP "abc"))

which, in Ruby, looks like:

[:GREEDY-REPETITION, 3, 5, [:GROUP "abc"]]

This is not really a different language, just a way to express the
regexp string as a data-structure.

···

--
Posted via http://www.ruby-forum.com/\.

Ari_Brown · 7 August 2007 01:58

Ugh. If I must (which I must). What would you suggest as syntax?

Also, should I completely try to reinvent the wheel, or create a wrapper for current RegExp?

Man. I need a mentor on this

aRi
--------------------------------------------|
IMO, Arabic has THE most beautiful script.
Poetically, English is extremely beautiful. It's like a language of RegExp - except there are no rules!
Spoken, the most beautiful language is either French (sorry) or Esperanto.

···

On Aug 6, 2007, at 9:40 PM, Phlip wrote:

So start writing! and research other DSLs as you go.

Tim_Hunter4 · 7 August 2007 02:08

Ari Brown wrote:

So start writing! and research other DSLs as you go.

Ugh. If I must (which I must). What would you suggest as syntax?

Also, should I completely try to reinvent the wheel, or create a wrapper for current RegExp?

Man. I need a mentor on this

This might give you a place to start: Parsing expression grammar - Wikipedia

···

On Aug 6, 2007, at 9:40 PM, Phlip wrote:

--
RMagick OS X Installer [http://rubyforge.org/projects/rmagick/\]
RMagick Hints & Tips [http://rubyforge.org/forum/forum.php?forum_id=1618\]
RMagick Installation FAQ [http://rmagick.rubyforge.org/install-faq.html\]

Kenneth_McDonald · 7 August 2007 02:51

Ari,

How serious are you about this? Several years ago I wrote a Python library that treats Python regular
expressions as semantic, not syntactic, objects, and that has been incredibly useful to me. I've started
to port it to Ruby, but simply don't have the time. If you do (you're probably looking at a couple of
weeks of full-time-equivalent hours to do a good job, including decent documentation), I'm happy to pass
on the Python code, the Ruby code, and give advice and so on.

To help you evaluate this, and also as a potential source of ideas in case you do something else, I've
appended my (probably out of date) intro text to the library at the bottom of this reply.

Cheers,
Ken

Ari Brown wrote:

So start writing! and research other DSLs as you go.

Ugh. If I must (which I must). What would you suggest as syntax?

Also, should I completely try to reinvent the wheel, or create a wrapper for current RegExp?

Man. I need a mentor on this

aRi
--------------------------------------------|
IMO, Arabic has THE most beautiful script.
Poetically, English is extremely beautiful. It's like a language of RegExp - except there are no rules!
Spoken, the most beautiful language is either French (sorry) or Esperanto.

Text from the _Python_ library (In retrospect, I would do quite a bit different):

Overview

···

On Aug 6, 2007, at 9:40 PM, Phlip wrote:

    ========
       'rex' provides regular expression and parsing facilities. It uses (and is intended to functionally
    replace) the Python 're' module.
       Regular expression functionality is provided through the '_Rexp' and 'MatchResult' classes,
    and the CHAR, REP0, REP1, OPT, PAT, and ALT constructs.
    These constructs can be used as or provide functions to create rexps, and also define
    attributes for commonly used rexps. (For example, PAT.float provides a rexp
    which matches basic floating-point (no exponent) numbers.)
           Pattern-Matching Example
    ----------------------
       If you are familiar with regular expressions, the following will probably make at
    least some sense. If you are not, skip this example for now. In either case, come
    back to it once you have have read the formal definitions of functions and
    constructs provided by rex.
           COMPLEX= PAT.float['re'] + \
                    REP0.whitespace + \
                    ALT("+", "-")['op'] + \
                    REP0.whitespace + \
                    PAT.float['im'] + \
                    'i'

    The above example defines a pattern which will match complex
    numbers, of the form "-2.718 + 3.14i", for example. It uses the predefined
    match expressions PAT.float and REP0.whitespace to
    ease the definition. Applied to the example complex number string, the result will contain three
    named substrings: 're' will map to "-2.718", "op" will map to "+", and "im" will map to "3.14".
       SEQ is an alternative form of joining rexps; the above is equivalent to:
               COMPLEX= SEQ(
                    PAT.float['re'], REP0.whitespace,
                    ALT("+", "-")['op'],
                    REP0.whitespace,
                    PAT.float['im'],
                    'i'
                    )

Regular Expressions
---------------

    This is an introduction to using the pattern-matching (regular-expression-related)
    part of rex. See documentation associated
    with a specific method/function/name for details on that entity.
       In the following, we use the abbreviation RE to refer to standard regular
    expressions defined as strings, and the word 'rexp' to refer to rex objects
    which denote regular expressions.
       The starting point for building a rexp is either rex.PAT,
    which we'll just call PAT, or rex.CHAR, which we'll just call CHAR, or rex.LIT.
    CHAR provides rexps defining a set of characters, and which
    will match a single character string if that character is in the given
    set. In addition to providing attributes which provide prebuilt character
    sets, the CHAR function may be used to define your own character
    sets.
       LIT builds rexps which match strings of varying lengths.
       REP0 and REP1 are zero or more and 1 or ore
       Also

        - PAT._someattribute_ returns (for defined attributes) a corresponding rexp.
            For example, PAT.stringstart returns a rexp matching at the start of a string.
               - CHAR(a1, a2, . . .) returns a rexp matching a single character from a set
            of characters defined by its arguments. For example, CHAR("-", ["0","9"], ".")
            iter the characters necessary to build basic floating point numbers.
            See CHAR docs for details.
               - CHAR._someattribute_ returns (for defined attributes) a corresponding rexp
            defining a set of characters.
            For example, CHAR.digit returns a rexp matching a single digit.

    Now assume that A, B, C,... are rexps. The following Python expressions
        (_not_ strings) may be used to build more complex rexps:
               - X | Y | Z . . . : returns a rexp which iter a string if any of the operands
            match that string. Similar to "X|Y|Z" in normal REs, except of course you can't
            use Python code to define a normal RE.
                   - X + Y + Z ...: returns a rexp which iter a string if all of X, Y, Z match consecutive
            substrings of the string in succession. Like "XYZ" in normal REs.
                   - X*n : returns a rexp which iter a number of times as defined by n.
            This replaces '?', '+', and '*' as used in normal REs. See docs for details.
            'rex' defines constants which allow you to say X*REP0, X*REP1, or X*MAYBE,
            indicating (0 or more iter), (1 or more iter), or (0 or 1 iter),
            respectively.
                   - X**n : Like X*n, but does nongreedy matching.
                   - +X : positive lookahead assertion: iter if X iter, but doesn't
            consume any of the input.
                   - ~+X : negative lookahead assertion: iter if X _doesn't_ match,
            but doesn't consume any of the input.
                   - -X, ~-X : positive and negative lookback assertions. Lke lookahead assertions,
            but in the other direction.
                   - X[name] : name must be a string: any matched by X can be referred
            to by the given name in the match result object. (This is the equivalent
            of named groups in the re module).
                   - X.group() : X will be in an unnamed group, referable by number.
               In addition, a few other operations may be performed:

        - Some of the attributes defined in PAT have "natural inverses"; for such
            attributes, the inverse may be taken. For example, ~PAT.digit is
            a pattern matching any character except a digit.
                   - Character classes may be inverted: ~CHAR("aeiouAEIOU") returns a pattern
            matching any except a vowel.
                   - 'ALT' gives a different way to denote alternation: ALT(X, Y, Z,...) does
            the same thing as X | Y | Z | . . ., except that none of the arguments
            to ALT need be rexps; any which are normal strings will be converted
            to a rexp using PAT.
                   - 'SEQ' can take multiple arguments: PAT(X, Y, Z,...), which gives the same
            result as PAT(X) + PAT(Y) + PAT(Z) + . . . .
               Finally, a very convenient shortcut is that only the first object in a sequence of
    operator/method calls needs to be a rexp; all others will be automatically
    converted as if LIT(...) had been called on them. For example, the
    sequence X | "hello" is the same as X | LIT("hello")

Phlip1 · 7 August 2007 02:51

Ari Brown wrote:

Ugh. If I must (which I must).

You missed where I said I didn't know the actual answer.

What would you suggest as syntax?

Ruby itself, as a DSL; that was the point.

rx = match('foo') or match('bar') # like /(foo|bar)/
assert_equal [['foo', 'bar']], rx('a foo b bar')

Make match() return an object that overloads the or operator, and away you go!

···

--
  Phlip
  Test Driven Ajax (on Rails) [Book]
  "Test Driven Ajax (on Rails)"
  assert_xpath, assert_javascript, & assert_ajax

Marnen_Laibow-Koser · 20 December 2009 05:00

Ari Brown wrote:

So start writing! and research other DSLs as you go.

Ugh. If I must (which I must). What would you suggest as syntax?

Also, should I completely try to reinvent the wheel, or create a
wrapper for current RegExp?

Man. I need a mentor on this

I would suggest taking a look at Treetop, both as an easy-to-use parser
generator and as an inspiration for regexp extensions. But I mostly
like regexps the way they are.

aRi
--------------------------------------------|
IMO, Arabic has THE most beautiful script.

Ever looked at Mongolian (Uighur) script?

Poetically, English is extremely beautiful. It's like a language of
RegExp - except there are no rules!

Uh, what? (I know that was intended to be funny -- I just don't get
it.)

Spoken, the most beautiful language is either French (sorry) or
Esperanto.

Hmmm...

Best,

···

On Aug 6, 2007, at 9:40 PM, Phlip wrote:

--
Marnen Laibow-Koser
http://www.marnen.org
marnen@marnen.org
--
Posted via http://www.ruby-forum.com/\.

Ari_Brown · 7 August 2007 03:10

I'm moderately serious. This is going to be one of those projects that won't see the light of day for maybe 6 months to a year.
This looks largely what I was hoping to make, although in Ruby I had invisioned this:

matching email addresses (sample):
a = LeetExp.new(:letters => [[a-z], :insensitive],
        :string => "@",
        :letters => [[a-z], :insensitive],
        :string => ".",
        :string => ["com", "net", "org", "edu"]
)

case line
when a
# ...

My idea is to make it logical and human readable. Ruby is a language for humans and UberBeings, and I think this should reflect Ruby's ideas.

Also, was you library a wrapper for underlying PERL RegExp? or was it the whole RegExp engine?

Thanks,
Ari

···

On Aug 6, 2007, at 10:51 PM, Kenneth McDonald wrote:

Ari,

How serious are you about this? Several years ago I wrote a Python library that treats Python regular
expressions as semantic, not syntactic, objects, and that has been incredibly useful to me. I've started
to port it to Ruby, but simply don't have the time. If you do (you're probably looking at a couple of
weeks of full-time-equivalent hours to do a good job, including decent documentation), I'm happy to pass
on the Python code, the Ruby code, and give advice and so on.

To help you evaluate this, and also as a potential source of ideas in case you do something else, I've
appended my (probably out of date) intro text to the library at the bottom of this reply.

Cheers,
Ken

--------------------------------------------|
If you're not living on the edge,
then you're just wasting space.

Brian_Candler · 21 December 2009 15:39

Phlip wrote:

rx = match('foo') or match('bar') # like /(foo|bar)/

Aside: you either need to use '||' instead of 'or', or you need extra
parentheses, i.e.

rx = (match('foo') or match('bar'))

otherwise it parses as

(rx = match('foo')) or match('bar')

I don't really have a problem with regexps as they are. Although I'd
like to have more limited, *true* regexps, which compile to a DFA and
never backtrack.

···

--
Posted via http://www.ruby-forum.com/\.

Kenneth_McDonald · 7 August 2007 03:29

Ari Brown wrote:

I'm moderately serious. This is going to be one of those projects that won't see the light of day for maybe 6 months to a year.
This looks largely what I was hoping to make, although in Ruby I had invisioned this:

matching email addresses (sample):
a = LeetExp.new(:letters => [[a-z], :insensitive],
                :string => "@",
                :letters => [[a-z], :insensitive],
                :string => ".",
                :string => ["com", "net", "org", "edu"]
)

case line
when a
    # ...

My idea is to make it logical and human readable. Ruby is a language for humans and UberBeings, and I think this should reflect Ruby's ideas.

Reflecting on my own experience, I'd suggest a less verbose notation, and one that uses Ruby idioms more. For example:

letters = CharClass.new('a'..'z').case_insensitive
a = letters + "@" + letters + "." + (Literal.new("com") | "net" | "org"

"edu")

It's not at all difficult to do this with Ruby. Strings can be used for literals and character classes, and
ranges are perfect for use as char ranges in character classes.

Also, the ability to safely combine regular expressions (as shown above, where "letters" is used in "a")
is _paramount_ in making this sort of wrapper really useful.

Also, was you library a wrapper for underlying PERL RegExp? or was it the whole RegExp engine?

It was in Python; instances of my 'rex' class simply construct and use Python patterns, and their associated
functions, internally and invisibly to the user.

Ken

···

Thanks,
Ari

Robert_K1 · 7 August 2007 05:34

I'm moderately serious. This is going to be one of those projects that won't see the light of day for maybe 6 months to a year.
This looks largely what I was hoping to make, although in Ruby I had invisioned this:

matching email addresses (sample):
a = LeetExp.new(:letters => [[a-z], :insensitive],
                :string => "@",
                :letters => [[a-z], :insensitive],
                :string => ".",
                :string => ["com", "net", "org", "edu"]
)

You cannot do this because Hashes are unordered so you loose the original order. Also [a-z] is only valid if you define local variables a and z.

Personally I find regular expressions pretty readable - at least if they are crafted properly. See also below.

case line
when a
# ...

My idea is to make it logical and human readable. Ruby is a language for humans and UberBeings, and I think this should reflect Ruby's ideas.

Do you know the /x modifier? Than can go a long way to make a regular expression readable. For example:

input = <<TEXT
adjasdkajda dadkajd foo@bar.com adklskkdaldjskj
postmaster@root.edu adkjasdjk
blah@org akjsd askdl asd noname@foo.net hello
asdj
TEXT

input.scan %r{
   \b # word boundary
   (?i:[a-z]+) # user name
   @ # the famous "at" sign
   (?i:[a-z]+) # host name
   \. # a literal dot
   (?:com|net|org|edu) # only some of the TLDs
   \b # word boundary
}x do |match|
   puts "Found email address #{match}"
end

Kind regards

robert

···

On 07.08.2007 05:10, Ari Brown wrote:

David_A_Black1 · 7 August 2007 08:52

Hi --

I'm moderately serious. This is going to be one of those projects that won't see the light of day for maybe 6 months to a year.
This looks largely what I was hoping to make, although in Ruby I had invisioned this:

matching email addresses (sample):
a = LeetExp.new(:letters => [[a-z], :insensitive],
        :string => "@",
        :letters => [[a-z], :insensitive],
        :string => ".",
        :string => ["com", "net", "org", "edu"]
)

case line
when a
  # ...

My idea is to make it logical and human readable. Ruby is a language for humans and UberBeings, and I think this should reflect Ruby's ideas.

Regular expressions are nothing if not logical And readability,
as always, is largely in the eye of the beholder. I think the quest
for an alternative notation is fine, but there's nothing inherently
un-Ruby-like about what's there already. Then again, I'm in a small
minority who find /x with a lot of extra whitespace a serious
impediment to understanding a pattern

Anyway -- somewhere out there, though I haven't been able to find it,
is a library called Regexp::English by Florian Gross, which provides a
kind of English-language wrapper for regexes. I don't know whether
it's still in development and/or at a point of usability.

David

···

On Tue, 7 Aug 2007, Ari Brown wrote:

--
* Books:
   RAILS ROUTING (new! http://www.awprofessional.com/title/0321509242\)
   RUBY FOR RAILS (http://www.manning.com/black\)
* Ruby/Rails training
     & consulting: Ruby Power and Light, LLC (http://www.rubypal.com)

Yossef_Mendelssohn · 7 August 2007 13:11

Ari,

There have been other responses to this already, but I thought I'd
give you something else to look at:

I second (or third, or whatever) the contention that regular
expressions are pretty readable on their own (given some knowledge of
the syntax and good formatting). The thing to keep in mind is that
they're a language of their own. Once you learn the language, you
find you can use it in many a programming language (though there are
some dialectical problems here and there).

···

On Aug 6, 10:10 pm, Ari Brown <a...@aribrown.com> wrote:

I'm moderately serious. This is going to be one of those projects
that won't see the light of day for maybe 6 months to a year.
This looks largely what I was hoping to make, although in Ruby I had
invisioned this:

matching email addresses (sample):
a = LeetExp.new(:letters => [[a-z], :insensitive],
                                :string => "@",
                                :letters => [[a-z], :insensitive],
                                :string => ".",
                                :string => ["com", "net", "org", "edu"]
)

case line
when a
        # ...

My idea is to make it logical and human readable. Ruby is a language
for humans and UberBeings, and I think this should reflect Ruby's ideas.

Also, was you library a wrapper for underlying PERL RegExp? or was it
the whole RegExp engine?

Thanks,
Ari

On Aug 6, 2007, at 10:51 PM, Kenneth McDonald wrote:

> Ari,

> How serious are you about this? Several years ago I wrote a Python
> library that treats Python regular
> expressions as semantic, not syntactic, objects, and that has been
> incredibly useful to me. I've started
> to port it to Ruby, but simply don't have the time. If you do
> (you're probably looking at a couple of
> weeks of full-time-equivalent hours to do a good job, including
> decent documentation), I'm happy to pass
> on the Python code, the Ruby code, and give advice and so on.

> To help you evaluate this, and also as a potential source of ideas
> in case you do something else, I've
> appended my (probably out of date) intro text to the library at the
> bottom of this reply.

> Cheers,
> Ken

--------------------------------------------|
If you're not living on the edge,
then you're just wasting space.

--
-yossef

Michael_W_Ryder1 · 8 August 2007 00:24

Ari Brown wrote:

I'm moderately serious. This is going to be one of those projects that won't see the light of day for maybe 6 months to a year.
This looks largely what I was hoping to make, although in Ruby I had invisioned this:

matching email addresses (sample):
a = LeetExp.new(:letters => [[a-z], :insensitive],
                :string => "@",
                :letters => [[a-z], :insensitive],
                :string => ".",
                :string => ["com", "net", "org", "edu"]
)

Another way to do something like this is to use a "compound" regular expression where each part continues where the last one ended. Something like \@\.\* where $1 would be everything up to the @ sign, i.e. the name. $2 would be everything between the @ and the ., i.e. the ISP name. And $3 would be the remainder of the address. The only time something like this would fail would be an address like mine, where the ISP name is worldnet.att rather than just worldnet.

···

case line
when a
# ...

My idea is to make it logical and human readable. Ruby is a language for humans and UberBeings, and I think this should reflect Ruby's ideas.

Also, was you library a wrapper for underlying PERL RegExp? or was it the whole RegExp engine?

Thanks,
Ari

On Aug 6, 2007, at 10:51 PM, Kenneth McDonald wrote:

Ari,

How serious are you about this? Several years ago I wrote a Python library that treats Python regular
expressions as semantic, not syntactic, objects, and that has been incredibly useful to me. I've started
to port it to Ruby, but simply don't have the time. If you do (you're probably looking at a couple of
weeks of full-time-equivalent hours to do a good job, including decent documentation), I'm happy to pass
on the Python code, the Ruby code, and give advice and so on.

To help you evaluate this, and also as a potential source of ideas in case you do something else, I've
appended my (probably out of date) intro text to the library at the bottom of this reply.

Cheers,
Ken

--------------------------------------------|
If you're not living on the edge,
then you're just wasting space.

Robert_K1 · 21 December 2009 17:29

Nowadays DFA's are rare because NFA provide more features and you can
use them to your advantage (i.e. prioritizing by ordering
alternatives). You can "switch off" backtracking by using atomic
groups and greedy quantifiers:
http://www.geocities.jp/kosako3/oniguruma/doc/RE.txt

Kind regards

robert

···

2009/12/21 Brian Candler <b.candler@pobox.com>:

Phlip wrote:
rx = match$&#39;foo&#39;$ or match$&#39;bar&#39;$ \# like /$foo|bar$/
Aside: you either need to use '||' instead of 'or', or you need extra
parentheses, i.e.

rx = (match('foo') or match('bar'))

otherwise it parses as

(rx = match('foo')) or match('bar')

I don't really have a problem with regexps as they are. Although I'd
like to have more limited, *true* regexps, which compile to a DFA and
never backtrack.

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Simon_Strandgaard2 · 7 August 2007 13:36

[snip]

Anyway -- somewhere out there, though I haven't been able to find it,
is a library called Regexp::English by Florian Gross, which provides a
kind of English-language wrapper for regexes. I don't know whether
it's still in development and/or at a point of usability.

long time ago I wrote a regexp engine 96% compatible with the ruby's,
at that point in time. Maybe it's useful to somebody?
http://raa.ruby-lang.org/project/regexp/

···

On 8/7/07, dblack@rubypal.com <dblack@rubypal.com> wrote:

--
Simon Strandgaard
http://opcoders.com/

Kenneth_McDonald · 7 August 2007 16:55

Speaking as someone who has actually written and used (in Python) a more abstract regex library,
the biggest problem with regular expressions in most languages isn't the syntax, but rather
the inability to easily compose small REs into larger REs. Which is why so many programs end
up with huge, unreadable REs. As a small example, it's really nice (and obvious) to be able to say

re3 = re1 + re2

instead of

re3 = "(?:#{re1})(?:#{re2})"

And the advantages go well beyond the convenience illustrated in the above example...

Also, I think that people who are accustomed to regular expressions (or any other DSL) tend
to forget about the problems with that DSL; the need for newcomers to learn another syntax,
the inability to use standard language tools with the DSL, and so on.

So, though I've used REs for years, I certainly don't agree with the contention that "REs
are actually pretty good". RE syntax in RE languages is optimized for quickly entering onetime
REs on the command line, not for building robust REs that can be easily maintained by
other programmers. It's the difference between weird, Perl-style variables, and meaningful
variable names. A good abstract wrapper in Ruby would be very useful.

Ken

Yossef Mendelssohn wrote:

···

I second (or third, or whatever) the contention that regular
expressions are pretty readable on their own (given some knowledge of
the syntax and good formatting). The thing to keep in mind is that
they're a language of their own. Once you learn the language, you
find you can use it in many a programming language (though there are
some dialectical problems here and there).

--
-yossef

James_Britt · 7 August 2007 15:15

Ari,

Do it!
excellent project. even if it fails in the long run, or if you pass it off to somebody else.

I like the Rails-like hash-looking idea, of course you would need some ordering, so it would need to be some kind of array or struct, but it is an idea worth toying with.

···

On Aug 7, 2007, at 8:36 AM, Simon Strandgaard wrote:

On 8/7/07, dblack@rubypal.com <dblack@rubypal.com> wrote:
[snip]

Anyway -- somewhere out there, though I haven't been able to find it,
is a library called Regexp::English by Florian Gross, which provides a
kind of English-language wrapper for regexes. I don't know whether
it's still in development and/or at a point of usability.

long time ago I wrote a regexp engine 96% compatible with the ruby's,
at that point in time. Maybe it's useful to somebody?
http://raa.ruby-lang.org/project/regexp/

--
Simon Strandgaard
http://opcoders.com/

Topic		Replies	Views
Regular expressions ruby-talk	26	211	17 April 2003
About Regular Expressions ruby-talk	30	226	20 November 2004
Slow regular expressions :( ruby-talk	28	176	28 July 2006
ANN: Regexador - A mini-language for regular expressions ruby-talk	12	255	28 September 2013
Regex simplifier? ruby-talk	16	197	18 February 2011

Alternate Regular Expressions?

Related topics