Alternate Regular Expressions?

Thanks! I was actually thinking about this myself.

Please people, send an email if you want to see something in a Ruby RegExp wrapper. Don't be shy. If i can get drafted into making this, then you can tell me what you want to see.

Thanks,
Ari

···

On Aug 7, 2007, at 12:55 PM, Kenneth McDonald wrote:

Speaking as someone who has actually written and used (in Python) a more abstract regex library,
the biggest problem with regular expressions in most languages isn't the syntax, but rather
the inability to easily compose small REs into larger REs. Which is why so many programs end
up with huge, unreadable REs. As a small example, it's really nice (and obvious) to be able to say

   re3 = re1 + re2

instead of

   re3 = "(?:#{re1})(?:#{re2})"

And the advantages go well beyond the convenience illustrated in the above example...

Also, I think that people who are accustomed to regular expressions (or any other DSL) tend
to forget about the problems with that DSL; the need for newcomers to learn another syntax,
the inability to use standard language tools with the DSL, and so on.

So, though I've used REs for years, I certainly don't agree with the contention that "REs
are actually pretty good". RE syntax in RE languages is optimized for quickly entering onetime
REs on the command line, not for building robust REs that can be easily maintained by
other programmers. It's the difference between weird, Perl-style variables, and meaningful
variable names. A good abstract wrapper in Ruby would be very useful.

Ken

Yossef Mendelssohn wrote:

--------------------------------------------|
If you're not living on the edge,
then you're just wasting space.

Kenneth McDonald wrote:

Speaking as someone who has actually written and used (in Python) a more abstract regex library,
the biggest problem with regular expressions in most languages isn't the syntax, but rather
the inability to easily compose small REs into larger REs. Which is why so many programs end
up with huge, unreadable REs. As a small example, it's really nice (and obvious) to be able to say

   re3 = re1 + re2

I agree with this and that's why I have the following add-on in my standard lib:

class Regexp
   def +(other)
     if other.is_a?(Regexp)
       if self.options == other.options
         Regexp.new(source + other.source, options)
       else
         Regexp.new(source + other.to_s, options)
       end
     else
       Regexp.new(source + Regexp.escape(other.to_s), options)
     end
   end
end

It could easily be improved so that, for example, a range would get appended as a character class, etc.

Daniel

Kenneth McDonald wrote:

the biggest problem with regular expressions in most languages isn't the
syntax, but rather
the inability to easily compose small REs into larger REs. Which is why
so many programs end
up with huge, unreadable REs.

You can do that in ruby rather simply:
# example taken from an example earlier in this thread
name = /[a-z]+/i
host = /[a-z]+/i
tld = /com|net|org|edu/
input.scan(%r{\b#{name}@#{host}\.#{tld}\b}) do |match|
   puts "Found email address #{match}"
end

Regards
Stefan

···

--
Posted via http://www.ruby-forum.com/\.

I was actually 2 unread emails away from writing the list, thanking everyone for their help, and that I would only write the wrapper if someone really wanted me to.

Looks like I'm writing it.

-Ari

···

On Aug 7, 2007, at 11:15 AM, John Joyce wrote:

Ari,

Do it!
excellent project. even if it fails in the long run, or if you pass it off to somebody else.

I like the Rails-like hash-looking idea, of course you would need some ordering, so it would need to be some kind of array or struct, but it is an idea worth toying with.

Ari
-------------------------------------------|
Nietzsche is my copilot

Did you think about something like this (attached)? This is just a
raw hack to illustrate a possible way to do it.

Kind regards

robert

textual-rx.rb (3.39 KB)

···

2007/8/7, Ari Brown <ari@aribrown.com>:

Please people, send an email if you want to see something in a Ruby
RegExp wrapper. Don't be shy. If i can get drafted into making this,
then you can tell me what you want to see.

rather
the inability to easily compose small REs into larger REs. Which is why
so many programs end
up with huge, unreadable REs. As a small example, it's really nice (and
obvious) to be able to say

   re3 = re1 + re2

I agree with this and that's why I have the following add-on in my
standard lib:

class Regexp
   def +(other_regex)
   ...
   end
end

Note also that you can do

re3 = /#{re1}#{re2}/ which fits my needs pretty well.

-r

···

--
Posted via http://www.ruby-forum.com/\.

Did you think about something like this (attached)? This is just a
raw hack to illustrate a possible way to do it.

Quote:

mail_addr = TextualRegexp.new do
  anchor :beginning

  group :capturing do
    at_least_once { any "a-z" }
  end

  literal "@"

  repeat 1..4 do
    at_least_once { any "a-z" }
    literal "."
  end

  any %w{com edu org}
end

I like this! A readable DSL for regular expressions.

Regards,
Pit

···

2007/8/8, Robert Klemme <shortcutter@googlemail.com>: