Announcing Reg 0.4.0

caleb_clausen · 24 April 2005 02:34

I would like to announce the first version, 0.4.0, of Reg, the Ruby
Extended Grammar. Reg is a library for pattern matching in ruby data
structures. Reg provides Regexp-like match and match-and-replace for
all data structures (particularly Arrays, Objects, and Hashes), not
just Strings.

The Reg RubyForge project: http://rubyforge.org/projects/reg/

The Reg Tarball:
http://rubyforge.org/frs/download.php/4199/reg-0.4.0.tar.bz2

Reg is best thought of in analogy to regular expressions; Regexps are
special data structures for matching Strings; Regs are special data
structures for matching ANY type of ruby data (Strings included, using
Regexps).

This table compares syntax of reg and regexp for various constructs.
Keep
in mind that all Regs are ordinary ruby expressions. The special syntax

is acheived by overriding ruby operators.
These abbreviations are used:
re,re1,re2 represent arbitrary regexp subexpressions,
r,r1,r2 represent arbitrary reg subexpressions
s,t represent any single character (perhaps appropriately escaped, if
the char is magical)

reg regexp #description

+[r1,r2,r3] /re1re2re3/ #sequence
-[r1,r2] (re1re2) #subsequence
r.lit \re #escaping a magical
regproc{r} #{re} #dynamic inclusion
r1|r2 or :OR (re1|re2) or [st] #alternation
~r [^s] #negation (for scalar r and s)
r+0 re* #zero or more matches
r+1 re+ #one or more matches
r-1 re? #zero or one matches
r*n re{n} #exactly n matches
r*(n..m) re{n,m} #at least n, at most m matches
r-n re{n,} #at least n matches
r+m re{,m} #at most m matches
OB . #a single item
OBS .* #zero or more items
BR[1,2] \1,\2 #backreference ***

x or sub sub,gsub #search and replace ***

here are features of reg that don't have an equivalent in regexp
r.la #lookahead ***
~- #subsequence negation w/lookahead ***
& or :AND #all alternatives match
^ or :XOR #exactly one of alternatives matches
+{r1=>r2} #hash matcher
-{name=>r} #object matcher
obj.reg #turn any ruby object into a reg that matches if
obj.=== succeeds
/re/.sym #a symbol regex
proceq(klass){rcode} #a proc{} that responds to === by invoking the
proc's call
OBS as un-anchor #opposite of ^ and $ when placed at edges of a
reg array (kinda cheesy)
name=r #named subexpressions

recursive matches via regvariables&regconstants ***

*** = not implemented yet.

Reg is kind of hard to wrap your mind around, so here are some
examples:

Matches array containing exactly 2 elements; 1st is another array, 2nd
is integer:
+[Array,Integer]

Like above, but 1st is array of arrays of symbol
+[+[+[Symbol.reg+0]+0],Integer]

Matches array of at least 3 consecutive symbols and nothing else:
+[Symbol.reg+3]

Matches array with at least 3 symbols in it somewhere:
+[OBS, Symbol.reg+3, OBS]

Matches array of at most 6 strings starting with 'g'
+[/^g/-6] #no .reg necessary for regexp

Matches array of between 5 and 9 hashes containing a key :k pointing to
something non-nil:
+[ +{:k=>~nil.reg}*(5..9) ]

Matches an object with Integer instance variable @k and property (ie
method) foobar that returns a string with 'baz' somewhere in it:
-{:@k=>Integer, :foobar=>/baz/}

Matches array of 6 hashes with 6 as a value of at least one key,
followed by 18 objects with an attribute @s which is a String:
+[ +{OB=>6}*6, -{:@s=>String}*18 ]

Status:
Some highly nested vector reg constructions still don't work quite
right. (For examples, search on eat_unworking in regtest.rb.) A number
of features are unimplemented at this point, most notably
backreferences and substitutions.

Jon_Raphaelson · 24 April 2005 03:51

Ok, I'm going to go out on a limb here and say HOLY GOD THIS IS AWESOME.

Sorry for the shouting.

vikkous wrote:

···

I would like to announce the first version, 0.4.0, of Reg, the Ruby
Extended Grammar. Reg is a library for pattern matching in ruby data
structures. Reg provides Regexp-like match and match-and-replace for
all data structures (particularly Arrays, Objects, and Hashes), not
just Strings.

The Reg RubyForge project: http://rubyforge.org/projects/reg/

The Reg Tarball:
http://rubyforge.org/frs/download.php/4199/reg-0.4.0.tar.bz2

Reg is best thought of in analogy to regular expressions; Regexps are
special data structures for matching Strings; Regs are special data
structures for matching ANY type of ruby data (Strings included, using
Regexps).

This table compares syntax of reg and regexp for various constructs.
Keep
in mind that all Regs are ordinary ruby expressions. The special syntax

is acheived by overriding ruby operators.
These abbreviations are used:
re,re1,re2 represent arbitrary regexp subexpressions,
r,r1,r2 represent arbitrary reg subexpressions
s,t represent any single character (perhaps appropriately escaped, if
the char is magical)

reg regexp #description

+[r1,r2,r3] /re1re2re3/ #sequence
-[r1,r2] (re1re2) #subsequence
r.lit \re #escaping a magical
regproc{r} #{re} #dynamic inclusion
r1|r2 or :OR (re1|re2) or [st] #alternation
~r [^s] #negation (for scalar r and s)
r+0 re* #zero or more matches
r+1 re+ #one or more matches
r-1 re? #zero or one matches
r*n re{n} #exactly n matches
r*(n..m) re{n,m} #at least n, at most m matches
r-n re{n,} #at least n matches
r+m re{,m} #at most m matches
OB . #a single item
OBS .* #zero or more items
BR[1,2] \1,\2 #backreference ***
>>x or sub sub,gsub #search and replace ***

here are features of reg that don't have an equivalent in regexp
r.la #lookahead ***
~- #subsequence negation w/lookahead ***
& or :AND #all alternatives match
^ or :XOR #exactly one of alternatives matches
+{r1=>r2} #hash matcher
-{name=>r} #object matcher
obj.reg #turn any ruby object into a reg that matches if
obj.=== succeeds
/re/.sym #a symbol regex
proceq(klass){rcode} #a proc{} that responds to === by invoking the
proc's call
OBS as un-anchor #opposite of ^ and $ when placed at edges of a
reg array (kinda cheesy)
name=r #named subexpressions

recursive matches via regvariables&regconstants ***

*** = not implemented yet.

Reg is kind of hard to wrap your mind around, so here are some
examples:

Matches array containing exactly 2 elements; 1st is another array, 2nd
is integer:
+[Array,Integer]

Like above, but 1st is array of arrays of symbol
+[+[+[Symbol.reg+0]+0],Integer]

Matches array of at least 3 consecutive symbols and nothing else:
+[Symbol.reg+3]

Matches array with at least 3 symbols in it somewhere:
+[OBS, Symbol.reg+3, OBS]

Matches array of at most 6 strings starting with 'g'
+[/^g/-6] #no .reg necessary for regexp

Matches array of between 5 and 9 hashes containing a key :k pointing to
something non-nil:
+[ +{:k=>~nil.reg}*(5..9) ]

Matches an object with Integer instance variable @k and property (ie
method) foobar that returns a string with 'baz' somewhere in it:
-{:@k=>Integer, :foobar=>/baz/}

Matches array of 6 hashes with 6 as a value of at least one key,
followed by 18 objects with an attribute @s which is a String:
+[ +{OB=>6}*6, -{:@s=>String}*18 ]

Status:
Some highly nested vector reg constructions still don't work quite
right. (For examples, search on eat_unworking in regtest.rb.) A number
of features are unimplemented at this point, most notably
backreferences and substitutions.

Ptkwt · 24 April 2005 04:39

Wow.

Just curious: what kind needs led you to develop this?

Phil

In article <1114309915.927676.128220@g14g2000cwa.googlegroups.com>,

···

vikkous <google@inforadical.net> wrote:

I would like to announce the first version, 0.4.0, of Reg, the Ruby
Extended Grammar. Reg is a library for pattern matching in ruby data
structures. Reg provides Regexp-like match and match-and-replace for
all data structures (particularly Arrays, Objects, and Hashes), not
just Strings.

The Reg RubyForge project: http://rubyforge.org/projects/reg/

The Reg Tarball:
http://rubyforge.org/frs/download.php/4199/reg-0.4.0.tar.bz2

Reg is best thought of in analogy to regular expressions; Regexps are
special data structures for matching Strings; Regs are special data
structures for matching ANY type of ruby data (Strings included, using
Regexps).

This table compares syntax of reg and regexp for various constructs.
Keep
in mind that all Regs are ordinary ruby expressions. The special syntax

is acheived by overriding ruby operators.
These abbreviations are used:
re,re1,re2 represent arbitrary regexp subexpressions,
r,r1,r2 represent arbitrary reg subexpressions
s,t represent any single character (perhaps appropriately escaped, if
the char is magical)

reg regexp #description

+[r1,r2,r3] /re1re2re3/ #sequence
-[r1,r2] (re1re2) #subsequence
r.lit \re #escaping a magical
regproc{r} #{re} #dynamic inclusion
r1|r2 or :OR (re1|re2) or [st] #alternation
~r [^s] #negation (for scalar r and s)
r+0 re* #zero or more matches
r+1 re+ #one or more matches
r-1 re? #zero or one matches
r*n re{n} #exactly n matches
r*(n..m) re{n,m} #at least n, at most m matches
r-n re{n,} #at least n matches
r+m re{,m} #at most m matches
OB . #a single item
OBS .* #zero or more items
BR[1,2] \1,\2 #backreference ***

x or sub sub,gsub #search and replace ***

here are features of reg that don't have an equivalent in regexp
r.la #lookahead ***
~- #subsequence negation w/lookahead ***
& or :AND #all alternatives match
^ or :XOR #exactly one of alternatives matches
+{r1=>r2} #hash matcher
-{name=>r} #object matcher
obj.reg #turn any ruby object into a reg that matches if
obj.=== succeeds
/re/.sym #a symbol regex
proceq(klass){rcode} #a proc{} that responds to === by invoking the
proc's call
OBS as un-anchor #opposite of ^ and $ when placed at edges of a
reg array (kinda cheesy)
name=r #named subexpressions

recursive matches via regvariables&regconstants ***

*** = not implemented yet.

Reg is kind of hard to wrap your mind around, so here are some
examples:

Matches array containing exactly 2 elements; 1st is another array, 2nd
is integer:
+[Array,Integer]

Like above, but 1st is array of arrays of symbol
+[+[+[Symbol.reg+0]+0],Integer]

Matches array of at least 3 consecutive symbols and nothing else:
+[Symbol.reg+3]

Matches array with at least 3 symbols in it somewhere:
+[OBS, Symbol.reg+3, OBS]

Matches array of at most 6 strings starting with 'g'
+[/^g/-6] #no .reg necessary for regexp

Matches array of between 5 and 9 hashes containing a key :k pointing to
something non-nil:
+[ +{:k=>~nil.reg}*(5..9) ]

Matches an object with Integer instance variable @k and property (ie
method) foobar that returns a string with 'baz' somewhere in it:
-{:@k=>Integer, :foobar=>/baz/}

Matches array of 6 hashes with 6 as a value of at least one key,
followed by 18 objects with an attribute @s which is a String:
+[ +{OB=>6}*6, -{:@s=>String}*18 ]

Status:
Some highly nested vector reg constructions still don't work quite
right. (For examples, search on eat_unworking in regtest.rb.) A number
of features are unimplemented at this point, most notably
backreferences and substitutions.

Peter_Suk · 24 April 2005 04:39

This is like too good/weird to be true.

--Peter

···

On Apr 23, 2005, at 9:34 PM, vikkous wrote:

I would like to announce the first version, 0.4.0, of Reg, the Ruby
Extended Grammar.

--
There's neither heaven nor hell, save what we grant ourselves.
There's neither fairness nor justice, save what we grant each other.

Mathieu_Bouchard · 24 April 2005 05:34

I would like to announce the first version, 0.4.0, of Reg, the Ruby
Extended Grammar. Reg is a library for pattern matching in ruby data
structures. Reg provides Regexp-like match and match-and-replace for
all data structures (particularly Arrays, Objects, and Hashes), not
just Strings.

Can it also match on IO ? I'm particularly thinking of a stream implementation that supports illimited pushback of characters...

Because if it does, then you've got a lexer system that is also good as something else than just a damn lexer.

And by making regexps unified with the rest of the language, it brings Ruby closer to the Icon language, isn't it?

Anyhow: Congratulations!

(this is really something I wish had existed in 2001 or so).

  ,-o---------o---------o---------o-. ,----. |
  > The Diagram is the Program (TM) | | ,-o----------------------------o-.
  `-o-----------------------------o-' | | Mathieu Bouchard (Montréal QC) |

···

On Sun, 24 Apr 2005, vikkous wrote:
> >---' | http://artengine.ca/matju |
> > `-o------------------------------'

Christian_Neukirche1 · 24 April 2005 09:17

"vikkous" <google@inforadical.net> writes:

I would like to announce the first version, 0.4.0, of Reg, the Ruby
Extended Grammar. Reg is a library for pattern matching in ruby data
structures. Reg provides Regexp-like match and match-and-replace for
all data structures (particularly Arrays, Objects, and Hashes), not
just Strings.

Nifty, nifty, nifty. I really need to have a look at that.

How does it compare to the ML style of argument matching, btw?

···

--
Christian Neukirchen <chneukirchen@gmail.com> http://chneukirchen.org

Lyndon_Samson · 24 April 2005 12:04

It seems similar in spirit to JXPath for java which lets you use XPath
expressions to access objects, Hashs, Arrays, Maps etc which otherwise
is quite longwinded in java ( no snickering please ).

http://jakarta.apache.org/commons/jxpath/

···

On 4/24/05, vikkous <google@inforadical.net> wrote:

--
Into RFID? www.rfidnewsupdate.com Simple, fast, news.

Its_Me · 25 April 2005 17:04

This looks great!

I have not played with it yet, so hope these questions are not off base:

- can I bind variables to (parts of) matches
- have you thought about the connection to duck typing?
- any convenient way to match "all" ... like r*(size..size)

e.g.
http://groups-beta.google.com/group/comp.lang.ruby/browse_frm/thread/f2d02d53531408e/d1f3c7e641a53cdc?q=itsme213+pattern&rnum=1#d1f3c7e641a53cdc

"vikkous" <google@inforadical.net> wrote in message
news:1114309915.927676.128220@g14g2000cwa.googlegroups.com...

···

I would like to announce the first version, 0.4.0, of Reg, the Ruby
Extended Grammar. Reg is a library for pattern matching in ruby data
structures. Reg provides Regexp-like match and match-and-replace for
all data structures (particularly Arrays, Objects, and Hashes), not
just Strings.

The Reg RubyForge project: http://rubyforge.org/projects/reg/

The Reg Tarball:
http://rubyforge.org/frs/download.php/4199/reg-0.4.0.tar.bz2

Reg is best thought of in analogy to regular expressions; Regexps are
special data structures for matching Strings; Regs are special data
structures for matching ANY type of ruby data (Strings included, using
Regexps).

This table compares syntax of reg and regexp for various constructs.
Keep
in mind that all Regs are ordinary ruby expressions. The special syntax

is acheived by overriding ruby operators.
These abbreviations are used:
re,re1,re2 represent arbitrary regexp subexpressions,
r,r1,r2 represent arbitrary reg subexpressions
s,t represent any single character (perhaps appropriately escaped, if
the char is magical)

reg regexp #description

+[r1,r2,r3] /re1re2re3/ #sequence
-[r1,r2] (re1re2) #subsequence
r.lit \re #escaping a magical
regproc{r} #{re} #dynamic inclusion
r1|r2 or :OR (re1|re2) or [st] #alternation
~r [^s] #negation (for scalar r and s)
r+0 re* #zero or more matches
r+1 re+ #one or more matches
r-1 re? #zero or one matches
r*n re{n} #exactly n matches
r*(n..m) re{n,m} #at least n, at most m matches
r-n re{n,} #at least n matches
r+m re{,m} #at most m matches
OB . #a single item
OBS .* #zero or more items
BR[1,2] \1,\2 #backreference ***
>>x or sub sub,gsub #search and replace ***

here are features of reg that don't have an equivalent in regexp
r.la #lookahead ***
~- #subsequence negation w/lookahead ***
& or :AND #all alternatives match
^ or :XOR #exactly one of alternatives matches
+{r1=>r2} #hash matcher
-{name=>r} #object matcher
obj.reg #turn any ruby object into a reg that matches if
obj.=== succeeds
/re/.sym #a symbol regex
proceq(klass){rcode} #a proc{} that responds to === by invoking the
proc's call
OBS as un-anchor #opposite of ^ and $ when placed at edges of a
reg array (kinda cheesy)
name=r #named subexpressions

recursive matches via regvariables&regconstants ***

*** = not implemented yet.

Reg is kind of hard to wrap your mind around, so here are some
examples:

Matches array containing exactly 2 elements; 1st is another array, 2nd
is integer:
+[Array,Integer]

Like above, but 1st is array of arrays of symbol
+[+[+[Symbol.reg+0]+0],Integer]

Matches array of at least 3 consecutive symbols and nothing else:
+[Symbol.reg+3]

Matches array with at least 3 symbols in it somewhere:
+[OBS, Symbol.reg+3, OBS]

Matches array of at most 6 strings starting with 'g'
+[/^g/-6] #no .reg necessary for regexp

Matches array of between 5 and 9 hashes containing a key :k pointing to
something non-nil:
+[ +{:k=>~nil.reg}*(5..9) ]

Matches an object with Integer instance variable @k and property (ie
method) foobar that returns a string with 'baz' somewhere in it:
-{:@k=>Integer, :foobar=>/baz/}

Matches array of 6 hashes with 6 as a value of at least one key,
followed by 18 objects with an attribute @s which is a String:
+[ +{OB=>6}*6, -{:@s=>String}*18 ]

Status:
Some highly nested vector reg constructions still don't work quite
right. (For examples, search on eat_unworking in regtest.rb.) A number
of features are unimplemented at this point, most notably
backreferences and substitutions.

caleb_clausen · 24 April 2005 07:49

That's a long story, and well worth telling.

A long time ago, I wanted a better regexp than regexp. My search ended
when I found an extremely obscure language called gema (the
general-purpose matcher). I'm guessing that I'm the only person to ever
take gema seriously. For a time, I became the worlds foremost expert on
gema. Gema is designed around the idea that all computation can be
modeled as pattern and replacement. Everything in gema is pattern and
replacement... essentially everything is done with regexps. I was
fascinated with the idea. This seemed to me to be a much better model
for most programming problems, which typically involve reading input,
tranforming it in some way, and writing it out again. Conventional
languages (starting with fortran, and including ruby) are based around
the idea of a program being a long string of formulas. This is great
for math-heavy stuff, but most programming is really about data
manipulation, not math.

But there was trouble in paradise. Gema was wonderful, but weird. The
syntax was cranky. The author had issued one version long ago then
disappeared. Gema code was hard to read, in part because
everythingwasalljammedtogether.
Ifyouinsertspacestomakeitmorereadable,itchangesthesemanticsofyourprogram.
There were strange problems that I never tracked down or fully
characterized. The only data-type was the string. You had to be an
expert at avoiding the invisible pitfalls of the language to get
anywhere. But I did get surprisingly far. I managed to coax gema into
becoming a true parser, and parsing a toy language.
I wanted to write a compiler in gema. Yes, the whole compiler. And
parsing the toy language was already straining its capabilites. It
wasn't the data model; I actually figured out how to model all other
data types using strings. A match-and-replace language is actually much
better suited to most compiler tasks than an algol-like formula
language.

Eventually, I abandoned gema, determined to recreate it's glory in a
cleaner form. It was at about this time that I discovered ruby. The
successor to gema was ruma, the ruby matcher. Ruma would be basically
just like gema, but without the problems. Whitespace allowed between
tokens. Proper quotation mechanisms, including nested quotes. And the
language used in the actions (replacements) would be full ruby, instead
of gema's inadequate and crude action language.

Ruma got maybe halfway done... quite a ways, really. As part of ruma, I
needed a ruby lexer to make sense of the actions. This turned out to be
quite a lot harder than I had anticipated; I'm still working on that
lexer.

After grinding away at the lexer for a while, dreaming of ruma in the
meantime, I had a brainstorm. Ruma, like gema, was to be a string-based
language. It only operated on strings. In gema, that was just fine
because everything was strings and you just had to live with that. But
ruby has all these other types, a real type system. Wouldn't it be nice
to have those sophisticated search capabilites for other types too?
Well, since I proved to myself that all data types can be converted to
strings, why not convert the ruby data into strings and then match that
in ruma. Of course, it would be so much nicer to just do the matching
on the data in it's original form....

The breakthrough came when I realized how malleable ruby really is. I
had become accustomed to c, which I still love, but in so many ways
it's so much more limited. I didn't really have to write my own parser
and lexer; ruby could do it all for me. I just had to override a bunch
of operators.

After that, it was simple. All I do is override the right operators,
and ruby does the parsing and hands me the match expressions in
already-parsed form. Reg is amazingly small in the end. Most of the
effort and code went into the array matcher, but at least as much
functionality is to be had from the hash and object matchers, which
were trivial.

caleb_clausen · 24 April 2005 08:04

Can it also match on IO ? I'm particularly thinking of a stream
implementation that supports illimited pushback of characters...

I would very much like to do this, but right now, no. I'm not sure
exactly what would be involved in having the array matcher match files
as well; it seems like you might have to rip out the guts of the
backtracking engine to support it... but maybe not. Anyway, stay tuned
for a future release.

Just having the ability to compare regexps directly against files would
be really helpful in the construction of lexers of all sorts. Java has
this; why doesn't ruby?

Because if it does, then you've got a lexer system that is also good

as

something else than just a damn lexer.

Lexers, parsers, and pattern matching languages get too short a shrift
in my opinion. There's really a lot more they could be used for, if
only people would see... of course, it doesn't help that almost all
existing tools of this kind are string-oriented, and hard to use for
other data.

And by making regexps unified with the rest of the language, it

brings

Ruby closer to the Icon language, isn't it?

I wouldn't know... please let know about regexp integration in icon;
maybe there's some features I can steal.

caleb_clausen · 25 April 2005 22:39

Lyndon Samson wrote:

It seems similar in spirit to JXPath for java which lets you use

XPath

expressions to access objects, Hashs, Arrays, Maps etc which

otherwise

is quite longwinded in java ( no snickering please ).

http://jakarta.apache.org/comm ons/jxpath/

I took a skim thru this. It seems like XPath is all 'traversal of the
object graph' and thus very much like Reg's Object and Hash matchers.
The Array matcher does that and also matches regexp-like patterns
within arrays. Does XPath have that? I didn't see anything.

caleb_clausen · 26 April 2005 00:49

itsme213 wrote:

- can I bind variables to (parts of) matches

I guess what you're asking for is like the functionality of
backreferences, which aren't implemented yet. Binding a variable is
somewhat different, I guess but allows you to do the same sort of
thing. It could be implemented, not necessarily easily for global vars,
and maybe for others too using Binding.of_caller.

- any convenient way to match "all" ... like r*(size..size)

Uhh... well, OB matches any single object and OBS matches 0 or more of
any object. One great thing about reg is that you can name your own
subexpressions if you happen to need a matcher that matches -[some long
reg that I dont want to type over and over], you can write:
foo=-[some long reg... etc]
and then use foo everywhere in your larger reg. This is a major
weakness of regexps, and was one of the great things about gema in
comparison.

For instance, modulo optimizations, the definitions of OB and OBS are:
OB=Object.reg
OBS=OB+0

caleb_clausen · 26 April 2005 01:14

itsme213 said:

- have you thought about the connection to duck typing?

Christian Neukirchen said:

How does it compare to the ML style of argument matching, btw?

I think you're both talking about the same thing here; the ability to
dispatch to different methods depending on the types of method
parameters other than just the receiver. I have thought a good deal
about this, in fact, but not exactly in the context of Reg.

The best way to do this means extending the syntax, but here's a quick
and dirty way reg might be used to do it:

#warning! won't work yet; no substitutions in reg yet
module Scoundrels
  def Scoundrels::bill(*args) #dispatcher for all the bills
    send +[
      -[Lockpick]>>:bill_the_picklock |
      -[FakePassport,Cash,ShoePhone]>>:bill_the_spy |
      -[Gun.reg+1, Knife]>>:bill_the_murderer |

-[Laptop,Oscilloscope.reg|CellPhone|Password.reg*5]>>:bill_the_hacker
].match(args.dup).first, args
end

#definitions of the various bills omitted
end

Then you could do:
Scoundrel.bill(LockPick.new) #invokes bill_the_picklock
Scoundrel.bill(Laptop.new, CellPhone.new) #invokes bill_the_hacker

Maybe this could be made easier. The syntax takes a little getting used
to, let us say.

This is great; people are coming up with angles I never thought of.

Denis_Mertz1 · 24 April 2005 08:44

vikkous wrote:

Lexers, parsers, and pattern matching languages get too short a shrift
in my opinion. There's really a lot more they could be used for, if
only people would see... of course, it doesn't help that almost all
existing tools of this kind are string-oriented, and hard to use for
other data.

A small piece of example code could help to open eyes of people that dont
see what could be done with Reg (like me).

Denis

David_A_Black3 · 26 April 2005 01:52

Hi --

itsme213 said:

- have you thought about the connection to duck typing?

Christian Neukirchen said:

How does it compare to the ML style of argument matching, btw?

I think you're both talking about the same thing here; the ability to
dispatch to different methods depending on the types of method
parameters other than just the receiver. I have thought a good deal
about this, in fact, but not exactly in the context of Reg.

I'm not sure about Christian, but I think itsme213 meant sort of the
opposite: how does your system relate to the duck-typing environment
of Ruby, where type != class (i.e., an object's capabilities and its
class's instance methods are not necessarily the same thing)? Are you
thinking of extending the system so that it could match, for example,
"an array of objects that respond to ''" (or something along those
lines)? One could imagine that being useful -- though on the other
hand, duck typing per se, as I understand it, really means just
requesting action from objects without a lot of preliminary querying
and measuring (whether it be is_a?, respond_to?, or whatever). So a
system like yours might be part of a fundamentally different way of
handling these things -- though respond_to?-awareness might be an nice
sort of middle ground.

At least I think that's what he meant, and even if not, it would be
interesting to hear your thoughts on it

David

···

On Tue, 26 Apr 2005, vikkous wrote:

--
David A. Black
dblack@wobblini.net

Anonymous_Coward · 26 April 2005 03:08

itsme213 said:

- have you thought about the connection to duck typing?

Christian Neukirchen said:

How does it compare to the ML style of argument matching, btw?

I think you're both talking about the same thing here; the ability to
dispatch to different methods depending on the types of method
parameters other than just the receiver. I have thought a good deal
about this, in fact, but not exactly in the context of Reg.

I am not certain how flexible your framework is, but as a sidenote
the typical pattern matching (term used earlier) in functional
languages is well represented by things like this:

length = 0 -- Matches an empty list
length (x:xs) = 1 + length xs -- Matches&splits nonempty lists

length [1,2,3] -- returns 3
-- [1,2,3] -> [2,3] -> [3] ->

Sorry. I just like writing Haskell

The best way to do this means extending the syntax, but here's a quick
and dirty way reg might be used to do it:

#warning! won't work yet; no substitutions in reg yet
module Scoundrels
def Scoundrels::bill(*args) #dispatcher for all the bills
   send +[
     -[Lockpick]>>:bill_the_picklock |
     -[FakePassport,Cash,ShoePhone]>>:bill_the_spy |
     -[Gun.reg+1, Knife]>>:bill_the_murderer |

-[Laptop,Oscilloscope.reg|CellPhone|Password.reg*5]>>:bill_the_hacker
    ].match(args.dup).first, args
end

#definitions of the various bills omitted
end

Then you could do:
Scoundrel.bill(LockPick.new) #invokes bill_the_picklock
Scoundrel.bill(Laptop.new, CellPhone.new) #invokes bill_the_hacker

Maybe this could be made easier. The syntax takes a little getting used
to, let us say.

This is great; people are coming up with angles I never thought of.

E

···

Le 26/4/2005, "vikkous" <google@inforadical.net> a écrit:

--
template<typename duck>
void quack(duck& d) { d.quack(); }

Clifford_Heath4 · 29 April 2005 10:29

Good achievement!

vikkous wrote:

Lexers, parsers, and pattern matching languages get too short a shrift
in my opinion.

You're right. As I said in a recent discussion, I think it's because
the people of a sufficient theoretical bent to create the tools don't
seem to be able to make them usable:-).

Talking about substitutions, made me wonder whether you were familiar
with Txl (Tree Transformation Language).

When you come to doing substitutions, read up on BURGs (Bottom Up
Rewriting Grammars) if you aren't already familiar with them. They're
how optimising compilers choose an optimal sequence of code to emit.
They basically match leaf portions of an expression tree, and for each
match, accumulate the cost of the instructions that need to be emitted
to allow that sub-tree to be simplified. Within reason, all possible
paths are explored that allow the tree to be rewritten to the empty
tree, by emitting the optimal instruction sequence. It'd be excellent
if reg could deal with the kind of ambiguity this entails, choosing a
minimum-cost resolution.

Clifford Heath.

caleb_clausen · 25 April 2005 22:29

I included some small examples at the end of my initial post to try to
whet your appetite. Perhaps you can see applications of this kind of
thing to what you use ruby for? Searching for complicated patterns
within an arbitrary object graph is what Reg is about. If you have
complicated data, Reg may be a good choice for searching in it.
(Eventually it'll have search-and-replace, but that's not implemented
yet.)

Traditionally, parsing and pattern matching languages stop after the
parser stage of the compiler pipeline, but it seems to me that many
later compiler tasks are particularly well suited for pattern-matchers.
(They're never used because by this point, compiler data is in the form
of a parse tree,
and text-based pattern tools (most of them) can't deal with that.)
Let's take the example of a simple optimization, like strength
reduction. This is where the compiler changes multiplication by a
constant power of two into a left shift.

The problem, in other words, is to search for nodes of the syntax tree
that look like this:
[<some expr>, :*, 4]

and turn them into into this:
[<some expr>, :<<, 2]

In Reg, that would be:
+[expr, :*, -{:power_of_2?=>:true}].sub{BR[0], :<<, BR[2].log2}

My post "Lalr(n) parsing with reg" outlines how to twist Reg to
actually be a parser.

Florian_Gross2 · 26 April 2005 16:36

David A. Black wrote:

extending the system so that it could match, for example,
"an array of objects that respond to ''" (or something along those
lines)?

I wonder if he could just support objects that implement === which would give you Range and Module support and ruby-contract support for free.

But how is this related to your quote at all? ruby-contract offers a Check::Quack[:message] adaptor that implements === via respond_to?() -- I wonder if it would be a good idea to define Symbol#=== which would be used like this:

first = case obj
   when :first then
     obj.first
   when :fetch then
     obj.fetch(0)
   when :at then
     obj.at(0)
   when : then
     obj[0]
end

Or in combination with ruby-contract's signature():

class IO
   def pretty_output(*objs)
     objs.each do |obj|
       puts obj.pretty
     end
   end

signature :pretty_output, :repeated => :pretty, :block => false
end

caleb_clausen · 26 April 2005 17:54

I'm not sure about Christian, but I think itsme213 meant
sort of the opposite: how does your system relate to the
duck-typing environment of Ruby, where type != class
(i.e., an object's capabilities and its class's instance
methods are not necessarily the same thing)? Are you
thinking of extending the system so that it could match,
for example, "an array of objects that respond to ''"

Oh! Ok, well that's easy, it would be something like:

+[ proceq{|x| x.respond_to? : }+1 ]

Being forced to use a proceq (gawd that's a terrible name) here is a
little ugly. Eventually, I want to be able to pass arguments to
property matchers, so the above could be:

+[ -{ [:respond_to?, :]=>true }+1 ]

duck typing per se, as I understand it, really means just
requesting action from objects without a lot of preliminary
querying and measuring (whether it be is_a?, respond_to?,
or whatever).

If you're doing something complicated, you occasionally have to
explicity request the type (whether duck- or class-) of objects you're
working with, in order to do the right thing with it. I know this isn't
polymorphic, but sometimes it is the right way...

Topic		Replies	Views
[ANN] Reg 0.4.8 Released ruby-talk	0	121	4 January 2010
[ANN] Reg - Ruby Extended Grammar 0.4.6 ruby-talk	0	111	18 November 2005
[Ann] Reg 0.4.5 ruby-talk	1	126	17 May 2005
Lalr(n) parsing with reg ruby-talk	1	118	26 April 2005
MetaRegexp: experimental extensions to Regexp (requesting feedback) ruby-talk	3	132	3 October 2010

Announcing Reg 0.4.0

Related topics