The latest edition of “Mastering Regular Expressions, 2nd Edition” refers to
Ruby (Yippy!!), but not always in a positive light (Bummer!!).
Here is a sampling:
page 91, table indicates that Ruby version 1.6.7 was used when testing
regular expressions.
page 128, Table 3-11: Line Anchors for Some Scripting Languages. This table
lists “Concerns” and how they are handled. Under the Ruby column, the
following “Concerns” are noted:
Concern: “^ matches after any newline”.
– Note: “Ruby’s $ and ^ match at embedded newlines, but its \A and \Z do
not”
Concern: “$ matches before any newline”
– Note: “Ruby’s $ and ^ match at embedded newlines, but its \A and \Z do
not”
Under the title “Enhanced line-anchor mode. . .”
– Note: “N/A”. Indicates Ruby does not have this feature. While every other
language listed does (Java, Perl, PHP, Python, Tcl, .NET).
Concern: “\A always matches like normal ^”
– Note: “Ruby’s \A, unlike its ^, matches only at the start of the string”
Concern: “\Z always matches like normal $”
– Note: “Ruby’s \Z, unlike its $, matches at the end of the string, or
before a string-ending newline”
Concern: “\z always matches only at end of string”
– Note: “N/A”.
page 131, “My testing has shown that java.util.regex and Ruby have \G match
at the start of the current match, while Perl and the .NET languages have it
match at the end of the previous match. (Sun tells me that the next release
of java.util.regex will have its \g behavior match the documentation.)”
page 132, Table 3-12: A Few Utilities and Their Word Boundary
Metacharacters. The table indicates that Ruby does not support
“Start-of-word” and “End-of-word” boundary characters [e.g. Perl: (?<!\w)
(?=\w) … (?<=\w) (?!\w) ].
page 133, "Ruby has a bug whereby sometimes (?i) doesn’t apply to
-separated alternatives that are lowercase (but does if they’re
uppercase)."
I am not a Master at Regular Expressions so I would like comments on if
these things should change (or possibly already are changed) in Ruby.