サービス終了のお知らせ seems to have a fairly complete listing of its features.
Ah, yes. Thanks. I should have Googled first.
But reading through that and the documentation on the same site, I am still looking for a rationale document. Why Onigurama and not, say, PCRE? Why a new regexp parser?
1. Licensing. PCRE's licensing has been somewhat fluid. The current
release seems OK.
2. Control. In many ways, such a core feature to Ruby should be native to Ruby.
3. Native concepts. Ruby REs are a bit different because they end up
being objects.
-austin
···
On Thu, 31 Mar 2005 00:29:03 +0900, B. K. Oxley (binkley) <binkley@alumni.rice.edu> wrote:
Florian Gross wrote:
> サービス終了のお知らせ seems to have a
> fairly complete listing of its features.
Ah, yes. Thanks. I should have Googled first.
But reading through that and the documentation on the same site, I am
still looking for a rationale document. Why Onigurama and not, say,
PCRE? Why a new regexp parser?
In message "Re: look-behind regexp ?" on Thu, 31 Mar 2005 00:29:03 +0900, "B. K. Oxley (binkley)" <binkley@alumni.rice.edu> writes:
But reading through that and the documentation on the same site, I am
still looking for a rationale document. Why Onigurama and not, say,
PCRE? Why a new regexp parser?
PCRE does only support UTF-8 (as far as I know), not multiple
encodings like Ruby does. Oniguruma supports UTF-8, UTF-16,
ISO-8859-*, EUC-JP, Shift_JIS, and lot more.
1. Licensing. PCRE's licensing has been somewhat fluid. The current
release seems OK.
2. Control. In many ways, such a core feature to Ruby should be native to Ruby.
3. Native concepts. Ruby REs are a bit different because they end up
being objects.
Hrm.
In all honesty, these objections seem weak to me.
If the licensing is not a problem right now, why would it necessarily become one in the future? (Although I don't know the history of licensing in PCRE, so perhaps it has a record of arbitrariness.)
Control is not so important when you have the source code. And Ruby can contribute to the development of PCRE.
I'm unsure what you mean in point three. I presume that a Ruby regexp implementation would use PCRE for implementation, wrapping any details so that the implementation is not visible, and only objects remain.
Not to be so nitpicky, I only used PCRE as an example. I have an inherent dislike of wheel-reinvention (my natural laziness at play), so my ears perk up when I see something like a rewrite of regexp parsers when so many fine ones are already around.
PCRE does only support UTF-8 (as far as I know), not multiple
encodings like Ruby does. Oniguruma supports UTF-8, UTF-16,
ISO-8859-*, EUC-JP, Shift_JIS, and lot more.
Ah. I inferred as much from the prominence given the list of encodings, but wanted to find out more.