>>
>> > Whilst it's certainly useless for a lot of tasks, I'm not sure that
>> > Ruby is any worse than other languages in this regard. As far as
>> I'm
>> > aware, most languages that 'support' Unicode don't handle grapheme
>> > clusters without using additional libraries.
>>
>> AFAIK Python regexps do that properly, and ICU does for sure (both as
>> free iterators and regexps).
>
> That's what I mean: ICU is a separate library, not part of a language
> core.PHP took the best of both - they are integrating ICU into the core.
Although I always hated
their tendency to bloat the core, this is one of the cases of bloat
that I would want to applaud as a gesture
of sanity and common sense.
Last time I looked ICU was in C++. Requiring a C++ compilier and
runtime is quite a bit of bloat
> We can use ICU in Ruby too - it's still pre-alpha and not
> seamless, but the possibility exists.Except from the fact that the maintainer has abandoned it and nobody
stepped in. I don't do C.> From what I've read, Python
> doesn't do the heavyweight stuff natively, either. (Please tell me if
> I'm wrong - I don't use Python.)It depends on what you call "heavyweight". For the purists out there,
I gather, even including a complete Unicode table with
codepoint properties might be "heavyweight".
I am not sure how large that might be. But if it is about the size of
the interpreter including the rest of the standard libraries I would
consider it "heavyweight". It would be a reason to start "optional
standard libraries" I guess
>>
>> To my knowledge you are intimately familiar with the subject so I
>> take it as sarcasm.
>
> I'm not being sarcastic at all, though perhaps I could have phrased it
> better. It's just that all Unicode discussions in Ruby end up going
> round and round in circles; if we as a community could identify some
> first-class examples of Doing It Right, I think we'd have some useful
> yardsticks.The problem being, my "Right Examples" are nowhere near other's
"Right Examples", which in turn supurs flamewars.
My "right example" is simple - Unicode on no terms, no encoding
choice, characters only - but most already are dissatisfied with such
an attitude and the issue has been discussed in detail, with no
solution satisfying all parties being devises. Too much compromise.
It's been also said that giving more options does not stop you from
using only unicode. If your "right example" is only about restricting
choice then there is really not much to it.
The "right examples" people were interested in are probably more like
the libraries/languages that implement enough functionality to give
you full unicode support for your definition of "full".
Thanks
Michal
···
On 7/31/06, Julian 'Julik' Tarkhanov <listbox@julik.nl> wrote:
On 31-jul-2006, at 18:51, Paul Battley wrote:
> On 31/07/06, Julian 'Julik' Tarkhanov <listbox@julik.nl> wrote:
>> On 31-jul-2006, at 17:48, Paul Battley wrote: