Get a portion of a utf8 encoded string

hi all.

ruby 1.8.5 (2006-08-25) [i386-linux]

>> puts 'hällo'
hällo
=> nil
>> puts 'hällo'[0, 2]
h▒
=> nil
>> puts 'hällo'[0, 3]

=> nil
>> require 'iconv'
=> true
>> puts 'hällo'[0, 3]

=> nil
>> puts 'hällo'[0, 2]
h▒
=> nil
>>

The ä character is probably 2 bytes long, so puts 'hällo'[1, 1] returns only a "pice" of the ä character. Is there a way to make this work?
'hällo'[0, 2] should return -> hä
and
'hällo'[0, 3] should return -> häl

thanks

Hi,

···

2008/9/16 Marco <m@r.co>:

hi all.

ruby 1.8.5 (2006-08-25) [i386-linux]

puts 'hällo'

hällo
=> nil

puts 'hällo'[0, 2]

h▒
=> nil

puts 'hällo'[0, 3]


=> nil

require 'iconv'

=> true

puts 'hällo'[0, 3]


=> nil

puts 'hällo'[0, 2]

h▒
=> nil

The ä character is probably 2 bytes long, so puts 'hällo'[1, 1] returns only
a "pice" of the ä character. Is there a way to make this work?
'hällo'[0, 2] should return -> hä
and
'hällo'[0, 3] should return -> häl

thanks

You can do it using regular expression :

'hällo'.split(//u)[0, 2].to_s -> hä

Regards,

Park Heesob

# thanks. even if 'hällo'[0, 2] looks better

upgrade?

[RUBY_VERSION, RUBY_PATCHLEVEL, RUBY_REVISION, RUBY_PLATFORM]

=> ["1.8.7", 5000, 0, "i686-linux"]

puts 'hällo'

hällo
=> nil

puts 'hällo'[0, 2]


=> nil

puts 'hällo'[0, 3]

häl
=> nil

···

From: S2 [mailto:sto.giocando@motor.storm]

Heesob Park wrote:

You can do it using regular expression :

'hällo'.split(//u)[0, 2].to_s -> hä

Why make it convoluted when ruby makes it so easy to use regexps:

>> puts 'hällo'[/.{2}/um]

It concerns me that people are suggesting using various backwards
incompatible changes in Ruby 1.8.7

Though 1.8.7 may be a good fallback for 1.9 only libraries that may
make things work as expected, I hate the idea of writing code that
works on 1.8.7 but not other versions of Ruby 1.8. At this point, I
feel like 1.8.6 is still the 'real' Ruby 1.8, anyway.

-greg

···

On Tue, Sep 16, 2008 at 4:50 AM, Peña, Botp <botp@delmonte-phil.com> wrote:

From: S2 [mailto:sto.giocando@motor.storm]
# thanks. even if 'hällo'[0, 2] looks better

upgrade?

--
Technical Blaag at: http://blog.majesticseacreature.com | Non-tech
stuff at: http://metametta.blogspot.com

# On Tue, Sep 16, 2008 at 4:50 AM, Peña, Botp

···

From: Gregory Brown [mailto:gregory.t.brown@gmail.com]
# <botp@delmonte-phil.com> wrote:
# > From: S2 [mailto:sto.giocando@motor.storm]
# > # thanks. even if 'hällo'[0, 2] looks better
# > upgrade?
# It concerns me that people are suggesting using various backwards
# incompatible changes in Ruby 1.8.7
# Though 1.8.7 may be a good fallback for 1.9 only libraries that may
# make things work as expected, I hate the idea of writing code that
# works on 1.8.7 but not other versions of Ruby 1.8. At this point, I
# feel like 1.8.6 is still the 'real' Ruby 1.8, anyway.

cmon, greg, what could be better than

  'hällo'[0, 2]

?

the change is good.

kind regards -botp

The change is absolutely good! I did a training session at Lone Star
Ruby Conference that sung its praises.

What isn't good is to say "I've not yet updated my code for Ruby 1.9",
so it works on Ruby 1.8.7 only.

My point is that if people really want to use Ruby 1.9 features for
anything but experimentation, they'd do more good by actually using
1.9, not an intermediate release that left most people confused.

If you are writing Ruby 1.8.7 specific code, your code may not run on
Ruby 1.9, for example, the code above will blow up on 1.9 unless the
encoding is properly set.

And your code *definitely* won't work on Ruby 1.8.x aside from 1.8.7

So if you're okay locking to a single point release, that's fine. But
I think it'd help ruby-core a lot more for you to use Ruby 1.9 and
help them iron out issues, and it'd help the Ruby community a lot more
for you to choose whether you are supporting 1.8.x, 1.9.x or both, but
not something that is neither (1.8.7)

-greg

···

On Wed, Sep 17, 2008 at 9:36 PM, Peña, Botp <botp@delmonte-phil.com>

cmon, greg, what could be better than

'hällo'[0, 2]

?

the change is good.

--
Technical Blaag at: http://blog.majesticseacreature.com | Non-tech
stuff at: http://metametta.blogspot.com