Splitting a string into characters - not bytes

I realise that this is probably a known thing but my google-fu is not
working today.

I am learning Japanese and I wanted to write an application (probably
a web app) that I can paste some Japanese into and highlight the
characters that I should know. To do this I need to take a string of
Japanese, such as "目が覚めてしまった。眠いのに…。" and break it down into
characters, that is this string contains 16 characters and a space
(despite being 49 bytes long).

Of course some strings also contain latin characters mixed in, such as
"俺の嫁と会話2 | あねこ".

My home-brew solutions always seem to screw up, but then I thought.
This is Ruby! Why the heck am I reinventing the wheel?

So can someone clue me in on how to take a string of Japanese and
break it up into characters?

Please

=> ["俺", "の", "嫁", "と", "会", "話", "2", " ", "|", " ", "あ", "ね", "こ"]

Regards,
Ammar

···

2010/11/4 Peter Hickman <peterhickman386@googlemail.com>:

So can someone clue me in on how to take a string of Japanese and
break it up into characters?

"俺の嫁と会話2 | あねこ".scan(/./u)

What version of ruby are you using? In ruby 1.9, you can simply use
String#each_char:

"目が覚めてしまった。眠いのに…。".each_char{|c| puts c}
=>















You may need to set the encoding appropriately, however.

I don't know how you'd do it in ruby 1.8.

I hope this helps

Stefano

···

On Thursday 04 November 2010, Peter Hickman wrote:

>I realise that this is probably a known thing but my google-fu is not
>working today.
>
>I am learning Japanese and I wanted to write an application (probably
>a web app) that I can paste some Japanese into and highlight the
>characters that I should know. To do this I need to take a string of
>Japanese, such as "目が覚めてしまった。眠いのに…。" and break it down into
>characters, that is this string contains 16 characters and a space
>(despite being 49 bytes long).
>
>Of course some strings also contain latin characters mixed in, such as
>"俺の嫁と会話2 | あねこ".
>
>My home-brew solutions always seem to screw up, but then I thought.
>This is Ruby! Why the heck am I reinventing the wheel?
>
>So can someone clue me in on how to take a string of Japanese and
>break it up into characters?
>
>Please

Thank you thank you thank you

Damn that was a simple fix, not to mention a very fast response

Thank you again

I was just reminded of that simple solution myself yesterday on this very list.

Cheers,
Ammar

···

On Thu, Nov 4, 2010 at 1:02 PM, Peter Hickman <peterhickman386@googlemail.com> wrote:

Thank you thank you thank you

Damn that was a simple fix, not to mention a very fast response

Thank you again

If you want the array you can do str.chars.to_a.

···

On Thu, Nov 4, 2010 at 11:06 AM, Ammar Ali <ammarabuali@gmail.com> wrote:

On Thu, Nov 4, 2010 at 1:02 PM, Peter Hickman > <peterhickman386@googlemail.com> wrote:
> Thank you thank you thank you
>
> Damn that was a simple fix, not to mention a very fast response
>
> Thank you again

I was just reminded of that simple solution myself yesterday on this very
list.

Cheers,
Ammar

If you want the array you can do str.chars.to_a.

It's worth noting that String#chars is not available in all versions of ruby:

RUBY_VERSION

=> "1.9.2"

"str".chars

=> #<Enumerator: "str":chars>

RUBY_VERSION

=> "1.8.7"

"str".chars

=> #<Enumerable::Enumerator:0x3e3a0>

RUBY_VERSION

=> "1.8.6"

"str".chars

NoMethodError: undefined method `chars' for "str":String
        from (irb):2

Regards,
Ammar

···

On Thu, Nov 4, 2010 at 6:42 PM, Adam Prescott <mentionuse@gmail.com> wrote: