UTF-8 strings?

Julik_Tarkhanov · 25 October 2004 01:19

Hello gentlemen!

A complete newbie question.
I come from PHP background and wanted to try Ruby (having watched all the beautiful presentations about Rails). Got started with the very basics, but already found a problem which might lead me to finish my research altogether and go back to PHP.

What is the situation in Ruby when it comes to UTF-8? I have to process (fetch, display, whatever) lost of russian strings, I store them in a database, I want to write and read them to and from XML etc.

Some very simple one-liners with my name written in russian yield exceptionally faulty results (ranging from nils to non-displayable characters). I get wrong results from downcase and index lookup, among others. If Ruby cannot do these things with strings I cannot use it for any real-life projects. PHP was ugly but the mbstring extension was doing all the job for me there.

I am running the binary 1.9 build for MacOS X, my terminal is set to UTF-8 and input escaping is disabled in bash. All other scripts process my strings correctly so that it not an input problem.

I was trying to find any info on the subject but the few pages mentioning this were in Japanese. Maybe someone can enlighten me? Maybe I just have to declare/import some module that will overload the string functions for me?

Nobuyoshi_Nakada · 25 October 2004 01:50

Hi,

At Mon, 25 Oct 2004 10:19:08 +0900,
Julik Tarkhanov wrote in [ruby-talk:117548]:

Some very simple one-liners with my name written in russian yield
exceptionally faulty results (ranging from nils to non-displayable
characters). I get wrong results from downcase and index lookup, among
others. If Ruby cannot do these things with strings I cannot use it for any
real-life projects. PHP was ugly but the mbstring extension was doing all
the job for me there.

Run ruby with -Ku option if your scripts are written in UTF-8,
or set $KCODE to 'u'.

···

--
Nobu Nakada

Topic		Replies	Views
[ENCODING] UTF8 hell ruby-talk	14	705	24 February 2010
UTF-8 question ruby-talk	20	166	15 August 2003
Embedded Ruby: UTF-8 strings not working right ruby-talk	1	188	19 October 2013
Using string with different encodings ruby-talk	1	144	1 July 2002
UTF-8 in Ruby ruby-talk	3	105	1 May 2008

UTF-8 strings?

Related topics