String#[]

Aemca · 26 September 2006 13:06

This must be a common newbie question, but I can't find the answer.

Why does string#[] return an ASCII code, rather than a character?

"abc"[1,2] #-> "bc"
"abc"[1..2] #-> "bc"
"abc"[1] #-> 98

Robert_K1 · 26 September 2006 13:10

Because there is no character class in Ruby. If you need a one character string you can do

irb(main):001:0> "abc"[2,1]
=> "c"
irb(main):002:0> "abc"[2].chr
=> "c"

I guess the former is more performant because the internal buffer can be shared.

Kind regards

robert

···

On 26.09.2006 15:02, Newbie wrote:

This must be a common newbie question, but I can't find the answer.

Why does string# return an ASCII code, rather than a character?

"abc"[1,2] #-> "bc"
"abc"[1..2] #-> "bc"
"abc"[1] #-> 98

Thomas_Adam · 26 September 2006 13:16

It tells you why in "ri String#":

``
Element Reference---If passed a single +Fixnum+, returns the code
of the character at that position. If passed two +Fixnum+ objects,
returns a substring starting at the offset given by the first, and
a length given by the second. If given a range, a substring
containing characters at offsets given by the range is returned.
''

-- Thomas Adam

···

On Tue, 26 Sep 2006 22:06:18 +0900 Newbie <none@none.com> wrote:

This must be a common newbie question, but I can't find the answer.

Why does string# return an ASCII code, rather than a character?

"abc"[1,2] #-> "bc"
"abc"[1..2] #-> "bc"
"abc"[1] #-> 98

Aemca · 26 September 2006 16:15

That answers what?, which I already knew. I'm asking why?

Thomas Adam wrote:

···

This must be a common newbie question, but I can't find the answer.

Why does string# return an ASCII code, rather than a character?

"abc"[1,2] #-> "bc"
"abc"[1..2] #-> "bc"
"abc"[1] #-> 98

It tells you why in "ri String#":

``
Element Reference---If passed a single +Fixnum+, returns the code
of the character at that position. If passed two +Fixnum+ objects,
returns a substring starting at the offset given by the first, and
a length given by the second. If given a range, a substring
containing characters at offsets given by the range is returned.
''

-- Thomas Adam

Aemca · 26 September 2006 16:15

Why not a 1-character string?

Robert Klemme wrote:

···

This must be a common newbie question, but I can't find the answer.

Why does string# return an ASCII code, rather than a character?

"abc"[1,2] #-> "bc"
"abc"[1..2] #-> "bc"
"abc"[1] #-> 98

Because there is no character class in Ruby.

> <snip>

Paul_Lutus · 26 September 2006 16:40

Newbie wrote:

Why not a 1-character string?

A one-character string isn't a character, because a string has the
properties of an array. A character (if there were such a thing in Ruby)
cannot be expanded into two or more characters as a string can.

When you reference a string with a single index, you get back a character
code, but this is as close to a character or character class as exists in
Ruby.

···

--
Paul Lutus
http://www.arachnoid.com

James_Edward_Gray_II · 26 September 2006 17:03

Ruby will work this way in the future.

James Edward Gray II

···

On Sep 26, 2006, at 11:15 AM, Newbie wrote:

Why not a 1-character string?

Gary_Wright · 26 September 2006 18:29

Even if String#[index] returned a 1-character string, you would still
want a way to extract individual code-points/bytes. Right now you have:

s[i..i] # substring starting at position i of length 1
s[i] # code-point at position i

I think in future Ruby versions it is going to be something like:

  s[i..i] # substring starting at position i of length 1
  s[i] # same as s[i..i]
  s.byte(i) # code-point at position i

I'm guessing at String#byte. I know I read something about that but I
couldn't find a reference right away.

Anyway, as I understand it, the concept of 'character' or even 'position'
is pretty complicated in a fully i18n world (such as with Unicode).

Gary Wright

···

On Sep 26, 2006, at 12:15 PM, Newbie wrote:

Why not a 1-character string?

Steven_Lumos1 · 26 September 2006 22:51

Newbie <none@none.com> writes:

Thomas Adam wrote:

This must be a common newbie question, but I can't find the answer.

Why does string# return an ASCII code, rather than a character?

"abc"[1,2] #-> "bc"
"abc"[1..2] #-> "bc"
"abc"[1] #-> 98

It tells you why in "ri String#":
``
Element Reference---If passed a single +Fixnum+, returns the code
of the character at that position. If passed two +Fixnum+ objects,
returns a substring starting at the offset given by the first, and
a length given by the second. If given a range, a substring
containing characters at offsets given by the range is returned.
''
-- Thomas Adam

That answers what?, which I already knew. I'm asking why?

Probably because a one-character String is 22 bytes and a Fixnum is
4 (on 32-bit archs).

As James said, a future version of Ruby will return a one-character
string in that case. In the meantime if you are just doing
comparisons you can use the Ruby character literal syntax:

next if line[0] == ?#

Steve

Jordan_Callicoat · 27 September 2006 01:45

gwtmp01@mac.com wrote:

I think in future Ruby versions it is going to be something like:

In 1.9 it's #ord:

1.8:
'a'[0] # => 97
'a'[0,1] # => a
'a'[0..0] # => a

1.9:
'a'[0] # => a
'a'.ord # => 97

Regards,
Jordan

Topic		Replies	Views
String#[] ruby-talk	0	88	26 September 2006
Accessing character code in Ruby 1.9.1 ruby-talk	2	121	24 September 2009
How to access individual characters in a string (as strings)? ruby-talk	6	116	6 January 2006
Getting single char from string -- Simpler solution? ruby-talk	3	125	25 November 2008
Strings and substrings ruby-talk	3	84	28 November 2002

String#[]

Related topics