String#each

Well, let’s come full circle and get back
to “each” (which started the discussion).

The rationale for the current behavior (and
someone else could explain this better) is
that the “elements” of which a string is
composed are subject to debate. In some
situations, a string could be considered a
sequence of lines.

There are each_byte and each_line iterators,
in fact. So what should ‘each’ do? One or
the other. Matz picked iteration by line.
It’s just the default.

If you specifically want bytes or lines,
you can use the specific iterator.

Yes, it was a surprise to me, too. But I
have gotten used to it and I accept it.

If Matz changes it, I will accept that
and get used to that, too. He is good at
what he does.

A related topic:

I don’t know much about m17n or i18n… but
I was wondering if an each_char iterator
might be a good idea for the future? That
way we could distinguish between bytes and
characters.

I favor making software as language-flexible
as possible, but I’d hate to see it slow
things down or complicate the API too much.

But I’m wandering outside my realm of knowledge.

Hal Fulton

do you think its just over the top to change how .each works ?
or is that just too much code breakage?

also, how is support for strings going to change in the future with the
adaptation of i17m (i18n?) support.

~transami

···

On Sun, 2002-07-07 at 17:04, Hal E. Fulton wrote:

Well, let’s come full circle and get back
to “each” (which started the discussion).

The rationale for the current behavior (and
someone else could explain this better) is
that the “elements” of which a string is
composed are subject to debate. In some
situations, a string could be considered a
sequence of lines.

There are each_byte and each_line iterators,
in fact. So what should ‘each’ do? One or
the other. Matz picked iteration by line.
It’s just the default.

If you specifically want bytes or lines,
you can use the specific iterator.

Yes, it was a surprise to me, too. But I
have gotten used to it and I accept it.

If Matz changes it, I will accept that
and get used to that, too. He is good at
what he does.

A related topic:

I don’t know much about m17n or i18n… but
I was wondering if an each_char iterator
might be a good idea for the future? That
way we could distinguish between bytes and
characters.

I favor making software as language-flexible
as possible, but I’d hate to see it slow
things down or complicate the API too much.

But I’m wandering outside my realm of knowledge.

Hal Fulton


~transami

“They that can give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety.”
– Benjamin Franklin

do you think its just over the top to change how .each works ?
or is that just too much code breakage?

Whenever there’s some defensible reason for why something works as it
does, even if there are pretty good arguments that it would have been
better if it had been done some other way, we need to lean heavily
toward preserving the status quo for that behavior.

For my purposes anymore, basically any code breakage is too much. I sure
don’t want to worry about things going wrong with scripts that are in
production use whenever I upgrade the interpreter, and I’ve been relying
heavily on Ruby at work for quite a while.

Mark

Hi,

···

In message “Re: String#each” on 02/07/08, Tom Sawyer transami@transami.net writes:

also, how is support for strings going to change in the future with the
adaptation of i17m (i18n?) support.

String index and length will be based on characters, not bytes. There
will be no other incompatible changes.

						matz.

Imagine this…

There is a 16compat.rb file in ruby1.8 lib directory. By requiring
it, all language features that have been changed (not added) in 1.8
assume 1.6 behaviour. You could put a require ‘compat16’ in your
script, and you would be sure that, if it ran under 1.6, it will run
for you. You could even set that system-wide to be sure (e.g. ruby' is really a script containingruby -rcompat1.6’) and require
’compat18’ for new projects. Most important, by doing so on the new
project, whatever change will happen in ruby2.0 you already know your
script will still run because compat18.rb in ruby2.0 will take care of
it…

Just a thought.

Massimiliano

···

On Mon, Jul 08, 2002 at 11:05:18AM +0900, Mark Slagell wrote:

Whenever there’s some defensible reason for why something works as it
does, even if there are pretty good arguments that it would have been
better if it had been done some other way, we need to lean heavily
toward preserving the status quo for that behavior.

For my purposes anymore, basically any code breakage is too much. I sure
don’t want to worry about things going wrong with scripts that are in
production use whenever I upgrade the interpreter, and I’ve been relying
heavily on Ruby at work for quite a while.

Yukihiro Matsumoto wrote:

also, how is support for strings going to change in the future with the
adaptation of i17m (i18n?) support.

String index and length will be based on characters, not bytes. There
will be no other incompatible changes.

So, even with i17m support, there will be no easy way to itterate over
characters? There will be no efficient way to break the string up into its
atomic parts?

– SER

Massimiliano, THAT’S GENIUS!!!

···

On Mon, 2002-07-08 at 05:24, Massimiliano Mirra wrote:

On Mon, Jul 08, 2002 at 11:05:18AM +0900, Mark Slagell wrote:

Whenever there’s some defensible reason for why something works as it
does, even if there are pretty good arguments that it would have been
better if it had been done some other way, we need to lean heavily
toward preserving the status quo for that behavior.

For my purposes anymore, basically any code breakage is too much. I sure
don’t want to worry about things going wrong with scripts that are in
production use whenever I upgrade the interpreter, and I’ve been relying
heavily on Ruby at work for quite a while.

Imagine this…

There is a 16compat.rb file in ruby1.8 lib directory. By requiring
it, all language features that have been changed (not added) in 1.8
assume 1.6 behaviour. You could put a require ‘compat16’ in your
script, and you would be sure that, if it ran under 1.6, it will run
for you. You could even set that system-wide to be sure (e.g. ruby' is really a script containingruby -rcompat1.6’) and require
’compat18’ for new projects. Most important, by doing so on the new
project, whatever change will happen in ruby2.0 you already know your
script will still run because compat18.rb in ruby2.0 will take care of
it…

Just a thought.

Massimiliano


~transami

“They that can give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety.”
– Benjamin Franklin

Has no one else read this post?

···

On Mon, 2002-07-08 at 05:24, Massimiliano Mirra wrote:

Imagine this…

There is a 16compat.rb file in ruby1.8 lib directory. By requiring
it, all language features that have been changed (not added) in 1.8
assume 1.6 behaviour. You could put a require ‘compat16’ in your
script, and you would be sure that, if it ran under 1.6, it will run
for you. You could even set that system-wide to be sure (e.g. ruby' is really a script containingruby -rcompat1.6’) and require
’compat18’ for new projects. Most important, by doing so on the new
project, whatever change will happen in ruby2.0 you already know your
script will still run because compat18.rb in ruby2.0 will take care of
it…

Just a thought.

Massimiliano

i’m waiting to here what others think. i am hopeful that this will be in
our ruby-futures.

perhaps this idea should be moved to a new thread?

~transami

Hi,

···

In message “Re: String#each” on 02/07/09, Sean Russell ser@germane-software.com writes:

So, even with i17m support, there will be no easy way to itterate over
characters? There will be no efficient way to break the string up into its
atomic parts?

Other than split(//)?
It will have each_char method too.

						matz.

Yukihiro Matsumoto wrote:

It will have each_char method too.

cool :slight_smile:

That will be great for iterating over the characters. What will be a
good way to get an array of characters from multibyte strings? (probably
something that already exists)

Tobi

···


http://www.pinkjuice.com/

Yukihiro Matsumoto wrote:

Other than split(//)?
It will have each_char method too.

Bingo.

— SER

Hi,

···

In message “Re: String#each” on 02/07/09, Tobias Reif tobiasreif@pinkjuice.com writes:

That will be great for iterating over the characters. What will be a
good way to get an array of characters from multibyte strings? (probably
something that already exists)

Not yet. Anyone?

chars, explode_chars, … hmm

						matz.

Yukihiro Matsumoto wrote:

chars, explode_chars, … hmm

Intuitively, I’d try

irb(main):001:0> ‘sue’.to_a
[“sue”]

, but that does not return an array of characters…

chars might be a good name.
Or chary :slight_smile: (char ary)

Tobi

···


http://www.pinkjuice.com/

Yukihiro Matsumoto wrote:

That will be great for iterating over the characters. What will be a
good way to get an array of characters from multibyte strings? (probably
something that already exists)

Not yet. Anyone?

chars, explode_chars, … hmm

Well, split() is fine with me in this case. I don’t think it is unintuitive
from breaking a String up into an array; I just didn’t think it was very
intuitive or efficient for itterating over characters. In this case,
you’re making a new array anyway, so this would be fine for me.

If you, Matz, know that there’s a more efficient way of creating an Array of
Characters (which, I assume, will be Strings) from a String, then yes. I
would like to have a chars() method. However, if everything is basically
going to boil down to an internal split(), can we just use split()?

— SER

Hi,

···

In message “Re: String#each” on 02/07/09, Sean Russell ser@germane-software.com writes:

If you, Matz, know that there’s a more efficient way of creating an Array of
Characters (which, I assume, will be Strings) from a String, then yes. I
would like to have a chars() method. However, if everything is basically
going to boil down to an internal split(), can we just use split()?

If you want to split a string into per character strings, you should
(and you will) use String#split. chars (and each_char) work on
"characters", which are integers in my current M17N implementation.

						matz.