Is there a better string.each?

Not to be a pedant, but this seems like a huge breakage of the
POLS,
if it bites everyone, no? =)

I see your point, but I disagree.

If Ruby behaved identically to C, then the
surprise for C programmers would be precisely
zero.

Oh yes, I was playing devils advocate to a certain degree.

In the first place, POLS is a Matz-centric phenomenon
(as he has admitted). The more you think like Matz,
the better you will understand Ruby.

I’m not sure if I’m understanding you correctly, but I’ve heard of
the “POLA” (A = astonishment) since the mid/late 80’s. I had assumed
POLS came from that.

And I think Strings are not really Arrays. There are
some isomorphisms there, since they are both “ordered
collections of entities.”

Yes, I agree here wholeheartedly. To say that since Strings are
ordered sets and Arrays are ordered sets do not make them the same
thing.

···

=====

Use your computer to help find a cure for cancer: http://members.ud.com/projects/cancer/

Yahoo IM: michael_s_campbell


Do You Yahoo!?
Sign up for SBC Yahoo! Dial - First Month Free

No, I have to say that in this respect as in others,
Ruby corrected C’s mistake. And it was not really
and truly a mistake in C; C is close to assembly
language, and is not as high level as Ruby. There
was not really another way to think of strings
except as arrays. But we’re past that now.

like c was the only languge before ruby came along?

strings have been delt with in all sorts of ways by “higher-level”
languages.

but when you get down to the core of it, strings are ultimaltly stored
as ordered sets of data. the difference for ruby really amounts to the
fact that the ordered set of data for a ruby Array consists of reference
pointers to Objects, where as the String consists of the actual
character bytecodes. if a String were to work just like the Array and
instead point to encoding-specific Character Objects, we’d get a much
more powerful and “internationalizable” model, but at a significant cost
to speed. so there is reason to have them as different types of objects.

but, to satisfy POLS, since they are the same type of data structure
(the String differing only in the types of objects in COULD point to)
the String should “inherit” essentially all the exact same methods as
Array and those should work in the same fashion. that is POLS!

~transami

···

On Fri, 2002-07-05 at 12:31, Hal E. Fulton wrote:

----- Original Message -----
From: “Michael Campbell” michael_s_campbell@yahoo.com
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Friday, July 05, 2002 11:07 AM
Subject: Re: is there a better string.each?

This is something which bites everybody, I think, at some point.

Not to be a pedant, but this seems like a huge breakage of the POLS,
if it bites everyone, no? =)

I see your point, but I disagree.

If Ruby behaved identically to C, then the
surprise for C programmers would be precisely
zero.

But when you use C for string processing (especially
when you’ve used something more powerful), you find
yourself saying: “There should be some easy way to
do THIS…”

In the first place, POLS is a Matz-centric phenomenon
(as he has admitted). The more you think like Matz,
the better you will understand Ruby.

Secondly, POLS should be a meta-linguistic issue IMO –
not “how does this work in my favorite language?” but
“how should this work in an ideal universe?”

And I think Strings are not really Arrays. There are
some isomorphisms there, since they are both “ordered
collections of entities.”

But a string is a highly specialized thing. For one
thing, each item has to be a character. No other
Ruby array is limited in the kind of data it can
contain; such an idea seems very unRubylike to me.

No, I have to say that in this respect as in others,
Ruby corrected C’s mistake. And it was not really
and truly a mistake in C; C is close to assembly
language, and is not as high level as Ruby. There
was not really another way to think of strings
except as arrays. But we’re past that now.

Hal Fulton


~transami

“They that can give up essential liberty to obtain a little
temporary safety deserve neither liberty nor safety.”
– Benjamin Franklin

Hal E. Fulton wrote:

Secondly, POLS should be a meta-linguistic issue IMO –
not “how does this work in my favorite language?” but
“how should this work in an ideal universe?”

yes

And I think Strings are not really Arrays. There are
some isomorphisms there, since they are both “ordered
collections of entities.”

But a string is a highly specialized thing. For one
thing, each item has to be a character. No other
Ruby array is limited in the kind of data it can
contain;

… so, as you say, it’s a specialized form of an array;
and could very polsy be modeled as a subclass of Array.

Tobi

···


http://www.pinkjuice.com/

Hal E. Fulton wrote:

it is by definition, not by coincidence, no?

No, not by definition… a string is certainly
not defined as an array (in the computer science
sense) in Ruby.

I was refering to the realm of common sense, which only shares some
intersection with CS :slight_smile:

Tobi

···


http://www.pinkjuice.com/

Hal E. Fulton wrote:

And I think Strings are not really Arrays. There are
some isomorphisms there, since they are both “ordered
collections of entities.”

But Strings /are/ arrays, by your own definition words. Really, a String is
an array (read: sequence) of Characters (which are displayed with glyphs);
words, sentances, any syntactic construction is built up of a sequence of
Characters.

But a string is a highly specialized thing. For one
thing, each item has to be a character. No other
Ruby array is limited in the kind of data it can
contain; such an idea seems very unRubylike to me.

Good point. But why? Why shouldn’t that array be able to contain things
other than characters – say, markup? The current implementation of Ruby
wouldn’t support that, of course… but the current implementation of Ruby
doesn’t really handle the glyphs-vs-character issue very well, either, and
binds pretty tightly strings to byte arrays. Anyway, that’s something I
just pulled out of my … hat. As an example.

No, I have to say that in this respect as in others,
Ruby corrected C’s mistake. And it was not really
and truly a mistake in C; C is close to assembly
language, and is not as high level as Ruby. There
was not really another way to think of strings
except as arrays. But we’re past that now.

Are we? The problem in C wasn’t String = array; the problem was Character =
byte.

···

… “You need someone listening to you for it to be an actual
<|> conversation.”
/|\
/|

They’re easier to remember when you know that “m17” means
“multilingualization”, or “m, 17 letters, n”. Same for “l10n” and
“localization” or “i18n” and “internationalization”. I’m sure the people
who work in those fields appreciate not having to type what could easily
amount to 100 characters a page for words that everyone reading such a
paper would know.

···

On Fri, 2002-07-05 at 16:17, Sean Russell wrote:

Yeah, but this is a known problem that is waiting for the m17n (i18n?)
solution. Man, those are some of the most un-intuitive namings of anything
I’ve ever seen.

POLS is a good thing when designing a language, but it should
be remembered, that POLS is from Matz and is the ‘least suprise’
to Matz. Any ‘least surprise’ that applies to others is just
their good fortune.

···

On Sat, Jul 06, 2002 at 07:29:44AM +0900, Marko Schulz wrote:

So POLS is a nice thought, but I don’t base too many decisions on it.


Jim Freeze
If only I had something clever to say for my comment…
~

No. Not for Japanese, Chinese, Korean, Russian, (I think) Hebrew …

Well, maybe. I can’t think of a language where a string isn’t a
set of characters in that language, however, when you’re talking
about a computer representation, a string is a sequence of characters
but a “character” in the alphabet can also be a sequence of bytes
itself.

A String might appear at first to be derived from Array, but a String
is really an Array of Characters, where a Character might itself be
an Array of Bytes.

A String is: an Array of Characters + a CharacterSet, IMHO. Whether
the implementation should be based on that, is something else.

Dossy wrote:

This works fine for German and English Strings. Does this uphold for
other languages too?

No. Not for Japanese, Chinese, Korean, Russian, (I think) Hebrew …

Well, maybe. I can’t think of a language where a string isn’t a
set of characters in that language, however, when you’re talking
about a computer representation, a string is a sequence of characters
but a “character” in the alphabet can also be a sequence of bytes
itself.

A String might appear at first to be derived from Array, but a String
is really an Array of Characters, where a Character might itself be
an Array of Bytes.

so it’s still

String < Array

A string is an array of characters which are arrays of bytes.

No?

Tobi

···


http://www.pinkjuice.com/

In the first place, POLS is a Matz-centric phenomenon
(as he has admitted). The more you think like Matz,
the better you will understand Ruby.

I’m not sure if I’m understanding you correctly, but I’ve heard of
the “POLA” (A = astonishment) since the mid/late 80’s. I had assumed
POLS came from that.

Oh, yes, that’s true. I didn’t mean that the
idea originated with him.

(BTW, I first heard it in The Tao of Programming,
around '86… do you know an earlier reference?)

I just meant that, according to Matz, when POLS is
applied to Ruby, the standard is HIS surprise, not
anyone else’s. He got a laugh at RubyConf2001 when
he said, “‘Least Surprise’ means ‘Least Surprise
for Me.’”

Hal Fulton

common sense is tightly coupled to the culture you live in or come from.

···

On Sat, Jul 06, 2002 at 06:09:57AM +0900, Tobias Reif wrote:

Hal E. Fulton wrote:

it is by definition, not by coincidence, no?

No, not by definition… a string is certainly
not defined as an array (in the computer science
sense) in Ruby.

I was refering to the realm of common sense, which only shares some
intersection with CS :slight_smile:


marko schulz

Are ordered sets of characters in a sorted tree structure Arrays?

I don’t think that “ordered set of characters” implies “Array”.

···

— Sean Russell ser@germane-software.com wrote:

Hal E. Fulton wrote:

And I think Strings are not really Arrays. There are
some isomorphisms there, since they are both “ordered
collections of entities.”

But Strings /are/ arrays, by your own definition words. Really, a
String is
an array (read: sequence) of Characters (which are displayed with
glyphs);
words, sentances, any syntactic construction is built up of a
sequence of
Characters.

=====

Use your computer to help find a cure for cancer: http://members.ud.com/projects/cancer/

Yahoo IM: michael_s_campbell


Do You Yahoo!?
Sign up for SBC Yahoo! Dial - First Month Free

Tobias Reif wrote:

I was refering to the realm of common sense, which only shares some
intersection with CS :slight_smile:

Hah! That’s got to go in my .sig file!

···

… “When choosing between evils, I always like to take the one I’ve
<|> never tried before.”
/|\ – Mae West
/|

IMO this is the heart of the matter: the technical core of strings vs.
their meaning. Technically strings make sence as arrays of characters.
Semantically strings make sense as arrays of words, blocks, paraghraphs,
sentences, lines, pages, etc.

– Nikodemus

···

On Sat, 6 Jul 2002, Tom Sawyer wrote:

but when you get down to the core of it, strings are ultimaltly stored
as ordered sets of data. the difference for ruby really amounts to the
fact that the ordered set of data for a ruby Array consists of reference

Hi –

so it’s still

String < Array

A string is an array of characters which are arrays of bytes.

No?

If you literally mean that String should be understood as inheriting
all the methods from Array, then I think this is too rigid a picture.
I’m thinking of some of the functionalities which are complementary as
between arrays and strings – pack/unpack, join/split – and also
#flatten comes to mind as an array method whose equivalent for strings
isn’t clear.

I’ve always thought that strings are basically arrays but merit
certain kinds of special treatment, extra methods, optimizations
(syntax as well as speed), etc., because they are so common that it’s
worth trading off some consistency (in their treatment as arrays) to
make them easier to handle.

David

···

On Sat, 6 Jul 2002, Tobias Reif wrote:


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

I’m not sure if I’m understanding you correctly, but I’ve heard
of
the “POLA” (A = astonishment) since the mid/late 80’s. I had
assumed
POLS came from that.

Oh, yes, that’s true. I didn’t mean that the
idea originated with him.

(BTW, I first heard it in The Tao of Programming,
around '86… do you know an earlier reference?)

Well, no. The earliest I remember was about that time, but came from
a manager of mine at the time. I have “sensed” an undercurrent in
this group though that it had originated with Matz, which I am
certain is not the case.

I just meant that, according to Matz, when POLS is
applied to Ruby, the standard is HIS surprise, not
anyone else’s. He got a laugh at RubyConf2001 when
he said, “‘Least Surprise’ means ‘Least Surprise
for Me.’”

Ah yes, of course.

···

=====

Use your computer to help find a cure for cancer: http://members.ud.com/projects/cancer/

Yahoo IM: michael_s_campbell


Do You Yahoo!?
Sign up for SBC Yahoo! Dial - First Month Free

Nikodemus Siivola wrote:

IMO this is the heart of the matter: the technical core of strings vs.
their meaning. Technically strings make sence as arrays of characters.
Semantically strings make sense as arrays of words, blocks, paraghraphs,
sentences, lines, pages, etc.

… which are all arrays of characters.

Tobi

···


http://www.pinkjuice.com/

I would love to see a separation of the code for String when used as a
data construct vs. String when used as a linguistic construct. The
uses for both are often employed in programming, it would be a shame
to compromise the use of one in favor on another. I think we need a
Phrase language string class, or ByteString (RawString?) for byte
level programming.

  • alan
···

On Sat, Jul 06, 2002 at 05:11:29PM +0900, Nikodemus Siivola wrote:

On Sat, 6 Jul 2002, Tom Sawyer wrote:

but when you get down to the core of it, strings are ultimaltly stored
as ordered sets of data. the difference for ruby really amounts to the
fact that the ordered set of data for a ruby Array consists of reference

IMO this is the heart of the matter: the technical core of strings vs.
their meaning. Technically strings make sence as arrays of characters.
Semantically strings make sense as arrays of words, blocks, paraghraphs,
sentences, lines, pages, etc.

– Nikodemus


Alan Chen
Digikata LLC
http://digikata.com

Michael Campbell wrote:

I don’t think that “ordered set of characters” implies “Array”.

I don’t know of a better definition of “Array” than “an ordered set of
somethings”.

Linked lists are also ordered sets of somethings; arrays just have a
particular customary implementation in computers. Still, people tend to
deal with strings the same way they deal with arrays, as opposed to linked
lists.

···

… I love my country.
<|> It’s my government I don’t trust.
/|\
/|

David Alan Black wrote:

If you literally mean that String should be understood as inheriting
all the methods from Array, then I think this is too rigid a picture.

stuff that’s not shared could be undef’d, or not included

I’ve always thought that strings are basically arrays but merit
certain kinds of special treatment, extra methods, optimizations
(syntax as well as speed), etc., because they are so common that it’s
worth trading off some consistency (in their treatment as arrays) to
make them easier to handle.

extra methods, sure;
wouldn’t speed be more behind the scenes?

anyways, (as Dave (?) said), I also think that String#each should return
each character (modeled as String, Character, or s.th. else)

Tobi

···


http://www.pinkjuice.com/