Hi,
Yukihiro Matsumoto wrote:
Hi,
the only point i am trying to make is that in so far as both account for
order and list, i want their exposed methodologies to be the same.
that’s it. (whihc is why .each should work differently)
a string is not an array.
but still, a string can be seen as an array of characters
sometimes, hence str[0] returns the first character (a byte in the
current implementation) in the string.text processing model of Ruby is besed on lines, not characters.
That’s why I made “each” to be line oriented, not character
oriented. I took usefulness over consisitency here.So, if you want String consistent with Array, you need to express
either:
- why it is important over usefulness
- or line-oriented “each” is not usefull at all
good enough to bring incompatibility.
matz.
Well, “usefulness” depends on the application. And it seems that
sometimes, strings are better seen as arrays (not Arrays!) of characters,
and at other times, as arrays of lines (and at yet other times, as arrays
of words or paragraphs).
Since there doesn’t seem to be a universally happy medium, perhaps the
problem lies in assuming there is one. I think the method “each” is the
problem: it’s ambiguous (“each what?”).
Couldn’t we have separate methods (as suggested earlier) #bytes, #chars (or
#characters), #words (maybe), #lines, #pars (or #paragraphs) for String?
We could have identical methods for the IO class, so IOs and Strings can be
used interchangeably with these methods. These methods could be both used
to iterate over the object (when called with a block), or used to retrieve
an Array (not array!) of the elements we’re interested in (when called
without a block). E.g.:
s = “abc\ndef\nghi”
s.lines >> [“abc”, “def”, “ghi”]
a =
s.lines do {|s| a << (s + ‘xyz’)} >> nil (or maybe something else)
s >> [“abcxyz”, “defxyz”, “ghixyz”]
Thus, io.lines &proc has the same effect as io.lines.each &proc (though
would hopefully require less memory when say, reading in a large file,
since the former doesn’t need to read the whole thing into a gigantic Array
first).
Now, what about #each? Well, we’ve made it essentially useless for calling
directly (since the others are more readable and unambiguous (agree?)).
But there’s still the interaction of String and IO with the Enumerable
mixin. What about this?:
We have methods corresponding to the iterator/collector methods to set the
one that #each points to: #useBytes, #useChars (or #useCharacters),
#useWords, #useLines, #usePars (or #useParagraphs).
These could be defined in their own mixin module:
Module StringProcessing
def useBytes
class << self
alias each bytes
end
end
def useChars
class << self
alias each chars
end
end
def useWords
class << self
alias each words
end
end
def useLines
class << self
alias each lines
end
end
def usePars
class << self
alias each pars
end
end
end
These would change the behaviour of #each for the String or IO instance
they’re called. E.g.:
s = “abc\ndef\nghi”
s.useChars
s.collect {|char| frob char} # iterates over chars
s.useLines
s.collect {|line| frob line} # now it iterates over lines
The default behaviour would be to iterate over lines, so as to be backward
compatible. Well, except for my little, implicit wish that the record
separators would be removed automatically. E.g., I’d rather
“abc\ndef”.lines would return [“abc”, “def”] than [“abc\n”, “def”]. But I
guess that’s another war… Still, even if the record separators stay,
this’d be a pretty flexible String/IO model, wouldn’t it?
Thoughts?
···
In message “Re: is there a better string.each?” > on 02/07/08, Tom Sawyer transami@transami.net writes: