[BUG] string range membership

Matz,

For your information, member? used to iterate over

? items to check membership. But since confusion

between include? and member?, they were merged. The
point is Ranges are used both for ranges and
intervals. Sometimes users want it to behave like a
range surrounded by begin/end values. Sometimes they
want it to behave like a set of values, that #each
produces.

I'd like to care this issue, but I haven't know the
right way to solve it yet. Perhaps we should
provide both membership method, with right names for
each. Any ideas?

    Ah, I see. So really, the root problem here is the assumption by
Range that (value < value.succ). And in String, this assumption does
not always hold true:

irb(main):001:0> s = 'z'
=> "z"
irb(main):002:0> s < s.succ
=> false

    Because of that, there is a huge distinction between
str_range.to_a.member?(x) (is x a member of the set of the range's
values) and (str_range.first <= x <= str_range.last) (is x in the
range's interval).
So, given that (at least in the case of ranges of strings) there is a
clear distinction between a value being included in the interval and a
value being included in the set, it appears that we have a real need for
two different methods. The methods Range#include? (in interval) and
Range#member? (of set) seem to be perfect candidates for these two
different functionalities. Before these two methods were merged, did
they take on these two functionalities, or were they different in some
other way?

    Are there other cases where "membership" changes depending on
whether the range is viewed as a set or an interval? If not, perhaps it
would be better to address the fact that str.succ violates the (str <
str.succ) assumption. Perhaps the functionality currently in
String#succ could be moved to another method (String#increment
perhaps?), and String#succ could take on a new functionality that does
not violate (str < str.succ).

    Anyway, please let me know if there is anything I can do to help
settle this issue.

   - Warren Brown

Hi,

So, given that (at least in the case of ranges of strings) there is a
clear distinction between a value being included in the interval and a
value being included in the set, it appears that we have a real need for
two different methods. The methods Range#include? (in interval) and
Range#member? (of set) seem to be perfect candidates for these two
different functionalities. Before these two methods were merged, did
they take on these two functionalities, or were they different in some
other way?

#include? used for range check, #member? was for set membership. But
since they have same functionality in Enumerable, some claimed having
different behaviors in Range was confusing. I agreed.

   Anyway, please let me know if there is anything I can do to help
settle this issue.

All we need is making up good names for each functionality.

              matz.

···

In message "Re: [BUG] string range membership" on Thu, 24 Nov 2005 01:03:19 +0900, "Warren Brown" <warrenbrown@aquire.com> writes:

Range#contains?

??

-a

···

On Thu, 24 Nov 2005, Yukihiro Matsumoto wrote:

Hi,

In message "Re: [BUG] string range membership" > on Thu, 24 Nov 2005 01:03:19 +0900, "Warren Brown" <warrenbrown@aquire.com> writes:

>So, given that (at least in the case of ranges of strings) there is a
>clear distinction between a value being included in the interval and a
>value being included in the set, it appears that we have a real need for
>two different methods. The methods Range#include? (in interval) and
>Range#member? (of set) seem to be perfect candidates for these two
>different functionalities. Before these two methods were merged, did
>they take on these two functionalities, or were they different in some
>other way?

#include? used for range check, #member? was for set membership. But
since they have same functionality in Enumerable, some claimed having
different behaviors in Range was confusing. I agreed.

> Anyway, please let me know if there is anything I can do to help
>settle this issue.

All we need is making up good names for each functionality.

--

ara [dot] t [dot] howard [at] noaa [dot] gov
all happiness comes from the desire for others to be happy. all misery
comes from the desire for oneself to be happy.
-- bodhicaryavatara

===============================================================================

All we need is making up good names for each functionality.

That is NOT all you need! This does not solve the complete problem, but
only provides a little-bitty patch for query on a Range member, and a
very inefficient one at that --which I thought was part of the reason
you changed #include and #member to be the same in the first place.

The overarching issue is that sortable and comparable are using the
same method #<=>, but they do not neccessarily want the same meaning.
You should provide a separate method for comparable --like I said, in
most cases they will be equivalent, but not so in String. And
dictionary order comparion is needed anyway. I studied this issue
exahustively over a year ago when I wrote a true Interval class.

T.

Yukihiro Matsumoto wrote:

#include? used for range check, #member? was for set membership. But
since they have same functionality in Enumerable, some claimed having
different behaviors in Range was confusing. I agreed.

> Anyway, please let me know if there is anything I can do to help
>settle this issue.

All we need is making up good names for each functionality.

How about something like Enumerable#produces?, or Enumerable#yields?

Then perhaps start deprecating Enumerable#member?

So for a range, one could use r.include?(obj) to test for obj between the endpoints, and r.yields?(obj) to test whether r.succ ever yields obj.

Enumerable#=== becomes an issue (case statement), right?

Hi,

···

In message "Re: [BUG] string range membership" on Thu, 24 Nov 2005 09:38:11 +0900, "Ara.T.Howard" <ara.t.howard@noaa.gov> writes:

All we need is making up good names for each functionality.

  Range#contains?

??

For which functionality?

              matz.

Hi,

All we need is making up good names for each functionality.

That is NOT all you need! This does not solve the complete problem, but
only provides a little-bitty patch for query on a Range member, and a
very inefficient one at that --which I thought was part of the reason
you changed #include and #member to be the same in the first place.

Depends on how you define problem.

The overarching issue is that sortable and comparable are using the
same method #<=>, but they do not neccessarily want the same meaning.
You should provide a separate method for comparable --like I said, in
most cases they will be equivalent, but not so in String. And
dictionary order comparion is needed anyway. I studied this issue
exahustively over a year ago when I wrote a true Interval class.

I'm not sure what you meant here. Range has no relation with
sorting. Can you elaborate?

              matz.

···

In message "Re: string range membership" on Thu, 24 Nov 2005 11:07:26 +0900, "Trans" <transfire@gmail.com> writes:

Ara.T.Howard wrote:

>
> All we need is making up good names for each functionality.

   Range#contains?

??

I'm sure there was an earlier post with an excellent
synopsis on ranges which stated that a Range /doesn't/
"contain"? Yeh, here it is ... from you ;))

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/167200

  "not quite - we have a string range that __produces__
   27 elements. it does not 'have' or 'contain' them.
   it merely suggests this set as it's current thought"

(SCNR :wink:

Would something like Range#covers? be more apt?
(meaning within the bounds).
Oh, pooh, that's got an 's' on the end as well :frowning:

daz

···

On Thu, 24 Nov 2005, Yukihiro Matsumoto wrote:

Bob Showalter wrote:

How about something like Enumerable#produces?, or Enumerable#yields?

Then perhaps start deprecating Enumerable#member?

So for a range, one could use r.include?(obj) to test for obj between
the endpoints, and r.yields?(obj) to test whether r.succ ever yields obj.

Enumerable#=== becomes an issue (case statement), right?

A nice alternative bit of thinking, Bob. At least you are making some
sense.

As for the rest of the gibberish being posted here, which btw has been
the same flap for years, forget it. It's hopeless. You all will be
right back to were you were two years ago, two years from now.

Adios,
T.

well, i would think of #member? as most natural for set membership - so
#contains? would/should be most like #include? - in my mind.

   harp:~ > cat a.rb
   module Enumerable
     def contains? value
       map.include? value
     end
   end

   r = "a" .. "aa"
   p r.contains?("z")

   harp:~ > ruby a.rb
   true

so, if each would 'hit' it - it's contained.

kind regards.

-a

···

On Thu, 24 Nov 2005, Yukihiro Matsumoto wrote:

Hi,

In message "Re: [BUG] string range membership" > on Thu, 24 Nov 2005 09:38:11 +0900, "Ara.T.Howard" <ara.t.howard@noaa.gov> writes:

>> All we need is making up good names for each functionality.
>
> Range#contains?
>
>??

For which functionality?

--

ara [dot] t [dot] howard [at] noaa [dot] gov
all happiness comes from the desire for others to be happy. all misery
comes from the desire for oneself to be happy.
-- bodhicaryavatara

===============================================================================

I'm not sure what you meant here. Range has no relation with
sorting. Can you elaborate?

#succ defines a sort order of sorts (pun intended ;-). But #<=> defines
a sort order too along with comparability. In most classes there's no
problem, but in String the two come into conflict --the orders are not
the same.

Then consider that Range is not a true interval because it uses #succ.
This is why I created a true Interval class that uses #+ instead.
Likewise Range shouldn't use #<=> either, but another method, lets call
it #cmp. This would fix the problem.

In general:

  module Comparable
    def cmp(o)
      self<=>o
    end
  end

That is to say, for anything comparable #cmp is the same as #<=>,
unless otherwise defined. (Alternately you could define #cmp as an
alias of #<=> directly in the classes it is needed --that would
probably be better.) Then in String define #cmp specially to confom to
the successive order as defined by #succ.

Thus having Range use #cmp instead of #<=> the issue is solved.

In summary, an object would then be "Rangeable" if it supports #succ,
but only fully so if is also supports #cmp too (instead of #<=>).

Does it make sense now? (Sorry if I'm not explaining well, it's a tad
subtle and it's been awhile since I worked on it too, so I have been
trying to recall it all myself too).

T.

lol. i realized that actually - i thought that the confusion with "include?"
being used to test "containment" might be resolved by having a method actually
named "contains?". :wink:

too confusing?

-a

···

On Fri, 25 Nov 2005, daz wrote:

Ara.T.Howard wrote:

On Thu, 24 Nov 2005, Yukihiro Matsumoto wrote:

All we need is making up good names for each functionality.

   Range#contains?

??

I'm sure there was an earlier post with an excellent
synopsis on ranges which stated that a Range /doesn't/
"contain"? Yeh, here it is ... from you ;))

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/167200

"not quite - we have a string range that __produces__
  27 elements. it does not 'have' or 'contain' them.
  it merely suggests this set as it's current thought"

(SCNR :wink:

Would something like Range#covers? be more apt?
(meaning within the bounds).
Oh, pooh, that's got an 's' on the end as well :frowning:

--

ara [dot] t [dot] howard [at] noaa [dot] gov
all happiness comes from the desire for others to be happy. all misery
comes from the desire for oneself to be happy.
-- bodhicaryavatara

===============================================================================

ara.t.howard wrote:

lol. i realized that actually - i thought that the confusion with
"include?"
being used to test "containment" might be resolved by having a method
actually
named "contains?". :wink:

How about "within?" for a value within a given range?

-- Jim Weirich

···

--
Posted via http://www.ruby-forum.com/\.

Jim Weirich wrote

How about "within?" for a value within a given range?

   (0..5).within?(3)

reads backwards, IMO compared to:

   (0..5).include?(3)

daz

daz wrote:

Jim Weirich wrote

How about "within?" for a value within a given range?

  (0..5).within?(3)

reads backwards, IMO compared to:

  (0..5).include?(3)

Yes, I agree. I think, just finding a new word for a method that does something, that used to have different names in the past, won't help. This has been tried, but it didn't work too well.

Ruby's ranges have (at least) a dual nature:
1. as an interval (a, b) of values,
2. as a shortcut for a set of values { a, a.succ, a.succ.succ, ..., b }.

I think "include?" is a good name for the 1. And 2 is very similar to 1, so people will easily confuse those names.

What about using a bit of double dispatch here, like that:

class Object
  def element?(r)
    r.find { |x| x == self } ? true : false
  end
end

"bb".element? "a".."zz" # => true

This doesn't read backwards, and the name conveys the meaning of set membership, as required by 2.

Perhaps using another method than "find" for searching (that only defaults to "find") would make it possible, to provide an alternative implementation for datastructures, that can compute membership faster than O(n).

···

--
Florian Frank

I was going to suggest r.has_element?(x) for the equavilent of #member.
maybe r.surrounds?(x) for for #include. That one is not as good.

···

On 11/25/05, Florian Frank <flori@nixe.ping.de> wrote:

daz wrote:

>Jim Weirich wrote
>
>
>>How about "within?" for a value within a given range?
>>
>>
>>
>
> (0..5).within?(3)
>
>reads backwards, IMO compared to:
>
> (0..5).include?(3)
>
>
Yes, I agree. I think, just finding a new word for a method that does
something, that used to have different names in the past, won't help.
This has been tried, but it didn't work too well.

Ruby's ranges have (at least) a dual nature:
1. as an interval (a, b) of values,
2. as a shortcut for a set of values { a, a.succ, a.succ.succ, ..., b }.

I think "include?" is a good name for the 1. And 2 is very similar to 1,
so people will easily confuse those names.

What about using a bit of double dispatch here, like that:

class Object
  def element?(r)
    r.find { |x| x == self } ? true : false
  end
end

"bb".element? "a".."zz" # => true

This doesn't read backwards, and the name conveys the meaning of set
membership, as required by 2.

Perhaps using another method than "find" for searching (that only
defaults to "find") would make it possible, to provide an alternative
implementation for datastructures, that can compute membership faster
than O(n).

--
Florian Frank

Florian, your double dispatch is interseting. While I still have no
idea if anyone has understand the #cmp solution I've proposed since no
one has commented on it. A fully general solution of #cmp looks
something like this:

  def cmp( other )
    return 0 if self == other
    loop
      before, after = other.succ, self.succ
      return -1 if before == self
      return 1 if after == self
    end
  end

Of course no one would never use this becuase 'other' may not be an
actual member and thus never hit on ==. So the only way to ensure
member comparsion in a fully general way is to have the Range on hand
--hence your double dispatch. A generalized solution would then be:

  def cmp( other, range )
    return 0 if self == other
    arr = range.to_a
    arr.index( self ) <=> arr.index( other )
  end

But this is silly since Range can do this itself, no need to double
dispatch --if #cmp is not defined on the object, Range can always
expand into an array and compare indexes itself. But there's still the
rick of infinite expansions.

Likewise I think the double dispatching within an #element? method is
in the same league. If a #cmp can't be defined and used to determine
membership neither will an #element? method be able to, so Range then
must resort to 'to_a.include?'

Range is better off depending on a comparision method just for it to
ensure compatibility with #succ --which also ensures determination of
memebership with the methods we already have #member? and #include?
--Then they would do exaclty what the documentation says they're
supposed to do, which they actually DO NOT do at the moment.

T.