Doing an AND in regexp char class

This question arises out of a couple of recent threads and may or may
not be a Ruby-specific question.

I can check with a character class if one of the characters in the
class exists or does not exist, but can I use a regexp to check if a
string absolutely contains all of the characters in the class?

Using a set perspective, I can do it like this in irb...

s1 = "hello there"
s2 = "ohi"
(s2.unpack('c*') & s1.unpack('c*')).size == s2.size

=> false

I use unpack to avoid creating a bunch of String objects, one for each
element in the array, which would happen if I used #split. What I'm
wondering is if there is a way to do this with a simple regexp.

Thanks,
Todd

cfp:~ > cat a.rb
class String
   def all_chars? chars
     tr(chars, '').empty?
   end
end

p 'foobar'.all_chars?('rabof')
p 'foobar'.all_chars?('abc')
p 'foobar'.all_chars?('')

cfp:~ > ruby a.rb
true
false

a @ http://codeforpeople.com/

···

On May 8, 2008, at 3:40 PM, Todd Benson wrote:

This question arises out of a couple of recent threads and may or may
not be a Ruby-specific question.

I can check with a character class if one of the characters in the
class exists or does not exist, but can I use a regexp to check if a
string absolutely contains all of the characters in the class?

Using a set perspective, I can do it like this in irb...

s1 = "hello there"
s2 = "ohi"
(s2.unpack('c*') & s1.unpack('c*')).size == s2.size

=> false

I use unpack to avoid creating a bunch of String objects, one for each
element in the array, which would happen if I used #split. What I'm
wondering is if there is a way to do this with a simple regexp.

Thanks,
Todd

--
we can deny everything, except that we have the possibility of being better. simply reflect on that.
h.h. the 14th dalai lama

REs can do this, but may not be the best way. The way that comes to
mind is to see if the string matches the characters in any order,
i.e. for "ohi" either ohi, oih, hio, hoi, iho, or ioh
so something like

/(o([^h]*h[^i]i|[^i]*i[^h]*h)|(h([^i]*i[^o]*o|[^o]*[^i]*i)|o([^h]*h[^o]*o|[^o]*o[^h]*h)/

meaning

   o followed by either
      zero or more non-h's folllowed by an h followed by zero or more
non-i's folllowed by an i
      or
      zero or more non-i's followed by an i followed by zero or more
non-h's followed by an h
   or
   h followed by either
      ...
  ...

I would be possible to generate such an RE from the string.

But maybe someone cleverer with REs has a better approach.

···

On Thu, May 8, 2008 at 5:40 PM, Todd Benson <caduceass@gmail.com> wrote:

This question arises out of a couple of recent threads and may or may
not be a Ruby-specific question.

I can check with a character class if one of the characters in the
class exists or does not exist, but can I use a regexp to check if a
string absolutely contains all of the characters in the class?

Using a set perspective, I can do it like this in irb...

s1 = "hello there"
s2 = "ohi"
(s2.unpack('c*') & s1.unpack('c*')).size == s2.size

=> false

I use unpack to avoid creating a bunch of String objects, one for each
element in the array, which would happen if I used #split. What I'm
wondering is if there is a way to do this with a simple regexp.

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

Cool :slight_smile: #tr is one of those useful methods I somehow consistently forget about.

tkx fur realizashuns,
Todd

···

On Thu, May 8, 2008 at 6:07 PM, ara.t.howard <ara.t.howard@gmail.com> wrote:

On May 8, 2008, at 3:40 PM, Todd Benson wrote:

This question arises out of a couple of recent threads and may or may
not be a Ruby-specific question.

I can check with a character class if one of the characters in the
class exists or does not exist, but can I use a regexp to check if a
string absolutely contains all of the characters in the class?

Using a set perspective, I can do it like this in irb...

s1 = "hello there"
s2 = "ohi"
(s2.unpack('c*') & s1.unpack('c*')).size == s2.size

=> false

I use unpack to avoid creating a bunch of String objects, one for each
element in the array, which would happen if I used #split. What I'm
wondering is if there is a way to do this with a simple regexp.

Thanks,
Todd

cfp:~ > cat a.rb
class String
def all_chars? chars
   tr(chars, '').empty?
end
end

p 'foobar'.all_chars?('rabof')
p 'foobar'.all_chars?('abc')
p 'foobar'.all_chars?('')

cfp:~ > ruby a.rb
true
false
false

Using String#tr is nice, but the result is not what Todd wants:

  s1 = "hello there"
  s2 = "ohe"

  (s2.unpack('c*') & s1.unpack('c*')).size == s2.size
  => true

  class String
   def all_chars? chars
     tr(chars, '').empty?
   end
  end

  s1.all_chars?(s2)
  => false

Like in the regexp examples, you have to switch self and chars:

  class String
   def all_chars? chars
     chars.tr(self, '').empty?
   end
  end

  s1.all_chars?(s2)
  => true

Regards,
Pit

···

2008/5/9 ara.t.howard <ara.t.howard@gmail.com>:

On May 8, 2008, at 3:40 PM, Todd Benson wrote:

Using a set perspective, I can do it like this in irb...

s1 = "hello there"
s2 = "ohi"
(s2.unpack('c*') & s1.unpack('c*')).size == s2.size

=> false

cfp:~ > cat a.rb
class String
def all_chars? chars
   tr(chars, '').empty?
end
end

me too. just got lucky this time :wink:

a @ http://codeforpeople.com/

···

On May 8, 2008, at 5:30 PM, Todd Benson wrote:

tr is one of those useful methods I somehow consistently forget about

--
we can deny everything, except that we have the possibility of being better. simply reflect on that.
h.h. the 14th dalai lama

Todd Benson wrote:

This question arises out of a couple of recent threads and may or may
not be a Ruby-specific question.

I can check with a character class if one of the characters in the
class exists or does not exist, but can I use a regexp to check if a
string absolutely contains all of the characters in the class?

Using a set perspective, I can do it like this in irb...

s1 = "hello there"
s2 = "ohi"
(s2.unpack('c*') & s1.unpack('c*')).size == s2.size

=> false

I use unpack to avoid creating a bunch of String objects, one for each
element in the array, which would happen if I used #split. What I'm
wondering is if there is a way to do this with a simple regexp.

Thanks,
Todd

cfp:~ > cat a.rb
class String
def all_chars? chars
   tr(chars, '').empty?
end
end

p 'foobar'.all_chars?('rabof')
p 'foobar'.all_chars?('abc')
p 'foobar'.all_chars?('')

cfp:~ > ruby a.rb
true
false

Cool :slight_smile: #tr is one of those useful methods I somehow consistently forget about.

But it can be done with regex, right? It's just more elegant with tr.

class String
   def all_chars? chars
     if chars.empty?
       empty?
     else
       /\A[#{chars}]*\z/ === self
     end
   end
end

p 'foobar'.all_chars?('rabof') # => true
p 'foobar'.all_chars?('abc') # => false
p 'foobar'.all_chars?('') # => false

···

On Thu, May 8, 2008 at 6:07 PM, ara.t.howard <ara.t.howard@gmail.com> wrote:

On May 8, 2008, at 3:40 PM, Todd Benson wrote:

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

I'm drawing a blank here with this one. Why doesn't this work then...

irb(main):006:0> r = /\A[oh]*\z/
=> /\A[oh]*\z/
irb(main):007:0> s = "hello, there"
=> "hello, there"
irb(main):008:0> r === s
=> false

Todd

···

On Thu, May 8, 2008 at 7:00 PM, Joel VanderWerf <vjoel@path.berkeley.edu> wrote:

Todd Benson wrote:

On Thu, May 8, 2008 at 6:07 PM, ara.t.howard <ara.t.howard@gmail.com> >> wrote:

On May 8, 2008, at 3:40 PM, Todd Benson wrote:

This question arises out of a couple of recent threads and may or may
not be a Ruby-specific question.

I can check with a character class if one of the characters in the
class exists or does not exist, but can I use a regexp to check if a
string absolutely contains all of the characters in the class?

Using a set perspective, I can do it like this in irb...

s1 = "hello there"
s2 = "ohi"
(s2.unpack('c*') & s1.unpack('c*')).size == s2.size

=> false

I use unpack to avoid creating a bunch of String objects, one for each
element in the array, which would happen if I used #split. What I'm
wondering is if there is a way to do this with a simple regexp.

Thanks,
Todd

cfp:~ > cat a.rb
class String
def all_chars? chars
  tr(chars, '').empty?
end
end

p 'foobar'.all_chars?('rabof')
p 'foobar'.all_chars?('abc')
p 'foobar'.all_chars?('')

cfp:~ > ruby a.rb
true
false
false

Cool :slight_smile: #tr is one of those useful methods I somehow consistently forget
about.

But it can be done with regex, right? It's just more elegant with tr.

class String
def all_chars? chars
   if chars.empty?
     empty?
   else
     /\A[#{chars}]*\z/ === self
   end
end
end

p 'foobar'.all_chars?('rabof') # => true
p 'foobar'.all_chars?('abc') # => false
p 'foobar'.all_chars?('') # => false

Joel VanderWerf wrote:

Todd Benson wrote:

Thanks,

p 'foobar'.all_chars?('')

cfp:~ > ruby a.rb
true
false
false

Cool :slight_smile: #tr is one of those useful methods I somehow consistently forget about.

But it can be done with regex, right? It's just more elegant with tr.

class String
   def all_chars? chars
     if chars.empty?
       empty?
     else
       /\A[#{chars}]*\z/ === self
     end
   end
end

p 'foobar'.all_chars?('rabof') # => true
p 'foobar'.all_chars?('abc') # => false
p 'foobar'.all_chars?('') # => false

Your method doesn't work, which can clearly be seen in these examples:

strs = ["aaa", "bbb", "ccc"]
chars = "abc"

strs.each do |str|

  if /\A[#{chars}]*/ =~ str
    print str, " - yes"
    puts
  else
    print str, " - no"
    puts
  end

end

--output:--
aaa - yes
bbb - yes
ccc - yes

It should be clear from the output that even though the string "aaa"
passes your test, it is not true that all the characters in the string
"abc" appear in in the string "aaa".

···

--
Posted via http://www.ruby-forum.com/\.

me too. just got lucky this time :wink:

Knowledge --> the art of getting lucky very often, right Ara :wink:

we can deny everything, except that we have the possibility of being better.
simply reflect on that.
h.h. the 14th dalai lama

BTW when I was referring to the quote I learnt most about I was
thinking about "Be kind whenever it is possible. It is always
possible".

Not that I dislike the others or apply any judgment I just wanted to
be clear that I *personally* learnt the most from the above :slight_smile:

Cheers
Robert

···

On Fri, May 9, 2008 at 1:50 AM, ara.t.howard <ara.t.howard@gmail.com> wrote:

--
http://ruby-smalltalk.blogspot.com/

---
Whereof one cannot speak, thereof one must be silent.
Ludwig Wittgenstein

Todd Benson wrote:

I'm drawing a blank here with this one. Why doesn't this work then...

irb(main):006:0> r = /\A[oh]*\z/
=> /\A[oh]*\z/
irb(main):007:0> s = "hello, there"
=> "hello, there"
irb(main):008:0> r === s
=> false

Maybe I'm confused about was wanted originally. The above tests the following condition:

   (set of chars occurring in given string)
      is_a_subset_of
   (given set of chars).

irb(main):007:0> /\A[oh]*\z/ === "hohoho"
=> true
irb(main):008:0> /\A[oh]*\z/ === "ho ho"
=> false

If you want superset instead of subset, this works:

irb(main):013:0> /(?=.*h)(?=.*o)/ === "h o"
=> true

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Hi --

Todd Benson wrote:

This question arises out of a couple of recent threads and may or may
not be a Ruby-specific question.

I can check with a character class if one of the characters in the
class exists or does not exist, but can I use a regexp to check if a
string absolutely contains all of the characters in the class?

Using a set perspective, I can do it like this in irb...

s1 = "hello there"
s2 = "ohi"
(s2.unpack('c*') & s1.unpack('c*')).size == s2.size

=> false

I use unpack to avoid creating a bunch of String objects, one for each
element in the array, which would happen if I used #split. What I'm
wondering is if there is a way to do this with a simple regexp.

Thanks,
Todd

cfp:~ > cat a.rb
class String
def all_chars? chars
  tr(chars, '').empty?
end
end

p 'foobar'.all_chars?('rabof')
p 'foobar'.all_chars?('abc')
p 'foobar'.all_chars?('')

cfp:~ > ruby a.rb
true
false

Cool :slight_smile: #tr is one of those useful methods I somehow consistently forget
about.

But it can be done with regex, right? It's just more elegant with tr.

class String
def all_chars? chars
   if chars.empty?
     empty?
   else
     /\A[#{chars}]*\z/ === self
   end
end
end

p 'foobar'.all_chars?('rabof') # => true
p 'foobar'.all_chars?('abc') # => false
p 'foobar'.all_chars?('') # => false

I'm drawing a blank here with this one. Why doesn't this work then...

irb(main):006:0> r = /\A[oh]*\z/
=> /\A[oh]*\z/
irb(main):007:0> s = "hello, there"
=> "hello, there"
irb(main):008:0> r === s
=> false

"hello, there" contains letters other than o and h, but your regex
calls for a string consisting of zero or more o's or h's and nothing
else.

I think there might be some confusion as between determining that a
string contains certain characters, and determining that a string
contains *only* certain characters. My understanding was that you
wanted the first, which you could do with tr but I think you'd
probably want the character cluster to be doing the tr'ing:

   "oh".tr("hello, there","").empty? # true; all letters in "oh"
                                         # are also in "hello, there"
   "hello, there".tr("ho","").empty? # false

They're both strings, of course, so you can do either with Ara's
or Joel's methods:

   "oh".all_chars?("hello, there") # true
   "hello, there".all_chars?("oh") # false

though if it's really the former you want you might want to name it
all_present_in? or something.

David

···

On Fri, 9 May 2008, Todd Benson wrote:

On Thu, May 8, 2008 at 7:00 PM, Joel VanderWerf <vjoel@path.berkeley.edu> wrote:

On Thu, May 8, 2008 at 6:07 PM, ara.t.howard <ara.t.howard@gmail.com> >>> wrote:

On May 8, 2008, at 3:40 PM, Todd Benson wrote:

--
Rails training from David A. Black and Ruby Power and Light:
   INTRO TO RAILS June 9-12 Berlin
   ADVANCING WITH RAILS June 16-19 Berlin
   INTRO TO RAILS June 24-27 London (Skills Matter)
See http://www.rubypal.com for details and updates!

Hi --

Joel VanderWerf wrote:

Todd Benson wrote:

Thanks,

p 'foobar'.all_chars?('')

cfp:~ > ruby a.rb
true
false

Cool :slight_smile: #tr is one of those useful methods I somehow consistently forget about.

But it can be done with regex, right? It's just more elegant with tr.

class String
   def all_chars? chars
     if chars.empty?
       empty?
     else
       /\A[#{chars}]*\z/ === self
     end
   end
end

p 'foobar'.all_chars?('rabof') # => true
p 'foobar'.all_chars?('abc') # => false
p 'foobar'.all_chars?('') # => false

Your method doesn't work, which can clearly be seen in these examples:

strs = ["aaa", "bbb", "ccc"]
chars = "abc"

strs.each do |str|

if /\A[#{chars}]*/ =~ str
   print str, " - yes"
   puts
else
   print str, " - no"
   puts
end

end

--output:--
aaa - yes
bbb - yes
ccc - yes

It should be clear from the output that even though the string "aaa"
passes your test, it is not true that all the characters in the string
"abc" appear in in the string "aaa".

Do it the other way around (and don't forget the \z):

   if /\A[#{str}]*\z/ =~ chars

It's really the characters in str that you're testing, to make sure
that none of them fail to match the characters in chars. If the
variable names seem backwards, you can change them. It's the logic
that's important, and it works fine.

David

···

On Fri, 9 May 2008, 7stud -- wrote:

--
Rails training from David A. Black and Ruby Power and Light:
   INTRO TO RAILS June 9-12 Berlin
   ADVANCING WITH RAILS June 16-19 Berlin
   INTRO TO RAILS June 24-27 London (Skills Matter)
See http://www.rubypal.com for details and updates!

Hi --

Todd Benson wrote:

I'm drawing a blank here with this one. Why doesn't this work then...

irb(main):006:0> r = /\A[oh]*\z/
=> /\A[oh]*\z/
irb(main):007:0> s = "hello, there"
=> "hello, there"
irb(main):008:0> r === s
=> false

Maybe I'm confused about was wanted originally. The above tests the following condition:

(set of chars occurring in given string)
    is_a_subset_of
(given set of chars).

irb(main):007:0> /\A[oh]*\z/ === "hohoho"
=> true
irb(main):008:0> /\A[oh]*\z/ === "ho ho"
=> false

If you want superset instead of subset, this works:

irb(main):013:0> /(?=.*h)(?=.*o)/ === "h o"
=> true

That depends on the order, though. To do the superset test, you could
just do the subset, but in the other direction: check that the
character class, as a string, doesn't contain anything that isn't in
the main string:

   str = "h o"
   chars = "ho"

   /\A[#{str}]*\z/ === chars # true

(though probably best to uniquify the string first).

David

···

On Fri, 9 May 2008, Joel VanderWerf wrote:

--
Rails training from David A. Black and Ruby Power and Light:
   INTRO TO RAILS June 9-12 Berlin
   ADVANCING WITH RAILS June 16-19 Berlin
   INTRO TO RAILS June 24-27 London (Skills Matter)
See http://www.rubypal.com for details and updates!

Yep. The subject title is misleading, because the AND is already
there [^ho] means not h _and_ also not o.

I was looking to find if given a string A, can I say whether or not
all of the characters in string A exist in string B (count doesn't
matter, just existence). All of you gave me some good answers that I
hadn't thought of. Good brain food :slight_smile:

Todd

···

On Thu, May 8, 2008 at 7:26 PM, Joel VanderWerf <vjoel@path.berkeley.edu> wrote:

Todd Benson wrote:

I'm drawing a blank here with this one. Why doesn't this work then...

irb(main):006:0> r = /\A[oh]*\z/
=> /\A[oh]*\z/
irb(main):007:0> s = "hello, there"
=> "hello, there"
irb(main):008:0> r === s
=> false

Maybe I'm confused about was wanted originally. The above tests the
following condition:

(set of chars occurring in given string)
    is_a_subset_of
(given set of chars).

David A. Black wrote:

Hi --

false

     else

   puts
passes your test, it is not true that all the characters in the string
"abc" appear in in the string "aaa".

Do it the other way around (and don't forget the \z):

Whoops.

   if /\A[#{str}]*\z/ =~ chars

It's really the characters in str that you're testing, to make sure
that none of them fail to match the characters in chars. If the
variable names seem backwards, you can change them. It's the logic
that's important, and it works fine.

Nice.

···

On Fri, 9 May 2008, 7stud -- wrote:

--
Posted via http://www.ruby-forum.com/\.

David A. Black wrote:

irb(main):013:0> /(?=.*h)(?=.*o)/ === "h o"
=> true

That depends on the order, though.

Yes, it's buggy. Should use //m:

irb(main):003:0> /(?=.*h)(?=.*o)/ === "o \nh"
=> false
irb(main):004:0> /(?=.*h)(?=.*o)/m === "o \nh"
=> true

Does that fix the order problem you were thinking of?

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Hi --

···

On Fri, 9 May 2008, Joel VanderWerf wrote:

David A. Black wrote:

irb(main):013:0> /(?=.*h)(?=.*o)/ === "h o"
=> true

That depends on the order, though.

Yes, it's buggy. Should use //m:

irb(main):003:0> /(?=.*h)(?=.*o)/ === "o \nh"
=> false
irb(main):004:0> /(?=.*h)(?=.*o)/m === "o \nh"
=> true

Does that fix the order problem you were thinking of?

Actually I think I was wrong about the order mattering (since they're
zero-width). But /m helps anyway. I still think you could just change
the roles of the two strings and dissect "the string" as a character
class and "the characters" as a string, and use your original
technique.

David

--
Rails training from David A. Black and Ruby Power and Light:
   INTRO TO RAILS June 9-12 Berlin
   ADVANCING WITH RAILS June 16-19 Berlin
   INTRO TO RAILS June 24-27 London (Skills Matter)
See http://www.rubypal.com for details and updates!