Regular Expression

I am trying to write a reqular expression to match a word with my input
string.

Input string can be ( inpstr = ABCDD )

I should be able to match all the below words with 0 or 1 A, 0 or 1 B, 0
or 1 C and 0 or 1 or 2 D

A
AC
CA
DAD
BAC
ADD

but not words which have like 2B or 2A
AAB
ABCC
ADDD

I am using an expression like ^[ABCDD]*$,but this matches words like ABB
or ABCC which have like 2B or 2A.

Any suggestion on how I can fix this?

Thanks

···

--
Posted via http://www.ruby-forum.com/.

You are going to need some more advanced regexes to get a match in the way
you are hoping for.

I would get out a copy of a good regex cheat sheet, and work through some
examples in http://rubular.com
http://www.addedbytes.com/download/regular-expressions-cheat-sheet-v2/pdf/

The main reason for rubular is that you list all of the regexes you want to
match, and which to reject, and keep tweaking until you get what you want.

OTOH this regex might scratch your itch:
/^A?B?C?D{0,2}$/

···

On Tue, Nov 2, 2010 at 9:05 PM, Dv Dasari <dv.mymail@gmail.com> wrote:

I am trying to write a reqular expression to match a word with my input
string.

Input string can be ( inpstr = ABCDD )

Any suggestion on how I can fix this?

--
http://richardconroy.blogspot.com | http://twitter.com/RichardConroy

I am trying to write a reqular expression to match a word with my input
string.

Input string can be ( inpstr = ABCDD )

Any suggestion on how I can fix this?

Take a look at looakahead and lookbehind assertions. They used to
conditionally match when followed or preceded by a given pattern.

OTOH this regex might scratch your itch:
/^A?B?C?D{0,2}$/

Note that the last quantifier here (the {0,2}) only applies to the
last character, the D, not the whole expression.

"ABBC" =~ /^A?B?C?D{0,2}$/

=> nil

HTH,
Ammar

···

On Tue, Nov 2, 2010 at 11:34 PM, Richard Conroy <richard.conroy@gmail.com> wrote:

On Tue, Nov 2, 2010 at 9:05 PM, Dv Dasari <dv.mymail@gmail.com> wrote:

The above works so long as each of ABC or D must come in said order,
if present. This goes against the OP's examples: CA, DAD, and BAC.

My suspicion is that you're problem isn't solvable by a regular
expression alone, but that you'll need to do some parsing (still
possibly using regular expressions in the process).

···

On Tue, Nov 2, 2010 at 3:34 PM, Richard Conroy <richard.conroy@gmail.com> wrote:

On Tue, Nov 2, 2010 at 9:05 PM, Dv Dasari <dv.mymail@gmail.com> wrote:

I am trying to write a reqular expression to match a word with my input
string.

Input string can be ( inpstr = ABCDD )

Any suggestion on how I can fix this?

You are going to need some more advanced regexes to get a match in the way
you are hoping for.

I would get out a copy of a good regex cheat sheet, and work through some
examples in http://rubular.com
http://www.addedbytes.com/download/regular-expressions-cheat-sheet-v2/pdf/

The main reason for rubular is that you list all of the regexes you want to
match, and which to reject, and keep tweaking until you get what you want.

OTOH this regex might scratch your itch:
/^A?B?C?D{0,2}$/

--
Kendall Gifford
zettabyte@gmail.com

Kendall Gifford wrote in post #958838:

···

On Tue, Nov 2, 2010 at 3:34 PM, Richard Conroy > <richard.conroy@gmail.com> wrote:

You are going to need some more advanced regexes to get a match in the way
/^A?B?C?D{0,2}$/

The above works so long as each of ABC or D must come in said order,
if present. This goes against the OP's examples: CA, DAD, and BAC.

My suspicion is that you're problem isn't solvable by a regular
expression alone, but that you'll need to do some parsing (still
possibly using regular expressions in the process).

Yes, you are correct, this expression doesnt match words like CA or BAD
or CAD.

Just wondering if there is an option to say all different combinations
or orders.

--
Posted via http://www.ruby-forum.com/\.

regex = /^(A(?!A)|B(?!B)|C(?!C)|D{1,2}(?!D))+$/

["ABCDD","CA","CAD","CADD","CADDD","DAD","BAC","BBAD","AABCCD"].each do |string|
   puts "'#{string}' => #{string.match(regex)}"
end

···

==============
'ABCDD' => ABCDD
'CA' => CA
'CAD' => CAD
'CADD' => CADD
'CADDD' =>
'DAD' => DAD
'BAC' => BAC
'BBAD' =>
'AABCCD' =>

On Nov 2, 2010, at 5:55 PM, Kendall Gifford wrote:

On Tue, Nov 2, 2010 at 3:34 PM, Richard Conroy <richard.conroy@gmail.com> wrote:

On Tue, Nov 2, 2010 at 9:05 PM, Dv Dasari <dv.mymail@gmail.com> wrote:

I am trying to write a reqular expression to match a word with my input
string.

Input string can be ( inpstr = ABCDD )

Any suggestion on how I can fix this?

You are going to need some more advanced regexes to get a match in the way
you are hoping for.

I would get out a copy of a good regex cheat sheet, and work through some
examples in http://rubular.com
http://www.addedbytes.com/download/regular-expressions-cheat-sheet-v2/pdf/

The main reason for rubular is that you list all of the regexes you want to
match, and which to reject, and keep tweaking until you get what you want.

OTOH this regex might scratch your itch:
/^A?B?C?D{0,2}$/

The above works so long as each of ABC or D must come in said order,
if present. This goes against the OP's examples: CA, DAD, and BAC.

My suspicion is that you're problem isn't solvable by a regular
expression alone, but that you'll need to do some parsing (still
possibly using regular expressions in the process).

--
Kendall Gifford
zettabyte@gmail.com

Mike Cargal

mike@cargal.net

In theory, any language/grammar/syntax construct of finite length can
be matched with a regular expression, it's just that the expression
would, for most things, get really huge fast as the length of said
construct grows.

So, for your example you CAN match it with just ONE regular
expression, but as you've noticed, it will be loooong:

/^(A|B|C|D|A[BCD]|B[ACD]|C[ABD]|D[ABC]|AB[CD]|AC[BD]|AD[BCD]|BA[CD]...etc.)$/

Better to just use code in such situations, utilizing simple regex
patterns in the process.

···

On Tue, Nov 2, 2010 at 4:04 PM, Dv Dasari <dv.mymail@gmail.com> wrote:

Kendall Gifford wrote in post #958838:

On Tue, Nov 2, 2010 at 3:34 PM, Richard Conroy >> <richard.conroy@gmail.com> wrote:

You are going to need some more advanced regexes to get a match in the way
/^A?B?C?D{0,2}$/

The above works so long as each of ABC or D must come in said order,
if present. This goes against the OP's examples: CA, DAD, and BAC.

My suspicion is that you're problem isn't solvable by a regular
expression alone, but that you'll need to do some parsing (still
possibly using regular expressions in the process).

Yes, you are correct, this expression doesnt match words like CA or BAD
or CAD.

Just wondering if there is an option to say all different combinations
or orders.

--
Kendall Gifford
zettabyte@gmail.com

Afternoon,

Kendall Gifford wrote in post #958838:
>> You are going to need some more advanced regexes to get a match in the
way
>> /^A?B?C?D{0,2}$/
>>
>
> The above works so long as each of ABC or D must come in said order,
> if present. This goes against the OP's examples: CA, DAD, and BAC.
>
> My suspicion is that you're problem isn't solvable by a regular
> expression alone, but that you'll need to do some parsing (still
> possibly using regular expressions in the process).

Yes, you are correct, this expression doesnt match words like CA or BAD
or CAD.

Just wondering if there is an option to say all different combinations
or orders.

Unless for some reason you are doing homework or something and need to use a
regular expression, may I suggest the following instead.

You have a very small string and a very small set of options in terms of
possible letters. Your string can have no more than 5 different characters
so why not create a "binary" version of your string and then check to see if
anything appears more than you want?

For example

chars = {}
# Why the powers jump by 5 explanation follows
chars['a'] = 2**0
chars['b'] = 2**5
chars['c'] = 2**10
chars['d'] = 2**15

test_string = 'aabcd'

bit_value = 0

test_string.downcase.each_char{ |c| bit_value = bit_value + chars[c] }

if (bit_value & 949214) != 0 #Magic number explanation to follow
  puts 'Bad string'
else
  puts 'Accepted string'
end

Really what this does is goes character by character and puts each character
into it's "bucket"

So we start with a binary value of all zeros

00000 00000 00000 00000 - we use 5 slots (or increase our power of 2 by 5)
for each character because they could appear 5 times each

Going through the test_string - 'aabcd'

First an a - we add 2**0 which is 1

0+1 = 1

Binary
00000 00000 00000 00001

Next another a - add another 1

1+1 = 2
Binary
00000 00000 00000 00010

b is next - add 2**5 or 32

2+32 = 34
Binary
00000 00000 00001 00010

c - add 2**10 or 1024

34+1024 = 1058
Binary
00000 00001 00001 00010

d - add 2**15 or 32768
Binary
00001 00001 00001 00010

It should be fairly apparent that the location of the 1 in each "bucket"
represents the number of times that character appears in our string.

If you had aaabb then your binary value would be 00000 00000 00010 00100
dddcd would look like 01000 00001 00000 00000

And so on - obviously if we have none of a character then we have no 1 in
that "bucket"

So lets now take a look at the "magic" number of 949214 which has a binary
rep of

11100 11110 11110 11110

This number represents all the possible locations of the binary digit one
that you would find unacceptable. Either 5,4, or 3 Ds and 5, 4, 3, or 2 Cs,
Bs, or As.

So now we have

00001 00001 00001 00010 - test value
11100 11110 11110 11110 - our mask of unacceptable options

And when we & (bit wise AND) the two values we get

00000 00000 00000 00010 - we have too many As in this case.

Therefore since this value is above 0 we have a problem. Any valid value
will return zero here.

Not a regular expression - but I believe this would be faster given your
limited set of options and string lengths.

John

···

On Tue, Nov 2, 2010 at 3:04 PM, Dv Dasari <dv.mymail@gmail.com> wrote:

> On Tue, Nov 2, 2010 at 3:34 PM, Richard Conroy > > <richard.conroy@gmail.com> wrote:

However, this also appears by my test to match "DADD", "ABA", etc:

["ABA", "DADD", "CAC"].each do |string|
  puts "'#{string}' => #{string.match(regex)}"
end
'DADD' => DADD
'ABA' => ABA
'CAC' => CAC

It does get you closer though. I rarely remember to make use of
look-ahead (and look-behind and other "(?X" style patterns) since when
switching languages/regexp engines, I'm never sure what features will
be there (and will still work the same). I guess I'm too conservative,
sticking with core/basic features.

This does make me curious how short of a regex using these features
could be written for this one case...

···

On Thu, Nov 4, 2010 at 5:46 AM, Mike Cargal <mike@cargal.net> wrote:

regex = /^(A(?!A)|B(?!B)|C(?!C)|D{1,2}(?!D))+$/

["ABCDD","CA","CAD","CADD","CADDD","DAD","BAC","BBAD","AABCCD"].each do |string|
puts "'#{string}' => #{string.match(regex)}"
end

==============
'ABCDD' => ABCDD
'CA' => CA
'CAD' => CAD
'CADD' => CADD
'CADDD' =>
'DAD' => DAD
'BAC' => BAC
'BBAD' =>
'AABCCD' =>

--
Kendall Gifford
zettabyte@gmail.com

Afternoon,

Unless for some reason you are doing homework or something and need to use a
regular expression, may I suggest the following instead.

You have a very small string and a very small set of options in terms of
possible letters. Your string can have no more than 5 different characters
so why not create a "binary" version of your string and then check to see if
anything appears more than you want?

For example

chars = {}
# Why the powers jump by 5 explanation follows
chars['a'] = 2**0
chars['b'] = 2**5
chars['c'] = 2**10
chars['d'] = 2**15

test_string = 'aabcd'

bit_value = 0

test_string.downcase.each_char{ |c| bit_value = bit_value + chars[c] }

if (bit_value & 949214) != 0 #Magic number explanation to follow
puts 'Bad string'
else
puts 'Accepted string'
end

Really what this does is goes character by character and puts each character
into it's "bucket"

So we start with a binary value of all zeros

00000 00000 00000 00000 - we use 5 slots (or increase our power of 2 by 5)
for each character because they could appear 5 times each

Do they? In order to verify that you would need a separate regexp
which limits overall length of the sequence to 4.

But I think your math is wrong. Since you always add the same value
(e.g. 2**0 for "a") as you show below, you can store 2**5 = 32
different values (0 to 31) for each character. The 2**5 thing only
makes sense if you use binary OR and shift the mask for each
character. Am I missing something?

Going through the test_string - 'aabcd'

First an a - we add 2**0 which is 1

0+1 = 1

Binary
00000 00000 00000 00001

Next another a - add another 1

1+1 = 2
Binary
00000 00000 00000 00010

b is next - add 2**5 or 32

2+32 = 34
Binary
00000 00000 00001 00010

c - add 2**10 or 1024

34+1024 = 1058
Binary
00000 00001 00001 00010

d - add 2**15 or 32768
Binary
00001 00001 00001 00010

It should be fairly apparent that the location of the 1 in each "bucket"
represents the number of times that character appears in our string.

If you had aaabb then your binary value would be 00000 00000 00010 00100
dddcd would look like 01000 00001 00000 00000

And so on - obviously if we have none of a character then we have no 1 in
that "bucket"

So lets now take a look at the "magic" number of 949214 which has a binary
rep of

11100 11110 11110 11110

This number represents all the possible locations of the binary digit one
that you would find unacceptable. Either 5,4, or 3 Ds and 5, 4, 3, or 2 Cs,
Bs, or As.

So now we have

00001 00001 00001 00010 - test value
11100 11110 11110 11110 - our mask of unacceptable options

And when we & (bit wise AND) the two values we get

00000 00000 00000 00010 - we have too many As in this case.

Therefore since this value is above 0 we have a problem. Any valid value
will return zero here.

Not a regular expression - but I believe this would be faster given your
limited set of options and string lengths.

Interesting approach. I would simply have done

def check(input)
  raise ArgumentError, "Illegal chars in sequence: %p" % input unless
/\A[A-D]{0,4}\z/ =~ input
  cnt = Hash.new 0
  input.scan /./ do |m|
    raise ArgumentError, "Illegal sequence %p" % input if (cnt[m] += 1) > 1
    # alt: return false
  end
  # alt: true
end

Kind regards

robert

···

On Tue, Nov 2, 2010 at 11:31 PM, John W Higgins <wishdev@gmail.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Okay, now this is even closer:

regex = /^(A(?=[^A]+$)|B(?=[^B]+$)|C(?=[^C]+$)|D{1,2}(?!D))+$/

However, it still has problems with "DADD", "DADBD" and such...

···

On Thu, Nov 4, 2010 at 11:27 AM, Kendall Gifford <zettabyte@gmail.com> wrote:

On Thu, Nov 4, 2010 at 5:46 AM, Mike Cargal <mike@cargal.net> wrote:

regex = /^(A(?!A)|B(?!B)|C(?!C)|D{1,2}(?!D))+$/

["ABCDD","CA","CAD","CADD","CADDD","DAD","BAC","BBAD","AABCCD"].each do |string|
puts "'#{string}' => #{string.match(regex)}"
end

==============
'ABCDD' => ABCDD
'CA' => CA
'CAD' => CAD
'CADD' => CADD
'CADDD' =>
'DAD' => DAD
'BAC' => BAC
'BBAD' =>
'AABCCD' =>

However, this also appears by my test to match "DADD", "ABA", etc:

["ABA", "DADD", "CAC"].each do |string|
puts "'#{string}' => #{string.match(regex)}"
end
'DADD' => DADD
'ABA' => ABA
'CAC' => CAC

It does get you closer though. I rarely remember to make use of
look-ahead (and look-behind and other "(?X" style patterns) since when
switching languages/regexp engines, I'm never sure what features will
be there (and will still work the same). I guess I'm too conservative,
sticking with core/basic features.

This does make me curious how short of a regex using these features
could be written for this one case...

--
Kendall Gifford
zettabyte@gmail.com

There are different requirements for each letter. Adapting your solution:

def check(input)
  limits = {'A' => 1, 'B' => 1, 'C' => 1, 'D' => 2}
  raise ArgumentError, "Illegal chars in sequence: %p" % input unless
/\A[A-D]{0,4}\z/ =~ input
  cnt = Hash.new 0
  input.scan /./ do |m|
    cnt[m] += 1
  end

  cnt.each do |letter, amount|
    raise ArgumentError, "Illegal chars in sequence: %p" % input if
amount > limits[letter]
  end
end

Jesus.

···

On Wed, Nov 3, 2010 at 11:10 AM, Robert Klemme <shortcutter@googlemail.com> wrote:

Interesting approach. I would simply have done

def check(input)
raise ArgumentError, "Illegal chars in sequence: %p" % input unless
/\A[A-D]{0,4}\z/ =~ input
cnt = Hash.new 0
input.scan /./ do |m|
raise ArgumentError, "Illegal sequence %p" % input if (cnt[m] += 1) > 1
# alt: return false
end
# alt: true
end

Oh... missed that detail... I *think* this covers the bases...

regex = /^(A(?!.*A)|B(?!.*B)|C(?!.*C)|D{1,2}(?!.*D))+$/

["ABCDD","CA","CAD",
  "CADD","CADDD","DAD",
  "BAC","BBAD","AABCCD",
  "DADD","DADBD","ABCDDA",
  "MIKE"].each do |string|
   puts "'#{string}' => #{string.match(regex)}"
end

···

On Nov 4, 2010, at 1:34 PM, Kendall Gifford wrote:

On Thu, Nov 4, 2010 at 11:27 AM, Kendall Gifford <zettabyte@gmail.com> wrote:

On Thu, Nov 4, 2010 at 5:46 AM, Mike Cargal <mike@cargal.net> wrote:

regex = /^(A(?!A)|B(?!B)|C(?!C)|D{1,2}(?!D))+$/

["ABCDD","CA","CAD","CADD","CADDD","DAD","BAC","BBAD","AABCCD"].each do |string|
  puts "'#{string}' => #{string.match(regex)}"
end

==============
'ABCDD' => ABCDD
'CA' => CA
'CAD' => CAD
'CADD' => CADD
'CADDD' =>
'DAD' => DAD
'BAC' => BAC
'BBAD' =>
'AABCCD' =>

However, this also appears by my test to match "DADD", "ABA", etc:

["ABA", "DADD", "CAC"].each do |string|
puts "'#{string}' => #{string.match(regex)}"
end
'DADD' => DADD
'ABA' => ABA
'CAC' => CAC

It does get you closer though. I rarely remember to make use of
look-ahead (and look-behind and other "(?X" style patterns) since when
switching languages/regexp engines, I'm never sure what features will
be there (and will still work the same). I guess I'm too conservative,
sticking with core/basic features.

This does make me curious how short of a regex using these features
could be written for this one case...

Okay, now this is even closer:

regex = /^(A(?=[^A]+$)|B(?=[^B]+$)|C(?=[^C]+$)|D{1,2}(?!D))+$/

However, it still has problems with "DADD", "DADBD" and such...

--
Kendall Gifford
zettabyte@gmail.com

====================
'ABCDD' => ABCDD
'CA' => CA
'CAD' => CAD
'CADD' => CADD
'CADDD' =>
'DAD' =>
'BAC' => BAC
'BBAD' =>
'AABCCD' =>
'DADD' =>
'DADBD' =>
'ABCDDA' =>
'MIKE' =>

Mike Cargal

mike@cargal.net

Interesting approach. I would simply have done

def check(input)
raise ArgumentError, "Illegal chars in sequence: %p" % input unless
/\A[A-D]{0,4}\z/ =~ input
cnt = Hash.new 0
input.scan /./ do |m|
raise ArgumentError, "Illegal sequence %p" % input if (cnt[m] += 1) > 1
# alt: return false
end
# alt: true
end

There are different requirements for each letter. Adapting your solution:

Good point! I overlooked that.

def check(input)
limits = {'A' => 1, 'B' => 1, 'C' => 1, 'D' => 2}
raise ArgumentError, "Illegal chars in sequence: %p" % input unless
/\A[A-D]{0,4}\z/ =~ input
cnt = Hash.new 0
input.scan /./ do |m|
cnt[m] += 1
end

cnt.each do |letter, amount|
raise ArgumentError, "Illegal chars in sequence: %p" % input if
amount > limits[letter]
end
end

I'd rather do:

def check(input)
raise ArgumentError, "Illegal chars in sequence: %p" % input unless
/\A[A-D]{0,4}\z/ =~ input
cnt = {'A' => 1, 'B' => 1, 'C' => 1, 'D' => 2}
input.scan /./ do |m|
   raise ArgumentError, "Illegal sequence %p" % input if (cnt[m] -= 1) < 0
   # alt: return false
end
# alt: true
end

:slight_smile:

Cheers

robert

···

2010/11/3 Jesús Gabriel y Galán <jgabrielygalan@gmail.com>:

On Wed, Nov 3, 2010 at 11:10 AM, Robert Klemme > <shortcutter@googlemail.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Except that "DAD" *is* legal according to OP's description. :slight_smile:

···

On Thu, Nov 4, 2010 at 4:53 PM, Mike Cargal <mike@cargal.net> wrote:

On Nov 4, 2010, at 1:34 PM, Kendall Gifford wrote:

On Thu, Nov 4, 2010 at 11:27 AM, Kendall Gifford <zettabyte@gmail.com> wrote:

On Thu, Nov 4, 2010 at 5:46 AM, Mike Cargal <mike@cargal.net> wrote:

regex = /^(A(?!A)|B(?!B)|C(?!C)|D{1,2}(?!D))+$/

["ABCDD","CA","CAD","CADD","CADDD","DAD","BAC","BBAD","AABCCD"].each do |string|
puts "'#{string}' => #{string.match(regex)}"
end

==============
'ABCDD' => ABCDD
'CA' => CA
'CAD' => CAD
'CADD' => CADD
'CADDD' =>
'DAD' => DAD
'BAC' => BAC
'BBAD' =>
'AABCCD' =>

However, this also appears by my test to match "DADD", "ABA", etc:

["ABA", "DADD", "CAC"].each do |string|
puts "'#{string}' => #{string.match(regex)}"
end
'DADD' => DADD
'ABA' => ABA
'CAC' => CAC

It does get you closer though. I rarely remember to make use of
look-ahead (and look-behind and other "(?X" style patterns) since when
switching languages/regexp engines, I'm never sure what features will
be there (and will still work the same). I guess I'm too conservative,
sticking with core/basic features.

This does make me curious how short of a regex using these features
could be written for this one case...

Okay, now this is even closer:

regex = /^(A(?=[^A]+$)|B(?=[^B]+$)|C(?=[^C]+$)|D{1,2}(?!D))+$/

However, it still has problems with "DADD", "DADBD" and such...

--
Kendall Gifford
zettabyte@gmail.com

Oh... missed that detail... I *think* this covers the bases...

regex = /^(A(?!.*A)|B(?!.*B)|C(?!.*C)|D{1,2}(?!.*D))+$/

["ABCDD","CA","CAD",
"CADD","CADDD","DAD",
"BAC","BBAD","AABCCD",
"DADD","DADBD","ABCDDA",
"MIKE"].each do |string|
puts "'#{string}' => #{string.match(regex)}"
end

'ABCDD' => ABCDD
'CA' => CA
'CAD' => CAD
'CADD' => CADD
'CADDD' =>
'DAD' =>
'BAC' => BAC
'BBAD' =>
'AABCCD' =>
'DADD' =>
'DADBD' =>
'ABCDDA' =>
'MIKE' =>

Mike Cargal

mike@cargal.net
http://blog.mikecargal.com

--
Kendall Gifford
zettabyte@gmail.com

"We don't need no stinkin' loops!"

def check input
  return false unless /\A[A-D]{0,4}\z/ =~ input
  %w(A B C D).map{|s| input.count s}.zip( [1,1,1,2] ).
    map{|a,b| b-a}.all?{|n| n >= 0}
end

···

On Nov 3, 7:31 am, Robert Klemme <shortcut...@googlemail.com> wrote:

2010/11/3 Jesús Gabriel y Galán <jgabrielyga...@gmail.com>:

> On Wed, Nov 3, 2010 at 11:10 AM, Robert Klemme > > <shortcut...@googlemail.com> wrote:
>> Interesting approach. I would simply have done

>> def check(input)
>> raise ArgumentError, "Illegal chars in sequence: %p" % input unless
>> /\A[A-D]{0,4}\z/ =~ input
>> cnt = Hash.new 0
>> input.scan /./ do |m|
>> raise ArgumentError, "Illegal sequence %p" % input if (cnt[m] += 1) > 1
>> # alt: return false
>> end
>> # alt: true
>> end

> There are different requirements for each letter. Adapting your solution:

Good point! I overlooked that.

> def check(input)
> limits = {'A' => 1, 'B' => 1, 'C' => 1, 'D' => 2}
> raise ArgumentError, "Illegal chars in sequence: %p" % input unless
> /\A[A-D]{0,4}\z/ =~ input
> cnt = Hash.new 0
> input.scan /./ do |m|
> cnt[m] += 1
> end

> cnt.each do |letter, amount|
> raise ArgumentError, "Illegal chars in sequence: %p" % input if
> amount > limits[letter]
> end
> end

I'd rather do:

def check(input)
raise ArgumentError, "Illegal chars in sequence: %p" % input unless
/\A[A-D]{0,4}\z/ =~ input
cnt = {'A' => 1, 'B' => 1, 'C' => 1, 'D' => 2}
input.scan /./ do |m|
raise ArgumentError, "Illegal sequence %p" % input if (cnt[m] -= 1) < 0
# alt: return false
end
# alt: true
end

grrr......

regex = /^(A(?!.*A)|B(?!.*B)|C(?!.*C)|D(?!.*D.*D|DD(?!.*D)))+$/

["ABCDD","CA","CAD",
  "CADD","CADDD","DAD","DADD","DADAD","ADD",
  "BAC","BBAD","AABCCD",
  "DADD","DADBD","ABCDDA",
  "MIKE"].each do |string|
   puts "'#{string}' => #{string.match(regex)}"
end

···

======================
'ABCDD' => ABCDD
'CA' => CA
'CAD' => CAD
'CADD' => CADD
'CADDD' =>
'DAD' => DAD
'DADD' =>
'DADAD' =>
'ADD' => ADD
'BAC' => BAC
'BBAD' =>
'AABCCD' =>
'DADD' =>
'DADBD' =>
'ABCDDA' =>
'MIKE' =>

On Nov 4, 2010, at 7:06 PM, Kendall Gifford wrote:

On Thu, Nov 4, 2010 at 4:53 PM, Mike Cargal <mike@cargal.net> wrote:

On Nov 4, 2010, at 1:34 PM, Kendall Gifford wrote:

On Thu, Nov 4, 2010 at 11:27 AM, Kendall Gifford <zettabyte@gmail.com> wrote:

On Thu, Nov 4, 2010 at 5:46 AM, Mike Cargal <mike@cargal.net> wrote:

regex = /^(A(?!A)|B(?!B)|C(?!C)|D{1,2}(?!D))+$/

["ABCDD","CA","CAD","CADD","CADDD","DAD","BAC","BBAD","AABCCD"].each do |string|
  puts "'#{string}' => #{string.match(regex)}"
end

==============
'ABCDD' => ABCDD
'CA' => CA
'CAD' => CAD
'CADD' => CADD
'CADDD' =>
'DAD' => DAD
'BAC' => BAC
'BBAD' =>
'AABCCD' =>

However, this also appears by my test to match "DADD", "ABA", etc:

["ABA", "DADD", "CAC"].each do |string|
puts "'#{string}' => #{string.match(regex)}"
end
'DADD' => DADD
'ABA' => ABA
'CAC' => CAC

It does get you closer though. I rarely remember to make use of
look-ahead (and look-behind and other "(?X" style patterns) since when
switching languages/regexp engines, I'm never sure what features will
be there (and will still work the same). I guess I'm too conservative,
sticking with core/basic features.

This does make me curious how short of a regex using these features
could be written for this one case...

Okay, now this is even closer:

regex = /^(A(?=[^A]+$)|B(?=[^B]+$)|C(?=[^C]+$)|D{1,2}(?!D))+$/

However, it still has problems with "DADD", "DADBD" and such...

--
Kendall Gifford
zettabyte@gmail.com

Oh... missed that detail... I *think* this covers the bases...

regex = /^(A(?!.*A)|B(?!.*B)|C(?!.*C)|D{1,2}(?!.*D))+$/

["ABCDD","CA","CAD",
"CADD","CADDD","DAD",
"BAC","BBAD","AABCCD",
"DADD","DADBD","ABCDDA",
"MIKE"].each do |string|
  puts "'#{string}' => #{string.match(regex)}"
end

'ABCDD' => ABCDD
'CA' => CA
'CAD' => CAD
'CADD' => CADD
'CADDD' =>
'DAD' =>
'BAC' => BAC
'BBAD' =>
'AABCCD' =>
'DADD' =>
'DADBD' =>
'ABCDDA' =>
'MIKE' =>

Mike Cargal

mike@cargal.net
http://blog.mikecargal.com

Except that "DAD" *is* legal according to OP's description. :slight_smile:

--
Kendall Gifford
zettabyte@gmail.com

Mike Cargal

mike@cargal.net

The question is: what qualifies as a loop? Explicit loops are only
done with "while", "until" and "for", Everything else is just a
method call with a block.

According to the strict loop definition my code does not have a loop
either. According to the wide loop definition your code has a lot
more loops than mine. I can spot at least five of them - plus a lot
temporary Array instances.

Cheers

robert

···

On Wed, Nov 3, 2010 at 5:45 PM, w_a_x_man <w_a_x_man@yahoo.com> wrote:

On Nov 3, 7:31 am, Robert Klemme <shortcut...@googlemail.com> wrote:

I'd rather do:

def check(input)
raise ArgumentError, "Illegal chars in sequence: %p" % input unless
/\A[A-D]{0,4}\z/ =~ input
cnt = {'A' => 1, 'B' => 1, 'C' => 1, 'D' => 2}
input.scan /./ do |m|
raise ArgumentError, "Illegal sequence %p" % input if (cnt[m] -= 1) < 0
# alt: return false
end
# alt: true
end

"We don't need no stinkin' loops!"

def check input
return false unless /\A[A-D]{0,4}\z/ =~ input
%w(A B C D).map{|s| input.count s}.zip( [1,1,1,2] ).
map{|a,b| b-a}.all?{|n| n >= 0}
end

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

def check input
  /\A[A-D]{0,4}\z/.match input and
  %w(A B C D).map{|s| input.count s}.zip( [1,1,1,2] ).
    map{|a,b| b-a}.min >= 0
end

···

On Nov 3, 10:42 am, w_a_x_man <w_a_x_...@yahoo.com> wrote:

"We don't need no stinkin' loops!"

def check input
return false unless /\A[A-D]{0,4}\z/ =~ input
%w(A B C D).map{|s| input.count s}.zip( [1,1,1,2] ).
map{|a,b| b-a}.all?{|n| n >= 0}
end

double grrrrr..... (I messed up the parentheses on the last last one)

regex = /^(A(?!.*A)|B(?!.*B)|C(?!.*C)|D(?!.*D.*D)|DD(?!.*D))+$/

["ABCDD","CA","CAD",
"CADD","CADDD","DAD","DADD","DADAD","ADD",
"BAC","BBAD","AABCCD",
"DADD","DADBD","ABCDDA",
"MIKE"].each do |string|
  puts "'#{string}' => #{string.match(regex)}"
end

···

On Nov 4, 2010, at 7:14 PM, Mike Cargal wrote:

grrr......

regex = /^(A(?!.*A)|B(?!.*B)|C(?!.*C)|D(?!.*D.*D|DD(?!.*D)))+$/

["ABCDD","CA","CAD",
"CADD","CADDD","DAD","DADD","DADAD","ADD",
"BAC","BBAD","AABCCD",
"DADD","DADBD","ABCDDA",
"MIKE"].each do |string|
  puts "'#{string}' => #{string.match(regex)}"
end

======================
'ABCDD' => ABCDD
'CA' => CA
'CAD' => CAD
'CADD' => CADD
'CADDD' =>
'DAD' => DAD
'DADD' =>
'DADAD' =>
'ADD' => ADD
'BAC' => BAC
'BBAD' =>
'AABCCD' =>
'DADD' =>
'DADBD' =>
'ABCDDA' =>
'MIKE' =>

On Nov 4, 2010, at 7:06 PM, Kendall Gifford wrote:

On Thu, Nov 4, 2010 at 4:53 PM, Mike Cargal <mike@cargal.net> wrote:

On Nov 4, 2010, at 1:34 PM, Kendall Gifford wrote:

On Thu, Nov 4, 2010 at 11:27 AM, Kendall Gifford <zettabyte@gmail.com> wrote:

On Thu, Nov 4, 2010 at 5:46 AM, Mike Cargal <mike@cargal.net> wrote:

regex = /^(A(?!A)|B(?!B)|C(?!C)|D{1,2}(?!D))+$/

["ABCDD","CA","CAD","CADD","CADDD","DAD","BAC","BBAD","AABCCD"].each do |string|
puts "'#{string}' => #{string.match(regex)}"
end

==============
'ABCDD' => ABCDD
'CA' => CA
'CAD' => CAD
'CADD' => CADD
'CADDD' =>
'DAD' => DAD
'BAC' => BAC
'BBAD' =>
'AABCCD' =>

However, this also appears by my test to match "DADD", "ABA", etc:

["ABA", "DADD", "CAC"].each do |string|
puts "'#{string}' => #{string.match(regex)}"
end
'DADD' => DADD
'ABA' => ABA
'CAC' => CAC

It does get you closer though. I rarely remember to make use of
look-ahead (and look-behind and other "(?X" style patterns) since when
switching languages/regexp engines, I'm never sure what features will
be there (and will still work the same). I guess I'm too conservative,
sticking with core/basic features.

This does make me curious how short of a regex using these features
could be written for this one case...

Okay, now this is even closer:

regex = /^(A(?=[^A]+$)|B(?=[^B]+$)|C(?=[^C]+$)|D{1,2}(?!D))+$/

However, it still has problems with "DADD", "DADBD" and such...

--
Kendall Gifford
zettabyte@gmail.com

Oh... missed that detail... I *think* this covers the bases...

regex = /^(A(?!.*A)|B(?!.*B)|C(?!.*C)|D{1,2}(?!.*D))+$/

["ABCDD","CA","CAD",
"CADD","CADDD","DAD",
"BAC","BBAD","AABCCD",
"DADD","DADBD","ABCDDA",
"MIKE"].each do |string|
puts "'#{string}' => #{string.match(regex)}"
end

'ABCDD' => ABCDD
'CA' => CA
'CAD' => CAD
'CADD' => CADD
'CADDD' =>
'DAD' =>
'BAC' => BAC
'BBAD' =>
'AABCCD' =>
'DADD' =>
'DADBD' =>
'ABCDDA' =>
'MIKE' =>

Mike Cargal

mike@cargal.net
http://blog.mikecargal.com

Except that "DAD" *is* legal according to OP's description. :slight_smile:

--
Kendall Gifford
zettabyte@gmail.com

Mike Cargal

mike@cargal.net
http://blog.mikecargal.com

Mike Cargal

mike@cargal.net