How to Efficiently Calculate the Pattern of Zeros and Ones?

William_Djaja_Tjokr1 · 18 September 2002 18:49

Hi,

I am dealing with this algorithmic problem. I have an array of arbitrary
integers. I have to count the occurences of 1’s with the pattern

"0 0 ... 0 1 0 .. 0 0"

in the array. The minimum number of zeros on each side of the 1 is a
parameter, say m = 2. Also at the beginning and at the end of the array,
the boundary condition does not require the minimum number of zeros, as
long as they are zeros (or non-existent).

For example, with m = 2:

[1 0 0 1 0 0 5 1]   --> 2
[0 0 1 0 0 1 0 2]   --> 1
[1 0 1 0 1 0 1 0]   --> 0
[1 0 0 0 0 0 1 0]   --> 2

Typical array length will be around 24. The problem is I will have
thousands, if not hundreds of thousands, of such arrays. What is a good
way to do it in Ruby? (Even coverting the array first to, for example,
string and then use regexp, is also fine as long as it is efficient.)

Regards,

Bill

Joseph_McDonald · 18 September 2002 20:11

I am dealing with this algorithmic problem. I have an array of arbitrary
integers. I have to count the occurences of 1's with the pattern

"0 0 ... 0 1 0 .. 0 0"

in the array. The minimum number of zeros on each side of the 1 is a
parameter, say m = 2. Also at the beginning and at the end of the array,
the boundary condition does not require the minimum number of zeros, as
long as they are zeros (or non-existent).

For example, with m = 2:

    [1 0 0 1 0 0 5 1] --> 2
    [0 0 1 0 0 1 0 2] --> 1
    [1 0 1 0 1 0 1 0] --> 0
    [1 0 0 0 0 0 1 0] --> 2

Typical array length will be around 24. The problem is I will have
thousands, if not hundreds of thousands, of such arrays. What is a good
way to do it in Ruby? (Even coverting the array first to, for example,
string and then use regexp, is also fine as long as it is efficient.)

Well, I don't have a test case for it, so there could be a bug
lurking, but this does over 10,000 per second on my box:

I'm sure someone can make it faster.

#!/usr/local/bin/ruby

def count_pattern(num,arrays) # pass in a zillion arrays

  arrays.each do |arr|
    hits = 0
    onehit = false
    numzeros = num # cheater

    arr.each do |x|
      if x == 1 and numzeros >= num
        onehit = true
        numzeros = 0
        next
      end
      if x == 0
        numzeros += 1
        if onehit == true and numzeros >= num # HIT!
          hits += 1
          onehit = false
        end
      else
        numzeros = 0
        onehit = false
      end
    end
    hits += 1 if onehit == true # trailing one
    puts "[#{arr.join(' ')}] -> #{hits}"
  end
end

arrs = [
[1, 0, 0, 1, 0, 0, 5, 1],
[0, 0, 1, 0, 0, 1, 0, 2],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 0, 0, 0, 0, 1, 0]
]

count_pattern(2,arrs)

regards,
-joe

Mrchameleon · 19 September 2002 01:18

William Djaja Tjokroaminata billtj@z.glue.umd.edu wrote in message news:amagpp$lkf$1@grapevine.wam.umd.edu…

Hi,

I am dealing with this algorithmic problem. I have an array of arbitrary
integers. I have to count the occurences of 1’s with the pattern
"0 0 ... 0 1 0 .. 0 0"
in the array. The minimum number of zeros on each side of the 1 is a
parameter, say m = 2. Also at the beginning and at the end of the array,
the boundary condition does not require the minimum number of zeros, as
long as they are zeros (or non-existent).

For example, with m = 2:
[1 0 0 1 0 0 5 1]   --> 2
[0 0 1 0 0 1 0 2]   --> 1
[1 0 1 0 1 0 1 0]   --> 0
[1 0 0 0 0 0 1 0]   --> 2
Typical array length will be around 24. The problem is I will have
thousands, if not hundreds of thousands, of such arrays. What is a good
way to do it in Ruby? (Even coverting the array first to, for example,
string and then use regexp, is also fine as long as it is efficient.)

Regards,

Bill

Is this too slow?

arr.select { |i| i ==1 }.size

Yours

Chris

Johan_Holmberg · 19 September 2002 08:20

William Djaja Tjokroaminata billtj@z.glue.umd.edu writes:

For example, with m = 2:

[1 0 0 1 0 0 5 1]   --> 2
[0 0 1 0 0 1 0 2]   --> 1
[1 0 1 0 1 0 1 0]   --> 0
[1 0 0 0 0 0 1 0]   --> 2

I don’t know if the following is efficient in terms of CPU-cylces
but it is at least rather short and easy to read (once you know
the cool enum-package).

/Johan Holmberg

···

#----------------------------------------------------------------------
require “enum/cluster”

lists = [
[1, 0, 0, 1, 0, 0, 5, 1],
[0, 0, 1, 0, 0, 1, 0, 2],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 0, 0, 0, 0, 1, 0],
]

facit = [0, 0, 1, 0, 0]
m = 2

for list in lists
count = 0
list.each_with_neighbors(m, 0) do |part|
count += 1 if part == facit
end
puts “# in #{list.inspect} = #{count}”
end
#----------------------------------------------------------------------

William_Djaja_Tjokr1 · 18 September 2002 21:10

Hey, it seems it is working pretty well. Thanks a lot for writing the
code. Basically it is a type of “one-pass” algorithm, so it is
efficient. I am going to modify it, because based on the system
description, it seems that I will not need the trailing zero test. (It is
for something that is temporarily called RSMA (Reception Sense Multiple
Access); a terminal will not transmit if it hears that another terminal
already transmits on a particular channel.) Thanks a lot.

Regards,

Bill

···

===========================================================================
Joseph McDonald joe@vpop.net wrote:

Well, I don’t have a test case for it, so there could be a bug
lurking, but this does over 10,000 per second on my box:

I’m sure someone can make it faster.

(deleted)

David_Alan_Black1 · 19 September 2002 01:19

Hi –

···

On Thu, 19 Sep 2002, Chris Reay wrote:

William Djaja Tjokroaminata billtj@z.glue.umd.edu wrote in message news:amagpp$lkf$1@grapevine.wam.umd.edu…
Hi,

I am dealing with this algorithmic problem. I have an array of arbitrary
integers. I have to count the occurences of 1’s with the pattern
"0 0 ... 0 1 0 .. 0 0"
in the array. The minimum number of zeros on each side of the 1 is a
parameter, say m = 2. Also at the beginning and at the end of the array,
the boundary condition does not require the minimum number of zeros, as
long as they are zeros (or non-existent).

For example, with m = 2:
[1 0 0 1 0 0 5 1]   --> 2
[0 0 1 0 0 1 0 2]   --> 1
[1 0 1 0 1 0 1 0]   --> 0
[1 0 0 0 0 0 1 0]   --> 2
Typical array length will be around 24. The problem is I will have
thousands, if not hundreds of thousands, of such arrays. What is a good
way to do it in Ruby? (Even coverting the array first to, for example,
string and then use regexp, is also fine as long as it is efficient.)

Regards,

Bill
Is this too slow?

arr.select { |i| i ==1 }.size

That counts all the 1’s, without regard for whether or not they’re
surrounded by the requisite number of 0’s.

David

–
David Alan Black | Register for RubyConf 2002!
home: dblack@candle.superlink.net | November 1-3
work: blackdav@shu.edu | Seattle, WA, USA
Web: http://pirate.shu.edu/~blackdav | http://www.rubyconf.com

Joel_VanderWerf1 · 19 September 2002 20:43

Johan Holmberg wrote:

William Djaja Tjokroaminata billtj@z.glue.umd.edu writes:

For example, with m = 2:

[1 0 0 1 0 0 5 1] → 2
[0 0 1 0 0 1 0 2] → 1
[1 0 1 0 1 0 1 0] → 0
[1 0 0 0 0 0 1 0] → 2

I don’t know if the following is efficient in terms of CPU-cylces
but it is at least rather short and easy to read (once you know
the cool enum-package).

I had to go back and check whether each_with_neighbors generated lots of
arrays or not. It turns out that it doesn’t. Instead, it keeps a single
array, shifts data through it, and yields the array. So it should be
fairly efficient as pure ruby code goes.

However, this may surprise users who modify the array in the iterator
block, so I’ll slip a warning into the docs.

If your rows are short, each_with_neighbors is not very efficient
because of the “warm up” and “cool down” to deal with the cases near the
edge.

/Johan Holmberg

#----------------------------------------------------------------------
require “enum/cluster”

lists = [
[1, 0, 0, 1, 0, 0, 5, 1],
[0, 0, 1, 0, 0, 1, 0, 2],
[1, 0, 1, 0, 1, 0, 1, 0],
[1, 0, 0, 0, 0, 0, 1, 0],
]

facit = [0, 0, 1, 0, 0]
m = 2

for list in lists
count = 0
list.each_with_neighbors(m, 0) do |part|
count += 1 if part == facit
end
puts “# in #{list.inspect} = #{count}”
end
#----------------------------------------------------------------------

Nice!

If the requrements were different, and 1’s near the edge were not to be
counted because there are not enough zeros, you could replace

list.each_with_neighbors(m, 0) do |part|

with
list.each_cluster(m*2+1) do |part|

William_Djaja_Tjokro · 19 September 2002 20:58

Hi,

If the “shifts data through it” is accomplished using Ruby
Array#push() and Array#shift(), then probably the code is indeed very easy
to write, but probably not very efficient in the background. I like the
answer given in the first response, as I think it mimics how regular
expression works (using some “states”).

Regards,

Bill

···

===========================================================================
Joel VanderWerf vjoel@path.berkeley.edu wrote:

I had to go back and check whether each_with_neighbors generated lots of
arrays or not. It turns out that it doesn’t. Instead, it keeps a single
array, shifts data through it, and yields the array. So it should be
fairly efficient as pure ruby code goes.

However, this may surprise users who modify the array in the iterator
block, so I’ll slip a warning into the docs.

If your rows are short, each_with_neighbors is not very efficient
because of the “warm up” and “cool down” to deal with the cases near the
edge.

Joel_VanderWerf1 · 20 September 2002 01:46

No doubt the state machine approach is faster, and you’re also accurate
about shifting. I’m just relieved that each_with_neighbors doesn’t spew
garbage, which would make it even slower

William Djaja Tjokroaminata wrote:

···

Hi,

If the “shifts data through it” is accomplished using Ruby
Array#push() and Array#shift(), then probably the code is indeed very easy
to write, but probably not very efficient in the background. I like the
answer given in the first response, as I think it mimics how regular
expression works (using some “states”).

Regards,

Bill

Joel VanderWerf vjoel@path.berkeley.edu wrote:

I had to go back and check whether each_with_neighbors generated lots of
arrays or not. It turns out that it doesn’t. Instead, it keeps a single
array, shifts data through it, and yields the array. So it should be
fairly efficient as pure ruby code goes.

However, this may surprise users who modify the array in the iterator
block, so I’ll slip a warning into the docs.

If your rows are short, each_with_neighbors is not very efficient
because of the “warm up” and “cool down” to deal with the cases near the
edge.

–
Joel VanderWerf California PATH, UC Berkeley
mailto:vjoel@path.berkeley.edu Ph. (510) 231-9446
http://www.path.berkeley.edu FAX (510) 231-9512

Maggie_Xiao · 20 September 2002 03:13

Hi,

My problem is generate a set of test cases automatically in Ruby, and use
it to test the c program get the statement coverage which should be pass to
ruby, the ruby program supposed to use a search method to generate the
better test cases to test the c program again untill find the best test
cases which can fullly cover all of the statements… That means the input of
c program should be generated from ruby and the result of statement coverage
should be passed to ruby.

Seems I should use rubyinline. Is there any one has suggestion about it? Or
any other way to do it?
Thanks.

Bill_Kelly · 20 September 2002 06:29

Hi,

My problem is generate a set of test cases automatically in Ruby, and use
it to test the c program get the statement coverage which should be pass to
ruby, the ruby program supposed to use a search method to generate the
better test cases to test the c program again untill find the best test
cases which can fullly cover all of the statements… That means the input of
c program should be generated from ruby and the result of statement coverage
should be passed to ruby.

Perhaps you could post some sample “test” data to be fed to the ‘C’
program, and an example of what the output data from the coverage
analyzer looks like. (?)

Do you already have an idea what kind of search method you plan to
use to cause the output from the coverage analysis to be useful
in adapting the generated input data toward increased coverage?

It sounds like a difficult problem in any language.

Seems I should use rubyinline. Is there any one has suggestion about it? Or
any other way to do it?
Thanks.

Regards,

Bill