Ruby unit testing

Hi,

I have a ruby script that I would like to apply unit testing against but
Ive never actually created any previously and the documentation is not
bad but im struggling to apply it to my own code.

Could someone give a couple of examples of how unit testing might be
applied to the following code just to get me started on it.

require 'net/http'
require 'uri'
require 'rexml/document'
require 'rubygems'
require_gem 'activerecord'

include REXML
def fetch(uri_str, limit=10)
  fail 'http redirect too deep' if limit.zero?
  puts "Scraping: #{uri_str}"
  response = Net::HTTP.get_response(URI.parse(uri_str))
  case response
  when Net::HTTPSuccess
    response
  when NetHTTPRedirection
    fetch(response['location'], limit-1)
  else
    response.error!
  end
end

#Connect to the database
ActiveRecord::Base.establish_connection(
  :adapter => "mysql",
  :username => "root",
  :host => "localhost",
  :password => "",
  :database => "build"
)

puts "connected to build database"

class Result < ActiveRecord::Base
end

records = Result.find(:all).each do |records|
  response = fetch(records.build_url)

  scraped_data = response.body

  table_start_pos = scraped_data.index('<table class="index"
width="100%">')
  table_end_pos = scraped_data.index('</table>') + 9
  height = table_end_pos - table_start_pos

  #pick out the table
  gathered_data = response.body[table_start_pos,height]

  #convert Data to REXML
  converted_data = REXML::Document.new gathered_data
  module_name = XPath.first(converted_data, "//td[@class='data']/a/]")
  module_name = module_name

  build_status_since = XPath.first(converted_data, "//td[2]/em")
  build_status_since = build_status_since.text
  build_status_since = build_status_since.slice(/(\d+):(\d+)/)

  last_failure = XPath.first(converted_data, "//tbody/tr/td[3]")
  last_failure = last_failure.text

  last_success = XPath.first(converted_data, "//tbody/tr/td[4]")
  last_success = last_success.text

  build_number = XPath.first(converted_data, "//tbody/tr/td[5]")
  build_number = build_number.text

···

-----------------

I know that I need to include:

require 'test/unit'
require '../name_of_code_tested'

after that though it gets less clear :frowning:

any ideas?

cheers.

--
Posted via http://www.ruby-forum.com/.

Chris Gallagher wrote:

Hi,

I have a ruby script that I would like to apply unit testing against but
Ive never actually created any previously and the documentation is not
bad but im struggling to apply it to my own code.

Could someone give a couple of examples of how unit testing might be
applied to the following code just to get me started on it.

[ sample code removed from this reply - dj ]

I know that I need to include:

require 'test/unit'
require '../name_of_code_tested'

after that though it gets less clear :frowning:

any ideas?

cheers.

Perhaps ZenTest might be of interest? It will scan your code and sketch
out the outline of tests for each of your methods.

Some other resources include -
http://onestepback.org/articles/tdddemo/
http://wiki.rubygarden.org/Ruby/page/show/UsingTestUnit

···

--
Posted via http://www.ruby-forum.com/\.

I don't have time to look at your code now, but maybe someone will
pitch in, or I'll look in the morning. I also had trouble getting into
unit testing so I eventually bought a book and made a good study of
it.

I can recommend the book. Even though it covers C#, it's mostly generic info:

I love the pragmatic programmer books. Every time I buy one I'm
impressed. My all-time favourite one is "The Pragmatic Programmer:
Journeyman to Master". I think all programmers should read that book.

Les

···

On 11/21/06, Chris Gallagher <cgallagher@gmail.com> wrote:

Hi,

I have a ruby script that I would like to apply unit testing against but
Ive never actually created any previously and the documentation is not
bad but im struggling to apply it to my own code.

Could someone give a couple of examples of how unit testing might be
applied to the following code just to get me started on it.

thanks for the reply.

Zentest sounds pretty cool but there doesnt seem to be anything telling
people how to use it at the moment. how does one get zentest to scan
your code or how ever it does what it does?

cheers.

···

--
Posted via http://www.ruby-forum.com/.

Hi,

This is how I'd do it. I don't know the details, so your actual code
will be different. This code was not run at all, so probably there are
many typos/errors.
You'd certainly choose another names than I've chosen.

First of all, if you want to test, it's good to split the code in
small chunks so that each does a little bit of work. Thus you can test
them separately. I tend to put the code inside some class as well...
you know, the OOP :wink:

While adding tests, you want to change the code as little as possible,
to not broke it. When you have the tests in place, you can safely
change stuff (=refactor).

The final code could be a bit different from what is written in the
mail, as the code was written before the text, and I was too lazy to
fix it :wink:

Now the steps to the code below:

1. minor corrections:
  - at the end of the code 'end' is missing
  - change |records| to |result| as you are iterating over Result(s)
  - change each to collect as you seem to collect results (but I may be wrong)

2. move the whole each/collect block inside Result class as:
   class Result < ...
      def self.get_records
         Result.find(:all).each do |result|
            ...
         end
      end
   end
(you can omit the Result. before find)
We have a function that returns something for all Results

what remains at the bottom is
   records = Result.get_records

3. The block doesn't return anything meaningful so:
   - add just after class Result < ...:
BuildStatus = Struct.new(:module_name, :build_status_since,
:last_failure, :last_success, :build_number)
  - add BuildStatus.new(module_name, build_status_since, last_failure,
last_success, build_number) at the end of the block (inside it)

4. Now we can add the first unit test at the very bottom:
comment out the records = ... line
and add:

if __FILE__ == $0
   require 'test/unit'

   class TestResult < Test::Unit::TestCase
      def test_get_records
          assert_equal [], Result.get_records
      end
   end
end

This test will fail if there are any results, but we have SOMETHING now.

5. Let's make the test succeed:
change the inside of the test:

      def test_get_records
          results = Result.find(:all)
          records =Result.get_records
          assert_equal results.size, records.size
          records.each do |record|
             assert_equal Result::BuildStatus, record.class
          end
       end

6. It seems that get_build_statuses is a more appropriate name for get_records.
rename it.

7. Now let's go through the block content:
   move it to separate method:
   def get_build_status
       ...
   end

   and change Result.get_build_statuses to:

   def self
      find(:all).collect do |result|
          result.get_build_status
      end
   end

   make sure the tests are still working
   add test for get_build_status (it should return a BuildStatus),
check it's contents
   i.e.
   def test_get_build_status
      result =Result.new
      result.build_url = 'whatevr yo use'
      build_status = result.get_build_status
      assert_equal '...', build_status.module_name
      ...
   end

8. move fetch inside the class, adding default parameter build_url, so
we can call
  result.fetch insteadof fetch(result.build_url)
  add test for fetch, make sure the tests are still working

9. separate table extraction
   make sure the tests are still working
   add test for it

10. separate table parsing
   make sure the tests are still working
   add test for it

11. refactor more
   make sure the tests are still working

12. when you're done, you can drop some tests if they are no more needed.

Finally one really nice link:

This is James Edward Gray doing RubyQuiz along with testing.

----------8<-------------------------------------

require 'net/http'
require 'uri'
require 'rexml/document'
require 'rubygems'
require_gem 'activerecord'

include REXML

#Connect to the database
ActiveRecord::Base.establish_connection(
        :adapter => "mysql",
        :username => "root",
        :host => "localhost",
        :password => "",
        :database => "build"
)

puts "connected to build database"

class Result < ActiveRecord::Base
        # default value for uri_str to allow calling result.fetch
        # instead of fetch(result.build_url)
        def fetch(uri_str = build_url, limit=10)
                fail 'http redirect too deep' if limit.zero?
                puts "Scraping: #{uri_str}"
                response = Net::HTTP.get_response(URI.parse(uri_str))
                case response
                when Net::HTTPSuccess
                        response
                when NetHTTPRedirection
                        fetch(response['location'], limit-1)
                else
                        response.error!
                end
        end

        def get_table(data)
                table_start_pos = data.index('<table class="index"
width="100%">')
                table_end_pos = data.index('</table>') + 9
                height = table_end_pos - table_start_pos

                #pick out the table
                # response.body = data
                # you can do without heigth:
                # return data[table_start_pos...table_stop_pos]
                return data[table_start_pos,height]
        end

        BuildStatus = Struct.new(:module_name, :build_status_since,
:last_failure, :last_success, :build_number)

        def parse_table(table)
                ret = BuildStatus.new

                #convert Data to REXML
                # - added parentheses
                converted_data = REXML::Document.new(table)
                module_name = XPath.first(converted_data,
"//td[@class='data']/a/]")
                ret.module_name = module_name

                build_status_since = XPath.first(converted_data, "//td[2]/em")
                build_status_since = build_status_since.text
                ret.build_status_since = build_status_since.slice(/(\d+):(\d+)/)

                last_failure = XPath.first(converted_data, "//tbody/tr/td[3]")
                ret.last_failure = last_failure.text

                last_success = XPath.first(converted_data, "//tbody/tr/td[4]")
                ret.last_success = last_success.text

                build_number = XPath.first(converted_data, "//tbody/tr/td[5]")
                ret.build_number = build_number.text

                ret
        end

        def get_build_status
                # - move fetch inside Result class
                response = fetch

                # you can put .body into fetch, and combine the
following two into one:
                # table = get_table(fetch())
                # or even in one line:
                # return parse_table(get_table(fetch()))
                scraped_data = response.body
                table = get_table(scraped_data)

                build_status = parse_table(table)
                return build_status
        end

        def self.get_build_statuses
                find(:all).collect do |result|
                        result.get_build_status
                end
        end
end

if __FILE__ == $0
        require 'test/unit'

        class TestResult < Test::Unit::TestCase

                EXAMPLE_PAGE = <<-EOF
...
                EOF

                EXAMPLE_TABLE = <<-EOF
...<table...>
...
</table>
...
                EOF

                EXAMPLE_RESULT = Result::BuildStatus.new('','','','')
# fill in the blanks

                def setup
                        @r = Result.new
                end

                def test_fetch
                        @r.build_url = 'whatever' # fill in your example
                        assert_raise(RuntimeError) { @r.fetch('', 0) }
# zero is 'too deep'
                        assert_equal @r.fetch(@r.build_url), @r.fetch
  # check for refactoring
                end

                def test_get_table
                        assert_equal EXAMPLE_TABLE, @r.get_table(EXAMPLE_PAGE)
                end

                def test_parse_table
                        assert_equal EXAMPLE_RESULT,
@r.parse_table(EXAMPLE_TABLE)
                end

                def test_get_result
                        assert_equal Result::BuildStatus.new('',...),
r.get_build_status
                end
        end
end

Here's a good article:
http://www.linuxjournal.com/article/7776

Les

···

On 11/22/06, Chris Gallagher <cgallagher@gmail.com> wrote:

thanks for the reply.

Zentest sounds pretty cool but there doesnt seem to be anything telling
people how to use it at the moment. how does one get zentest to scan
your code or how ever it does what it does?

--
Man's unfailing capacity to believe what he prefers to be true rather
than what the evidence shows to be likely and possible has always
astounded me. We long for a caring Universe which will save us from
our childish mistakes, and in the face of mountains of evidence to the
contrary we will pin all our hopes on the slimmest of doubts. God has
not been proven not to exist, therefore he must exist.

- Prokhor Zakharov

Hi,

This is how I'd do it. I don't know the details, so your actual code
will be different. This code was not run at all, so probably there are
many typos/errors.
You'd certainly choose another names than I've chosen.

First of all, if you want to test, it's good to split the code in
small chunks so that each does a little bit of work. Thus you can test
them separately. I tend to put the code inside some class as well...
you know, the OOP :wink:

While adding tests, you want to change the code as little as possible,
to not broke it. When you have the tests in place, you can safely
change stuff (=refactor).

Two ideas to think about here:
* First, refactoring the code before you have tests is never a super
good idea. Writing unit tests that you know you're going to need to
refactor/replace/throw away is also. BDD and RSpec may be a
better way to go (any BDD folks want to pitch in with some ideas
here?).
* Second, once you've got your suite of tests written (either unit
tests or BDD style specifications), you should probably run your
tests under rcov. C0 coverage (like rcov measures) won't ensure
that you have all the tests you need, but it will help ensure that
you're not leaving chunks of code untested.

(Nice walk through below ... we can sure use more documents like
this.)

···

On 11/21/06, Jan Svitok <jan.svitok@gmail.com> wrote:

The final code could be a bit different from what is written in the
mail, as the code was written before the text, and I was too lazy to
fix it :wink:

Now the steps to the code below:

1. minor corrections:
  - at the end of the code 'end' is missing
  - change |records| to |result| as you are iterating over Result(s)
  - change each to collect as you seem to collect results (but I may be wrong)

2. move the whole each/collect block inside Result class as:
   class Result < ...
      def self.get_records
         Result.find(:all).each do |result|
            ...
         end
      end
   end
(you can omit the Result. before find)
We have a function that returns something for all Results

what remains at the bottom is
   records = Result.get_records

3. The block doesn't return anything meaningful so:
   - add just after class Result < ...:
BuildStatus = Struct.new(:module_name, :build_status_since,
:last_failure, :last_success, :build_number)
  - add BuildStatus.new(module_name, build_status_since, last_failure,
last_success, build_number) at the end of the block (inside it)

4. Now we can add the first unit test at the very bottom:
comment out the records = ... line
and add:

if __FILE__ == $0
   require 'test/unit'

   class TestResult < Test::Unit::TestCase
      def test_get_records
          assert_equal , Result.get_records
      end
   end
end

This test will fail if there are any results, but we have SOMETHING now.

5. Let's make the test succeed:
change the inside of the test:

      def test_get_records
          results = Result.find(:all)
          records =Result.get_records
          assert_equal results.size, records.size
          records.each do |record|
             assert_equal Result::BuildStatus, record.class
          end
       end

6. It seems that get_build_statuses is a more appropriate name for get_records.
rename it.

7. Now let's go through the block content:
   move it to separate method:
   def get_build_status
       ...
   end

   and change Result.get_build_statuses to:

   def self
      find(:all).collect do |result|
          result.get_build_status
      end
   end

   make sure the tests are still working
   add test for get_build_status (it should return a BuildStatus),
check it's contents
   i.e.
   def test_get_build_status
      result =Result.new
      result.build_url = 'whatevr yo use'
      build_status = result.get_build_status
      assert_equal '...', build_status.module_name
      ...
   end

8. move fetch inside the class, adding default parameter build_url, so
we can call
  result.fetch insteadof fetch(result.build_url)
  add test for fetch, make sure the tests are still working

9. separate table extraction
   make sure the tests are still working
   add test for it

10. separate table parsing
   make sure the tests are still working
   add test for it

11. refactor more
   make sure the tests are still working

12. when you're done, you can drop some tests if they are no more needed.

Finally one really nice link:
http://macromates.com/screencast/ruby_quiz_screencast.mov
This is James Edward Gray doing RubyQuiz along with testing.

----------8<-------------------------------------

require 'net/http'
require 'uri'
require 'rexml/document'
require 'rubygems'
require_gem 'activerecord'

include REXML

#Connect to the database
ActiveRecord::Base.establish_connection(
        :adapter => "mysql",
        :username => "root",
        :host => "localhost",
        :password => "",
        :database => "build"
)

puts "connected to build database"

class Result < ActiveRecord::Base
        # default value for uri_str to allow calling result.fetch
        # instead of fetch(result.build_url)
        def fetch(uri_str = build_url, limit=10)
                fail 'http redirect too deep' if limit.zero?
                puts "Scraping: #{uri_str}"
                response = Net::HTTP.get_response(URI.parse(uri_str))
                case response
                when Net::HTTPSuccess
                        response
                when NetHTTPRedirection
                        fetch(response['location'], limit-1)
                else
                        response.error!
                end
        end

        def get_table(data)
                table_start_pos = data.index('<table class="index"
width="100%">')
                table_end_pos = data.index('</table>') + 9
                height = table_end_pos - table_start_pos

                #pick out the table
                # response.body = data
                # you can do without heigth:
                # return data[table_start_pos...table_stop_pos]
                return data[table_start_pos,height]
        end

        BuildStatus = Struct.new(:module_name, :build_status_since,
:last_failure, :last_success, :build_number)

        def parse_table(table)
                ret = BuildStatus.new

                #convert Data to REXML
                # - added parentheses
                converted_data = REXML::Document.new(table)
                module_name = XPath.first(converted_data,
"//td[@class='data']/a/]")
                ret.module_name = module_name

                build_status_since = XPath.first(converted_data, "//td[2]/em")
                build_status_since = build_status_since.text
                ret.build_status_since = build_status_since.slice(/(\d+):(\d+)/)

                last_failure = XPath.first(converted_data, "//tbody/tr/td[3]")
                ret.last_failure = last_failure.text

                last_success = XPath.first(converted_data, "//tbody/tr/td[4]")
                ret.last_success = last_success.text

                build_number = XPath.first(converted_data, "//tbody/tr/td[5]")
                ret.build_number = build_number.text

                ret
        end

        def get_build_status
                # - move fetch inside Result class
                response = fetch

                # you can put .body into fetch, and combine the
following two into one:
                # table = get_table(fetch())
                # or even in one line:
                # return parse_table(get_table(fetch()))
                scraped_data = response.body
                table = get_table(scraped_data)

                build_status = parse_table(table)
                return build_status
        end

        def self.get_build_statuses
                find(:all).collect do |result|
                        result.get_build_status
                end
        end
end

if __FILE__ == $0
        require 'test/unit'

        class TestResult < Test::Unit::TestCase

                EXAMPLE_PAGE = <<-EOF
...
                EOF

                EXAMPLE_TABLE = <<-EOF
...<table...>
...
</table>
...
                EOF

                EXAMPLE_RESULT = Result::BuildStatus.new('','','','')
# fill in the blanks

                def setup
                        @r = Result.new
                end

                def test_fetch
                        @r.build_url = 'whatever' # fill in your example
                        assert_raise(RuntimeError) { @r.fetch('', 0) }
# zero is 'too deep'
                        assert_equal @r.fetch(@r.build_url), @r.fetch
  # check for refactoring
                end

                def test_get_table
                        assert_equal EXAMPLE_TABLE, @r.get_table(EXAMPLE_PAGE)
                end

                def test_parse_table
                        assert_equal EXAMPLE_RESULT,
@r.parse_table(EXAMPLE_TABLE)
                end

                def test_get_result
                        assert_equal Result::BuildStatus.new('',...),
r.get_build_status
                end
        end
end

--
thanks,
-pate
-------------------------

Jan Svitok wrote:

Hi,

This is how I'd do it. I don't know the details, so your actual code
will be different. This code was not run at all, so probably there are
many typos/errors.
You'd certainly choose another names than I've chosen.

First of all, if you want to test, it's good to split the code in
small chunks so that each does a little bit of work. Thus you can test
them separately. I tend to put the code inside some class as well...
you know, the OOP :wink:

While adding tests, you want to change the code as little as possible,
to not broke it. When you have the tests in place, you can safely
change stuff (=refactor).

The final code could be a bit different from what is written in the
mail, as the code was written before the text, and I was too lazy to
fix it :wink:

Now the steps to the code below:

1. minor corrections:
  - at the end of the code 'end' is missing
  - change |records| to |result| as you are iterating over Result(s)
  - change each to collect as you seem to collect results (but I may be
wrong)

2. move the whole each/collect block inside Result class as:
   class Result < ...
      def self.get_records
         Result.find(:all).each do |result|
            ...
         end
      end
   end
(you can omit the Result. before find)
We have a function that returns something for all Results

what remains at the bottom is
   records = Result.get_records

3. The block doesn't return anything meaningful so:
   - add just after class Result < ...:
BuildStatus = Struct.new(:module_name, :build_status_since,
:last_failure, :last_success, :build_number)
  - add BuildStatus.new(module_name, build_status_since, last_failure,
last_success, build_number) at the end of the block (inside it)

4. Now we can add the first unit test at the very bottom:
comment out the records = ... line
and add:

if __FILE__ == $0
   require 'test/unit'

   class TestResult < Test::Unit::TestCase
      def test_get_records
          assert_equal , Result.get_records
      end
   end
end

This test will fail if there are any results, but we have SOMETHING now.

5. Let's make the test succeed:
change the inside of the test:

      def test_get_records
          results = Result.find(:all)
          records =Result.get_records
          assert_equal results.size, records.size
          records.each do |record|
             assert_equal Result::BuildStatus, record.class
          end

...........
..........
...........
.........

thanks for the long post Jan, youve given me a good insight into the way
that my code should be tested. I think that its hard to see how it works
exactly when you look at tests being done on other peoples code but when
its your own, it makes a bit more sense.

I was dreading the idea that Id have refactor it. Not something i've
done before as im very new to ruby code at the moment. In my original
post i actually omitted a lot of the code which might have been vital to
the way that you refactored it and now im a bit lost.

Heres the full code (only another few lines extra):

require 'net/http'
require 'uri'
require 'rexml/document'
require 'rubygems'
require_gem 'activerecord'

include REXML
def fetch(uri_str, limit=10)
  fail 'http redirect too deep' if limit.zero?
  puts "Scraping: #{uri_str}"
  response = Net::HTTP.get_response(URI.parse(uri_str))
  case response
  when Net::HTTPSuccess
    response
  when NetHTTPRedirection
    fetch(response['location'], limit-1)
  else
    response.error!
  end
end

#Connect to the database
ActiveRecord::Base.establish_connection(
  :adapter => "mysql",
  :username => "root",
  :host => "localhost",
  :password => "",
  :database => "build"
)

puts "connected to build database"

class Result < ActiveRecord::Base
end

records = Result.find(:all).each do |records|
  response = fetch(records.build_url)

  scraped_data = response.body

  table_start_pos = scraped_data.index('<table class="index"
width="100%">')
  table_end_pos = scraped_data.index('</table>') + 9
  height = table_end_pos - table_start_pos

  #pick out the table
  gathered_data = response.body[table_start_pos,height]

  #convert Data to REXML
  converted_data = REXML::Document.new gathered_data
  module_name = XPath.first(converted_data, "//td[@class='data']/a/]")
  module_name = module_name

  build_status_since = XPath.first(converted_data, "//td[2]/em")
  build_status_since = build_status_since.text
  build_status_since = build_status_since.slice(/(\d+):(\d+)/)

  last_failure = XPath.first(converted_data, "//tbody/tr/td[3]")
  last_failure = last_failure.text

  last_success = XPath.first(converted_data, "//tbody/tr/td[4]")
  last_success = last_success.text

  build_number = XPath.first(converted_data, "//tbody/tr/td[5]")
  build_number = build_number.text

  #modify current entry for the build.

  Result.find(:all,:conditions => ["build_url = ?",
records.build_url]).each do |b|
    b.build_status_since = build_status_since
    b.last_failure = last_failure
    b.last_success = last_success
    b.build_number = build_number
    b.save
    puts '#{module_name} successfully scraped.'
  end
end

I think I can nail the testing now but might need a hand with
refactoring it. It was silly of me to leave that out of the original
post, apologies.

···

--
Posted via http://www.ruby-forum.com/\.

> thanks for the reply.
>
> Zentest sounds pretty cool but there doesnt seem to be anything telling
> people how to use it at the moment. how does one get zentest to scan
> your code or how ever it does what it does?

Here's a good article:
How to Use ZenTest with Ruby | Linux Journal

Well, that one's a bit dated (it's also shipped with ZenTest itself). A
newer, but still dated article is at:

http://www-128.ibm.com/developerworks/edu/os-dw-os-ruby1-i.html

One of these days, maybe I'll write a newer version covering the
cool stuff like autotest and red-green that's been added lately.

The slides from my RubyConf*MI talk touch on Test::Unit and ZenTest,
but they don't really tell the whole story. I keep hearing rumblings that
video from the conference is going to be available, and then you can
hear the incoherent mutterings that accompany the slides.

···

On 11/21/06, Leslie Viljoen <leslieviljoen@gmail.com> wrote:

On 11/22/06, Chris Gallagher <cgallagher@gmail.com> wrote:

Les

--
Man's unfailing capacity to believe what he prefers to be true rather
than what the evidence shows to be likely and possible has always
astounded me. We long for a caring Universe which will save us from
our childish mistakes, and in the face of mountains of evidence to the
contrary we will pin all our hopes on the slimmest of doubts. God has
not been proven not to exist, therefore he must exist.

- Prokhor Zakharov

--
thanks,
-pate
-------------------------

Never mind. Here's what you can do: I see you iterate over Results
twice. I suppose it's enough to reuse the the original Result (result)
in the outer cycle.
so instead of get_build_status you can name it update_build_status,
and replace ret. with self., remove the last line, and remove the
BuildStatus struct altogether.

Then you can create method update_build_status! that calls
update_build_status and save.
or you can stuff the save into update_build_status or in the class
method (self.get_build_statuses, that might be renamed to
update_build_statuses, and the each stays indeed as it is)

Whatever you're trying to achieve, state that in your tests. I mean,
if you want to have a particular Result have a particular values,
check them.

Another problem with tests is how you can live with external
dependencies - this time a) objects in DB b) actual pages that get
scraped. I tend to separate that stuff, so I can provide my fake
values - you can see it in how I did get_table - it doesn't call fetch
at all. the same with parse_table. At the end, you wire them together
with all the dependencies.

Somehow just massaging the code to be better testable seems to improve
its quality, in terms of readability, density, etc. Several times when
I changed code to be able to write the tests for it, I was just
watching how the code fell away :wink:

···

On 11/22/06, Chris Gallagher <cgallagher@gmail.com> wrote:

thanks for the long post Jan, youve given me a good insight into the way
that my code should be tested. I think that its hard to see how it works
exactly when you look at tests being done on other peoples code but when
its your own, it makes a bit more sense.

I was dreading the idea that Id have refactor it. Not something i've
done before as im very new to ruby code at the moment. In my original
post i actually omitted a lot of the code which might have been vital to
the way that you refactored it and now im a bit lost.

Heres the full code (only another few lines extra):

require 'net/http'
require 'uri'
require 'rexml/document'
require 'rubygems'
require_gem 'activerecord'

include REXML
def fetch(uri_str, limit=10)
  fail 'http redirect too deep' if limit.zero?
  puts "Scraping: #{uri_str}"
  response = Net::HTTP.get_response(URI.parse(uri_str))
  case response
  when Net::HTTPSuccess
    response
  when NetHTTPRedirection
    fetch(response['location'], limit-1)
  else
    response.error!
  end
end

#Connect to the database
ActiveRecord::Base.establish_connection(
  :adapter => "mysql",
  :username => "root",
  :host => "localhost",
  :password => "",
  :database => "build"
)

puts "connected to build database"

class Result < ActiveRecord::Base
end

records = Result.find(:all).each do |records|
  response = fetch(records.build_url)

  scraped_data = response.body

  table_start_pos = scraped_data.index('<table class="index"
width="100%">')
  table_end_pos = scraped_data.index('</table>') + 9
  height = table_end_pos - table_start_pos

  #pick out the table
  gathered_data = response.body[table_start_pos,height]

  #convert Data to REXML
  converted_data = REXML::Document.new gathered_data
  module_name = XPath.first(converted_data, "//td[@class='data']/a/]")
  module_name = module_name

  build_status_since = XPath.first(converted_data, "//td[2]/em")
  build_status_since = build_status_since.text
  build_status_since = build_status_since.slice(/(\d+):(\d+)/)

  last_failure = XPath.first(converted_data, "//tbody/tr/td[3]")
  last_failure = last_failure.text

  last_success = XPath.first(converted_data, "//tbody/tr/td[4]")
  last_success = last_success.text

  build_number = XPath.first(converted_data, "//tbody/tr/td[5]")
  build_number = build_number.text

  #modify current entry for the build.

  Result.find(:all,:conditions => ["build_url = ?",
records.build_url]).each do |b|
    b.build_status_since = build_status_since
    b.last_failure = last_failure
    b.last_success = last_success
    b.build_number = build_number
    b.save
    puts '#{module_name} successfully scraped.'
  end
end

I think I can nail the testing now but might need a hand with
refactoring it. It was silly of me to leave that out of the original
post, apologies.