Test::Unit -- multiple errors in test method?

Hi !

I have been writing some unit tests with Test::Unit.

I’ve noted that when an assertion fails in a test method, the
remaining assertions in the same test method aren’t even excuted.
Here is an example:

  require 'test/unit'

  class TC_example < Test::Unit::TestCase

  def test_a
      assert_equal 2,  1 + 1
      assert_equal 5,  2 + 2         # gives error
      assert_equal 7,  2 + 5         # never executed !!!
      assert_equal 8,  3 + 4         # never executed !!!
  end

  end

When running this I get:

  C:\> ruby TC_example.rb
  Loaded suite TC_example
  Started
  F
  Finished in 0.01 seconds.

1) Failure!!!
  test_a(TC_example) [TC_example1.rb:8]:
  <5> expected but was
  <4>

  1 tests, 2 assertions, 1 failures, 0 errors

The two last assertions aren’t even executed.
Is this a but or an intentional “feature” ?

I’ve looked at the way JUNIT in Java does the same thing,
and there I got all errors reported. That seems much more useful
to mee.

With the current Test::Unit behaviour in Ruby, I can’t know if
an error “hides” a number of other errors until I’ve fixed the first
error (it feels like having a C-compiler that only report the first
compilation error …).

I have tested this with Ruby 1.8, and also with a more recent
CVS-snapshot.

Am I missing something obvious,
or should this be considered a bug in Test::Unit ?

/Johan Holmberg

Hi !

I have been writing some unit tests with Test::Unit.
The two last assertions aren’t even executed.
Is this a but or an intentional “feature” ?

“feature”

I’ve looked at the way JUNIT in Java does the same thing,
and there I got all errors reported. That seems much more useful
to mee.

You’d better look more closely. JUnit stops execution of the test at the
first failed assertion. Of course, it doesn’t report how many assertions
it’s making, so maybe it’s less obvious to you.

With the current Test::Unit behaviour in Ruby, I can’t know if
an error “hides” a number of other errors until I’ve fixed the first
error (it feels like having a C-compiler that only report the first
compilation error …).

If you’ve got a number of assertions in one test case, while not always, often
they are “ascendent”. That is, the second asssertion can only really be
checked. if the first one is true. For example, so people might right

assert_not_nil(a)
assert_equals("foo",a.name)
···

On Monday 15 September 2003 08:08, Johan Holmberg wrote:

/Johan Holmberg


David Corbin dcorbin@machturtle.com

Hi –

Hi !

I have been writing some unit tests with Test::Unit.

I’ve noted that when an assertion fails in a test method, the
remaining assertions in the same test method aren’t even excuted.

The two last assertions aren’t even executed.
Is this a but or an intentional “feature” ?

Considering the state of the test/unit code, and the amount of use it
gets, I think it’s a pretty safe bet that something like this was
implemented consciously and didn’t just slip in unnoticed… :slight_smile:

A failed assertion raises Test::Unit::AssertionFailedError, so you can
wrap your tests in rescue clauses if you want to put them all in one
method and not stop execution. But see David Corbin’s response too;
this probably isn’t a good design, and it’s actually more work and
clutter than putting each test in a different method or grouping them
to cascade as David showed.

David

···

On Mon, 15 Sep 2003, Johan Holmberg wrote:


David Alan Black
home: dblack@superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

“Simon Strandgaard” qj5nd7l02@sneakemail.com schrieb im Newsbeitrag
news:pan.2003.09.15.13.05.42.438830@sneakemail.com

I have been writing some unit tests with Test::Unit.

I’ve noted that when an assertion fails in a test method, the
remaining assertions in the same test method aren’t even excuted.
Here is an example:

It is considered good practize only to one assertion per testcase,
this
way you won’t have any assertions which never gets executed.

Did you mean to say “one assertion per test” (i.e. test method)? One
assertion per test case (i.e. class) would be a bit lavish…

But even if so: JUnit encourages the convention of one test per method,
which typically leads to multiple assertions per test.

Even though I doesn’t always follow that convention :slight_smile:

Same here.

Regards

robert
···

On Mon, 15 Sep 2003 22:08:22 +0900, Johan Holmberg wrote:

I’ve looked at the way JUNIT in Java does the same thing,
and there I got all errors reported. That seems much more useful
to mee.

Not in any Junit I’ve ever used.

public void testTest() throws Exception {
assertEquals(2, 1 + 1);
assertEquals(5, 2 + 2);
assertEquals(7, 2 + 5);
assertEquals(8, 3 + 4);
}

C:\j2sdk1.4.2\bin\javaw.exe -classpath …

.F
Time: 0
There was 1 failure:
1)
testTest(com…)junit.framework.AssertionFailedError: expected:<5>
but was:<4>
at com. … .java:51)

···

Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

I’ve looked at the way JUNIT in Java does the same thing,
and there I got all errors reported. That seems much more useful
to mee.

You’d better look more closely. JUnit stops execution of the test at the
first failed assertion. Of course, it doesn’t report how many assertions
it’s making, so maybe it’s less obvious to you.

Yes, I missed that.

If you’ve got a number of assertions in one test case, while not
always, often they are “ascendent”. That is, the second
asssertion can only really be checked. if the first one is true.

Yes, in case this was a feature I suspected the thinking was like
that. But how about these cases ?

  • the Rubicon tests for Ruby. The test methods test a number of
    things in the same method. As soon as any test fails the whole
    test suite soon becomes increasingly useless (maybe one could say
    that there should not be any errors, but that has always been
    the case when I have tried to run Rubicon …)

  • I have tried to write some table driven tests. How should I do
    this ? Currently I tried something in the following style:


    @@data = [
    [2, 1, 1],
    [5, 2, 2], # should fail
    [7, 2, 5],
    [8, 3, 4], # should fail
    ]

    def test_b
    for facit, aa, bb in @@data
    assert_equal facit, aa + bb
    end
    end

    Here I have several tests that are not “increasing”.
    (just before I post this I saw David Alan Blacks answer mentioning
    Test::Unit::AssertionFailedError. Should I catch that ? )

Thanks for the replies,
/Johan Holmberg

···

On Mon, 15 Sep 2003, David Corbin wrote:

Not in the one I used either.
I simply “saw what I wanted to see” when I tried it :wink:
(I’ve not used JUNIT before)

/Johan Holmberg

···

On Mon, 15 Sep 2003, Michael Campbell wrote:

I’ve looked at the way JUNIT in Java does the same thing,
and there I got all errors reported. That seems much more useful
to mee.

Not in any Junit I’ve ever used.

Hi –

  • I have tried to write some table driven tests. How should I do
    this ? Currently I tried something in the following style:


    @@data = [
    [2, 1, 1],
    [5, 2, 2], # should fail
    [7, 2, 5],
    [8, 3, 4], # should fail
    ]

    def test_b
    for facit, aa, bb in @@data
    assert_equal facit, aa + bb
    end
    end

    Here I have several tests that are not “increasing”.
    (just before I post this I saw David Alan Blacks answer mentioning
    Test::Unit::AssertionFailedError. Should I catch that ? )

You could turn the whole thing on its head, so to speak:

require ‘test/unit’

class TestMe < Test::Unit::TestCase
def setup
@@data = [
[2, 1, 1],
[5, 2, 2], # should fail
[7, 2, 5],
[8, 3, 4], # should fail
]
end

def test_addition
  wrong = @@data.find_all {|d| d[0] != d[1] + d[2]}
  assert(wrong.empty?, "These failed: #{wrong.inspect}")
end

end

(give or take some granularity in the error-reporting, etc.)

David

···

On Mon, 15 Sep 2003, Johan Holmberg wrote:


David Alan Black
home: dblack@superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

- I have tried to write some table driven tests. How should I do
  this ? Currently I tried something in the following style:

Something like this ?

svg% cat b.rb
#!/usr/bin/ruby
require 'test/unit'

class TC_example < Test::Unit::TestCase

   @@data = [
      [2, 1, 1, true],
      [5, 2, 2, false], # should fail
      [7, 2, 5, true],
      [8, 3, nil, TypeError], # should fail
   ]

   def test_a
      for facit, aa, bb, res in @@data
         case res
         when true
            assert_equal(facit, aa + bb)
         when false
            assert_not_same(facit, aa + bb)
         else
            assert_raises(res) { aa + bb }
         end
      end
   end
end

svg%

svg% b.rb
Loaded suite ./b
Started
.
Finished in 0.001909 seconds.

1 tests, 4 assertions, 0 failures, 0 errors
svg%

Guy Decoux

Yes, in case this was a feature I suspected the thinking was
like that.

As David and David pointed out, it is, indeed, a feature. Sorry it’s bugging you. :slight_smile:

But how about these cases ?

  • the Rubicon tests for Ruby. The test methods test a number of
    things in the same method. As soon as any test fails the whole
    test suite soon becomes increasingly useless (maybe one could say
    that there should not be any errors, but that has always been
    the case when I have tried to run Rubicon …)

I don’t understand why the tests become useless to you. If we have a test
for strip:

def test_strip
assert_equal(“name”, " name ")
assert_equal(“name”, “\tname\n”)
end

…and the first assertion fails, we immediately know that there’s a bug in
strip. Why is the test useless if the second assertion isn’t executed?

  • I have tried to write some table driven tests. How should I do
    this ? Currently I tried something in the following style:


    @@data = [
    [2, 1, 1],
    [5, 2, 2], # should fail
    [7, 2, 5],
    [8, 3, 4], # should fail
    ]

    def test_b
    for facit, aa, bb in @@data
    assert_equal facit, aa + bb
    end
    end

    Here I have several tests that are not “increasing”.

Perhaps a better word is “related”. All of those tests are related, aren’t
they? If the first one fails, there’s a high probability that the second
one, and the third one, etc., will fail, too. So the testing framework only
worries you with the first one.

Perhaps the key is to realize that it is the method that is the test, not
the assertions. Thus it’s just a like a short-circuited comparison operator

  • we don’t bother evaluating more assertions once we know the test has
    failed.

Also, I think one of the primary things unit testing does for me is force me
to throw my assumptions out the window. If we let failures cascade it would
be tempting to go through and try to fix them all before running the test
again, but that would be a large assumption. Better to not assume anything
about the rest of the test and just run it again once we’ve fixed the
current problem.

There is one case where I do believe it is better to keep evaluating
assertions even if one fails: acceptance tests. Because of the long-running
nature of acceptance tests, and the high overhead they often incur, we need
to squeeze as much information out of a run as possible. I don’t think it’s
an ideal situation - if my acceptance tests ran as quickly as my unit tests
I’d want to stop a test as soon as an assertion failed, just as when unit
testing. But reality bites, just as it does in C compilers - I’d rather have
the compiler just tell me about the first error, but compiling can be an
expensive process, so I end up wading through a bunch of errors I don’t
really care about to find the one I do. C’est la vi.

So, both for acceptance testing and for those who feel that they need it, I
plan to add the ability to keep running when an assertion fails. I’ve
actually planned on it for a while, but haven’t gotten around to it yet. For
the time being, you can use some variation of what Guy posted in
[ruby-talk:82066].

HTH,

Nathaniel

<:((><

···

Johan Holmberg [mailto:holmberg@iar.se] wrote:

No, not really.
When I wrote “should fail” I just meant that I had written a line in
the table that as it stands give an error.

A more accurate example would have been if I wrote a @@data table
that contained correct addition-results, and had written a faulty
addition-function that I then tested. My example maybe was a bit
contrived just to keep it short.

Ideally I would have liked to have some way to indicate to the
Test::Unit framework if the assertions in a certain test method were
“ascendent” (using the words from David Corbins previous answer)
or not.

If they were not “ascendent”, then all assertions should have
been executed even if some of them “failed”.

/Johan Holmberg

···

On Mon, 15 Sep 2003, ts wrote:

  • I have tried to write some table driven tests. How should I do
    this ? Currently I tried something in the following style:

Something like this ?

svg% cat b.rb
#!/usr/bin/ruby
require ‘test/unit’

class TC_example < Test::Unit::TestCase

@@data = [
[2, 1, 1, true],
[5, 2, 2, false], # should fail
[7, 2, 5, true],
[8, 3, nil, TypeError], # should fail
]
[…]

I don’t understand why the tests become useless to you. If we have a test
for strip:

def test_strip
assert_equal(“name”, " name ")
assert_equal(“name”, “\tname\n”)
end

…and the first assertion fails, we immediately know that there’s a bug in
strip. Why is the test useless if the second assertion isn’t executed?

I didn’t mean completely useless.
(but see below for an example)

Perhaps a better word is “related”. All of those tests are related, aren’t
they? If the first one fails, there’s a high probability that the second
one, and the third one, etc., will fail, too. So the testing framework only
worries you with the first one.

I felt that I got a too narrow “window” of what was wrong.
A too “boolean” (yes/no) answer.

Let me give an example:

A while ago I began to think that Rubys File.dirname/File.basename was too UNIX-centric and mis-behaved on Windows. One of the things I tried to do was to run the relevant tests in Rubicon. Since these test also were too UNIX-centric I started adding more tests in the same style as the ones already there.

So I added a number of more assertions about “basename”.
Before there were about 25 of these (in one “test method”) and I
added about 30 more Windows-specific assertions.

When I started to run the tests I got errors.
So yes, I know that there are errors in File.basename (at least
according to my assumtions as codified in my new assertions).

But it was hard to understand how much the current Ruby
implementation deviated from what I thought about as the “right”
bahaviour.

After a while I gave up running the tests under Rubicon, and moved
my assertion code to a small “standalone-program” that just tested
basename and reported all errors it found.

Then when I saw all errrors at once (a wider “window”), I began to
understand more about the nature of the problem and could start
thinking about how to fix it.

I understand that the current Test::Unit framwork is intentionally
designed as it is, and that it normally is a good thing to abort a
test method when the first assertion fails.

I just don’t know yet how “unusual” my scenario above is, and if
there are other ways to cope with the situation by using features
already present in Test::Unit.

So, both for acceptance testing and for those who feel that they need it, I
plan to add the ability to keep running when an assertion fails. I’ve
actually planned on it for a while, but haven’t gotten around to it yet. For
the time being, you can use some variation of what Guy posted in
[ruby-talk:82066].

That sounds promising.
I’ll try Guys / David Alan Blacks approach until then.

/Johan Holmberg

···

On Mon, 15 Sep 2003, Nathaniel Talbott wrote:

No, not really.
When I wrote "should fail" I just meant that I had written a line in
the table that *as it stands* give an error.

Then this ?

svg% cat b.rb
#!/usr/bin/ruby
require 'test/unit'

class TC_example < Test::Unit::TestCase

   @@data = [
      [2, 1, 1],
      [5, 2, 2], # should fail
      [7, 2, 5],
      [8, 3, 7], # should fail
   ]

   def test_a
      error =
      for facit, aa, bb in @@data
         begin
            assert_equal(facit, aa + bb)
         rescue Exception
            error << $!
         end
      end
      error.each {|e| add_failure(e.message, e.backtrace) }
   end
end

svg%

svg% b.rb
Loaded suite ./b
Started
FF
Finished in 0.002194 seconds.

  1) Failure!!!
test_a(TC_example) [./b.rb:17]:
<5> expected but was
<4>

  2) Failure!!!
test_a(TC_example) [./b.rb:17]:
<8> expected but was
<10>

1 tests, 4 assertions, 2 failures, 0 errors
svg%

Guy Decoux

No, not really.
When I wrote “should fail” I just meant that I had written a line in
the table that as it stands give an error.

Then this ?

Yes !
Thanks, for the example code.

It seems to solve my problems in the case where I want to use
table-driven tests, and where each test (i.e. line in the table) is
independent of the others.

/Johan Holmberg

···

On Mon, 15 Sep 2003, ts wrote:

svg% cat b.rb
#!/usr/bin/ruby
require ‘test/unit’

class TC_example < Test::Unit::TestCase

@@data = [
[2, 1, 1],
[5, 2, 2], # should fail
[7, 2, 5],
[8, 3, 7], # should fail
]

def test_a
error =
for facit, aa, bb in @@data
begin
assert_equal(facit, aa + bb)
rescue Exception
error << $!
end
end
error.each {|e| add_failure(e.message, e.backtrace) }
end
end

svg%

svg% b.rb
Loaded suite ./b
Started
FF
Finished in 0.002194 seconds.

  1. Failure!!!
    test_a(TC_example) [./b.rb:17]:
    <5> expected but was
    <4>

  2. Failure!!!
    test_a(TC_example) [./b.rb:17]:
    <8> expected but was
    <10>

1 tests, 4 assertions, 2 failures, 0 errors
svg%

Guy Decoux