Test / unit question: facility to mark some tests as "missing" or "incomplete"

I often find myself with some unit tests that run, and several more test
that I know I really should write but simply can’t right now. Being
hopelessly forgetful and never reading my own comments, I would like some
way to enter such missing tests as stubs, so that test/unit will give a
"missing" or “incomplete” test message (but not count is as a test or
assertion failure).

Is this doable? Does it make sense to do something like this?

Thanks!

Its Me said:

I often find myself with some unit tests that run, and several more test
that I know I really should write but simply can’t right now. Being
hopelessly forgetful and never reading my own comments, I would like some
way to enter such missing tests as stubs, so that test/unit will give a
“missing” or “incomplete” test message (but not count is as a test or
assertion failure).

Is this doable? Does it make sense to do something like this?

Sometimes when I wish to disable a test for a bit, I will change the name
of a test from “test_xyz” to “xtest_xyz”. A simple grep through the
project identifies all such tests for later fixup.

···


– Jim Weirich jim@weirichhouse.org http://onestepback.org

“Beware of bugs in the above code; I have only proved it correct,
not tried it.” – Donald Knuth (in a memo to Peter van Emde Boas)

I also uses ‘x’ prefix to identify disabled testcases :wink:

If its many testcases I need to disable then I use ‘undef test_xyz’,
so that the undef’s is grouped, so I easily can see what is
disabled. These can again easily be enabled all together by
using a =begin, =end multiline comment.

=begin
undef test_zap2
undef test_zap3 # TODO: must fix
undef test_zap4
undef test_zap5
=end

Other times I move testcases to modules, and then do a
‘include ExerciseSomething’… then its very easy to enable/disable
a big group of testcases.

class TestSomething < Common::TestCase
module ExerciseAlpha
def test_alpha1; end
def test_alpha2; end
def test_alpha3; end
end
module ExerciseBeta
def test_beta1; end
def test_beta2; end
def test_beta3; end
end
include ExerciseAlpha
include ExerciseBeta
end

···

On Tue, 13 Apr 2004 03:02:29 +0900, Jim Weirich wrote:

Its Me said:

I often find myself with some unit tests that run, and several more test
that I know I really should write but simply can’t right now. Being
hopelessly forgetful and never reading my own comments, I would like some
way to enter such missing tests as stubs, so that test/unit will give a
“missing” or “incomplete” test message (but not count is as a test or
assertion failure).

Is this doable? Does it make sense to do something like this?

Sometimes when I wish to disable a test for a bit, I will change the name
of a test from “test_xyz” to “xtest_xyz”. A simple grep through the
project identifies all such tests for later fixup.


Simon Strandgaard

Jim Weirich wrote:

Sometimes when I wish to disable a test for a bit, I will change the name
of a test from “test_xyz” to “xtest_xyz”.

Small world: That is /exactly/ the same method we use.

Regards,

···


Bil Kleb, Hampton, Virginia

[snip]

Maybe test/unit could recognise xtest methods and say

124 tests, 300 assertions, 0 failures, 0 errors, 15 disabled tests

to help us when we forget a test has been x’ed.

(I would have said this proposed behaviour is “non-obvious, and people
wouldn’t know to use ‘x’ unless they’d read the documentation”, but given
the number of people who use ‘x’ anyway, they might just stumble across it.)

There is also times where tests cannot be complete for some reason.
For instance in Rubicon there are some windows tests which obvious
cannot be run on unix.

def test_something
skip(‘reason for skipping’) if is_win98?

end

Besides that there are those tests which is unimplemented.

def test_something
todo(‘what needs to be done’)
end

They all are in the family of tests which cannot be executed for
some reason.

124 tests, 300 assertions, 0 failures, 0 errors,
26 not run (15 disabled, 8 skipping, 3 unimplemented).

···

On Thu, 15 Apr 2004 12:10:54 +0000, Tim Sutherland wrote:


Simon Strandgaard

“Simon Strandgaard” neoneye@adslhome.dk wrote in message

def test_something
skip(‘reason for skipping’) if is_win98?

end

def test_something
todo(‘what needs to be done’)
end

They all are in the family of tests which cannot be executed for
some reason.

124 tests, 300 assertions, 0 failures, 0 errors,
26 not run (15 disabled, 8 skipping, 3 unimplemented).

Hmmm … different categories of test results reported. Better than a
convention such as prefix “x”.

test/unit has Failures (assert) and Errors (throw or raise), but not
Warnings.

Perhaps test/unit could add some hooks so (something like) the following
would work.

class MyTest < Test::Unit::TestCase
warnings :todo, :skip # :todo, :skip become warning categories
def test_A
warn(:todo, “excuse”) # reported in warn/todo category
end
def test_B
warn(“reason”) # warning in warn/other category
end
end

test/unit could bundle some standard warning categories e.g. todo, skip,
disabled

Internally perhaps test/unit could define a new class to raise:

class TestWarning < StandardError; end

And rescue these separately from its current AssertionFailedError, reporting
these as warning.

124 tests, 300 assertions, 0 failures, 0 errors,
26 warnings (15 disabled, 8 skip, 2 todo, 1 other)

Re: the “xtest” method of disabling tests…

A simple solution might be for Test::Unit to ignore
private methods.

That wouldn’t address the need for distinguishing the
reason the test is ignored, of course.

Just a thought.

Hal

I argue against this.

It’s key to reduce visual clutter when you are running tests. They should
be red or green. You shouldn’t have to think about what you are seeing. The
example above is confusing.

And you shouldn’t be using your tests as your to do list. Put them in
comments and use Rdoc to extract them. Put them in your eclipse task list.
Or build a separate system for tracking them, like Marick:

require ‘ruby-trace/util/todo’
include Todo

class Misnamed
todo “Rename class Misnamed.”

include Math

todo “Change square_root to handle negative args.”
def square_root(arg)
todo ‘This todo call happens every time the method is called.’
sqrt(arg)
end
end

if FILE == $0
puts “The square root of 2 is #{Misnamed.new.square_root(2)}.”
end

http://rubyforge.org/cgi-bin/viewcvs/cgi/viewcvs.cgi/wtr/scripting101/installs/timeclock/ruby-trace/doc/examples/todo-example.rb?rev=1.1.1.1&cvsroot=wtr&content-type=text/vnd.viewcvs-markup

In fact i even think the distinction between failures and errors is tedious
and unnecessary.

Bret

···

At 10:29 AM 4/15/2004, Its Me wrote:

124 tests, 300 assertions, 0 failures, 0 errors,
26 warnings (15 disabled, 8 skip, 2 todo, 1 other)


Bret Pettichord, Software Tester
Consultant - www.pettichord.com
Author - www.testinglessons.com
Blogger - www.io.com/~wazmo/blog

Homebrew Automation Seminar
April 30, Austin, Texas
www.pettichord.com/training.html

“Bret Pettichord” bret@pettichord.com wrote …

124 tests, 300 assertions, 0 failures, 0 errors,
26 warnings (15 disabled, 8 skip, 2 todo, 1 other)

I argue against this.

It’s key to reduce visual clutter when you are running tests. They should
be red or green. .

True. A very succinct view of the results is useful e.g. a binary
red/green.

You shouldn’t have to think about what you are seeing. The
example above is confusing

But what about when you want to think about what you are seeing?

Like how many tests? How many assertions? Which test passed or failed? Why
exclude information about ineffective tests (disabled, skipped, or
missing) from the results themselves? When I need to look deeper than the
red/green on a gven test run, information about missing tests can be as
useful as any of the other existing bits of more detailed test results.

Would be nice if was available (despite to-do lists and coverage tools). It
could get used a lot more than to-do lists or Rdoc comments about needed
tests.

···

At 10:29 AM 4/15/2004, Its Me wrote:

Bret Pettichord wrote:

In fact i even think the distinction between failures and errors is
tedious and unnecessary.

It depends, if it allows you to distinguish broken test code from broken
production code. For example:

  • Test code violating preconditions (in the Design by Contract sense)
    should be “errors”, while violated postconditions or invariants should
    be “failures”. For example, a method is only valid when its receiver is
    in a particular state; if the test calls that method without verifying
    or ensuring the receiver’s state, the test is at fault. (Of course,
    sometimes it’s a regression bug, and designers have to decide whether to
    patch the test or redesign to remove the dependency.)

  • Failures in the fixture setup, particularly reading test input files
    or setting up Mock Objects, should be “errors”, while failures during
    the execution of production code should be “failures”.

  • Exceptions caused by external flaws, like an IOException while loading
    test data or the unanticipated death of a remote server, should be
    “errors”. Of course, the proper design of a unit test would inline test
    data and mock out servers, but I’ve seen a number of “real-world” tests
    that don’t; some test writers find the task too hard given the current
    design, and others are trying to prove that a third-party API works as
    expected underneath their own wrapper.

I agree that current tools don’t make the distinction very well, and
maybe it’s even theoretically impossible in the general case.

124 tests, 300 assertions, 0 failures, 0 errors,
26 warnings (15 disabled, 8 skip, 2 todo, 1 other)

I argue against this.

It’s key to reduce visual clutter when you are running tests. They
should be red or green. You shouldn’t have to think about what you are
seeing. The example above is confusing.

And you shouldn’t be using your tests as your to do list. Put them in
comments and use Rdoc to extract them. Put them in your eclipse task
list. Or build a separate system for tracking them,

Bret, you’ve pretty well summed up my reaction to this. I’ve just
recently been dealing with some pretty mucked up (Java) code, and the
last thing I’d want to provide those developers with is another way to
not complete functionality. So I won’t propagate such abilities in to
Test::Unit.

By way of explanation to others, I view things like disabled tests,
todo tests, and even Eclipse TODO tags as excuses. Now, when I’m
cranking away at completing a story, I usually have an index card with
my short list of “excuses” on it. As I run in to things that need to be
done but would distract me from my current train of thought, I write
them down. But that list is in such a temporal state that I have to
deal with those issues before the card disappears. It’s OK to make
excuses to myself for a while, but I’m not going to foist my excuses on
the next guy that comes along (or even on myself when I look at the
code next).

If something in the code is important enough to flag, it’s important
enough to deal with now.

I’m sorry if my position seems hard-nosed, but there are other ways of
flagging these things if that’s what you really want to do. They just
won’t show up in the default Test::Unit install.

In fact i even think the distinction between failures and errors is
tedious and unnecessary.

An interesting viewpoint, and one I can certainly understand. However,
since Test::Unit lets one explicitly assert that exceptions are not
thrown, I think there is some value in distinguishing between expected
test failures and unexpected ones. When I’m coding test-first, there’s
a big difference to me between running my tests and getting an error
(unexpected), and running my tests and getting a failure (expected). So
I think the distinction will stay. I appreciate you bringing it up,
though, as it’s given me an interesting idea to turn over.

TODO: Have fun coding Ruby…

Nathaniel
Terralien, Inc.

<:((><

···

On Apr 15, 2004, at 12:22, Bret Pettichord wrote:

At 10:29 AM 4/15/2004, Its Me wrote:

I guess i don’t see this. I usually just notice what the error/failure is
(oh, i have a missing method foo.bar; oh, the return value of foo.baz is
incorrect) and then do what i have to do. I pay attention to the actual
error message and stack trace, not to the error/failure distinction.

When i write test-first, i sometimes expect to see errors and sometimes
expect to see failures. For example, if i write a test for a class that
doesn’t exist yet, the first time i run it, i will see an error. Then if
put in a stub of a class, i might see failures because it’s methods don’t
return the right thing. So at different times i expect both errors and
failures.

In other words, my expectation is that i will see red (errors/failures)
until i have a half-baked implementation and then green while i refactor it.

Bret

···

At 09:29 PM 4/15/2004, Nathaniel Talbott wrote:

When I’m coding test-first, there’s a big difference to me between running
my tests and getting an error (unexpected), and running my tests and
getting a failure (expected).


Bret Pettichord, Software Tester
Consultant - www.pettichord.com
Author - www.testinglessons.com
Blogger - www.io.com/~wazmo/blog

Homebrew Automation Seminar
April 30, Austin, Texas
www.pettichord.com/training.html

Nathaniel Talbott wrote:

Bret, you’ve pretty well summed up my reaction to this. I’ve just
recently been dealing with some pretty mucked up (Java) code, and the
last thing I’d want to provide those developers with is another way to
not complete functionality. So I won’t propagate such abilities in to
Test::Unit.

I certainly agree that for unit tests, you are looking for a binary
outcome; either pass or fail. No argument there.

However, I’m looking into using Test::Unit in some different scenarios.
When coupled with something like Ara’s session module, Ruby makes a
good scripting language for acceptance testing command line programs.
Unlike unit test, acceptance/functional tests can be (should be) written
early in an iteration before the code supporting them is ready.

What I would like to see (not in Test::Unit directly, but in some
acceptance testing framework built on Test::Unit) the ability to declare
the expected outcome of a test, much like the DejaGNU testing framework.
The framework could catch expected failures and unexpected passes.
Once a feature is implemented, it moves from an expected failure to an
expected pass.

Just some thoughts.

···


– Jim Weirich jim@weirichhouse.org http://onestepback.org

“Beware of bugs in the above code; I have only proved it correct,
not tried it.” – Donald Knuth (in a memo to Peter van Emde Boas)