Sunday, March 29, 2015

Unit-Testing Tutorial II: Test-Driven Development

I. Introduction

"Test Driven Development" (TDD) is a discipline for writing code by writing the unit-tests first.  More precisely, it is a cycle of:
  1. Write a (failing!) test
  2. Write just enough code to pass the test.
  3. Refactor the code to keep it clean (but still passing)
where, at each iteration, you build up the "production" code by adding new tests that extend (and define) what the production code is supposed to do.  The process is remarkably similar to "throwing" (as it's called) a clay pot on a potter's wheel: many repetitions of working and examining the work, as you build it up.

I do not specifically recommend TDD as a daily practice.  Familiarity with TDD as a discipline, however, is a very useful skill.  It encourages the close bond between writing code and testing code, that is the hallmark of good, clean, coding.

(Personally I practice what I call TND: "Test Near Development".  I tend to write small chunks of code, and then immediately write tests.  Often the tests then point out edge cases that I might not have thought of in the original code.  This process repeats in many small cycles, similar to strict TDD.)

II. Purpose of the Exercises

The rest of this page is a guide to working through a pair of TDD "katas", or exercises.  Like a martial arts kata, the purpose is to train your (mental) muscles.  This does not mean that code must or should be written exactly this way, at this level of detail.  (Although some people in the industry do, and I'm not sure they're wrong.)

Rather, take the exercises as "fractal" representations of how TDD might apply in writing production code.  The cycle of test, code, refactor is still an extremely useful one to master, even if the granularity, or level of abstraction, is larger than appears in the exercises.

It's also a tiny example of "emergent design", where the code "emerges" from the iteration of adding more tests.  There are flame-wars pro and con emergent design, but it's still worth consideration.

Perhaps the most important point is this: even when you have "finished" your code (real code, not just this kata), remember to do the last "refactor" step.  I have seen entirely too much 'production' code in my career that "works" but was not truly finished, did not have the rough edges sanded down, as it were.
  1. The first exercise (prime factorization) is a "pure" kata.  It emphasizes the step-by-step TDD process at a very fine-grained level.  Most developers look at this exercise and complain "why don't I just write the code?"  This kata is not about the final answer, but about practicing the TDD cycle.

  2. The second exercise (chess pawn moves) is more like a real-world program.  It has several relatively independent parts, which can be developed one at a time via tests.

III. The Prime Factorization Kata

(Heavily borrowed from Uncle Bob Martin's example.)

The "Prime Factorization" kata is meant to be done -- not read.  If you're reading this, take each step at a time, and write your own implementation of "just enough code" to make the next test pass.  Don't skip ahead -- really go step-by-step.

I've written up descriptions of each step as a guide, but only use it if you're stuck.
  1. Get the skeleton in place.  Start by checking out the base project from svn://  Look at the com.proquest.kata package.  This provides a basically empty framework with the skeleton already created.  You'll see that the first test, and the initial implementation, define the prime factors of 1 as an empty list.

    I've also provided the listOf() method to easily generate lists of numbers to be used in the test asserts.

       public List<Integer> getFactors(int num) {
          List<Integer> results = new ArrayList<Integer>();
          return results;
       = = = = = = = = = = = = = = = = = = = = = = = = = = = =
       public void setup() {
          factorizer = new PrimeFactorizer();
       public void test1() {
          assertEquals (listOf(), factorizer.getFactors(1));
       private List<Integer> listOf(Integer ... numbers) {
          List<Integer> results = new ArrayList<Integer>();
          for (Integer number: numbers)
          return results;
  2. Add the first test(s).  The process we'll follow is very simple: we'll just keep adding tests for the next integer, until something breaks.  So we add:

       public void test2() {
          assertEquals (listOf(2), factorizer.getFactors(2));
    This test fails.  What's the simplest thing we can do to make it pass?  (That doesn't just hard-code the single answer we're looking for?)

       if (num > 1) {
    In fact, this change also passes the next test:

       assertEquals (listOf(3), factorizer.getFactors(3))

  3. Add just a little bit of logic.  Now it gets interesting.  When we add a test for:

       assertEquals (listOf(2,2), factorizer.getFactors(4));
    we have to actually do some computation.  But how much code is "just enough to pass"?  In this case, let's just handle a single factor of 2.  Now the code looks like:

       if (num % 2 == 0) {
          num = num / 2;
       if (num > 1) {
    You'll notice the last part is the same as before.  So all we're adding is "taking out" the first factor of 2 (if there is one).

    You should begin to see a pattern here.  At each new test, handle the new case.  If the new case looks a lot like the previous case, do the simplest possible generalization of both cases.

    This change actually handles the test cases for 4, 5, 6, and 7!

       assertEquals (listOf(2,2), factorizer.getFactors(4));
       assertEquals (listOf(5),   factorizer.getFactors(5));
       assertEquals (listOf(2,3), factorizer.getFactors(6));
       assertEquals (listOf(7),   factorizer.getFactors(7));

  4. Generalize a single piece of logic.  Adding a test for 8 raises a more general case.  How do we handle multiple instances of the same factor?  Again, take the "single case code" and make it more general, while changing as few other things as possible.
       while (num % 2 == 0) {
          num = num / 2;
       if (num > 1) {
    So really, all we've done is change the "if" to a "while"!

  5. Rinse and Repeat.  Adding a test for 9 is much like what we did for 8, only now we're making it more general than hard-coding a divisor of 2.  Essentially, we're wrapping a loop of candidate factors around the previous code, and substituting the candidate for the previously hard-coded 2:

       for (int factor=2; num > 1; ++factor) {
          while (num % factor == 0) {
             num = num / factor;
       if (num > 1) {
    We actually had to do two things here, which is a little bit of a cheat.  We added the loop, and we told it when to stop (when num got reduced, by division, to 1). This passes the test for 9, and 10, and ... well, as it turns out, for everything!
  6. Refactor.  But we're not done... remember, the cycle is failing test, passing test, and then refactor.  Looking closely at the above code, the "if (num > 1)" is apparently no longer needed.  Try taking it out, and see if the tests pass!

  7. More refactoring?  What other refactoring can be done?  The inner while loop could become a for loop, which removes one line of code.  It's not clear which form is more obvious to the reader.  Think about it.

  8. Other Tests?  Are there are any other tests we should consider?  Certainly take a stab at some larger numbers.  But also consider smaller numbers: are there any edge cases we need to test?  What about 0?  -1?  Write tests for those as well.

  9. Refactor Tests.  Think about refactoring the tests, too.  Is there code common to all of the tests?  This is very simple set of tests. 

    It would be tempting to just lump all the asserts together in one @Test method.  For this example, it might even make sense (and certainly be shorter).  In general, though, each test is making a declarative statement about the code.  Don't over-optimize that, at least not at first.  Once you've got truly working code, then think about optimizing your tests.

    (OTOH, don't blindly cut-and-paste large amounts of code for each test... if you're copying more than 3 lines at a time in your tests, then it's time to do some refactoring.  Fail, pass, refactor should apply to your tests just as well as to your production code.)

  10. Efficiency?  Now that the code is working, we can think about efficiency.  Avoid "premature optimization" -- get something working first so that you fully understand the problem (and have a good suite of tests!), and then think about optimization.  (Obviously there are cases where some thought about efficiency may have to come first.  But they're pretty rare.)

    Even the optimizations can be TDD'd.  Consider:
    • Candidate factors could be 2, and then only odd numbers.
    • Candidate factors could be just prime numbers.
    • Candidate factors could be just prime numbers, no higher than the square root of the original number.  (Or does it matter?)
    • Candidate factors could be cached for future calls to getFactors.
    • Is there a (presumably) faster way to get the dividend and the modulus in one operation?  (Most hardware does both in one operation.)
  11. Coda (aka the little bit of music at the end of a symphony, after you think the symphony is over):

    Do this whole exercise over again, from scratch -- in about two weeks.  Don't look at this exercise, don't look at the code you wrote the first time.  Do it literally like a martial arts kata.  Just follow the fail / pass / refactor pattern.  Afterwards, see what looks the same and what looks different.  If feasible, try it in a different language.

  12. Before I had studied Zen for thirty years, I saw mountains as mountains, and waters as waters. When I arrived at a more intimate knowledge, I came to the point where I saw that mountains are not mountains, and waters are not waters. But now that I have got its very substance I am at rest. For it's just that I see mountains once again as mountains, and waters once again as waters.
    Ch'uan Teng Lu, 22. (The Way of Zen 126)


IV.  Chess Pawn Moves

In the first module, I introduced a (very) minimal set of classes for supporting Chess problems.  They are in the com.proquest.chess package in the same project as the prime factorization exercise.
  1. Pawn moves.  For this second kata, we'll build on the chess package, and create a Pawn class that generates all of the legal moves that a Pawn can make.  Now the Pawn is probably the most illogical piece in chess: it has four different kinds of moves, some of which can be combined.  This makes for a very nice breakdown of TDD'ing writing the class.

    To refresh your chess knowledge, a pawn can:

    1. Advance one square forward, if nothing is blocking it.

    2. Advance two squares forward, if it is in its starting position (2nd rank, or y=1 for white, y=6 for black) and nothing is blocking it.

    3. Capture an opposing piece one square to the right and forward, or left and forward.

    4. Promote to a Queen, Rook, Knight, or Bishop if its move or capture leaves it on the last rank (y=7 for white, y=0 for black)

    5. "En passant" capture.  If a (e.g. black) pawn takes the 2-square move option, and lands next to a white pawn, the white pawn can capture the black pawn and lands on the square where the black pawn would have been if it had only moved one square.

      Just to make it worse, the white player must capture the black pawn immediately or forever forfeit the move.  (And my Board class currently has no way to record move numbers, which would be required to properly implement the "do it now or never" rule.)

  2. TDD strategy.  The TDD approach is to take the rules above, one at a time (or in even smaller pieces), write a failing test, then make just that test pass, and then refactor (and keep all the tests passing).  When I worked the exercise, my thinking (and the tests) went something like this:

    1. White pawn should move forward (+y) one square
    2. Black  pawn should move forward (-y) one square
    3. Hmm, I'm gonna have a lot of this white pawn, black pawn, stuff.  Refactor my tests to combine both in each test.  Remember that test code should be just as clean and as abstracted as production code!  So I wrote an "assertPawnCanMoveTo(pawn, Integer ... xyValues)" utility method that abstracts my tests and makes them shorter.
    4. Pawn should not move forward if blocked.
    5. Pawn on starting position can move two squares forward if not blocked.  The move-one-square and then move-two-squares code is probably going to look similar; after my test passes, I may want to refactor.  This is a good time to use Emma's "coverage as... Junit" to see if all paths have been exercised.
    6. Normal Pawn capture.   Hmm, I may need to go back to one test for White, and one test for Black.  Unless I can come up with a refactoring that is both clever and clear... which may be impossible.  When the board gets complicated, use board.print() just to see what's going on.
    7. Oops, don't allow captures of your own (same-colored) pieces!
    8. I ran coverage again, and noticed I never exercised prohibiting captures past the left or right edge of the board.  So I added a test for that, too.
    9. Promotion to Queen, Rook, Knight, or Bishop on moving a Pawn to the last rank.  Note that this counts as 4 different moves.
    10. Promotion during a capture onto the last rank.  A quick refactoring of the Pawn code, to re-use promotion code in both moves and captures, probably suggests itself here.
  3. Safe Refactoring.  At this point, I have everything working except "en passant" moves, so it's time to look back for refactoring.

    1. There's probably a lot of code that does something like "a white pawn at y==6 or a black pawn at y==1" that could be improved.  We could use data instead of code (perhaps a Map indexed by Color that gives us y values, or maybe Position's).

    2. Or perhaps a better approach would be two new sub-classes BlackPawn and WhitePawn that extend Pawn, but with some different constants.   E.g. lastRank() is 7 for WhitePawn but 0 for BlackPawn.

    3. The makeMoves() method probably has two different responsibilities: one for moves, and one for captures, and they probably deserve different (sub) methods, if only for readability.

    The point is, now that we have good test coverage -- and your whole Pawn class should be totally green in coverage right now -- we can confidently refactor all we want!  Don't just stop because it works, now make it good!
  4. Check against my tests.   Because we have a standard API for Piece's, you can cross-check your code by running my tests.  Just paste that text into a new test class and run it.  Take a look at the names I used for the tests -- I have zero comments, and (I think) no need to add any.  Look at the abstractions, like assertContainsPromotionTo().  How would you improve these tests?  Do they cover cases that you didn't?  Did you cover cases that my tests didn't?

No comments:

Post a Comment