Monday, September 21, 2009

Spy vs. Spy

A while ago Ravi Mohan wrote a blog entry having to do with TDD and sudoku solving and I wrote a response to it asserting that TDD is not an algorithm generator. I recently caught up with Ravi's blog and it turns out Peter Norvig indirectly referenced Ravi's original blog entry in an interview for the book Coders at Work. I've been thinking recently of writing a bit more on the subject of design and TDD to simplify and clarify what I said at the time so I figure I may as well take this opportunity to do so.

What is Design?

It strikes me that the big problem lies in the ambiguity of the word 'design' as it applies to programming. No one can envision how an entire application of any significance will be written ahead of time, so we break applications up into smaller problems which we can solve separately. Fundamentally that's what the word 'design' refers to: This process of cutting the thing up into smaller pieces: One wouldn't swallow an entire meal in one gulp. Instead there are separate courses and plates, and in turn we cut the food into bite-sized morcels.

Algorithms:

One important part of this design process is separating the algorithm from the implementation. The 'algorithm' refers to the aspect of the solution to a problem that is independent from the code you need to write. It's even independent from any specific programming language. It's the over-all approach you're using to solve that problem. Let's say you want to write some code to fill a shape with a particular color, or sort a very large data-set, or encrypt a file, or calculate the odds of winning for a given poker hand. These tend to be the kinds of problems where you can't readily start writing code hoping to stumble toward the solution one step at a time. You have to start with some kind of plan. The algorithm is that plan. In some cases the algorithm has to be validated in a very formal mathematical kind of way. Cryptography is a good example. It doesn't matter that the average person can't decypher a scrambled message without the key. This kind of code generally has to stand up to a determined and sophisticated attacker and you have to know the boundaries where this algorithm will stop being effective. In other cases you just need to get the general idea and that's probably good enough to begin coding: A simple flood-fill may be a good example of that. In fact it's often the case that if you becom familiar enough with a given domain, you don't need to explicitely define algorithms ahead of time at all. If you have enough experience you know roughly what you'll need to do and you can be confident that you'll work out the details as you go along. It's important to realize that deep down you still usually have an algorithm in the back of your mind as you begin to write code, even though you may not be fully conscious of it.

Hill Climbing:

You want to climb to the top of a hill but you've never done it before. Instead of worrying about the entire route, you pick a place up the hill you're pretty sure you can reach. Once you get there, you have a look around and decide how to get a bit father up still, etc. Instead of solving the entire problem, you come up with a simpler version or part of the problem that you have some confidence you can deal with. You write some code to solve that aspect of the problem. Then you see if you can take the next step toward a solution. You may have to backtrack several times as you go along, but this can be an effective strategy. Hill-climbing has the advantage that solving a part of a problem may make the rest much clearer, and actually working on real code can free you from the mental paralysis that can come from simply staring too long at a piece of paper and thinking really hard. However, a hill-climbing approach can easily lead to a dead end: Once you've exhausted all of the code needed to develop the fairly obvious aspects of what you are trying to do, there is no clear way to take the next step. That's where algorithmic thinking is required, and the solution, once you have it, may be conceptually different enough that you may have to throw away much of the code you wrote earlier.

Where does TDD Fit?

TDD is a technique to help you write better code. Since TDD is about the code, it's important to realize that it is generally not about helping you to develop algorithms. It's mostly about the implementation part of the algorithm-implementation pairing at the heart of programming. That is what I meant when I wrote in my earlier post that TDD is not an algorithm generator. There's a lot more to writing an application than knowing about algorithms. Your code has to work correctly for one thing! Donald Knuth had a famous comment for a piece of code he wrote: Beware of bugs in the above code; I have only proved it correct, not tried it. It also has to be maintainable, which means the it should readable and as self-documenting as possible, and also that it should be a free as possible from duplication. Your implementation also has to to be reasonably efficient. If the way you implement your algorithm is too slow, that's also a problem (sometimes the algorithm itself is at fault and needs to be replaced too).

TDD allows you to write code one step at a time. The general idea is that you set up some initial conditions for your test, then you call a some function (which you haven't written yet), and finally you write some assertions to make sure that function did the right thing. Having done that much, you fill in the code for that function. This type of approach lets you think about the shape of your code without the distraction of worrying about the details of how to make it work. Shaping the code is also a part of the design process, and often that's where some confusion sets in. Some people are referring to the way the code is organized when they talk about design (the classes, methods, functions, data structures, etc) and other are focusing on the algorithmic aspects. If TDD means Test Driven Design, then it's the code-shaping meaning of design that it mainly refers to.

Backstopping each piece of code you write with a test allows you to achieve the following objectives: 1) Make sure your code is doing what you want it to - particularly to confirm edge cases are being handled properly; This lets you have more confidence in the code you're writing and prevents you from having to wait 'til an entire chunk of functionality is finished before testing it to see if it actually works; 2) Assist you in thinking about how you want your code organized without concomitantly worrying about implementation; 3) Develop a regression suite that will flag changes you may make later on that break the tests 4) Provide a form of executable documentation (one that won't go stale) that will help people (including yourself!) who read the code later on understand your intentions.

TDD and Hill Climbing:

If you don't understand a problem very well to begin with, hacking code naively one test at a time is not very likely to work, as can be seen in Ron Jeffries attempts to solve Sudoku using that very approach.

Conclusion:

Hill climbing, writing the code so as to take one step toward the end goal at a time, works well enough when you have a pretty good idea of where you're going. However, for harder problems it's easy enough to reach a dead-end where there is no obvious next step. Undertaking a hill-climbing approach blindly without adequate preparation means you may spend a lot of time writing code that will only lead you to an impasse. In the domain of business programming there is not a lot of algorithmic complexity. Most of the time the complexity in business programming arises from requirements that are often rather arbitrary and from working with large amounts of data. Developing such applications successfully isn't easy, but the difficulty tends not to be mathematical in nature: It's more like tax law, if you will, with all of its intricacies and exceptions, than like quantum field theory. Using TDD to backstop a mostly hill-climbing strategy tends to work reasonably well for such applications. When problems become more algorithmic/mathematical, algorithmic thinking becomes necessary. You need to consider such problems as a whole and come up with an over-all strategy to solve them. A naive step-at-a-time approach usually doesn't work well in such cases, and it becomes necessary to work things out ahead of time. A combination of both approaches generally works best for most applications, though the balance between the two will obviously vary from one app to another. Success in developing good software lies in the ability to find this balance.

Tuesday, July 01, 2008

The Problem With Popplers

Today's entry is just a rant about the state of programming in general. That's why the subject line is completely irrelevant. I was going to call it The Problem With Programming, but where's the fun in that? My problem with programming is that we keep having the same endless discussions. These discussions, in my humble opinion, are not really of much consequence. Most of the programming discussions seem to have to do with a) programming language features, b) new and better programming languages, and c) methodology. Hey, this is an "agile" programming blog, right? So I'm already guilty of c. The truth is that none of these things really matters all that much in relation to the giant elephant in the room: Competence. With reasonable exceptions, if you're competent in one programming language, you're competent in them all. Sure, you'll have to take some time to learn the idioms and quirks of another language, but that's fine. Am I really going to be significantly more productive using C# than Java? What about Python? What about Perl or PHP? What about Ruby? Oh, those are boring, eh? Perhaps something like Scala or Lisp would do the trick. Honestly, does it really matter? Sure there are cases where such distinctions do make a difference, but frankly, assuming you've got some kind of reasonable OO language without pointers (I've excluded C and C++ from this discussion), really it comes down to this:

1) Is the inherent performance adequate for your forseeable needs?
2) Is the platform stable?
3) Can you find suitable libraries for what you want to do without re-inventing the wheel?
4) Have you addressed relevant portability issues?
5) Are the available development environments suitable for your needs/preferences?

Really, that's about it. Everything else is just kind of annoying dithering. Sure, with real human languages people have a lot of vested interests, plus there are aesthetic concerns. But programming languages have only been around for decades. Do we really need to have a balkanized landscape in the programming language world to mimic that of regular languages in the real world? Apparently so! As for aesthetics, I suppose poetry written in French can never be perfectly translated into English, or Chinese, or Swahili, and so on. But we're talking about computer programs here. The aesthetics should, in my opinion, be limited to expressing things clearly and economically. Computer programming is much more like technical writing than poetry. If there are real aesthetics to it, then I would say any poetry in programming lives in the realm of algorithms, which are language-agnostic anyway. Then there are the endless debates about changes to programming languages. I had an email discussion recently with a fellow about checked exceptions in Java and how terrible they were. My goodness. If you don't like them, then don't use them. It's easy to dispense with them. It's not a problem. Frankly it's a waste of mental energy to discuss things like this. But that's what people are into these days. Endless onanizing with the latest and coolest language features. Should the language be more dynamic? Less dynamic? Can we write our own specialized Domain Specific Languages? Sigh...

Finally there's the methodology cesspool. I have my own views. The problem is that this issue tends to be at the forefront all the time. Is XP good? Are all XP practices good? Is agile a religion or a con? What's the difference between agile and lean? The simple reality is having competent and motivated programmers is good. Having unmotivated and incompetent programmers is bad. If you have competent and motivated programmers, then a really bad methodology may cripple them, but nothing short of the insanity of the CMM stuff is likely to do that.

CMM level 5, denotes complete paralysis. Levels 2 to 4 indicate various intermediate stages where paralysis is spreading, but you can still ship.
The norm is simply chaos without any form at all: The hack and fix grind. Outside the realm of complete insanity, there are practices that most reasonable people would agree on: Automated tests for regression and quality; using a revision control system; integrating the tests into the revision control system as part of an automated build; planning and designing in some kind of iterative fashion; communicating - you know, like, talking to one another in some way that doesn't involve exchanging 200 page documents. Cooperation - people involved genuinely wanting to help one another reach the goal.

Ok, you get the idea. Beyond that, it's all about refinement. Don't like pair-programming? Fine. Prefer writing automated tests after you write the code for a particular feature? Well, I disagree, but ok, go for it. Find occasions in which some detailed planning and design is necessary? Hey, do it - just don't lose it and end up in in(s)anity land again.

What's my point? Well, it's that the real problem with programming is that developers aren't competent enough. There is not enough understanding of core programming ideas - either that or there isn't enough self-discipline to appy them: Reduce duplication; encapsulate; reduce coupling; increase cohesion; emphasize clarity; resist the urge to obfuscate; no matter how tempting it might be, choose the plain way to do something rather than the really fancy: Avoid the way that uses some new fangled language feature unless you can demonstrate the plain way is going to be a serious problem.Whatever you do, avoid falling in love with object-oriented design for its own sake (or functional programming for its own sake). You should be able to show how everything you do emphasize clarity and reduces duplication and coupling. At the end of the day, a good program written in Python should look almost identical to the same thing written in Java. What are your objects? What are their functions? How have you removed duplication? It's not a sexy problem. Switching to a new programming language or a new methodology won't help. It's just the slow grind of educating people, educating ourselves, and keeping things as simple as possible.


Think about it.

Sunday, March 09, 2008

Hitting the Reset Button

All too often in the software world we convince ourselves to keep going with a design that just wasn't quite right from the start. We're not willing to say, let's scrap this and use what we learned to make it better next time. The result is that the technology becomes increasingly crufty and after a while, it's hard to know where to even start to improve things. I think it's important to have the courage to make tough decisions and rework designs that need it as soon as possible, even if it means rolling up our sleeves to do some hard work and dealing with some short term pain during the transition. I've quoted a piece of an interview that CNN did with Steve Jobs below, but I'd like to first show the key highlight, at least from my point of view:
But there always seems to come a moment where it's just not working, and it's so easy to fool yourself - to convince yourself that it is when you know in your heart that it isn't

It really is so easy to fool yourself. I think it's true that the great companies and great teams don't just come out of the gate with winners; rather they have the humility and courage to evaluate their work and start again when they have to. Anyway, here' s the rest of the part of the interview I wanted to quote:
At Pixar when we were making Toy Story, there came a time when we were forced to admit that the story wasn't great. It just wasn't great. We stopped production for five months.... We paid them all to twiddle their thumbs while the team perfected the story into what became Toy Story. And if they hadn't had the courage to stop, there would have never been a Toy Story the way it is, and there probably would have never been a Pixar.

We called that the 'story crisis,' and we never expected to have another one. But you know what? There's been one on every film. We don't stop production for five months. We've gotten a little smarter about it. But there always seems to come a moment where it's just not working, and it's so easy to fool yourself - to convince yourself that it is when you know in your heart that it isn't.

Well, you know what? It's been that way with [almost] every major project at Apple, too.... Take the iPhone. We had a different enclosure design for this iPhone until way too close to the introduction to ever change it. And I came in one Monday morning, I said, 'I just don't love this. I can't convince myself to fall in love with this. And this is the most important product we've ever done.'

And we pushed the reset button. We went through all of the zillions of models we'd made and ideas we'd had. And we ended up creating what you see here as the iPhone, which is dramatically better. It was hell because we had to go to the team and say, 'All this work you've [done] for the last year, we're going to have to throw it away and start over, and we're going to have to work twice as hard now because we don't have enough time.' And you know what everybody said? 'Sign us up.'

That happens more than you think, because this is not just engineering and science. There is art, too. Sometimes when you're in the middle of one of these crises, you're not sure you're going to make it to the other end. But we've always made it, and so we have a certain degree of confidence, although sometimes you wonder. I think the key thing is that we're not all terrified at the same time. I mean, we do put our heart and soul into these things.

Saturday, November 24, 2007

Pair Programming Redux

I've been working for a few years in a mostly XP pair programming environment. Here's my list of pros and cons of pair programming based on this experience:

Pros:
  • Unlike most other projects I've worked on, I've seen an overall improvement in the code quality over the past few years. I've never seen that happen on any other project. As people learn better ways of doing things, pairing helps to disseminate that knowledge in a way that meetings, presentations, and code reviews just don't.
  • In my experience, there hasn't been a problem with people who are working together arguing and consequently getting nothing done. On the contrary, pairing seems to have encouraged an atmosphere of cooperation and friendliness.
  • Sometimes under time pressure one can't resist the urge to copy and paste some code or hack in some functionality without developing tests first. The mutual supervision of pair programming really does seem to have a positive effect on these kinds of transgressions. It's a lot harder to take a nasty shortcut under the watchful eye of the person you're pairing with.
  • I would tend to agree with the principle that working in pairs doesn't really hurt overall productivity. The reason seems to be that the team having a common understanding of the code and the business generally trumps the value of having two people typing code in separately. Integration is the real difficulty in many software projects, so whatever gain in the amount of code written might come from separating pairs, that gain seems to be offset by the greater consistency of code written when pairing.
Cons:
  • Hygiene: Pair programming in our environment means sharing one keyboard and mouse, and sitting in close proximity to another person all day long as well as switching pairs frequently. We often have had problems with people spreading colds around the office. I think developers in an XP shop should have their own personal wireless mouse/keyboard combination as well as their own chair that they don't have to share with anyone else.
  • Ergonomics: The reality is that sitting at a computer is not a natural position for the body and you can do damage to your back, neck, shoulders and head, and of course your wrists over time. Pairing tends to encourage bad ergonomic habits, because it's inconvenient to change the position of the keyboard tray, monitor, chair, etc when it's time to take one's turn at the keyboard. I really believe that these things need to be considered if you want to do pair programming long term.
  • Personal space: Some people enjoy having some peace and quiet as well as a space where they can keep their things. As much as I am a fan of pair programming, I think developers should have their own desks away from the common bull pen they can retreat to from time to time.
My conclusion is that pair programming does work, but it requires some care. First, developers have to respect each other and be willing to compromise. Also, pair programming 100% of the time doesn't work. Sometimes when facing a design problem it helps to go off and work separately, then get back together and discuss later. Also, pair programming - constantly communicating, asking questions, explaining and justifying one's own ideas throughout the day - is very demanding. After a period of time, burnout can occur. When that happens, I think it's a good idea for people to be able to work on something alone for a while.

Thursday, November 15, 2007

Demystefying Stubs, Fakes, and Mock Objects

This blog entry is my attempt to explain the different kinds test-only objects that can be substituted for the real thing in automated tests: Stubs, Fakes, and Mocks. It can be easy to get confused, so I hope this entry helps!

Stubs, fakes, and mocks are all objects that replace the objects that would normally be used in the actual application. What purpose do they serve? First, using such test-only objects promotes good design. Writing code so that classes have dependencies on interfaces and abstract classes which can be implemented either by the full-fledged production code or by a test-only class of some kind helps to reduce the overall coupling in an application. Test-only objects frequently come up when an application is accessing another system: Making sure that system is available and produces the same response to a given input all the time may be a problem (consider a service that provides the current price for stocks for example). Also, simply configuring something like a relational database or a web server so that tests can run against it can be an issue. More generally, using a test-only object makes it easier to set up the initial conditions for tests. Such objects can make it simpler to provide the initial state that the function under test will respond to. The time it takes for tests to run can also be an factor; using simple test-only objects can speed up the time it takes for tests to run substantially. Finally, in some cases one may want to develop some logic that depends on classes that haven't been written yet or that you don't have access to. Say for example another team is working on that functionality and you don't have access to their code. In such cases, you can write code against objects that, for the purposes of testing your own logic, implement the interfaces you expect to see in the yet-to-be-written API. The core idea behind the use of test-only objects is that we often want to write tests for the application logic we're currently working on without having to worry about th behaviour of some of the external code our logic is using.

Let's consider an example. I once wrote a program that allowed a user to schedule polling of different sensors - e.g. read sensor 'A' once an hour, sensor 'B' every minute, and sensor 'C' every second. I wanted to test my scheduling logic, but I wanted to make it independent from the actual sensors: I didn't want to have my application connected to real sensors just to make the my unit tests run. Also the scheduling system of course obtained the computer's system time to see whether it was time to read a given sensor. I also wanted my tests to be independent of the actual system time: What I wanted was to set the time in my test and make sure my program responded by reading the correct sensors - and avoided reading the wrong ones. I didn't want to have to set the actual computer's system time inside of my tests! So, for both of these cases, the business of actually connecting to the sensors and setting and reading the time, I created special test-only objects to stand-in for the ones that would be used in the actual application.

If you're interested in ways to instrument your code, and the trade-offs involved, so that you can substitute these kinds of test-only objects Jeff Langr's Don't Mock Me article is a great reference. I should note that his use of the word "mock" is more generic than the one that's often used. He means "mock" in the general sense of any kind of test object. As we'll see a bit further down, mock objects often have a specific meaning that's different from stubs and fakes.

Now, on to Stubs, Fakes, and Mocks.

Stub: A stub is an object that always does the same thing. It's very simple and very dumb. In our example above of polling sensors, the system time seems like a useful entity to replace with a stub. Let's suppose our scheduling code was in a class called Scheduler. This class might have a method called getSystemTime(). For the purpose of testing, we might create a TestingScheduler class that extends Scheduler and overrides the getSystemTime() method. Now you can set the system time in the constructor of this test-specific class, e.g:

public class TestingScheduler extends Scheduler {
public TestingScheduler(int timeInMillisForTest) {
this.timeInMillisForTest = timeInMillisForTest;
}

public int getSystemTime() {
return timeInMillisForTest;
}
}

When a TestingShcheduler object is used as part of a test, the rest of the Scheduler logic works normally, but it's now getting the time that's been set in the test instead of the actual system time.

Fake: A fake is a more sophisticated kind of test object. The idea is that the object actually displays some real behaviour, yet in some essential ways it is not the real thing. Fakes can be used to speed up the time it takes tests to run and/or to simplify configuration. For example, a project I am currently working on is using Oracle's Toplink as an object-relational mapper (ORM). This allows data in Java objects to be transparently saved to and retrieved from a relational database. To make tests that use this framework run faster, a much simplified memory-only implementation of Toplink's interfaces was implemented. This version doesn't know about transactions and doesn't actually persist data, but it works well enough to allow many of our tests to run against it - and since the actual Oracle database isn't involved, the tests run over an order of magnitude faster. Going back to the scheduler example, we developed a piece of software that could behave as though it was a real sensor. That way we were able to run a variety of fairly complicated tests to make sure our application could communicate with sensors correctly without actually having to hook up the tests to a real sensor. Any time you write code that simulates an external service - some sensors, a Web server, or what have you, you're creating a fake.

You can find a simple example of a fake in the TestNode class in my loop finder example. The TestNode implements the Node interface for the purposes of the unit tests. Classes that are actually part of the application have their own, more complex implementation, of this interface - but we're not interested in testing their implementation of the Node interface here. This allows us to write tests that can run in isolation from the rest of the application. From the perspective of the overall design, this approach helps us to reduce coupling between classes. The LoopFinder class only depends on the Node interface rather than on any specific implementation. That's an example of how making code easier to test concomitantly improves the design.

Mock: Mock objects can be the most confusing to understand. First of all, one can argue that the two types of test classes mentioned above are mocks. After all, they both "mock out" or "simulate" a real class. In fact mocks are a certain kind of stub or fake. However, the additional feature mock objects offer on top of acting as simple stubs or fakes is that they provide a flexible way to specify more directly how your function under test should actually operate. In this sense they also act as a kind of recording device: They keep track of which of the mock object's methods are called, with what kind of parameters, and how many times. If your function under test fails to exercise the mock as specified in the test, the test fails. That's why developing using mock objects is often called "interaction testing." You're not only writing a test which confirms that state after a given method call matches the expected values; you're also specifying how the objects in the function under test, which of course have been replaced with mocks, ought be exercised within a given test.

To sum up: A mock object framework can make sure a) that the method under test, when executed, will in fact call certain functions on the mock object (or objects) it interacts with and b) that the method under test will react in an appropriate way to whatever the mock objects do - this second part is not any different from what stubs and fakes offer.

We've already seen how stubs and fakes can be used, so let's create a hand-rolled example of the kind of thing that mock object frameworks can help with. Let's go back to the scheduler we've already talked about. Let's say the scheduler processes a queue of ScheduledItem objects (ScheduledItem might be an interface) . If it's time to run one of these items, the scheduler calls the item's execute method. In our test, we can create a queue of mock items such that only one of them is supposed to be executed. A simple way of implementing this mock item might look something like this:

public interface ScheduledItem {
public void execute();
public int getNextExecutionTime();
}

public class MockScheduledItem implements ScheduledItem {
private boolean wasExecuted;
private int nextRun;

public MockScheduledItem(int nextRun) {
this.nextRun = nextRun;
}

public void execute() {
wasExecuted = true;
}

public int getNextExecutionTime() {
return nextRun;
}

public boolean getWasExecuted() {
return wasExecuted;
}
}

Our test might look something like this:
public void testScheduler_MakeSureTheRightItemIsExecuted() {
//setup
MockScheduledItem shouldRun = new MockScheduledItem(1000)
MockScheduledItem shouldNotRun = new MockScheduledItem(2000)
Scheduler scheduler = new TestingScheduler(1100);
scheduler.add(shouldNotRun);
sheduler.add(shouldRun);

//execute
scheduler.processQueue();

//verify
assertTrue(shouldRun.getWasExecuted());
assertFalse(shouldNotRun.getWasExecuted());
}
That's a really simple, hand-rolled, example of a mock object. The test just makes sure that the processQueue method ran the execute method on the first item, but not for the second one. Of course this example is very simple. We could make it a little fancier by counting the number of times the execute method is called and make sure it's only called once during the test. Then we could start to implement functionality that makes sure functions belonging to a given mock object are called with particular arguments, in a particular order, etc. Mock object frameworks support this kind of functionality out of the box. You can take any class in your application and create a mock version of that class to be used as part of a test. There are a bunch of mocking frameworks for many different programming languages.

Before you dive in, consider my word of caution: In the great spectrum between pure black-box and pure white-box testing, using mock objects is about as "white-box" as it gets. You're saying things along the lines of "I want to make sure that when I call function X on object A (the function and object under test), that functions Y on object B and function Z on object C will be called in that order, with certain specific arguments." When the test you write is making sure that something *happened* as a result of your test, it tends to be easier to understand what the test is trying to do. On the other hand, if your test is just making sure that some functions were called, what does that really mean? Potentially very little. Also, because your mock objects are basically fakes or stubs, you are not guaranteed that the behaviour of the actual objects that are being mocked out will be consistent with the mocks. In other words, you can create a mock version of an object that adds one to its argument whereas the real function subtracts one. If you change the behaviour of a given function that is being mocked in a test somewhere, you have to be careful to make sure to adjust the mock accordingly. If you don't, you'll wind up with passing tests, but you may still have introduced a bug into the application. This kind of problem tends to become more likely as the sophistication of the fake implementation increases - and pure fakes also suffer from the same weakness. I do think that creating mock tests where the specified interactions become complicated and the mock itself is a sophisticated fake that can respond to a wide variety of interactions compounds the likeliness of running into this kind of problem. Also, on a more basic level, simply refactoring code can be difficult with mock objects. The mock frameworks sometimes use strings to represent the mock object's methods internally, so renaming a method using a refactoring tool may not actually update the mock, and your tests would suddenly fail just because you renamed a function. Of course even a slightly more complicated refactoring, like breaking up a method into two can also cause mock objects to fail trivially, telling you that yes indeed, you actually changed some code.

As you can tell, I am not a huge fan of extensive use of mock objects in the sense of specifying interactions. I believe that such objects can indeed be useful in specific cases, but that's not how I think about writing my code. When I write a test, I try to keep it simple and concentrate on what I can expect to happen as a result of running that test, not specifically what the execution path of the function under test will look like. There are of course cases where this type of interaction testing is useful. I think the scheduler example above is a good case in point. You want your test to make sure a method is called, but thats it; you're not interested in what the real implementation of that method may do. All in all, I tend to prefer to stick with simple hand-rolled stubs, fakes, and mocks in my TDD practice. Your mileage may vary. Martin Fowler has written about the distinction betwen mocks and stubs/fakes also.

I know that when I first encountered "mock objects", I had some trouble figuring out exactly what it meant and what all the fuss was about. If you've found this blog entry because you were experiencing the same confusion, I hope it's been of some help.

Additional Links:

Monday, November 12, 2007

Even Grandmasters Get The Blues

I read an article some months ago about Vladimir Kramnik's incredible mistake in a game against the computer opponent Deep Fritz. Kramnik was at that point the undisputed world champion in chess - apparently the first person to achieve that status since Kasparov. In this game however, he made a truly astounding blunder. Deep Fritz had made a move that threatened an immediate check mate. Even someone like me who barely knows the rules of chess can see this fairly easily, but somehow Kramnik missed it. Here's a diagram of what happened:



With Deep Fritz's queen at E4 threatening to check mate at H7 with the very next move, Kramnik, instead of defending, blithely moved his queen from A7 to E3. In an instant, the game was over. Here is an account of the events:

Kramnik played the move 34...Qe3 calmly, stood up, picked up his cup and was about to leave the stage to go to his rest room. At least one audio commentator also noticed nothing, while Fritz operator Mathias Feist kept glancing from the board to the screen and back, hardly able to believe that he had input the correct move. Fritz was displaying mate in one, and when Mathias executed it on the board Kramnik briefly grasped his forehead, took a seat to sign the score sheet and left for the press conference, which he dutifully attended.

It's a fascinating thing. How could Kramnik have made such an error? There could be lots of explanations. He might have been ill or in a bad mood. He might have been experiencing a lot of stress because his opponent was a computer: It must be psychologically difficult to cope with the idea of an opponent with perfect memory who will never make a reading mistake or get tired; an opponent who can not be intimidated with an agressive move, or confused, or tricked. Never the less, the fact is that this mistake really did happen.

For me this might be an interesting example of a phenomenon I'll call the "expert's blunder," which can apply to anything, and in particular, to software development. A beginner in any field has a very small number of things he or she can keep track of. As one learns more and more about a given area however, one has to think of an ever expanding tree of concepts and ideas. Anything you do as an expert is heavier and more difficult. The more you know, the more tools are available to you, the more effort is required to choose which tool or approach to use next. It becomes easier to overlook something completely simple that a beginner would spot right away. As Kramnik was playing this ill-fated game, I think he wasn't really looking at the current state of the board, but rather at his own mental picture, a cloud of variations and possibilities. He might have got out of sync with his place in the actual game.

I've noticed this kind of thing happening to me when I've found myself implementing a more complicated piece of code than was necessary. Here's a case I blogged about some time ago for example: My original blog entry, followed by my realization that my code was over-designed. Here I too was getting ahead of myself, thinking of classes and subclasses and interfaces instead of just solving the problem in the most straightforward way. I think this kind of mistake is a good example of why TDD (test-driven development) and pair programming are useful tools in software development. The more we focus on the simple step-by-step design process of TDD and the more we constantly subject our code to critique as we do in pair programming, the less likely we are to develop bloated and over-designed code - however well-intentioned we might have been in writing it in the first place. I think this also promotes the idea of pairing a seasoned veteran with a recent grad: The value flows in both directions, not just from the expert to the novice. Not only beginners can write bad code. By virtue of their experience, people with more knowledge can be just as susceptible. At the very least, it's safe to say that one should always pause and just "look at the board" every so often.

Saturday, November 10, 2007

Fun Introductory Software Projects

I've taught some evening/weekend courses in software development over the years. Usually I've taught adults, but sometimes I've had the chance to work with some talented kids too - from about 12 to 17 years of age. I have to admit, working with the kids was great. Someone recently suggested that I write up some of the projects that were done during these courses. I've always had students come up with their own ideas: I'm a big believer that if you come up with your own idea for something to work on, you'll be genuinely motivated. Therefore, I really suggest coming up with your own concept if you're looking for a project to work on... In case you're looking for some inspiration though, here are some of the types of things people in my courses have done. Have fun!

  • A version of the famous "snake" video game.
  • A really neat original 80's style arcade game which involved shooting a mothership and picking up space junk floating around on the screen to augment one's own spaceship. Let me tell you, it's quite an experience asking about how the swarming bullets were done and getting the reply "Oh, it's a very simple algorithm, but the swarming is emergent behaviour" from a 13/14 year old.
  • A simple contact manager which you can enter people's information and pictures into. This program used a file to store its data, so all of the storage/retrieval routines had to be written from scratch as opposed to using a database program, which proved quite instructive to the developer.
  • A very impressive network multi-player game along the lines of Warcraft (albeit much simpler). The programmer did good work with path-finding and making sure all players were in sync. The players were happy faces of different colors which became sad faces as you attacked them.
  • An online pizza-ordering application. I recall some good discussions about the user interface and whether it was ok to allow people to order who didn't want to enter in their credit-card information.
  • A chat program along the lines of ICQ/Instant Messenger
  • A "Towers of Hanoi" program in which the user could select the number of disks. The user could play the game him/herself or let the computer solve the puzzle. A challenging bonus (which in this case was not implemented) would be to incorporate a "hint for next move" feature.