Even tests has got to justify themselves
Let us get a few things out of the way:
- I am not using TDD.
- I am not using BDD.
- I am not using Test After.
- I am not ignoring testing.
I considered not posting this post, because of the likely response, but it is something that I think it worth at least discussion. The event that made me decide to post this is the following bug:
public bool IsValid
{
get { return string.IsNullOrEmpty(Url); }
}
As you can probably guess, I have an inverted conditional here. The real logic is that the filter is valid if the Url is not empty, not the other way around.
When I found the bug, I briefly considered writing a test for it, but it struck me as a bad decision. This is code that I don’t see any value in testing. It is too stupid to test, because I won’t have any ROI from the tests. And yes, I am saying that after seeing that the first time I wrote the code it had a bug.
The idea behind TDD is to use the tests to drive the design. Well, in this case, I don’t have any design to drive. In recent years, I have moved away from the tenets of TDD toward a more system oriented testing system.
I don’t care about testing a specific class, I want to test the entire system as whole. I may switch some parts of the infrastructure (for example, change the DB to in memory one), for perf sake, but I usually try to test an entire component at a time.
My components may be as small as a single class or as big as the entire NH Prof sans the actual UI pieces. I have posted in the past, showing how I implement features for NH Prof, including the full source code for the relevant sections. Please visit the link, it will probably make more sense to you afterward. It is usually faster, easier and more convenient to write a system test than to try to figure out how to write a unit test for the code.
Now, let us look at why people are writing tests:
- Higher quality code
- Safe from regressions
- Drive design
Well, as I said, I really like tests, but my method of designing software is no longer tied to a particular class. I have the design of the class handed to me by a higher authority (the concept), so that is out. Regressions are handled quite nicely using the tests that I do write.
What about the parts when I am doing design, when I am working on a new concept?
Well, there are two problems here:
- I usually try several things before I settle down on a final design. During this bit of churn, it is going to take longer to do things with tests.
- After I have a design finalized, it is still easier to write a system level test than write unit tests for the particular implementation.
As a matter of fact, in many cases, I don’t really care about the implementation details of a feature, I just want to know that the feature works. As a good example, let us take a look at this test:
public class CanGetDurationOfQueries : IntegrationTestBase
{
[Fact]
public void QueriesSpecifyTheirDuration()
{
ExecuteScenarioInDifferentAppDomain<SelectBlogByIdUsingCriteria>();
var first = model.RecentStatements
.ExcludeTransactions()
.First();
Assert.NotNull(first.DurationViewModel.Inner.Value);
}
}
NH Prof went through three different ways of measuring the duration of a query. The test didn’t need to change. I have a lot of tests that work in the same manner. Specifying the final intent, rather than specifying each individual step.
There are some parts in which I would use Test First, usually parts that I have high degree of uncertainty about. The “show rows from query” feature in NH Prof was develop using Test First, because I had absolutely no idea how to approach it.
But most of the time, I have a pretty good idea where I am and where I am going, and writing unit tests for every miniscule change is (for lack of a better phrase) hurting my style.
Just about any feature in NH Prof is covered in tests, and we are confident enough in our test coverage to release on every single commit.
But I think that even a test has got to justify its existence, and in many cases, I see people writing tests that have no real meaning. They duplicate the logic in a single class or method. But that isn’t what I usually care about. I don’t care about what a method or a class does.
I care about what the overall behavior is. And I shaped my tests to allow me to assert just that. I’ll admit that NH Prof is somewhat of a special case, since you have a more or less central location that you can navigate form which to everything else. In most systems, you don’t have something like that.
But the same principle remains, if you setup your test environment so you are testing the system, it is going to be much easier to test the system. It isn’t a circular argument. Let us take a simple example of an online shop and wanting to test the “email on order confirmed” feature.
One way of doing this would be to write a test saying that when the OrderConfirmed message arrive, a SendEmail message is sent. And another to verify that SendEmail message actually send an email.
I would rather write something like this, however:
[Fact]
public void WillSendEmailOnOrderConfirmation()
{
// setup the system using an in memory bus
// load all endpoints and activate them
// execute the given scenario
ExecuteScenario<BuyProductUsingValidCreditCard>();
var confirmation = model.EmailSender.EmailsToSend.FirstOrDefault(x=>x.Subject.Contains("Order Confirmation");
Assert.NotNull(confirmation);
}
I don’t care about implementation, I just care about what I want to assert.
But I think that I am getting side tracked to another subject, so I’ll stop here and post about separating asserts from their scenarios at another time.
Comments
I think the only really controversial thing in here might be the statement #2 "I am not using BDD" :D
In my experience, effective BDD can rarely be achieved by testing 'units' in complete isolation b/c rarely do you get meaningful, testable 'behavior' out of a single 'unit' (taken to mean 'class'). Meaningful, observable, testable behavior usually requires some collaboration of multiple classes (you seem to be refering to this collection of collaborating classes as a 'component' here I think).
If by 'doing BDD' we mean the developer self-flagellation that arises from meaningless adherence to the syntax of 'Given-When-Then' nonsense, then I think its safe to say you're not "doing BDD' but if we look at the intent behind much of BDD (testing actual meaningful behaviors of your SUT in such a way that trivial and meaningless changes in implementation during the design lifecycle don't break needlessly-brittle, low-level 'unit' tests), then in my book you're actually practicing BDD.
And this post is a great high-level overview of why more people should really being doing exactly that ;)
-Steve B.
as software developers everything we do has to justify itself. and tests are no exception.
This is one of the tenants of BDD, if you're not driving value with what you're doing move on to something else.
Adding a test in your example isn't delivering business value so move on.
You seem to be outlining some of the reasons for the focus of BDD while saying you aren't using it.
Pragmatism will always win IMHO
O
Steve,
I have a post tomorrow talking about the concrete implementation.
I agree about the requirements for meaningful behavior.
The reason that I am rejecting BDD isn't because of the driving forces, but because of the way it is commonly portrayed and presented.
I think doing TDD gives great value in learning proper design techniques, you are kind of forced into it (if you are using the right tools). But I have to admit doing TDD quite easily gives brittle tests, where the implementation is hard to change. So Behavioral tests is the better way to go, all I see here is that you do it at a higher level. An other thing I see is that experience matters, when you have the right experience you don't need to verify for yourself each and every piece of logic.
I had a good about this during a geek beer, instead of setting something up in order to get the right internal state you can just execute the actions to get to that state and then execute the action you want to test and verify the expected behavior. This made a lot off sense to me as well.
-Mark
You've hit on one of the universal programming principles: YMMV / It Depends. For you, you have the experience to know when you need tests and when you don't. When you're churning on one of several designs, you're essentially spiking a solution. Assuming you're either throwing away the code or writing some tests afterward to ensure it does what you expect (at whatever level - unit/integration/acceptance), I don't think too many people are going to think that's irresponsible or a bad practice. And as Mark wrote, it's easy to write brittle tests if you tie your tests too closely to your specific design/implementation. Tests, like code, can depend on abstractions and behavior, rather than mirroring the system under test (SUT). Thus, when the SUT changes, the tests can remain valid with minimal/no changes.
Oh, and since it was mentioned by more than one person between the post and comments, let me note that it's 'tenets' not 'tenants'. Tenants live in rented apartments. Tenets are principles.
I've got no problem with this on the condition that you're one of the few people out there whose experience and ability I trust to make the call when a test is necessary.
I honestly feel the "is this really worth it" doubt too and I experiment on my own projects but for anything mission critical I'm sticking close to the TDD/BDD script until I gain some more confidence.
+1
The TDD and BDD usually forces people to create myriad of small pointless unit tests that are getting broken after small changes in the core libraries. This creates a maintenance nightmare.
I've found it very useful to create a higher level integration/behavior tests. The general idea of what the software is intended to execute is rarely changes - hence these tests stay valid longer and provide a better safety net.
Hey wait, that's my line you stole!
Hi Ayende,
I get why you wouldn't write a check now. There are only two possible conditions, and now you know it is right.
To me, it's about value add. How much time did you have to spend tracking this down? How many of your customers were affected by it, and had to spend their time reporting it? And how much time would it have taken to have written a check which would have caught this?
And why didn't your system checks catch it? Did you modify those to catch this behavior? Because perhaps you did, and that what you found was a small hole in your system checking logic.
There's plenty of times when I have made a decision not to write a check for something and it's bitten me. But it seems like what you wrote was, "I didn't write a test, and it bit me, and tests at a feature level are where it is at." I agree that your feature level checks are important - but that seems unrelated to not writing a check here.
What I'm really curious about is if you modified your system level checks to account for this, and if not, why not?
Uncle Bob will smack you on your hand!!
Hmm,
Well your case is quite valid for a single programmer. It gets quite worse when you have a team of more than 5 people handling the code. Its even worse when these five people are geo-distributed. 5 people, 5 different experience levels, cultural backgrounds, academic backgrounds. New people always coming and going.
For instance, one of the products I'm working on has lived several years and all these years lots of programmers have contributed to it. Some of them were not even programmers (mechanical engineers stuck in software industry). So code introduced by such people works for them and they know what they're doing and probably would live as long as one release. Its only later in the release cycle that you start to find bugs and when you try to solve them you hit a roadblock. Why, because they didn't respect coding standards, design patterns, etc.
So TDD has one major advantage that I see. It lets you see things from the 'others' perspective before you even write code.
And trust me regression is a HUGE issue. If you are the lone developer who knows his code-fu well, good. but thats extremely ideal case.
Zaki Shaheen said it.
Cory,
It was an obvious issue, caught the first time that I run the app. No users impact.
My system checks don't check stuff like that, for the reasons outlined in the post
Eric,
I am a bit bigger :-)
Zaki,
I LOVE (not) it when people are making assumption about the context I am working on.
This is how we work for NH Prof, which has a team of 4 geo-distributed people and _every single build goes to customers_.
And did you SEE the numerous time that I referred to still having tests and using them for regression?
Anyway, this post should be marked MA (Mature Audience Only), so younger programmers wont use excuse to not write unit tests because Ayende doesn't do it.
An interesting post, I know you said your system tests don't test for that sort of thing (the bug in question), I guess I'd be curious as to why (or even how)?
At some point, the system tests should be calling some class that in turn calls that property which would always return the wrong value. At that point you'd think it would return an unintended result along the way.
Overall it seems that you are doing BDD without all the BDD specific wrappers. Overall, I can't say I disagree with it too much. Changing design and having it break 100 unit tests is a pain, but then again, that's most the exception than the rule.
Have you read msdn.microsoft.com/en-us/magazine/cc163665.aspx? The article contains similar idea - not to test private members.
I'm not sure if I'm 100% following, but I definitely think balancing at what level to test is significant (and hard ... requires experience). At what level it turns to implementation details basically.
I know when I first started TDD I was making a ton of things public just so I could test it as a 'unit'. Looking back, this was all implementation details and it was actually making a worse design by adding a lot of public noise.
It also made refactoring hard because having tests on implementation devalues all tests -- ie. if you can change implementation and everything still works but your tests you start to mistrust your tests.
I'm not sure how to know where to draw the line in the sand without experience though.
Hi, several people have made good points here. The issue that I see is ( much less succinctly ) the same as Alex Simkin. TDD helps force you to write code that adheres to SOLID and other design priciples. Once your fluent is this sort of practice, indeed it does tend to seem a bit over kill. And if everyone on you team is good and trustworthy, then start to make adjustments. But if you have a team that does no testing and writes crap code ( like a great deal of the web app shops ) then TDD is important principle to introduce because it has a clear discipline and set of rules. Test everything. Once your code doesn't suck then we can relax a bit.
Also... are you making smart use of IoC to help with testing? One reason you might test the order confirmation email differently is to be able to insert a mocked email sender.
If instead you configured IoC in your scenarios and then just manually pointed the IoC to the mocked email sender or whatever you were interested in testing you end up with having less implementation concerns in your test.
Or am I missing something?
Gamer,
Yes, I did. I think that me & Roy are talking about different things here.
I'm not sure, but BDD seems to pretty much fith the bill for what you expect from tests without them getting in the way. The "churn" you refer to is really just a form of prototyping.
Purist TDD or BDD for that matter isn't for everyone. It can easily become a bit like "over-the-fence" where the process is expected to be followed for the sake of being followed. (But then some people find adherence to process to be comforting.)
Personally I use a form of BDD, but unlike TDD I don't adopt Red-Green-Refactor. Mine is more of an Amber-Green-Refactor. <:] I write code, think about it, maybe throw it out and rewrite it, think about it some more, and when I'm satisfied I write behavioural based tests for the functionality. (The story I was working on.) It's Amber because usually the tests pass, but sometimes they fail and that's a very good thing. From there once the tests are passing the code is safe(r) to refactor. Where I do use red-green-refactor is when fixing production issues. Write the test to reproduce the bug, fix the bug, run all the tests. This is one area that I'm absolutely zealous about eliminating regression. Customers are typically tolerant of bugs, but one thing I've learned is that it really pisses them off is when a bug regresses. Your name is mud if a "bug came back, the very next day, the bug came back, it wouldn't stay away...."
So I guess in your example case, write the unit test, and consider writing behaviour tests as you implement functionality. Yeah, it's an idiotic mistake to make and anyone can see it in hindsight, but a little discipline can help prevent those from reaching customers.
You're not the only one that thinks you shouldn't test that method. There's a cleverly-named project in the Java world called CRAP4J ( http://www.crap4j.org/). Its an acronym, of course: Change Risk Analysis and Predictions.
It marries code coverage (a valuable testing statistics) with cyclomatic complexity. Untested complex code shows up big time. Something that isn't complex -- like the method you started with -- and also not tested isn't weighted much at all.
I keep hoping someone will port it to .NET.
Steve,
No, it is not prototyping.
I don't care for the implementation details. And in fact, I change them on as needed basis frequently.
That may be six months after the code was initially written
Ayende,
Then that's re-factoring, and there should be tests in place to ensure that implementation you are changing doesn't have unexpected ramifications on behaviour.
When you're working on your own project and can take the stand "It will do what it does" with a disclaimer then the issue is moot.
When working against requirements tied to a contract, as part of a team: the risk of regression, changes in behaviour, and breaking something while "fixing something that isn't broken" justify the assurances that a test driven approach provide.
In hindsight this has gotten too pedantic. The issue isn't to test or not to test, it's the scope of how tests fit.
Still, I do believe that when a bug is uncovered, regardless how idiotic, the first thing should be to assess why existing tests didn't cover it, and either update the tests to cover it, or introduce a new test to cover it. Regardless whether you use Unit Tests, behaviour tests, or choose to call your behaviour tests "scenario" tests. ;]
Steve,
No, it is not refactoring.
Refactoring means changing impl. and not behavior. I change _behavior_.
There are plenty of reasons to do stuff like that (performance, changing operation to make future changes easier, cleaner implementation, etc).
And note that I explicitly state that I do have regression suite, regression suite and TDD have nothing to do with one another
in WillSendEmailOnOrderConfirmation, you're asserting NotNull on a Where. I think you mean Any, Single or Default. Or am I missing something?
--Ruben Bartelink
Ruben,
You are right, fixed.
Some interesting comments about testing there Ayende, the fact that it's came from you is going to give your opinion a lot of extra weight.
Although I'm glad you've written an article about this controversial topic and good to see your opinion is factually founded and goes against the grain of the latest fad of TDD development.
I'm personally do not like TDD for designing new projects at all, I find it encourages poor architecture for complex systems.
I find it is only useful when adding/changing features in an existing system/code-base as is done in enterprise software development.
Over time I've adopted my own development style which I've labelled 'DDT' (Developer Driven Tests), although it does sound a lot like 'TestAfter'.
Although having said that I am finding BDD to be very useful, it has some very useful properties that other test-styles don't have, i.e. a focus on business value and by having it in a language that business users understand they're able to readily see which of their requirements are being met and which ones aren't.
Ayende,
I have found the same thing over time. If things were too specific it could cause more work, I tend to use a mix of techniques, TDD where appropriate etc. A lot of my work is “systems integration” related so the high level tests have more value, change less and become good regression tests post delivery.
Thanks for sharing...
PK :-)
Comment preview