Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 5,953 | Comments: 44,409

filter by tags archive

More xUnit tweaks, dynamic test skipping


For a long time, xUnit’s dev has resisted adding support for skipping a test dynamically. You could create your own Fact class and handle that yourself, but that was quite a lot of work, for something very specific.

In RavenDB, we had the need to dynamically decide whatever the test can run based on the actual test situation, so I decided to add this to our xUnit fork. This turned out to be really simple to do. Just three lines of code Smile

https://github.com/ayende/xunit/commit/82accb4c850a3938187ac334fb73d6e81dc921e3#diff-3c2b9f2cb8392f32456d0bf81151b59fR57

Tweaking xUnit


One of the interesting challenges that we have with RavenDB is the number and duration of our tests.

In particular, we current have over three thousands tests, and they take hours to run. We are doing a lot of stuff there “let us insert million docs, write a map/reduce index, query on that, then do a mass update, see what happens”, etc. We are also doing a lot of stuff that really can’t be emulated easily. If I’m testing replication for a non existent target, I need to check that actual behavior, etc. Oh, and we’re probably doing silly stuff in there, too.

In order to try to increase our feedback cycle times, I made some modifications to xUnit. It is now going to record the test duration of the tests, the results look like that:

image

You can see that Troy is taking too long. In fact, there is a bug that those tests currently expose that result in a timeout exception, that is why they take so long.

But this is just to demonstrate the issue. The real power here is that we also use this when decided how to run the tests. We are simply sorting them by how long they took to run. If we don’t have a record for that, we’ll give them a run time of –1.

This has a bunch of interesting implications:

  • The faster tests are going to run first. That means that we’ll have earlier feedback if we broke something.
  • The new tests (haven’t had chance to run ever) will run first, those are were the problems are more likely anyway.
  • We only run report this for passing tests, that means that we are going to run failed tests first as well.

In addition to that, this will also give us better feedback on what are slow tests are. So we can actually give them some attention and see if they are really required to be slow or they can be fixed.

Hopefully, we can find a lot of the tests that are long, and just split them off into a separate test project, to be run at a later time.

The important thing is, now we have the information to handle this.

What does the test say?


Because RavenDB is a database, a lot of the tests we have to run are pretty long. For example, we need to touch the disk a lot, and we have a lot of networked tests.

that means that running this test suite can take a while. But the default information we get is pretty lousy. Just the test count and that is it. But when a test hang, and they do if we have bugs, it make it very hard to figure out where the culprit is.

So we forked xunit and added a tiny feature to the console runner:

image

Defensive coding is your friend


We just had a failing test:

image

As you can see we assumed that fiddler is running, when it isn’t. Here is the bug:

image

Now, this is great when I am testing things out, and want to check what is going on the wire using Fiddler, but I always have to remember to revert this change, otherwise we will have a failing test and a failing build.

That isn’t very friction free, so I added the following:

image

Now the code is smart enough to not fail the test if we didn’t do things right.

On failing tests


I made a change (deep in the guts of RavenDB), and then I run the tests, and I go this:

image

I love* it when this happens, because it means that there is one root cause that I need to fix, really obvious and in the main code path.

I hate it when there is just one failing test, because it means that this is an edge condition or something freaky like that.

* obviously I would love it more if there were no failing tests.

Testing Rhino Service Bus applications


One of the really nice things about Rhino Service Bus applications is that we have created a structured way to handle inputs and outputs. You have messages coming in and out, as well as the endpoint local state to deal with. You don’t have to worry about how to deal with external integration points, because those are already going over messages.

And when you have basic input/output figured out, you are pretty much done.

For example, let us see the code that handles extending trail licenses in our ordering system:

public class ExtendTrialLicenseConsumer : ConsumerOf<ExtendTrialLicense>
{
    public IDocumentSession Session { get; set; }
    public IServiceBus Bus { get; set; }

    public void Consume(ExtendTrialLicense message)
    {
        var productId = message.ProductId ?? "products/" + message.Profile;
        var trial = Session.Query<Trial>()
            .Where(x => x.Email == message.Email && x.ProductId == productId)
            .FirstOrDefault();

        if (trial == null)
            return;
        
        trial.EndsAt = DateTime.Today.AddDays(message.Days);
        Bus.Send(new NewTrial
        {
            ProductId = productId,
            Email = trial.Email,
            Company = trial.Company,
            FullName = trial.Name,
            TrackingId = trial.TrackingId
        });
    }
}

How do we test something like this? As it turns out, quite easily:

public class TrailTesting : ConsumersTests
{
    protected override void PrepareData(IDocumentSession session)
    {
        session.Store(new Trial
        {
            Email = "you@there.gov",
            EndsAt = DateTime.Today,
            ProductId = "products/nhprof"
        });
    }

    [Fact]
    public void Will_update_trial_date()
    {
        Consume<ExtendTrialLicenseConsumer, ExtendTrialLicense>(new ExtendTrialLicense
        {
            ProductId = "products/nhprof",
            Days = 30,
            Email = "you@there.gov",
        });

        using (var session = documentStore.OpenSession())
        {
            var trial = session.Load<Trial>(1);
            Assert.Equal(DateTime.Today.AddDays(30), trial.EndsAt);
        }
    }

    // more tests here
}

All the magic happens in the base class, though:

public abstract class ConsumersTests : IDisposable
{
    protected IDocumentStore documentStore;
    private IServiceBus Bus = new FakeBus();

    protected ConsumersTests()
    {
        documentStore = new EmbeddableDocumentStore
        {
            RunInMemory = true,
            Conventions =
                {
                    DefaultQueryingConsistency = ConsistencyOptions.QueryYourWrites
                }
        }.Initialize();

        IndexCreation.CreateIndexes(typeof(Products_Stats).Assembly, documentStore);

        Products.Create(documentStore);
        using (var session = documentStore.OpenSession())
        {
            PrepareData(session);
            session.SaveChanges();
        }
    }

    protected T ConsumeSentMessage<T>()
    {
        var fakeBus = ((FakeBus)Bus);
        object o = fakeBus.Messages.Where(x => x.GetType() == typeof(T)).First();

        fakeBus.Messages.Remove(o);
        return (T) o;
    }

    protected void Consume<TConsumer, TMsg>(TMsg msg)
        where TConsumer : ConsumerOf<TMsg>, new()
    {
        var foo = new TConsumer();

        using (var documentSession = documentStore.OpenSession())
        {
            Set(foo, documentSession);
            Set(foo, Bus);
            Set(foo, documentStore);

            foo.Consume(msg);

            documentSession.SaveChanges();
        }
    }

    private void Set<T,TValue>(T foo, TValue value)
    {
        PropertyInfo firstOrDefault = typeof(T).GetProperties().FirstOrDefault(x=>x.PropertyType==typeof(TValue));
        if (firstOrDefault == null) return;
        firstOrDefault.SetValue(foo, value, null);
    }

    protected virtual void PrepareData(IDocumentSession session)
    {
    }

    public void Dispose()
    {
        documentStore.Dispose();
    }
}

And here are the relevant details for the FakeBus implementation:

public class FakeBus : IServiceBus
{
    public List<object>  Messages = new List<object>();

    public void Send(params object[] messages)
    {
        Messages.AddRange(messages);
    }
}

Now, admittedly, this is a fairly raw approach and we can probably do better. This is basically hand crafted auto mocking for consumers, and I don’t like the Consume<TConsumer,TMsg>() syntax very much. But it works, it is simple and it doesn’t really gets in the way.

I won’t say it is the way to go about it, but it is certainly easier than many other efforts that I have seen. We just need to handle the inputs & outputs and have a way to look at the local state, and you are pretty much done.

Structuring your Unit Tests, why?


I am a strong believer in automated unit tests. And I read this post by Phil Haack with part amusement and part wonder.

RavenDB currently has close to 1,400 tests in it. We routinely ask for failing tests from users and fix bugs by writing tests to verify fixes.

But structuring them in terms of source code? That seems to be very strange.

You can take a look at the source code layout of some of our tests here: https://github.com/ayende/ravendb/tree/master/Raven.Tests/Bugs

It is a dumping ground, basically, for tests. That is, for the most part, I view tests as very important in telling me “does this !@#! works or not?” and that is about it. Spending a lot of time organizing them seems to be something of little value from my perspective.

If I need to find a particular test, I have R# code browsing to help me, and if I need to find who is testing a piece of code, I can use Find References to get it.

At the end, it boils down to the fact that I don’t consider tests to be, by themselves, a value to the product. Their only value is their binary ability to tell me whatever the product is okay or not. Spending a lot of extra time on the tests distract from creating real value, shippable software.

What I do deeply care about with regards to structuring the tests is the actual structure of the test. It is important to make sure that all the tests looks very much the same, because I should be able to look at any of them and figure out what is going on rapidly.

I am not going to use the RavenDB example, because that is system software and usually different from most business apps (although we use a similar approach there). Instead, here are a few tests from our new ordering system:

[Fact]
public void Will_send_email_after_trial_extension()
{
    Consume<ExtendTrialLicenseConsumer, ExtendTrialLicense>(new ExtendTrialLicense
    {
        ProductId = "products/nhprof",
        Days = 30,
        Email = "you@there.gov",
    });

    var email = ConsumeSentMessage<NewTrial>();

    Assert.Equal("you@there.gov", email.Email);
}

[Fact]
public void Trial_request_from_same_customer_sends_email()
{
    Consume<NewTrialConsumer, NewTrial>(new NewTrial
    {
        ProductId = "products/nhprof",
        Email = "who@is.there",
        Company = "h",
        FullName = "a",
        TrackingId = Guid.NewGuid()
    });
    Trial firstTrial;
    using (var session = documentStore.OpenSession())
    {
        firstTrial = session.Load<Trial>(1);
    }
    Assert.NotNull(ConsumeSentMessage<SendEmail>());
    
    Consume<NewTrialConsumer, NewTrial>(new NewTrial
    {
        TrackingId = firstTrial.TrackingId,
        Email = firstTrial.Email,
        Profile = firstTrial.ProductId.Substring("products/".Length)
    });

    var email = ConsumeSentMessage<SendEmail>();
    Assert.Equal("Hibernating Rhinos - Trials Agent", email.ReplyToDisplay);
}

As you can probably see, we have a structured way to send input to the system, and we can verify the output and the side affects (creating the trial, for example).

This leads to a system that can be easily tested, but doesn’t force us to spend too much time in the ceremony of tests.

Async tests in Silverlight


One of the things that we do is build a lot of stuff in Silverlight, usually, those things are either libraries or UI. Testing Silverlight was always a problem, but at least there is a solution OOTB for that.

Unfortunately, the moment that you start talking about async tests (for example, you want to run a web server to check things), you need to do things like this, EnqueueCallback, EnqueueConditional and other stuff that makes the test nearly impossible to read.

Luckily for us, Christopher Bennage stopped here for a while and created a solution.

It allows you to take the following sync test:

[Fact]
public void CanUpload()
{
    var ms = new MemoryStream();
    var streamWriter = new StreamWriter(ms);
    var expected = new string('a',1024);
    streamWriter.Write(expected);
    streamWriter.Flush();
    ms.Position = 0;

    var client = NewClient(); 
    client.UploadAsync("abc.txt", ms).Wait();

    var downloadString = webClient.DownloadString("/files/abc.txt");
    Assert.Equal(expected, downloadString);
}

And translate it to:

[Asynchronous]
public IEnumerable<Task> CanUpload()
{
    var ms = new MemoryStream();
    var streamWriter = new StreamWriter(ms);
    var expected = new string('a', 1024);
    streamWriter.Write(expected);
    streamWriter.Flush();
    ms.Position = 0;

    yield return client.UploadAsync("abc.txt", ms);

    var async = webClient.DownloadStringTaskAsync("/files/abc.txt");
    yield return async;

    Assert.AreEqual(expected, async.Result);
}

It makes things so much easier. To set this us, just reference the project and add the following in the App.xaml.cs file:

private void Application_Startup(object sender, StartupEventArgs e)
{
    UnitTestSystem.RegisterUnitTestProvider(new RavenCustomProvider());
    RootVisual = UnitTestSystem.CreateTestPage();
}

And you get tests that are now easy to write and run in Silverlight.

Cut the abstractions by putting test hooks


I have been hearing about testable code for a long time now. It looks somewhat like this, although I had to cut on the number of interfaces along the way.

We go through a lot of contortions to be able to do something fairly simple, avoid hitting a data store in our tests.

This is actually inaccurate, we are putting in a lot of effort into being able to do that without changing production code. There are even a lot of explanations how testable code is decoupled, and easy to change, etc.

In my experience, one common problem is that we put in too much abstraction in our code. Sometimes it actually serve a purpose, but in many cases, it is just something that we do to enable testing. But we still pay the hit in the design complexity anyway.

We can throw all of that away, and keep only what we need to run production code, but that would mean that we would have harder time with the tests. But we can resolve the issue very easily by making my infrastructure aware of testing, such as this example:

image

But now your production code was changed by tests?!

Yes, it was, so? I never really got the problem people had with this, but at this day and age, putting in the hooks to make testing easier just make sense. Yes, you can go with the “let us add abstractions until we can do it”, but it is much cheaper and faster to go with this approach.

Moreover, notice that this is part of the infrastructure code, which I don’t consider as production code (you don’t touch it very often, although of course it has to be production ready), so I don’t have any issue with this.

Nitpicker corner: Let us skip the whole TypeMock discussion, shall we?

TDD fanatic corner: I don’t really care about testable code, I care about tested code. If I have a regression suite, that is just what i need.

Hey you, developer! Learn a bit of business speak!


image  In a comment to one of my posts, Dave Mertens said:

I often wish I had the luxury the throw away the current code and start from scratch. Why. Because my manager ask me how long it takes to fix just this bug. 80% of the bugs (luckily we don't have that many bugs) are fixed within 2 hours including writing tests. Than he asks me how long it takes to rewrite the damn code.

It is interesting to note that the questions you ask select the answers you want.

When you give someone a choice between 2 hours of fixing code vs. 3 days of fixing code, the decision is always going to be in favor of 2 hours.

At one point in my life I was asked if I stopped beating up people. This is a twist on the old “have you stopped beating your wife yet? yes/no” question. I answered in Mu. Then I spent another ten minutes explaining what I meant.

When someone asks me those types of questions, I correct the question, before I answer. Because while you may ask questions to lead the answer in a certain way, there is nothing that says that I must go that way.

The right question is: How often are we hitting on bugs for this same code path? Then asking about how long it takes to fix the whole damn thing.

Oh, and make sure that you can fix it. There is nothing a business likes less than spending time & money on solving a problem and ending up not solving it. Leaving aside what it does to your credibility and the trust relationship with the business.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. The RavenDB Comic Strip (3):
    28 May 2015 - Part III – High availability & sleeping soundly
  2. Special Offer (2):
    27 May 2015 - 29% discount for all our products
  3. RavenDB Sharding (3):
    22 May 2015 - Adding a new shard to an existing cluster, splitting the shard
  4. Challenge (45):
    28 Apr 2015 - What is the meaning of this change?
  5. Interview question (2):
    30 Mar 2015 - fix the index
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats