Analyzing a performance problem – Is a prisoner dangerous?

time to read 12 min | 2292 words

Recently I run into a performance problem in an application that I was reviewing, and I thought that it would make a great post. Since I can’t use the actual application model, I decided that I am tired of using the same old online shop model and turned to the one domain in which I am a domain expert. Prisons.

Let us imagine that this is part of the Prison 10’s* Dashboard. It looks pretty simple, right?

image

Let us talk about this as SQL, ignoring all layers in the middle. We can express this as:

SELECT TOP 10 Prisoner.Id, Prisoner.FirstName, Prisoner.LastName, Prisoner.Status
FROM Prisoner JOIN Imprisonment on Prisoner.Id = Imprisonment.Prisoner
WHERE Imprisonment.Prion = 6 AND Imprisonment.IsCurrent = 1
ORDER BY Imprisonment.ArrivalDate DESC

Seems simple enough, right? But notice that we don’t have Prisoner.Dangerous column on the result. That is because there is no such column in the table. So what does our good developer do? He goes to the domain expert and ask him where he is supposed to get the data from. The domain expert answer is that this is not an attribute of the prisoner itself, it is a decision based on many factors, any of which can flag a prisoner as a dangerous. He then gives the developer a list of the few common ones:

  • If the prisoner is not convicted
  • If the prisoner has a disciplinary notice in the last 2 months
  • If the prisoner has more than one disciplinary notice in the last 6 months
  • If the prisoner is charged with violent crime
  • If the prisoner is currently in withdrawal
  • If the prisoner request for vacation was denied in the last 3 months

As you can see, the list goes on (and on, and on). Our developer starts working on the problem. Since he is working on a small local database, he is starting out with a spike of everything as in memory filters. It is beautifully Object Oriented, and it looks like this:

public class NotConvicted : IPrisonerDangerousEvaluator
{
    public bool Eval(Prisoner p)
    {
        return p.Status != PrisonerStatus.Convicted;
    }
}

public class DisciplinaryNoticeInLast2Months : IPrisonerDangerousEvaluator
{
    public bool Eval(Prisoner p)
    {
        return p.DisciplinaryNotices
                .Any(x=>x.Date > DateTime.Today.AddMonths(-2));
    }
}

public class MorethanOneDisciplinaryNoticeInLast6Months : IPrisonerDangerousEvaluator
{
    public bool Eval(Prisoner p)
    {
        return p.DisciplinaryNotices
                .Where(x=>x.Date > DateTime.Today.AddMonths(-6))
                .Count() > 1;
    }
}

public class ChargedWithViolentCrime : IPrisonerDangerousEvaluator
{
    public bool Eval(Prisoner p)
    {
        return p.Charges
                .Any(x=>x.ChargeStatus.IsViolent);
    }
}

public class CurrentlyInDrugWithdrawal : IPrisonerDangerousEvaluator
{
    public bool Eval(Prisoner p)
    {
        return p.MedicalRecord
                .Withdrawals
                .Any(x=>x.IsCurrent);
    }
}

public class VacationRequestDeniedInLast3Months : IPrisonerDangerousEvaluator
{
    public bool Eval(Prisoner p)
    {
        return p.Requests
                .Where(x=>x.RequestType == RequestType.Vacation && x.Date > DateTime.Today.AddMonths(-3))
                .Any(x=>x.Status.Approved == false);
    }
}

Now, this works, and from OO perspective, it works just great. From a performance perspective, this is horrible. Let us do the math of how many queries this is going to generate, oaky?

  • Get all prisoners – 1
    • For each prisoner – N(Prisoners)
      • NotConvicted – 0 (load data in the prisoner entity, which was already loaded)
      • DisciplinaryNoticeInLast2Months – 1 (load DisciplinaryNotices)
      • MorethanOneDisciplinaryNoticeInLast6Months – 0 (DisciplinaryNotices already loaded)
      • ChargedWithViolentCrime – 1 (load Charges)
      • CurrentlyInDrugWithdrawal – 2 (load Medical Record, load Withdrawals)
      • VacationRequestDeniedInLast3Months – 2 (load Requests, load Status)

This isn’t SELECT N+1 ! This is so much worse. What we have is actually: SELECT 1 + N(1+1+2+2)

Or, in other words, assuming we want to display the first ten inmates, showing that little grid up above is going to result in 41 queries!

This actually show us several important worst practices:

  • In memory filtering should be avoided if possible.
  • Trying to ignore the realities of data access is going to bite you, hard.
  • There is a good reason for the call for aggregate roots and the avoidance of highly connected object graphs.

So, what is the obvious approach? We can change the way that we load the data. Instead of trying to go through the object model, we can query the data directly from the database. It is going to look something like this:

SELECT TOP 10 Prisoner.Id, Prisoner.FirstName, Prisoner.LastName, Prisoner.Status,
(
    SELECT 1 
    WHERE Prisoner.Status != 'Convicted'
    OR EXISTS (SELECT 1 FROM DisciplinaryNotices 
        WHERE DisciplinaryNotices.Prisoner = Prisoner.Id
        AND   DisciplinaryNotices.Date > getdate()-60)
    OR 1 < (SELECT COUNT(*) FROM DisciplinaryNotices 
        WHERE DisciplinaryNotices.Prisoner = Prisoner.Id
        AND   DisciplinaryNotices.Date > getdate()-180)
    OR  EXISTS (SELECT 1 FROM Charges
        WHERE Charges.Prisoner = Prsioner.Id
        AND Charges.IsViolent = 1)
    OR  EXISTS (SELECT 1 FROM MedicalRecords JOIN Withdrawals ON MedicalRecords.Id = Withdrawal.Record
        WHERE MedicalRecords.Prisoner = Prisoner.Id
        AND Withdrawal.IsCurrent = 1)
    OR  EXISTS (SELECT 1 FROM Requests JOIN RequestsStatus ON Requests.Id = RequestsStatus.Request
        WHERE Requests.Prisnoer = Prisoner.Id
        AND   Request.Date > getdate() - 90
        AND   RequestsStatus.Approved = 0
        AND   Request.Type = 'Vacation')
) as Dangerous 
FROM Prisoner JOIN Imprisonment on Prisoner.Id = Imprisonment.Prisoner
WHERE Imprisonment.Prion = 6 AND Imprisonment.IsCurrent = 1
ORDER BY Imprisonment.ArrivalDate DESC

From the point of view of database access, we are in a much better position, we now have only a single query to execute. For that matter, we can change the IPrisonerDangerousEval to perform their evaluation in the database, retaining the nice OO model that we have. It would look something like this:

public class NotConvicted : IPrisonerDangerousEvaluator
{
    public void AddSelectionCriteria(DetachedCriteria crit)
    {
        return crit.Add(Restrictions.NotEq("Status", PrisonerStatus.Convicted));
    }
}

public class DisciplinaryNoticeInLast2Months : IPrisonerDangerousEvaluator
{
    public void AddSelectionCriteria(DetachedCriteria crit)
    {
        return crit.CreateCriteria("DisciplinaryNotices)
                   .Add(Restrictions.Gt("Date", DateTime.Today.AddMonths(-2)));
    }
}

Note, doing something like this require a fairly interesting infrastructure, to make sure that the results are consistent and each rule isn’t stepping on its other toes.

Problem solved?

Not… really.

Take a look at the SQL that we have above. This query is hitting 8 tables, and please remember that I explicitly stated that I kept the number of things that make a prisoner dangerous limited, there are about forty rules relating to this, not half a dozen. But even with just half a dozen, trying to optimize the above query is going to be… problematic.

This is where we figure out something really interesting. In most systems, the number of reads far outweigh the number of writes. The most basic optimization that we can do is move work from the read phase of the application to the write phase, since it is going to execute so much less often.

It seems like it would be a very simple solution for the application to execute the rules on save and set the IsDangerous property, right? Well, it would, except that there are so many different places in the application that can need this. Remember, there are a lot of rules, and just about anything can change that. So we would need to execute the rules whenever we change something in the application.

Here, again, we see the importance of aggregates, because instead of spreading that logic all over the system, we can define Prisoner as the aggregate, and force all changes to it to occur through the entity. When saving a prisoner, then we can execute this.

However, there is another aspect, this can cause an unacceptable performance for saving, since we expect the number of rules to grow and only become more complex over time. This looks like a good opportunity to to shift work to a background process. There is also the matter that we have to process those rules against every day, to make sure that we reset any rules that depend on time.

I would build such a system using the following structure:

image

The web server would send a notification to the batch processing server whenever a prisoner aggregate root is updated. The batch server will then process all rules for the prisoner. Once a day, the batch server will re-evaluate the rules for all prisoners. This solves the performance problem that we have when updating a prisoner. It introduce a new one, a problem of consistency. Now, there is going to be some delay between updating a prisoner and updating the IsDangerous status. In practice, I would expect the window of inconsistency to be be in the order of the time it takes to process all the rules. I would be extremely surprised if it was more than a second, and probably a lot less.

In the duration, we can show in the UI that the prisoner’s danger status in unknown. The UI can then query the application periodically until the danger status became known again using Ajax. From the user experience point of view, there would be almost no change, but in cases where the evaluation takes long enough for the user to notice, it will be made explicit what is going on.

Yes, it is a more complex solution, but it is also one that neatly avoids all the complications that we have run into during this post.

* I am sorry, but this post is going to be choke full of jokes that only I will probably understand.