Ayende @ Rahien

It's a girl

Modeling reference data in RavenDB

The question came up in the Mailing list, and I thought it would be a good thing to post about.

How do you handle reference data in RavenDB? The typical example in most applications would be something like the list of states, their names and abbreviations. You might want to refer to a state by its abbreviation, but still allow for something like this:

How do you handle this in RavenDB? Using a relational database, you would probably have a separate table just for states, and it is certainly possible to create something similar to that in RavenDB:

image

The problem with this approach is that is it a very wrong headed approach one for a document database. In a relational database, you have no choice but to threat anything that has many items as a table with many rows. In a document database, we have much better alternatives.

Instead of threating each state as an individual document, we can treat them as a whole value, like this:

image

In the sample, I included just a couple of states, but I think that you get the idea. Note the name of this document “Config/States”.

What are the benefits of this approach?

  • We only have to retrieve a single document.
  • We are almost always going to treat the states as a list, never as individual items.
  • There is never a transactional boundary for modifying just a single state.

In fact, I would usually say that this isn’t good enough. I would try to centralized any/all of the reference data that we use in the application into a single document. There is a high likelihood that it would be a fairly small document even after doing so, and it would be insanely easy to cache.

Tags:

Published at

Originally posted at

Comments (34)

Why I don’t like the MVC Mini Profiler codebase

image

I mentioned in passing that I didn’t like that in the blog post about the MVC Profiler Support for RavenDB.

There were three major reasons for that. First, I don’t like code like this:

image

This seems to be incredibly invasive, it is really not something that I want to force on my users. Considering how much time I spent building and designing profilers, I have strong opinions about them. And one of the things that I consider as huge selling points for my profilers is that they are pretty much fire and forget. Once you initialize them, you can forget about all the rest.

Obviously, a general purpose profiler like this can’t do that, but I still don’t like it very much. And since we were talking about profiling RavenDB, I saw no reason to go to something like this.

Next, there is a pretty big focus on profiling SQL:

image

That is pretty meaningless for us when we talk about Profiling RavenDB.

And finally, there was the Stack Trace management. Providing Stack Trace information is incredibly useful for a profiler, and the MVC Mini Profiler supports that, but I didn’t like how it did it. Calling new StackTrace() is pretty expensive, and there are better ways of doing that.

One of the goals for the RavenDB profiler is to allow people to run it in production, same as I am doing right now in this blog, after all.

Mostly, the decision to go and write it from scratch was based on making it easier for us to build something that made sense for our purposes, not try to re-use something that was built for a completely different purpose.

Tags:

Published at

Originally posted at

Comments (15)

Debugging Security Exceptions

One of the horrible things about SecurityException is just how little information you are given.

For example, let us take a look at:

image

Yes, that is informative. The real problem is that most security analysis is done at the method level, which means that _something_ in this method caused this problem. Which meant that in order to debug this, I have to change my code to be like this:

image

Which gives me this:

image

Just to point out, the first, second, etc are purely artificial method names, meant just to give me some idea about the problem areas for this purpose only.

Then we need to go into the code in the First method and figure out what the problem is there, and so on, and so forth.

Annoying, and I wish that I knew of a better way.

Tags:

Published at

Originally posted at

Comments (12)

NHibernate & RavenDB Courses in New York, October/November 2011

I think that the NHibernate and RavenDB workshops in DevLink has been awesome, and I have been asked to do more of those.

One thing that I learned in particular from the RavenDB workshop is that just one day is just not enough, and we expanded the single day workshop into a two days course. I’ll be giving this course in November, 3 – 4 2011, in New York.

And, of course, my old companion, the Advanced NHibernate Course, for the people who wants to grok NHibernate and really understand what is going on under the cover, and maybe even more important, why? That course will run on October 31 – November 2, 2011, also in New York.

And if you are from the other side of the pond, I’ll be giving my Advanced NHibernate Course in Warsaw, Poland on October 17 – 19, 2011.

Register now for early bird pricing:

RavenDB Migrations: Rolling Updates

There are several ways to handle schema changes in RavenDB. When I am talking about schema changes, I am talking about changing the format of documents in production. RavenDB doesn’t have a “schema”, of course, but if your previous version of the application had a Name property for customer, and your new version have FirstName and LastName, you need to have some way of handling that.

Please note that in this case I am explicitly talking about a rolling migration, not something that you need to do immediately.

We will start with the following code bases:

Version 1.0 Version 2.0
public class Customer
{
    public string Name {get;set;}
    public string Email {get;set;}
    public int NumberOfOrders {get;set;}
}
public class Customer
{
    public string FirstName {get;set;}
    public string LastName {get;set;}
    public string CustomerEmail {get;set;}
    public bool PreferredCustomer {get;set;}
} 

As I said, there are several approaches, depending on exactly what you are trying to do. Let us enumerate them in order.

Removing a property – NumberOfOrders

As you can see, NumberOfOrders was removed from v1 to v2. In this case, there is absolutely no action required of us. The next time that this customer will be loaded, the NumberOfOrders property will not be bound to anything, RavenDB will note that the document have changed (missing a property) and save it without the now invalid property. It is self cleaning Smile.

Adding a property – PreferredCustomer

In this situation, what we have is a new property, and we need to provide a value for it. If there isn’t any value for the property in the stored json, it won’t be set, which means that the default value (or the one set in the constructor) will be the one actually set. Again, RavenDB will note that the document have changed, (have an extra property) and save it with the new property. It is self healing Smile.

Modifying properties – Email –> CustomerEmail, Name –> FirstName, LastName

This is where things gets annoying. We can’t rely on the default behavior for resolving this. Luckily, we have the extension points to help us.

public class CustomerVersion1ToVersion2Converter : IDocumentConversionListener
{
    public void EntityToDocument(object entity, RavenJObject document, RavenJObject metadata)
    {
        Customer c = entity as Customer;
        if (c == null)
            return;

        metadata["Customer-Schema-Version"] = 2;
        // preserve the old Name proeprty, for now.
        document["Name"] = c.FirstName + " " + c.LastName;
        document["Email"] = c.CustomerEmail;
    }

    public void DocumentToEntity(object entity, RavenJObject document, RavenJObject metadata)
    {
        Customer c = entity as Customer;
        if (c == null)
            return;
        if (metadata.Value<int>("Customer-Schema-Version") >= 2)
            return;

        c.FirstName = document.Value<string>("Name").Split().First();
        c.LastName = document.Value<string>("Name").Split().Last();
        c.CustomerEmail = document.Value<string>("Email");
    }
}

Using this approach, we can easily convert between the two version, including keeping the old schema in place in case we still need to be compatible with the old schema.

Pretty neat, isn’t it?

Tags:

Published at

Originally posted at

Comments (33)

RavenDB Migrations: When to execute?

When I am talking about migrations, I am talking about changing the format of documents in production. RavenDB doesn’t have a “schema”, of course, but if your previous version of the application had a Name property for customer, and your new version have FirstName and LastName, you need to have some way of handling that.

Before getting around to discussing how to do this, I want to discuss when. There are basically two types of migrations when you are talking about RavenDB.

Immediate, shutdown – there is a specific cutoff point, in which we will move from v1 to v2. That includes shutting down the application for the duration of the update. The shutdown is required because the v1 application expects to find Name property, the v2 application expects to first FirstName, LastName properties. If you try to run them both at the same time, you can get into a situation where each version of the application is fighting to enforce its own view. So the simplest solution is to shut down the application, convert the data and then start the new version.

Rolling update – your v2 application can handle the v1 format, which means that you can switch from the v1 app to v2 app fairly rapidly, and whenever the v2 application encounters the old format, it will convert it for your to the new one, potentially conserving the old format for a certain period.

I’ll discuss exactly how to handle those in my next post.

Tags:

Published at

Originally posted at

Elegancy challenge: Cacheable Batches

Let us say that we have the following server code:

public JsonDocument[] GetDocument(string[] ids)
{
  var  results = new List<JsonDocument>();
  foreach(var id in ids)
  {
    results.Add(GetDocument(id));
  }
  return result.ToArray();  
}

This method is a classic example of batching requests to the server. For the purpose of our discussion, we have a client proxy that looks like this:

public class ClientProxy : IDisposable
{
  MemoryCache cache = new MemoryCache();
  
  public JsonDocument[] GetDocument(params string[] ids)
  {
    
    // make the request to the server, for example
    var request = WebRequest.Create("http://server/get?id=" + string.Join("&id=", ids));
    using(var stream = request.GetResponse().GetResposeStream())
    {
      return  GetResults(stream);
    }
  }
  
  public void Dispose()
  {
    cache.Dispose();
  }
}

Now, as you can probably guess from the title and from the code above, the question relates to caching. We need to make the following pass:

using(var proxy = new ClientProxy())
{
  proxy.GetPopularity("ayende", "oren"); // nothing in the cahce, make request to server
  proxy.GetPopulairty("rhino", "hippo"); // nothing in the cahce, make request to server
  proxy.GetPopulairty("rhino", "aligator"); // only request aligator, 'rhino' is in the cache
  proxy.GetPopulairty("rhino", "hippo");  // don't make any request, serve from cache
  proxy.GetPopulairty("rhino", "oren");   // don't make any request, serve from cache
}

The tricky part, of course, is to make this elegant. You can modify both server and client code.

I simplified the problem drastically, but one of the major benefits in the real issue was reducing the size of data we have to fetch over the network even for partial cached queries.

Tags:

Published at

Originally posted at

Comments (27)

Hibernating Rhinos is hiring

My company is hiring again, and since I had so much success in our previous hiring round, I decided to use this blog for the next one as well.

Some information that you probably need to know:

  • You have to know to program in C#.
  • If you have OSS credentials, that is all for the better, but it is not required.
  • It is not a remote position, our offices are located in Hadera, Israel.

We are looking for developers, and are open for people with little to no commercial experience. I am looking for people who are passionate about programming, and I explicitly don’t list the whole alphabet soup that you usually see in a hiring notice. I don’t really care if you have / don’t have 5 years experience in WCF 4.1 update 3. I care that you can think, that you can program your way out of problems and also know when it is not a good idea to throw code at a problem.

If you are a reader of my blog, you probably have a pretty good idea about the sort of things that we are doing. But to be clear, we are talking about working on our flock of ravens (RavenDB, RavenFS and RavenMQ), our suite of profilers as well as some other things that I won’t discuss publically at this point Smile.

What is required:

Commercial experience is not required, but I would want to see code that you wrote.

Additionally, we are still looking for interns, so if you are a student who would like to gain some real world commercial experience (and get paid for it, naturally), contact me.

Challenge: Recent Comments with Future Posts

We were asked to implement comments RSS for this blog, and many people asked about the recent comments widget. That turned out to be quite a bit more complicated than it appeared on first try, and I thought it would make a great challenge.

On the face of it, it looks like a drop dead simple feature, right? Show me the last 5 comments in the blog.

The problem with that is that it ignores something that is very important to me, the notion of future posts. One of the major advantages of RaccoonBlog is that I am able to post stuff that would go on the queue (for example, this post would be scheduled for about a month from the date of writing it), but still share that post with other people. Moreover, those other people can also comment on the post. To make things interesting, it is quite common for me to re-schedule posts, moving them from one date to another.

Given that complication, let us try to define how we want the Recent Comments feature to behave with regards to future posts. The logic is fairly simple:

  • Until the post is public, do not show the post comments.
  • If a post had comments on it while it was a future post, when it becomes public, the comments that were already posted there, should also be included.

The last requirement is a bit tricky. Allow me to explain. It would be easier to understand with an example, which luckily I have:

image_thumb[2]

As you can see, this is a post that was created on the 1st of July, but was published on the 12th. Mike and me have commented on the post shortly after it was published (while it was still hidden from the general public). Then, after it was published, grega_g and Jonty have commented on that.

Now, let us assume that we query at 12 Jul, 08:55 AM, we will not get any comments from this post, but on 12 Jul, 09:01 AM, we should get both comments from this post. To make things more interesting, those should come after comments that were posted (chronologically) after them. Confusing, isn’t it? Again, let us go with a visual aid for explaining things.

In other words, let us say that we also have this as well:

image

Here is what we should see in the Recent Comments:

12 Jul, 2011 – 08:55 AM 12 Jul, 2011 – 09:05 AM
  1. 07/07/2011 07:42 PM – jdn
  2. 07/08/2011 07:34 PM – Matt Warren
  1. 07/03/2011 05:26 PM – Ayende Rahien
  2. 07/03/2011 05:07 PM - Mike Minutillo
  3. 07/07/2011 07:42 PM – jdn
  4. 07/08/2011 07:34 PM – Matt Warren

Note that the 1st and 2snd location on 9:05 are should sort after 3rd and 4th, but are sorted before them, because of the post publish date, which we also take into account.

Given all of that, and regardless of the actual technology that you use, how would you implement this feature?

The horror of having to enforce SLAs

I am doing the NHibernate Course right now, and we go to the stage where everyone is tired and need some comic relief, at which point I decided to show them the true meaning of an SLA.

image

If you commit this without telling anyone, you get to have so much fun when the computer suddenly start screaming at those horribly inefficient pages.

Tags:

Published at

Originally posted at

Comments (8)

RavenDB Lazy Requests

In my previous post, I discussed the server side implementation of lazy requests / Multi GET. The ability to submit several requests to the server in a single round trip. RavenDB has always supported the ability to perform multiple write operations in a single batch, but now we have the reverse functionality the ability to make several reads at one. (The natural progression, the ability to make several read/write operations in a single batch will not be supported).

As it turned out, it was actually pretty hard to do, and require us to do some pretty deep refactoring to the way we were executing our requests, but in the end it is here and it is working. Here are a few examples of how it looks line from the client API point of view:

var lazyUser = session.Advanced.Lazily.Load<User>("users/ayende");
var lazyPosts = session.Query<Posts>().Take(30).Lazily();

And up until now, there is actually nothing being sent to the server. The result of those two calls are Lazy<User> and Lazy<IEnumerable<Post>>, respectively.

The reason that we are using Lazy<T> in this manner is that we want to make it very explicit when you are actually evaluating the lazy request. All the lazy requests will be sent to the server in a single roundtrip the first time that any of the lazy instances are evaluated, or you can force this to happen by calling:

session.Advanced.Eagerly.ExecuteAllPendingLazyOperations();

In order to increase the speed even more, on the server side, we are actually going to process all of those queries in parallel. So you get several factors that would speed up your application:

  • Only a single remote call is made.
  • All of the queries are actually handled in parallel on the server side.

And the best part, you usually don’t really have to think about it. If you use the Lazy API, it just works Smile.

Tags:

Published at

Originally posted at

Comments (16)

RavenDB Multi GET support

One of the annoyances of HTTP is that it is not really possible to make complex queries easily. To be rather more exact, you can make a complex query fairly easily, but at some point you’ll reach the URI limit, and worse, there is no easy way to make multiple queries in a single round trip.

I have been thinking about this a lot lately, because it is a stumbling block for a feature that is near and dear to my heart, the Future Queries feature that is so useful when using NHibernate.

The problem was that I couldn’t think of a good way of doing this. Well, I could think of how to do this quite easily, to be truthful. I just couldn’t think of a good way to make this work nicely with the other features of RavenDB.

In particular, it was hard to figure out how to deal with caching. One of the really nice things about RavenDB’s RESTful nature is that caching is about as easy as it can be. But since we need to tunnel requests through another medium for it to work, I couldn’t figure out how to make this work in a nice fashion. And then I remembered that REST didn’t actually have anything to do with HTTP itself, you can do REST on top of any transport protocol.

Let us look at how requests are handled in RavenDB over the wire:

GET http://localhost:8080/docs/bobs_address

HTTP/1.1 200 OK

{
  "FirstName": "Bob",
  "LastName": "Smith",
  "Address": "5 Elm St."
}

GET http://localhost:8080/docs/users/ayende

HTTP/1.1 404 Not Found

As you can see, we have 2 request / reply calls.

What we did in order to make RavenDB support multiple requests in a single round trip is to build on top of this exact nature using:

POST http://localhost:8080/multi_get

[
   { "Url": "http://localhsot:8080/docs/bobs_address", "Headers": {} },
   { "Url": "http://localhsot:8080/docs/users/ayende", "Headers": {} },
]

HTTP/1.1 200 OK

[
  { "Status": 200, "Result": { "FirstName": "Bob", "LastName": "Smith", "Address": "5 Elm St." }},
  { "Status": 404 "Result": null },
]

Using this approach, we can handle multiple requests in a single round trip.

You might not be surprised to learn that it was actually very easy to do, we just needed to add an endpoint and have a way of executing the request pipeline internally. All very easy.

The really hard part was with the client, but I’ll touch on that in my next post.

Tags:

Published at

Originally posted at

Comments (25)

Macto: Looking at warrants

After spending so much time talking about how important Warrants are, it is actually surprising to see the UI for a warrant:

image

It is pretty simple, because from a data entry perspective, there isn’t really much to it. It is the effects of the Warrants that make it such an interesting concept. One thing to note here is that the date that we care about for the Warrant isn’t the date it was issued, but when the guy was actually arrested, that is where the clock starts ticking.

Adding a Warrant is also not that complex from a data entry perspective:

image

image

image

As you can see, thus far, it is pretty simple. But when you click the Finish button the complexity starts.

We need to check that the Warrant is valid (issued by someone with the authority to do so), and then calculate the new duration for the Inmate.

And that is enough for today, I just wrote ~10 posts on Macto in the last 24 hours, it is time to do something else.

Tags:

Published at

Originally posted at

Comments (6)

Answer: Modifying execution approaches

In RavenDB, we had this piece of code:

        internal T[] LoadInternal<T>(string[] ids, string[] includes)
        {
            if(ids.Length == 0)
                return new T[0];

            IncrementRequestCount();
            Debug.WriteLine(string.Format("Bulk loading ids [{0}] from {1}", string.Join(", ", ids), StoreIdentifier));
            MultiLoadResult multiLoadResult;
            JsonDocument[] includeResults;
            JsonDocument[] results;
#if !SILVERLIGHT
            var sp = Stopwatch.StartNew();
#else
            var startTime = DateTime.Now;
#endif
            bool firstRequest = true;
            do
            {
                IDisposable disposable = null;
                if (firstRequest == false) // if this is a repeated request, we mustn't use the cached result, but have to re-query the server
                    disposable = DatabaseCommands.DisableAllCaching();
                using (disposable)
                    multiLoadResult = DatabaseCommands.Get(ids, includes);

                firstRequest = false;
                includeResults = SerializationHelper.RavenJObjectsToJsonDocuments(multiLoadResult.Includes).ToArray();
                results = SerializationHelper.RavenJObjectsToJsonDocuments(multiLoadResult.Results).ToArray();
            } while (
                AllowNonAuthoritiveInformation == false &&
                results.Any(x => x.NonAuthoritiveInformation ?? false) &&
#if !SILVERLIGHT
                sp.Elapsed < NonAuthoritiveInformationTimeout
#else 
                (DateTime.Now - startTime) < NonAuthoritiveInformationTimeout
#endif
                );

            foreach (var include in includeResults)
            {
                TrackEntity<object>(include);
            }

            return results
                .Select(TrackEntity<T>)
                .ToArray();
        }

And we needed to take this same piece of code and execute it in:

  • Async fashion
  • As part of a batch of queries (sending multiple requests to RavenDB in a single HTTP call).

Everything else is the same, but in each case the marked line is completely different.

I chose to address this by doing a Method Object refactoring. I create a new class, and moved all the local variables to fields, and moved each part of the method to its own method. I also explicitly gave up control on executing, deferring that to whoever it calling us. We ended up with this:

    public class MultiLoadOperation
    {
        private static readonly Logger log = LogManager.GetCurrentClassLogger();

        private readonly InMemoryDocumentSessionOperations sessionOperations;
        private readonly Func<IDisposable> disableAllCaching;
        private string[] ids;
        private string[] includes;
        bool firstRequest = true;
        IDisposable disposable = null;
        JsonDocument[] results;
        JsonDocument[] includeResults;
                
#if !SILVERLIGHT
        private Stopwatch sp;
#else
        private    DateTime startTime;
#endif

        public MultiLoadOperation(InMemoryDocumentSessionOperations sessionOperations, 
            Func<IDisposable> disableAllCaching,
            string[] ids, string[] includes)
        {
            this.sessionOperations = sessionOperations;
            this.disableAllCaching = disableAllCaching;
            this.ids = ids;
            this.includes = includes;
        
            sessionOperations.IncrementRequestCount();
            log.Debug("Bulk loading ids [{0}] from {1}", string.Join(", ", ids), sessionOperations.StoreIdentifier);

#if !SILVERLIGHT
            sp = Stopwatch.StartNew();
#else
            startTime = DateTime.Now;
#endif
        }

        public IDisposable EnterMultiLoadContext()
        {
            if (firstRequest == false) // if this is a repeated request, we mustn't use the cached result, but have to re-query the server
                disposable = disableAllCaching();
            return disposable;
        }

        public bool SetResult(MultiLoadResult multiLoadResult)
        {
            firstRequest = false;
            includeResults = SerializationHelper.RavenJObjectsToJsonDocuments(multiLoadResult.Includes).ToArray();
            results = SerializationHelper.RavenJObjectsToJsonDocuments(multiLoadResult.Results).ToArray();

            return    sessionOperations.AllowNonAuthoritiveInformation == false &&
                    results.Any(x => x.NonAuthoritiveInformation ?? false) &&
#if !SILVERLIGHT
                    sp.Elapsed < sessionOperations.NonAuthoritiveInformationTimeout
#else 
                    (DateTime.Now - startTime) < sessionOperations.NonAuthoritiveInformationTimeout
#endif
                ;
        }

        public T[] Complete<T>()
        {
            foreach (var include in includeResults)
            {
                sessionOperations.TrackEntity<object>(include);
            }

            return results
                .Select(sessionOperations.TrackEntity<T>)
                .ToArray();
        }
    }

Note that this class doesn’t contain two very important things:

  • The actual call to the database, we gave up control on that.
  • The execution order for the methods, we don’t control that either.

That was ugly, and I decided that since I have to write another implementation as well, I might as well do the right thing and have a shared implementation. The key was to extract everything away except for the call to get the actual value. So I did just that, and we got a new class, that does all of the functionality above, except control where the actual call to the server is made and how.

Now, for the sync version, we have this code:

internal T[] LoadInternal<T>(string[] ids, string[] includes)
{
    if(ids.Length == 0)
        return new T[0];

    var multiLoadOperation = new MultiLoadOperation(this, DatabaseCommands.DisableAllCaching, ids, includes);
    MultiLoadResult multiLoadResult;
    do
    {
        using(multiLoadOperation.EnterMultiLoadContext())
        {
            multiLoadResult = DatabaseCommands.Get(ids, includes);
        }
    } while (multiLoadOperation.SetResult(multiLoadResult));

    return multiLoadOperation.Complete<T>();
}

This isn’t the most trivial of methods, I’ll admit, but it is ever so much better than the alternative, especially since now the async version looks like:

/// <summary>
/// Begins the async multi load operation
/// </summary>
public Task<T[]> LoadAsyncInternal<T>(string[] ids, string[] includes)
{
    var multiLoadOperation = new MultiLoadOperation(this,AsyncDatabaseCommands.DisableAllCaching, ids, includes);
    return LoadAsyncInternal<T>(ids, includes, multiLoadOperation);
}

private Task<T[]> LoadAsyncInternal<T>(string[] ids, string[] includes, MultiLoadOperation multiLoadOperation)
{
    using (multiLoadOperation.EnterMultiLoadContext())
    {
        return AsyncDatabaseCommands.MultiGetAsync(ids, includes)
            .ContinueWith(t =>
            {
                if (multiLoadOperation.SetResult(t.Result) == false)
                    return Task.Factory.StartNew(() => multiLoadOperation.Complete<T>());
                return LoadAsyncInternal<T>(ids, includes, multiLoadOperation);
            })
            .Unwrap();
    }
}

Again, it isn’t trivial, but at least the core stuff, the actual logic that isn’t related to how we execute the code is shared.

Macto: Talking to nasty people

Well, so far we have looked at the main screen and the Counting screen, it is time we introduce ourselves to the Inmate screen as well:

image

As you can see, there isn’t a lot going on here. We have the Inmate’s Id, name and location (including noting where he is located, if he is known to be outside the prison). We don’t usually care for things like detailed information in the normal course of things (the Inamte’s national id, for example). That information is important, but usually not relevant, we relegate it to a separate screen.

The dates for incarceration and scheduled release are also important, but they aren’t available for editing, they are there only for information purposes.

The note is there to make sure that highly important information (such as whatever the Inmate is suicidal / flight risk) would be clearly available. The same is true for the Cliff Notes version of the Record.

It is there not for casual use, but to ensure that pertinent information is available. Most importantly, note the last line there. Which means that if this Inmate is about to be released, we have to notify someone and get their approval for that. Well, approval is a strong word, we notify them that they need to give us a new Warrant for the Inmate, but we can delay releasing him (say, until midnight the day he is to be released) waiting for that Warrant.

Warrants are important, and you can see the ones that this guy have listed here. The last Warrant is the important one, the others are shown for completion sake and to ensure that we have continuous Warrants.

There are also several actions that we can do to an Inmate. We can Transfer him to another prison, Release him or Add a Warrant. Each of those is a complex process of its own, and I’ll discuss them later on.

Tags:

Published at

Originally posted at

Comments (7)

The customer is always right?

When you get this sort of an email, you almost always know that this is going to be bad:

image

Let us start with: Which product? What license key? What order? What do you expect me to do about it?

At least he is polite.

image

image

Hm, I wonder what is going on in here…

This error can occur because of a trial that has expired or a subscription that has not been renewed.

image

image

He attached a Trial licensed to this email.

image

image

It is like a Greek tragedy, you know that at some point this is going to arrive at the scene.

image

I mean, we explicitly added the notion of subscriptions to handle just such cases, of people who want to use the profiler just for a few days and don’t want to pay the full version price. And you can cancel that at any time, incurring no additional charges.

image

Sigh…

Macto: Counting is The Holy Grail

I might have mentioned before that Counting is somewhat important in prison. That is about as accurate as saying that you would somewhat prefer it to keep breathing. Counting is the heartbeat of the prison, the thing that the entire operation revolves around.

Here we can see the counting screen:

image

You can see that we have an Open Count here, that is a count that is still in progress. Some cell blocks have reported their counts, and some are still in the process of making the count.

A Count is Closed when we know where are the Inmates are. (When the two numbers line up properly). You probably noticed that there are two ways to report a count, the first is for Inmates who are outside the prison (court, hospital, etc). Because they are outside the prison, we track them by name. For internal counts, we don’t really care about names, just the numbers.

There is another type of a Count, a Named Count, which is a process that happen very rarely, but is usually used to reconcile what is in the computer and what is actually in the cells.

It is important to understand the “Officer in charge” field, basically, it is the guy who has the legal responsibility and is signing off on those numbers. In other words, if there is something wrong, that guy is going to take a hard fall.

Tags:

Published at

Originally posted at

Comments (14)

Macto: Getting Started, you never forget your first Inmate

People always love to start with CRUD, but it is almost never that simple. In this post, we will review the process required to accept an Inmate into the prison.

That process is composed of the following parts:

  1. Identification
    • Who is the guy?
    • Id number
    • Names
    • Photo
    • If we can’t identify / refuse to identify, we still need to be able to accept him.
  2. Lawful chain of incarceration
    • Go over all the documents for his arrest
    • Ensure that they are all in order
    • Ensure that they are continuous and valid
    • Check if there are any urgent things to do with him. For example, he may need to be at court today or the next day.
  3. Medical exam
    • Is it okay to hold the guy in prison?
    • If he is not healthy, can we take care of him in prison?
    • Does he require hospitalization?
    • Does he require medicine / treatment?
    • Are there any medical consideration into where to put him?
  4. Intelligence
    • Interviewing the guy
    • Report interesting details that are known about him
  5. Acceptability
    • Does he fit the level of Inmates we can accept? We usually don’t put murderers in minimum security prisons, for example.
    • Does he have any medical reason to reject him?
    • Are there any problems with the incarceration documents?
    • Is there any intelligence warning about the guy?
  6. Placement
    • Decide where to put the Inmate
    • What type of an Inmate is he? (Just arrested, sentenced, sentenced for a long period, etc)
    • Why is he in prison for?
    • What kind is he? (You want to avoid Inmate infighting, it creates paperwork, so you avoid putting them in conflict if possible)
    • Where there is room available?

Another important aspect to remember is that while we are allowed to reject invalid input (for example, we are allowed to say that the id number has to consist of only numeric characters), we are not allowed to reject input that is wrong.

What do I mean by that. Let us say that we have an Inmate at the door, and he doesn’t have his incarceration paperwork in order (well, not he, whoever brought him in, but you get the point). That means that legally, we can’t hold him. But Macto isn’t where things are actually happening, it is merely a support system that tracks what is going on in the real world. And the prison commander can decide to accept that guy anyway (say, because the paperwork in en route), and we have to allow for that. If we try to stop people from doing this, it is going to be worked around, and we don’t want that. The system is allowed, even encouraged, to warn the users when they are doing something wrong, but it cannot block it.

The first part, Identification, is actually pretty easy, all told. This is fairly simple data entry process. We’ll want to do some checkups on the data, such as that the id number is valid or to check the id against the name, etc. But we basically have to have some level of trust in the documents that we have. You usually don’t have an arrest warrant for “tall guy in brown shirt”. If we find any problems there, we can Flag the Dossier as a potentially fraudulent name. This is also the stage where we want to check if the Inmate is a returned visit, and bring the life the old Dossier.

The second part is more complex, because there are many different types of Warrants, each with their own implications. Arrest Warrant is valid for 24 hours, Remand Warrant is good until sentencing, etc. We need to input all of those Warrants, ensure that they are consistent, valid and continuous. If there is a problem with that, we need to Flag the Dossier, but we can’t reject it. We will discuss this in more detail in the next post.

The third part is basically external to Macto, we may need to provide the Inmate ID, for correlation purposes, but nothing beyond that. We do need to get approval from the doctor that the Inmate is in an OK condition to be held in the prison. That does get recorded in Macto.

The forth part is again, external to us. Usually any information is classified and wouldn’t appear in Macto. We may get some intelligence brief about the guy, but usually we won’t.

The fifth part is important, this is where we actually take legal and physical ownership for the Inmate. Up until that point, we had him in our hands, but we weren’t responsible for him. Accepting the Inmate is a simple matter if everything is good, but if the Dossier was Flagged, we might need approval from the officer in charge. Accepting an Inmate means that he is added to the prison’s Roster.

The sixth part is pretty much “where do I have a spare bed”, after which he is added to the Roster of the cell block he is now in care of.

It is important to note that Placement always happens. Even if immediately after Accepting an Inmate you rushed him to the hospital, that Inmate still has to be assigned to a cell block, because that assignment means that the cell block commander is in charge of him. That way we avoid potential mishaps when an Inmate is assigned to no one, doesn’t get Counted.

Okay, I think that this is enough for now, in the next post, we will discuss what exactly goes on in the second part, it is a pretty complex piece, and it deserve its own post.

Tags:

Published at

Originally posted at

Comments (13)

RavenDB implementation frustrations

While working on the RavenDB server is usually a lot of fun, there is a part of RavenDB client that I absolutely abhor. Every time that I need to touch the part of the client API that talks to the server, it is a pain. Why is that?

Let us take a the simple example, loading a document by id. On the wire, it looks like this:

GET /docs/users/ayende

It can’t be simpler than that, except that internally in RavenDB, we have three implementation for that:

  • Standard sync impl
  • Standard async impl
  • Silverlight async impl

The problem is that each of those uses different API, and while we created a shared abstraction for async / sync, at least, it is a hard task making sure that they are all in sync with one another if we need to make a modification.

For example, let us look at the three implementation of Get:

public JsonDocument DirectGet(string serverUrl, string key)
{
    var metadata = new RavenJObject();
    AddTransactionInformation(metadata);
    var request = jsonRequestFactory.CreateHttpJsonRequest(this, serverUrl + "/docs/" + key, "GET", metadata, credentials, convention);
    request.AddOperationHeaders(OperationsHeaders);
    try
    {
        var requestString = request.ReadResponseString();
        RavenJObject meta = null;
        RavenJObject jsonData = null;
        try
        {
            jsonData = RavenJObject.Parse(requestString);
            meta = request.ResponseHeaders.FilterHeaders(isServerDocument: false);
        }
        catch (JsonReaderException jre)
        {
            var headers = "";
            foreach (string header in request.ResponseHeaders)
            {
                headers = headers + string.Format("\n\r{0}:{1}", header, request.ResponseHeaders[header]);
            }
            throw new JsonReaderException("Invalid Json Response: \n\rHeaders:\n\r" + headers + "\n\rBody:" + requestString, jre);
        }
        return new JsonDocument
        {
            DataAsJson = jsonData,
            NonAuthoritiveInformation = request.ResponseStatusCode == HttpStatusCode.NonAuthoritativeInformation,
            Key = key,
            Etag = new Guid(request.ResponseHeaders["ETag"]),
            LastModified = DateTime.ParseExact(request.ResponseHeaders["Last-Modified"], "r", CultureInfo.InvariantCulture).ToLocalTime(),
            Metadata = meta
        };
    }
    catch (WebException e)
    {
        var httpWebResponse = e.Response as HttpWebResponse;
        if (httpWebResponse == null)
            throw;
        if (httpWebResponse.StatusCode == HttpStatusCode.NotFound)
            return null;
        if (httpWebResponse.StatusCode == HttpStatusCode.Conflict)
        {
            var conflicts = new StreamReader(httpWebResponse.GetResponseStreamWithHttpDecompression());
            var conflictsDoc = RavenJObject.Load(new JsonTextReader(conflicts));
            var conflictIds = conflictsDoc.Value<RavenJArray>("Conflicts").Select(x => x.Value<string>()).ToArray();

            throw new ConflictException("Conflict detected on " + key +
                                        ", conflict must be resolved before the document will be accessible")
            {
                ConflictedVersionIds = conflictIds
            };
        }
        throw;
    }
}
 

This is the sync API, of course, next we will look at the same method, for the full .NET framework TPL:

public Task<JsonDocument> GetAsync(string key)
{
    EnsureIsNotNullOrEmpty(key, "key");

    var metadata = new RavenJObject();
    AddTransactionInformation(metadata);
    var request = jsonRequestFactory.CreateHttpJsonRequest(this, url + "/docs/" + key, "GET", metadata, credentials, convention);

    return Task.Factory.FromAsync<string>(request.BeginReadResponseString, request.EndReadResponseString, null)
        .ContinueWith(task =>
        {
            try
            {
                var responseString = task.Result;
                return new JsonDocument
                {
                    DataAsJson = RavenJObject.Parse(responseString),
                    NonAuthoritiveInformation = request.ResponseStatusCode == HttpStatusCode.NonAuthoritativeInformation,
                    Key = key,
                    LastModified = DateTime.ParseExact(request.ResponseHeaders["Last-Modified"], "r", CultureInfo.InvariantCulture).ToLocalTime(),
                    Etag = new Guid(request.ResponseHeaders["ETag"]),
                    Metadata = request.ResponseHeaders.FilterHeaders(isServerDocument: false)
                };
            }
            catch (WebException e)
            {
                var httpWebResponse = e.Response as HttpWebResponse;
                if (httpWebResponse == null)
                    throw;
                if (httpWebResponse.StatusCode == HttpStatusCode.NotFound)
                    return null;
                if (httpWebResponse.StatusCode == HttpStatusCode.Conflict)
                {
                    var conflicts = new StreamReader(httpWebResponse.GetResponseStreamWithHttpDecompression());
                    var conflictsDoc = RavenJObject.Load(new JsonTextReader(conflicts));
                    var conflictIds = conflictsDoc.Value<RavenJArray>("Conflicts").Select(x => x.Value<string>()).ToArray();

                    throw new ConflictException("Conflict detected on " + key +
                                                ", conflict must be resolved before the document will be accessible")
                    {
                        ConflictedVersionIds = conflictIds
                    };
                }
                throw;
            }
        });
}

And here is the Siliverlight version:

public Task<JsonDocument> GetAsync(string key)
{
    EnsureIsNotNullOrEmpty(key, "key");

    key = key.Replace("\\",@"/"); //NOTE: the present of \ causes the SL networking stack to barf, even though the Uri seemingly makes this translation itself

    var request = url.Docs(key)
        .ToJsonRequest(this, credentials, convention);

    return request
        .ReadResponseStringAsync()
        .ContinueWith(task =>
        {
            try
            {
                var responseString = task.Result;
                return new JsonDocument
                {
                    DataAsJson = RavenJObject.Parse(responseString),
                    NonAuthoritiveInformation = request.ResponseStatusCode == HttpStatusCode.NonAuthoritativeInformation,
                    Key = key,
                    LastModified = DateTime.ParseExact(request.ResponseHeaders["Last-Modified"].First(), "r", CultureInfo.InvariantCulture).ToLocalTime(),
                    Etag = new Guid(request.ResponseHeaders["ETag"].First()),
                    Metadata = request.ResponseHeaders.FilterHeaders(isServerDocument: false)
                };
            }
            catch (AggregateException e)
            {
                var webException = e.ExtractSingleInnerException() as WebException;
                if (webException != null)
                {
                    if (HandleWebExceptionForGetAsync(key, webException))
                        return null;
                }
                throw;
            }
            catch (WebException e)
            {
                if (HandleWebExceptionForGetAsync(key, e))
                    return null;
                throw;
            }
        });
}

Did I mention it is annoying?

All of those methods are doing the exact same thing, but I have to maintain 3 versions of them. I thought about dropping the sync version (it is easy to do sync on top of async), which would mean that I would have only 2 implementations, but I don’t like it. The error handling and debugging support for async is still way below what you can get for sync code.

I don’t know if there is a solution for this, I just know that this is a big pain point for me.

Tags:

Published at

Originally posted at

Comments (9)

Macto: The Main Screen

I usually like to think about the responsibilities of the system by showing up the UI. It is a great way to communicate with both customers and developers.

Here is the main screen for the application:

image

This is actually a bad place to start with, in terms of coding start points, because it requires so many other things as well. This particular screen is likely to be viewed pretty much everywhere, this is what the prison commanders and the cell blocks commanders have at their desktop, what the officers are using to do their daily work (usually for Counting, admittedly).

Even before we can start , it reveals quite a lot about the actual way things work. In this screen, we can see that we have Flagged Dossiers, those are Inmates that have some problem with their Dossier. We can accept Inmates with problematic Dossiers, but we want to fix that as soon as possible, so we make this something that is very much front and center.

The “Action required” section detail Inmates that we have to take some action about. Whatever it is a court date that this inmate have to be at or his sentence is ending or a warrant that need extending.

Finally, and most importantly, we have the counts, which are the most important thing in the prison. You can see that the numbers at the bottom line up. If they don’t, we have A Problem.

Tags:

Published at

Originally posted at

Comments (16)

Entity Framework 4.1 Update 1, Backward Compatibility and Microsoft

One of the things that you keep hearing about Microsoft products is how much time and effort is dedicated to ensuring Backward Compatibility. I have had a lot of arguments with Microsoft people about certain design decisions that they have made, and usually the argument ended up with “We have to do it this was to ensure Backward Compatibility”.

That sucked, but at least I could sleep soundly, knowing that if I had software running on version X of Microsoft, I could at least be pretty sure that it would work in X+1.

Until Entity Framework 4.1 Update 1 shipped, that is. Frans Bouma has done a great job in describing what the problem actually is, including going all the way and actually submitting a patch with the code required to fix this issue.

But basically, starting with Entity Framework 4.1 Update 1 (the issue also appears in Entity Framework 4.2 CTP, but I don’t care about this much now), you can’t use generic providers with Entity Framework. Just to explain, generic providers is pretty much the one way that you can integrate with Entity Framework if you want to write a profiler or a cacher or a… you get the point.

Looking at the code, it is pretty obvious that this is not intentional, but just someone deciding to implement a method without considering the full ramifications. When I found out about the bug, I tried figuring out a way to resolve it, but the only work around would require me to provide a different assembly for each provider that I wanted to support (and there are dozens that we support on EF 4.0 and EF 3.5 right now).

We have currently implemented a work around for SQL Server only, but if you want to use Entity Framework Profiler with Entity Framework 4.1 Update 1 and a database other than SQL Server, we would have to create a special build for your scenario, rather than have you use the generic provider method, which have worked so far.

Remember the beginning of this post? How I said that Backward Compatibility is something that Microsoft is taking so seriously?

Naturally I felt that this (a bug that impacts anyone who extends Entity Framework in such a drastic way) is an important issue to raise with Microsoft. So I contacted the team with my finding, and the response that I got was basically: Use the old version if you want this supported.

What I didn’t get was:

  • Addressing the issue of a breaking change of this magnitude that isn’t even on a major release, it is an update to a minor release.
  • Whatever they are even going to fix it, and when this is going to be out.
  • Whatever the next version (4.2 CTP also have this bug) is going to carry on this issue or not.

I find this unacceptable. The fact that there is a problem with a CTP is annoying, but not that important. The fact that a fully release package has this backward compatibility issue is horrible.

What makes it even worse is the fact that this is an update to a minor version, people excepts this to be a safe upgrade, not something that requires full testing. And anyone who is going to be running Update-Package in VS is going to hit this problem, and Update-Package is something that people do very often. And suddenly, Entity Framework Profiler can no longer work.

Considering the costs that the entire .NET community has to bear in order for Microsoft to preserve backward compatibility, I am deeply disappointed that when I actually need this backward compatibility.

This is from the same guys that refused (for five years!) to fix this bug:

new System.ComponentModel.Int32Converter().IsValid("yellow") == true

Because, and I quote:

We do not have the luxury of making that risky assumption that people will not be affected by the potential change. I know this can be frustrating for you as it is for us. Thanks for your understanding in this matter.

Let me put this to you in perspective, anyone who is using EF Prof is likely to (almost by chance) to get hit by that. When this happens, our only option is to tell them to switch back a version?

That makes us look very bad, regardless of what is the actual reason for that. That means that I am having to undermine my users' trust in my product. "He isn't supporting 4.1, and he might not support 4.2, so we probably can't buy his product, because we want to have the newest version".

This is very upsetting. Not so much about the actual bug, those happen, and I can easily imagine the guy writing the code making assumptions that turn out to be false. Heavens know that I have done the same many times before. I don’t even worry too much about this having escaped the famous Microsoft Testing Cycle.

What (to be frank) pisses me off is that the response that we got from Microsoft for this was that they aren’t going to fix this. That the only choice that I have it to tell people to downgrade if they want to use my product (with all the implications that has for my product).

Sad smile

RavenDB Session Management with NServiceBus

I got a few questions about this in the past, and I thought that I might as well just post how I recommend deal with RavenDB session management for NServiceBus:

public class RavenSessionMessageModule : IMessageModule
{
  ThreadLocal<IDocumentSession> currentSession = new ThreadLocal<IDocumentSession>();

  IDocumentStore store;
  
  public RavenSessionMessageModule(IDocumentStore store)
  {
    this.store = store;
  }

  public IDocumentSession CurrentSession
  {
    get{ return currentSession.Value; }
  }

  public void HandleBeginMessage()
  {
    currentSession.Value = store.OpenSession();
  }
  
  public void HandleEndMessage()
  {
    using(var old = currentSession.Value)
    {
      currentSession.Value = null;
      old.SaveChanges();
    }
  }
  
  public void HandleError()
  {
    using(var old = currentSession.Value)
      currentSession.Value = null;
  }
}

When any of your message handlers needs to get the session, you need to wire the container in such a way that it would provide the RavenSessionMessageModule.CurrentSession to it.

Tags:

Published at

Originally posted at

Comments (12)

Advanced NHibernate Course–Warsaw, October 2011

I was very pleasantly surprised by the overwhelmingly positive response to my the idea of running the NHibernate course in Poland. Which is why it gives me great pleasure to announce that I’ll be giving my Advanced NHibernate Course in Warsaw, between the 17 October to the 19 October, 2011.

You can register using the this link. And in the meantime,

Happy nHibernating Smile

Macto: Warrants are for fools

Warrants are kinda important in a prison. They are the legal authority to limit someone’s freedom. In wouldn’t be overstating the fact in saying that Warrants are one of the major factors being managed in Macto.

There are all kind of Warrants in existence. To list just a few of them:

  • Arrest Warrant – Issued by an officer, generally hold for 24 hours only.
  • Detention Warrant – Issued by the court, generally for a short amount of time, up to a few weeks, in most cases.
  • Remand Warrant – Issued by the court, generally instructing the prison to hold the Inmate in custody until sentencing (not limited in time).
  • Sentencing Warrant – Issued by the court, specifying the total time that an Inmate is to be incarcerated.

There are other warrants, such as an Court Arrest Warrant, for example, but for the purpose of Macto, we won’t get into those. The type of activity currently required by the prison doesn’t really need them, but that might change in the future.

There is also another type of Warrant available, it is Whatever The Judge Said Warrant, or as the lawyers call is Mandamus Warrant. It is basically an instruction to do something, and it can be just about anything. From letting the Inmate to call his wife to putting him in a different cell or transferring him to a different prison to command special food / treatment to… Well, there is a reason I call it Whatever The Judge Said.

The rules for Warrants for incarceration are pretty simple. Each warrant type has an issuer (Arrest Warrants can only be given by Officers of rank Captain and above) for a certain duration (which in some cases, may be limited, such as the 24 hour limit for Arrest Warrants). Depending on the type of Warrant, it can be in Hours, Days, Months or Years. The units in which a warrant is specified are very important. In particular, there is a difference between 30 days incarceration and 1 month incarceration, for example. And hourly Warrants requires that by the time the Warrant expire, you are either got a new one at court or let the Inmate go.

The last issued Warrant is always the one that is valid, and all Warrants must be continuous. Gaps in the middle are considered to be a Very Bad thing.

Tags:

Published at

Originally posted at

Comments (27)