Challenge: Modifying execution approaches
In RavenDB, we had this piece of code:
internal T[] LoadInternal<T>(string[] ids, string[] includes) { if(ids.Length == 0) return new T[0]; IncrementRequestCount(); Debug.WriteLine(string.Format("Bulk loading ids [{0}] from {1}", string.Join(", ", ids), StoreIdentifier)); MultiLoadResult multiLoadResult; JsonDocument[] includeResults; JsonDocument[] results; #if !SILVERLIGHT var sp = Stopwatch.StartNew(); #else var startTime = DateTime.Now; #endif bool firstRequest = true; do { IDisposable disposable = null; if (firstRequest == false) // if this is a repeated request, we mustn't use the cached result, but have to re-query the server disposable = DatabaseCommands.DisableAllCaching(); using (disposable) multiLoadResult = DatabaseCommands.Get(ids, includes); firstRequest = false; includeResults = SerializationHelper.RavenJObjectsToJsonDocuments(multiLoadResult.Includes).ToArray(); results = SerializationHelper.RavenJObjectsToJsonDocuments(multiLoadResult.Results).ToArray(); } while ( AllowNonAuthoritiveInformation == false && results.Any(x => x.NonAuthoritiveInformation ?? false) && #if !SILVERLIGHT sp.Elapsed < NonAuthoritiveInformationTimeout #else (DateTime.Now - startTime) < NonAuthoritiveInformationTimeout #endif ); foreach (var include in includeResults) { TrackEntity<object>(include); } return results .Select(TrackEntity<T>) .ToArray(); }
And we needed to take this same piece of code and execute it in:
- Async fashion
- As part of a batch of queries (sending multiple requests to RavenDB in a single HTTP call).
Everything else is the same, but in each case the marked line is completely different.
When we had only one additional option, I choose the direct approach, and implement it using;
public Task<T[]> LoadAsync<T>(string[] ids) { IncrementRequestCount(); return AsyncDatabaseCommands.MultiGetAsync(ids) .ContinueWith(task => task.Result.Select(TrackEntity<T>).ToArray()); }
You might notice a few differences between those approaches. The implementation behave, most of the time, the same, but all the behavior for edge cases is wrong. The reason for that, by the way, is that initially the Load and LoadAsync impl was functionality the same, but the Load behavior kept getting more sophisticated, and I kept forgetting to also update the LoadAsync behavior.
When I started building support for batches, this really stumped me. The last thing that I wanted to do is to either try to maintain complex logic in three different location or have different behaviors depending if you were using a direct call, a batch or async call. Just trying to document that gave me a headache.
How would you approach solving this problem?
Macto: Non functional concerns, you are a legal system
Macto is a system that operates in a highly legislative environment. As such, we have to be prepared for the court to ask us to show our records about a particular Inmate. Part of that is ensuring that we preserve the history of the Inmate’s Dossier. An example where this would be relevant is when a lawyer contend the legality of incarcerating the Inmate. You have to show not only that you have legal authority to incarcerate the guy, you also have to show that you had that authority continuously throughout the incarceration period.
A typical case where there is a problem is shown below:
- 27 June 2011 20:52 – Arrest by Sargent Azulay for car vandalizing.
- 29 June 2011 09:15 – Detention, 8 days by Judge Judy
- 5 July 2011 – Remanded in Custody by Judge Thachil Oti
- 14 Aug 2011 – Sentenced, 3 months by Judge Koev Li
- 27 Sep 2011 – Released at end of sentence
Do you see the problem? You probably don’t, but for me, it shouts. The issue is that an Arrest is only valid for 24 hours. Because of the gap in the incarceration warrants, a lawyer can usually get an Inmate out.
That means that part of what the system has to do is to be able to say not only what the current state, but what was the state at any given point in time. Those are usually called Temporal Systems, or Append Only systems, since you are not allowed to make modifications existing data, only create new data.
They also tend to be quite hard to work with, but this is still isn’t a post about the technical stuff, so we will let it go until we get to the good parts.