Ayende @ Rahien

Oren Eini aka Ayende Rahien CEO of Hibernating Rhinos LTD, which develops RavenDB, a NoSQL Open Source Document Database.

You can reach me by:

oren@ravendb.net

+972 52-548-6969

, @ Q j

Posts: 6,840 | Comments: 49,138

filter by tags archive
time to read 4 min | 609 words

About five years ago, my wife got me a present, a FitBit. I didn’t wear a watch for a while, and I didn’t really see the need, but it was nice to see how many steps I took and we had a competition about who has the most steps a day. It was fun. I had a few FitBits since then and I’m mostly wearing one. As it turns out, FitBit allows you to get an export of all of your data, so a few months ago I decided to see what kind of information I have stored there, and what kind of data I can get from it.

The export process is painless and I got a zip with a lot of JSON files in it. I was able to process that and get a CSV file that had my heartrate over time. Here is what this looked like:

image

The file size is just over 300MB and it contains 9.42 million records, spanning the last 5 years.

The reason I looked into getting the FitBit data is that I’m playing with timeseries right now, and I wanted a realistic data set. One that contains dirty data. For example, even in the image above, you can see that the measurements aren’t done on a consistent basis. It seems like ten and five second intervals, but the range varies.  I’m working on a timeseries feature for RavenDB, so that was perfect testing ground for me. I threw that into RavenDB and I got the data to just under 40MB in side.

I’m using Gorilla encoding as a first pass and then LZ4 to further compress the data. In a data set where the duration between measurement is stable, I can stick over 10,000 measurements in a single 2KB segment. In the case of my heartrate, I can store an average of 672 entries in each 2KB segment. Once I have the data in there, I can start actually looking at interesting patterns.

For example, consider the following query:

image

Basically, I want to know how I’m doing on a global sense, just to have a place to start figuring things out. The output of this query is:

image

These are interesting numbers. I don’t know what I did to hit 177 BPM in 2016, but I’m not sure that I like it.

What I do like is this number:

image

I then run this query, going for a daily precision on all of 2016:

image

And I got the following results in under 120 ms.

image

These are early days for this feature, but I was able to take that and generate the following (based on the query above).

image

All of the results has been generated on my laptop, and we haven’t done any performance work yet. In fact, I’m posting about this feature because I was so excited to see that I got queries to work properly now. This feature is early stages yet.

But it is already quite cool.

time to read 7 min | 1227 words

imageProduction ready code is a term that I don’t really like. I much prefer the term: Production Ready System. This is because production readiness isn’t really a property of a particular piece of code, but of the entire system.

The term is often thrown around, and usually it is referred to adding error handling and robustness to a piece of code. For example, let’s take an example from the Official Docs:

This kind of code is obviously not production ready, right? Asked to review it, most people would point out the lack of error handling if the request fails. I asked on twitter about this and got some good answers, see here.

In practice, to make this piece of code production worthy you’ll need a lot more code and infrastructure:

  • .NET specific - ConfigureAwait(false) to ensure this works properly with a SynchronizationContext
  • .NET specific – Http Client caches Proxy settings and DNS resolution, requiring you to replace it if there is a failure / on a timer.
  • .NET specific – Exceptions won’t be thrown from Http Client if the server sent an error back (including things like auth failures).
  • Input validation – especially if this is exposed to potentially malicious user input.
  • A retry mechanism (with back off strategy) is required to handle transient conditions, but need either idempotent requests or way to avoid duplicate actions.
  • Monitoring for errors, health checks, latencies, etc.
  • Metrics for performance, how long such operations take, how many ops / sec, how many failures, etc.
  • Metrics for the size of responses (which may surprise you).
  • Correlation id for end to end tracing.
  • Properly handling of errors – including reading the actual response from the server and surfacing it to the caller / logs.
  • Handling successful requests that don’t contain the data they are supposed to.

And these are just the stuff that pop to my head from looking at 10 lines of really simple code.

And after you have done all of that, you are still not really production ready. Mostly because if you implemented all of that in the GetProductAsync() function, you can’t really figure out what is actually going on.

These kind of operation is something that you want to have to implement once, via the infrastructure. There are quite a few libraries which does robust service handling that you can use, and using that will help, but it will only take you part way toward production ready system.

Let’s take cars and driving as an example of a system. If you’ll look at a car, you’ll find that quite a bit of the car design, constraints and feature set is driven directly by the need to handle the failure mode.

A modern car will have (just the stuff that is obvious and pops to mind):

  • Drivers – required explicit learning stage and passing competency test, limits on driving in impaired state, higher certification levels for more complex vehicles.
  • Accident prevention: ABS, driver assist and seat belt beeps.
  • Reduce injuries / death when accidents do happen – seat belts, air bags, crumple zones.
  • On the road – rumble strips, road fence, road maintenance, traffic laws, active and passive enforcement.

I’m pretty sure that anyone who actually understand cars will be shocked by how sparse my list is. It is clear, however, that accidents, their prevention and reducing their lethality and cost are a part and parcel of all design decisions on cars. In fact, there is a multi layered approach for increasing the safety of drivers and passengers. I’m not sure how comparable the safety of a car is to production readiness of a piece of software, though. One of the ways that cars compete with one another is on the safety features. So there is a strong incentive to improve there. That isn’t usually the case with software.

It usually take a few (costly) lessons about how much being unavailable costs you before you can really feel how much not being production ready costs you. And at this point, most people turn to error handling and recovery strategies. I think this is a mistake. A great read on the topic is How Complex System Fail, it is a great, short paper, highly readable and very relevant to the field of software development.

I consider a system to production ready when it has, not error handling inside a particular component, but actual dedicated components related to failure handling (note the difference from error handling), management of failures and its mitigations.

The end goal is that you’ll be able to continue execution and maintain semblance of normalcy to the outside world. That means having dedicated parts of the system that are just about handling (potentially very rare) failure modes as well as significant impact on your design. and architecture. That is not an inexpensive proposition. it takes quite a lot of time and effort to get there, and it is usually only worth it if you actually need the reliability this provides.

With cars, the issue is literally human lives, so we are willing to spend quite a lot of preventing accidents and reducing their impact. However, the level of robustness I expect from a toaster is quite different (don’t go on fire, pretty much) and most of that is already handled by the electrical system in the house.

Erlang is a good example of a language and environment that has always prioritized production availability. Erlang systems famously have 99.9999999% availability (that is nine nines). That is 32 milliseconds of downtime per year, which pretty much means less than the average GC pause in most systems. Erlang have a lot of infrastructure to support this kind of availability numbers, but that still require you to understand the whole system.

For example, if your Erlang service depends on a database, a restart of a database server (which takes 2 minutes to cycle) might very well means that your service processes will die, will be restarted by their supervisors only to die again and again. At this point, the supervisors itself give up and die, passing the buck up the chain. The usual response is to restart the supervisor again a few times, but the database is still down and we are in a cascading failure scenario. Just restarting is really effective in handling errors, but for certain failure scenarios, you need to consider how you’ll actually make it work. A database being unavailable can make your entire system cycle through its restarts options and die just as the database is back online. For that matter, what happens to all the requests that you tried to process at that time?

I have had a few conversations that went something like: “Oh, we use Erlang, that is handled”, but production readiness isn’t something that you can solve at the infrastructure level. It has a global impact on your architecture, design and the business itself. There are a lot of questions that you can’t answer from a technical point of view. “If I can’t validate the inventory status, should I accept an order or not?” is probably the most famous one, and that is something that the business itself need to answer.

Although, to be honest, the most important answer that you need from the business is a much more basic one: “Do we need to worry about production readiness, and if so, by how much?”

time to read 2 min | 301 words

The Reddit’s front page contain a list of recent posts from all communities. In most cases, you want to show posts from communities that the user is subscribe to, but at the same time, you want to avoid flooding the front page with posts from any single community. You also need this to be really fast.

It turns out that doing this in RavenDB is actually very easy. We are going to create a map/reduce index that will aggregate the few most recent posts per community, like so:

image

What this index will do is provide us with the five most recent posts in each community, as well as their date. This is an interesting example of a map/reduce index, because we are using both aggregation and fanout in the index.

The nice thing about this index is that we can project the results directly from it to the user. Let’s see how the queries will look like:

image

This is a simple query that does quite a lot. It gives us the most recent 15 posts across all the communities that the user care about, with no single community able to generate more than 5 posts. It sort them the posted date and fetch the actual posts in the same query. This is going to give you consistent performance regardless of how much data you have and how many updates your experience. The actual Reddit front page is a lot more complex, I’m sure, but this serve as a nice example of how you can do non trivial stuff in RavenDB’s indexes that simplify your life by a lot.

time to read 4 min | 760 words

I talked about some of the requirements for proper workflow design in my previous post. As a reminder, the top ones are:

  • Cater for developers, not the business analysts. (More on this later).
  • Source control isn’t optional, meaning:
    • Multiple branches
    • Can diff & review changes
    • Merging
    • Multiple people can work at the same time
  • Encapsulate complexity

This may seem like a pretty poor list, because if you are a developer, you might be taking all of these as granted. Because of that, I wanted to display a small taste from what used to be Microsoft’s primary workflow engine.

image

A small hint… this kind of system is not going to be useful for anything relating to source control, change management, collaborative work, understanding what is going on, etc.

A better solution for this would be to use a tool that can work with source control, that developers are familiar with and can handle the required complexity.

That tool is called… code.

It checks all the boxes required, naturally. But it does have a distinct disadvantage. One of the primary reasons you want to use a workflow engine of some kind is to decouple the implementation of your business from the policies of the business. Coming back to the mortgage example, how you calculate late fees payment is fixed (in the contract itself, but usually also by law and many regulations), but figuring out whatever late fees should be waived, on the other hand, is subject to the whims of the business.

That is a pretty simple example, but in most businesses, these kind of workflows adds up. You can easily end up with dozens to hundreds of different workflows without the business being too big or complex.

There is another issue, though. Code is pretty good when you need to handle straightforward tasks. A set of if statements (which is pretty much all most workflows are) are trivial to handle. But workflow has another property, they tend to be long. Not long on computer scale (seconds), but long on people scale (months and years).

The typical process of getting a loan may involve an initial submission, review by a doctor, asking for follow up documentation (rinse – repeat a few times), getting doctor appraisal and only then being able to generate a quote for the customer. Then we have a period of time in which the customer can accept, a qualifying period, etc. That can last for a good long while.

Trying to code long running processes like that require us a very unnatural approach to coding. Especially since you are likely to need to handle software updates while the workflows are running.

In short, we are in a strange position: we want to use code, because it is clear, support software development practices that are essentials and can scale up in complexity as needed. On the other hand, we don’t want to use our usual codebase for that, because we’ll have very different deployment strategies, the manner of working is very different and there is a lot more involvement of the business in what is going on there.

The way to handle that is to create a proper boundary between parts of the system. We’ll have the workflow behavior, defined in scripts, that describe the policy of the system. These tend to be fairly high level concepts and are designed explicitly for the rule of business policy behaviors. The infrastructure for that, on the other hand, is just a standard application using normal software practices, that is driven by the workflow scripts.

And by a script, I meant literally a script. As in, JavaScript.

I want to give you a sneak peak into how I envision this kind of system, but I’ll defer full discussion of what is involved to my next post.



The idea is that we use the script to define our policy, and then we use that to make decisions and invoke the next stage in the process. You might notice that we have the state variable, which is persisted between invocations. That allow us to use a programming model that is fairly common and obvious to developers. We can usually also show this, as is, to a business analyst and get them to understand what is going on easily enough. All the actual actions are abstracted. For example, life insurance setup is a completely different workflow that we invoke.

In my next post, I’m going to drill down a bit into the details of this approach and what kind of features do we need there.

time to read 5 min | 815 words

One of the most common themes I run into when talking to customers, users and sundry people in tech is the repeated desire to fire developers.

Actually, that is probably too loaded a statement. It actually come in two parts:

  • Developers usually want to focus on the interesting bits, and the business logic portions aren’t that much fun.
  • The business analysts usually want to get things done and having to get developers to do that is considered inefficient.

If only there was a tool, or a pattern, or a framework, or something that would allow the business analysts to directly specify the behavior of the system… Why, we could cut the developers from the process entirely! And speaking as a developer, that would be a huge relief.

I think the original name for that was CASE tools, and that flopped. In fact, literally every single one of the attempts to replace developers by a tool has flopped. They got such a bad rap that people keep trying to implement them using different names. Some stuff can be done fairly easily, though. WYSIWYG for GUI is well established and Wordpress and WIX, to name the two examples that come to mind immediately, show that you can have a non techie build a proper website. In fact, you can even plug in some pretty sophisticated functionality without burdening the user with too much.

But all that takes you to a point. And past that point, the drop off is harsh. Let’s take another common tool that is used to reduce the dependency on developers, SharePoint.

You pay close to double for actual developer time on SharePoint, mostly because it is so painful to work with it.

In a recent conference, I got into a conversation about business workflows and how to best implement them. You can look at the image on the right to get a good idea about what kind of process they were talking about.

To make things real, I want to take a “simple” example, of accepting a life insurance policy. Here is what the (extremely simplified) workflow looks like for issuing a life insurance policy:

image

This looks good, and it certainly should make sense to a business analyst. However, even after I pretty much reduced the process to its bare bones and even those has been filed away, this is still pretty complex. The process of actually getting a policy is actually a lot more complex. Some questions don’t require doctor evaluation (for example, smoking) and some require supplemental documentation (oh, you were hospitalized? Gimme all these records). The doctor may recommend different rates, rejecting entirely, some exceptions in the policy, etc. All of which need to be in the workflow. Actuarial tables needs to be consulted for each of those cases, etc, etc, etc.

But something like the diagram above isn’t going to be able to handle this level of complexity. You are going to get lost very quickly if you try to put so many boxes on the screen.

So you need encapsulation.

And you’ll probably want to have a way to develop these business workflows, which means that they aren’t static.

So you need source control.

And if you have a complex business process, you likely have different people working on it.

So you need to be able to review changes, and merge them.

Note that this is explicitly distinct from being able to store the data in source control. Being able to actually diff in a meaningful fashion two versions of such a process is anything but trivial. Usually you are left with diffing the raw XML / JSON that store the structure. Good luck with that.

If the workflow is complex, you need to be able to understand what is going on under various conditions.

So you need a debugger.

In fact, pretty soon you’ll realize that you’ll need quite a lot of the things that developers do. Except that your tool of choice doesn’t do that, or if they do, they do it poorly.

Another issue that even if you somehow managed to bypass all of those details, you are going to be facing the same drop that you see elsewhere with tools that attempt to get rid of developers. At some point, the complexity grows too large, and you’ll call your development team and hand of the system to them. At which point they will be stack with a very clucky tool that attempt to be quite clever and easy to use. It is also horribly limiting for a developer. Mostly because all of the “complexity” involved is in the business process itself, not in the actual complexity of what is going on.

There are better ways of handling that, and the easier among them is to just use code. That can be… surprisingly versatile.

time to read 3 min | 468 words

VBS.FileI was reminiscing about some old code that I wrote a long while ago, in the heyday of ASP and when the dotcom bubble was just starting to inflate. At the time, I was either still at high school or just graduated and I was fascinated by the ability to write web applications. I wrote quite a few of them, as I recall. Thankfully, none of them ever made it to this day and age. I remember one project in particular that I was quite proud of. I wrote a bunch of BBS / forum systems. One version used an Access file as the database. IIRC, that is literally how I learned SQL for the first time.

The other BBS system is what I’m here to talk about today. You couldn’t always get Access, and having it installed on the server was PITA. Especially given that I was pretty much limited to hosts that offered free hosting only. So I decided to write a BBS system that had no dependencies whatsoever and can be deployed on any host that could handle ASP. Note that this is ASP classic, .NET is still 2 years away from alpha status at this time and Java is for applets.

I decided that I would write everything through file I/O, but that was quite complex. I needed something that would help me out. Then I realized that I could use ASP itself to help me. Instead of having to pull data at runtime from a file, parse it, process it and so on, I could lean on ASP itself for that.

Trigger warning: This code is newly written, but I still remember the shape of it quite well. This may cause you seizures. For the full beauty of this piece of code, you need to consider that this is a very small piece of a much larger codebase (all in a single file, of course) but it is a very much a reprehensive representative example.

I’ll give you a moment to study the code. It deserve that much of your attention.

What you see here is a beautiful example of using code as data and data as code, self modifying code and some really impressive (even if I say so myself) capabilities of my past self to dig himself way into the hole.

Bonus points if you can point out all the myriad of issue that this code has. You can safely leave aside maintainability, I never had to maintain it, but over twenty year have passed, and I still remember the complexity involved in keeping all the states in my head.

And that was the first time that I actually wrote my own dedicated database.

time to read 7 min | 1321 words

imageAlmost by accident, it turned out that I implemented a pretty simple, but non trivial task in both C and Rust and blogged about them.

Now that I’m done with both of them, I thought it would be interesting to talk about the differences in the experiences.

The Rust version clocks at exactly 400 lines of code and uses 12 external crates.

The C version has 911 lines of C code and another 140 lines in headers and depends on libuv and openssl.

Both took about two weeks of evenings of me playing around. If I was working full time on that, I could probably do that in a couple of days (but probably more, to be honest).

The C version was very straightforward. The C language is pretty much not there, and on the one hand, it didn’t get in my way at all. On the other hand, you are left pretty much on your own. I had to write my own error handling code to be sure that I got good errors, for example. I had to copy some string processing routines that aren’t available in the standard library, and I had to always be sure that I’m releasing resources properly. Adding dependencies is something that you do carefully, because it is so painful.

The Rust version, on the other hand, uses the default error handling that Rust has (and much improved since the last time I tried it). I’m pretty sure that I’m getting worse error messages than the C version I used, but that is good enough to get by, so that is fine. I had to do no resource handling. All of that is already handled for me, and that was something that I didn’t even consider until I started doing this comparison.

When writing the C version, I spent a lot of time thinking about the structure of the code, debugging through it (to understand what is going on, since I also learned how OpenSSL work) and seeing if things worked. Writing the code and compiling it were both things that I spent very little time on.

In comparison, the Rust version (although benefiting from the fact that I did it second, so I already knew what I needed to do) made me spend a lot more time on just writing code and getting it to compile.  In both cases, I decided that I wanted this to be a production worthy code, which meant handling all errors, producing good errors, etc. In C, that was simply a tax that needed to be dealt with. With Rust, that was a lot of extra work.

The syntax and language really make it obvious that you want to do that, but in most of the Rust code that I reviewed, there are a lot of unwrap() calls, because trying to handle all errors is too much of a burden. When you aren’t doing that, your code size balloons, but the complexity of the code didn’t, which was a great thing to see.

What was really annoying is that in C, if I got a compiler error, I knew exactly what the problem was, and errors were very localized. In Rust, a compiler error could stymie me for hours, just trying to figure out what I need to do to move forward. Note that the situation is much better than it used to be, because I eventually managed to get there, but it took a lot of time and effort, and I don’t think that I was trying to explore any dark corners of the language.

What really sucked is that Rust, by its nature, does a lot of type inferencing for you. This is great, but this type inferencing goes both backward and forward. So if you have a function and you create a variable using: HashMap::new(), the actual type of the variable depends on the parameters that you pass to the first usage of this instance. That sounds great, and for the first few times, it looked amazing. The problem is that when you have errors, they compound. A mistake in one location means that Rust has no information about other parts of your code, so it generates errors about that. It was pretty common to make a change, run cargo check and see three of four screen’s worth of errors pass by, and then go into a “let’s fix the next compiler error” for a while.

The type inferencing bit also come into play when you write the code, because you don’t have the types in front of you (and because Rust love composing types) it can be really hard to understand what a particular method will return.

C’s lack of async/await meant that when I wanted to do async operations, I had to decompose that to event loop mode. In Rust, I ended up using tokio, but I think that was a mistake. I should have used the event loop model there as well. It isn’t as nice, in terms of the code readability, but the fact that Rust doesn’t have proper async/await meant that I had a lot more additional complexity to deal with, and that nearly caused me to give up on the whole thing.

I do want to mention that for C, I had run Valgrind a few times to get memory leaks and invalid memory accesses (it found a few, even when I was extra careful). In Rust, the compiler was very strict and several times complained about stuff that if allowed, would have caused problems. I did liked that, but most of the time, it felt like fighting the compiler.

Speaking of which, the compilation times for Rust felt really high. Even with 400 lines of code, it can take a couple of seconds to compile (with cargo check, mind, not full build). I do wonder what it will do with a project of significant size.

I gotta say, though, compiling the C code meant that I would have to test the code. Compiling the Rust code meant that I could run things and they usually worked. That was nice, but at the same time, getting the thing to compile at all was a major chore many times. Even with the C code not working properly, the feedback loop was much shorter with C than with Rust. And some part of that was that I already had a working implementation for most of what I needed, so I had a lot less need to explore when I wrote the Rust code.

I don’t have any hard conclusions from the experience, I like the simplicity of C, and if I had something like Go’s defer to ensure resource disposal, that would probably be enough (I’m aware of libdefer and friends). I find the Rust code elegant (except the async stuff) and the standard library is great. The fact that the crates system is there means that I have very rich access to additional libraries and that this is easy to do. However, Rust is full of ceremony that sometimes seems really annoying. You have to use cargo.toml and extern crate for example.

There is a lot more to be done to make the compiler happy. And while it does catch you sometimes doing something your shouldn’t, I found that it usually felt like busy work more than anything else. In some ways, it feels like Rust is trying to do too much. I would have like to see something less ambitious. Just focusing on one or two concepts, instead of trying to be high and low level language, type inference set to the make, borrow checker and memory safety, etc. It feels like this is a very high bar to cross, and I haven’t seen that the benefits are clearly on the plus side here.

time to read 2 min | 345 words

I’m pretty much done with my Rust protocol impl. The last thing that I wanted to try was to see how it would look like when I allow for messages to be handled out of band.

Right now, my code consuming the protocol library looks like this:

This is pretty simple, but note that the function definition forces us to return a value immediately, and that we don’t have a way to handle a command asynchronously.

What I wanted to do is to change things around so I could do that. I decided to implemented the command:

remind 15 Nap

Which should help me remember to nap. In order to handle this scenario, I need to provide a way to do async work and to keep sending messages to the client. Here was the first change I made:

image

Instead of returning a value from the function, we are going to give it the sender (which will render the value to the client) and can return an error if the command is invalid in some form.

That said, it means that the echo implementation is a bit more complex.

There is… a lot of ceremony here, even for something so small. Let’s see what happens when we do something bigger, shall we? Here is the implementation of the reminder handler:

Admittedly, a lot of that is error handling, but there is a lot of code here to do something that simple.  Compare that to something like C#, where the same thing could be written as:

I’m not sure that the amount of complexity that is brought about by the tokio model, even with the async/await macros is worth it at this point. I believe that it needs at least a few more iterations before it is going to be usable for the general public.

There is way too much ceremony and work to be done, and a single miss and you are faced with a very pissed off compiler.

time to read 3 min | 410 words

After a lot of trouble, I’m really happy that I was able to build an async I/O implementation of my protocol. However, for real code, I think that I would probably recommend using with the sync API instead, since at least that is straightforward and doesn’t incur so much overhead at development time. The async stuff is still very much a “use at your own risk” kind of deal from my perspective. And I can’t imagine trying to use it in a large project and no suffering from the complexity.

As a good example, take a look at the following bit of code:

image

It doesn’t seem to be doing much, right? And it is clear what the intent of the code is.

However, if you try to compile this code you’ll get:

image

Now, it took me a long while to figure out what is going on.  The issue is that the code I’m seeing isn’t the actual code, because of macro expansions.

So let’s resolve this and see what the expanded code looks like:

This is after formatting, of course, but it certainly looks scary. Glancing at this code doesn’t tell me what the problem was, so I tried replacing the method with the expanded result, and I got the same error, but this time I got it on a line that helped me figure it out. Here is the issue:

image

We use the ? to return early from the poll method, and the Receiver I’m using in this case is defined to have a Result<String, ()>, so this is the cause of the problem.

I returned my own error type as a result, giving me the ability to convert from (), but that was a really hard thing to resolve.

It might be better to have Rust also offer to show the error on the expanded code by default, because it was somewhat of a chore to actually get to this.

What made this oh so confusing is that I had the exact same code, but using a Stream<String, io:Error> that worked, obviously. But it was decidedly non obvious to see what was the difference between two identical pieces of code.

time to read 3 min | 588 words

On my last post, I got really frustrated with tokio’s complexity and wanted to move to use mio directly. The advantages are that the programming model is pretty simple, even if actually working with is is hard. Event loops can cause your logic to spread over many different locations and make it hard to follow. I started to go that path until I figure out just how much work it would take. I decided to give tokio a second change, and at this point, I looked into attempts to provide async/await functionality to Rust.

It seems that at least some work is already available for this, using futures + some Rust macros. That let me write code that is much more natural looking, and I actually managed to make it work.

Before I get to the code, I want to point out some concerns that I have right now. The futures-await crate (and indeed, all of tokio) seems to be in a state of flux. There is an await in tokio, and I think that there is some merging around of all of those libraries into a single whole. What I don’t know, and can’t find any information about, is what I should actually be using, and how all the pieces come together. I have to note that even with async/await, the programming model is still somewhat awkward, but it is at a level that I can live with. Here is how I built it.

First, we need to accept connections, which is done like so:

Note that I have two #[async[ annotations. One for the method as a whole and one for the for loop. This just accept the connection and spawn a task to handle that, the most interesting tidbits are in the actual processing of the connection:

You can see that this is fairly straightforward code. We first do the TLS handshake, then we validate the certificate. If there is an auth error, we send it to the user and back off. If we are successful, however, things get interesting.

I create a channel, which allow me to  split off the read and write portions of the task. This means that I can send results out of order, if I wanted to, which is great for the actual protocol handling. The first thing to do is to send the OK string to the client, so they know that we successfully connected, then we spawn the read/write tasks. The write task is pretty simple, overall:

You can see the funny .0 references, which is an artifact of the fact that the write_all() function consumes the writer we pass to it and return (a potentially different) writer in the result.  This is pretty common for functional languages.

I’m pretty sure that I can avoid the two calls to write_all for the postfix, but that is easier for now.

Processing the commands is simple as well:

For each command we support, we have an entry on the server configuration and we fetch and invoke it. The result of the command will be written to the client by the write task. Right now we have a 1:1 association between them, but this is now easily broken.

And finally, having an actually command run and running the server itself:

This is pretty simple now, and it give us a nice model to program commands and responses.

I pushed the whole code to this branch, if you care to look at it.

I have some more comments about this code, but I’ll reserve them for another post.

FUTURE POSTS

  1. TimeSeries in RavenDB: Exploring the requirements - about one day from now

There are posts all the way to May 20, 2019

RECENT SERIES

  1. Reviewing Sled (3):
    23 Apr 2019 - Part III
  2. RavenDB 4.2 Features (5):
    21 Mar 2019 - Diffing revisions
  3. Workflow design (4):
    06 Mar 2019 - Making the business people happy
  4. Data modeling with indexes (6):
    22 Feb 2019 - Event sourcing–Part III–time sensitive data
  5. Production postmortem (25):
    18 Feb 2019 - This data corruption bug requires 3 simultaneous race conditions
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats