Ayende @ Rahien

filter by tags archive

architecture (622) rss
bugs (451) rss
community (382) rss
databases (481) rss
design (898) rss
development (651) rss
hibernating-practices (72) rss
miscellaneous (592) rss
performance (397) rss
programming (1101) rss
raven (1469) rss
ravendb.net (555) rss
reviews (184) rss

2025
- September (10)
- August (6)
- July (7)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

Apr 10 2019

Shared database in microservices is a problem, yep

time to read 4 min | 797 words

Tweet Share Share 7 comments

Tags:

architecture

This post is in reply to this one: Is a Shared Database in Microservices Actually an Anti-pattern?

The author does a great job outlining the actual problem. Given two services that need to share some data, how do you actually manage that in a microservice architecture? The author uses the Users and Orders example, which is great, because it is pretty simple and require very little domain knowledge.

The first question to ask is: Why?

Why use microservices? Wikipedia says:

The benefit of decomposing an application into different smaller services is that it improves modularity. This makes the application easier to understand, develop, test, and become more resilient to architecture erosion.

I always like an application that is easier to understand, develop and test. Being resilient to architecture erosion is a nice bonus.

The problem is that this kind of architecture isn’t free. A system that is composed of microservices is a system that need to communicate between these services, and that is usually where most of the complexity in such a system reside.

In the Orders service, if we need to access some details about the User, how do we do that?

We can directly call to the Users service, but that creates a strong dependency between the services. If Users is down, then Orders is down. That sort of defeats the purpose of the architecture. It also means that we don’t actually have separate services, we just have exchange the call assembly instruction with RPC and distributed debugging. All the costs, none of the benefits.

The post above rightly calls this problematic, and asks whatever async integration between the services would work, using streams. I’m not quite sure what was meant there. My usual method of integrating different microservices is to not do that, instead. Either we need to send a command to a different service (which is async) or we need to publish some data from a service (also async). Both of these options are assuming to be failure resistant and unreliable. In other words, if I send a command to another service, and I need to handle failure, I setup a timer to let me know to handle not being called back.

Even if you just need some data published from another service, and can use a feature such as RavenDB ETL to share that data. You still need to take into account issues such as network failures causing you to have a laggy view of the data.

This is not an accident.

That is not your data, you have a copy (potentially stale) of published data from another service. You can use that for reference, but you cannot count on it. If you need to rely on that data, you need to send a command to the owning service, which can then make the actual decision.

In short, this is not a trivial matter. Even if the actual implementation can be done pretty easily.

The fact that each service owns a particular portion of the system is a core principle of the microservice architecture.

Having a shared database is like having a back stage pass. It’s great, in theory, but it is also open for abuse. And it will be abused. I can guarantee that with 100% confidence.

If you blur the lines between services, they are no longer independent. Have fun trying to debug why your Users’ login time spiked (Orders’ is running the monthly report). Enjoy breaking the payment processing system (you added a new type of user that the Orders system can’t process). And these are the good parts. I haven’t started to talk about what happens when the Orders service actually attempt to write to the Users’ tables.

The article suggests using DB ACL to control that, but you already having something better. A different database, because it is a different service.

It might be better to think about the situation like joint bank account. It’s reasonable to a have a joint bank account with your spouse. It is not so reasonable to have a joint bank account with Mary from accounting, because that make it easier to direct deposit your payroll. There is separation there for a reason, and yes, that does make things harder.

That’s the point, it is not an accident.

The whole point is that integration between services is going to be hard, so you’ll have less of that, and you’ll have that along very well defined boundaries. That means that we can have proper boundaries and contracts between different areas, which lead us to better modularity, thus allowing easier development, deployment and management.

If that isn’t something you want, that is fine, just don’t go into the microservice architecture. Because a monolith architecture is just fine, but a Frankenstein creation of a microservice architecture with shared database is not. Just ask Mary from accounting…

Apr 09 2019

X509 Certificates vs. API Keys in RavenDB

time to read 1 min | 148 words

Tweet Share Share 2 comments

Tags:

RavenDB 4.x is using X509 Certificates for authentication. We got a feedback question from a customer about that, they much rather to use API Keys, instead.

We actually considered this as part of the design process for 4.x and we concluded that we can make this work in just the same manner as API Keys. Here is how you can make it work.

You have the certificate file (usually PFX) and convert that to a Base64 string, like so:

[System.Convert]::ToBase64String( (gc "cert.pfx" -Encoding byte ) )

You can take the resulting string and store it like an API key, because that is effectively how it is treated. In your application startup, you can use:

And this is it. For all intents and purposes, you can now use the certificate as an API key.

Apr 08 2019

RavenDB 3.0 end of lifecycle and RavenDB 4.2 Release Candidate

time to read 1 min | 185 words

Tweet Share Share 0 comments

Tags:

raven

Last week we had a couple of interesting milestones. The first of which is that we reached the End Of Life for RavenDB 3.0. If you are still running on RavenDB 3.0 (or any previous version), be aware that this marks the end of the support cycle for that version. You are strongly encouraged to upgrade to RavenDB 3.5 (which still has about 1.5 years of support).

I got an email today from a customer talking about maybe considering upgrade from the RavenDB version that was released in Dec 2012, so I’m very familiar with slow upgrade cycles.

End of Life for 3.0 means that we no longer offer support for it. If your operations team is dragging their feet on the upgrade, please hammer this point home. We really want to see people running on at least 3.5.

The other side of the news is that the new bits for RavenDB 4.2 Release Candidate are out. This release moves out of the experimental phase features such as Cluster Wide Transactions and Counters and introduce Graph Queries support. As usual, I would really love your feedback.

Apr 07 2019

I have been blogging for 15 years

time to read 1 min | 39 words

Tweet Share Share 8 comments

Tags:

community

I just realized that this is my first blog post, it was published in April 1st, 2004.

I don’t really know what else to say, to be honest. I have been doing this blog for 15(!) years.

Apr 04 2019

Modeling discussions: Data deletions

time to read 4 min | 671 words

Tweet Share Share 10 comments

Tags:

A decade(!) ago I wrote that you should avoid soft deletes. Today I run into a question in the mailing list and I remembered writing about this, it turned out that there has been quite the discussion on this at the time.

The context of the discussion at the time was deleting data from relational systems, but the same principles apply. The question I just fielded asked how you can translate a Delete() operation inside the RavenDB client to a soft delete (IsDeleted = true) operation. The RavenDB client API supports a few ways to interact with how we are talking to the underlying database, including some pretty interesting hooks deep into the pipeline.

What it doesn’t offer, though, is a way to turn a Delete() operation into and update (or an update to a delete). We do have facilities in place that allow you to detect (and abort) on invalid operations. For example, invoices should never be deleted. You can tell the RavenDB client API that it should throw whenever an invoice is about to be deleted, but you have no way of saying that we should take the Delete(invoice) and turn that into a soft delete operation.

This is quite intentionally by design.

Having a way to transform basic operations (like delete –> update) is a good way to be pretty confused about what is actually going on in the system. It is better to allow the user to enforce the required behavior (invoices cannot be deleted) and let the calling code handle this different.

The natural response here, of course, is that this places a burden on the calling code. Surely we want to be able to follow DRY and not write conditionals when the user clicks on the delete button. But this isn’t an issue where this is extra duplicated code.

An invoice is never deleted, it is cancelled. There are tax implications on that, you need to get it correct.
A payment is never removed, it is refunded.

You absolutely want to block deletions of those type of documents, and you need to treat them (very) different in code.

In the enusing decade since the blog posts at the top of this post were written, there have been a number of changes. Some of them are architecturally minor, such as the database technology of choice or the guiding principles for maintainable software development. Some of them are pretty significant.

One such change is the GDPR.

“Huh?!” I can imagine you thinking. How does the GDPR applies to an architectural discussion of soft deletes vs. business operations. It turns out that it is very relevant. One of the things that the GDPR mandates (and there are similar laws elsewhere, such as the CCPA) the right to be forgotten. So if you are using soft deletes, you might actually run into real problems down the line. “I asked to be deleted, they told me they did, but they secretly kept my data!”. The one thing that I keep hearing about the GDPR is that no one ever found it humorous. Not with the kind of penalties that are attached to it.

So when thinking about deletes in your system, you need to consider quite a few factors:

Does it make sense, from a business perspective, to actually lose that data? Deleting a note from a customer’s record is probably just fine. Removing the record of the customer at all? Probably not.
Do I need to keep this data? Invoices are one thing that pops to mind.
Do I need to forget this data? That is the other way, and what you can forget and how can be really complex.

At any rate, for all but the simplest scenarios, just marking IsDeleted = true is likely not going to be sufficient. And all the other arguments that has been raised (which I’m not going to repeat, read the posts, they are good ones) are still in effect.

Apr 03 2019

Production ready code is much more than error handling

time to read 7 min | 1227 words

Tweet Share Share 5 comments

Tags:

Production ready code is a term that I don’t really like. I much prefer the term: Production Ready System. This is because production readiness isn’t really a property of a particular piece of code, but of the entire system.

The term is often thrown around, and usually it is referred to adding error handling and robustness to a piece of code. For example, let’s take an example from the Official Docs:

This kind of code is obviously not production ready, right? Asked to review it, most people would point out the lack of error handling if the request fails. I asked on twitter about this and got some good answers, see here.

In practice, to make this piece of code production worthy you’ll need a lot more code and infrastructure:

.NET specific - ConfigureAwait(false) to ensure this works properly with a SynchronizationContext
.NET specific – Http Client caches Proxy settings and DNS resolution, requiring you to replace it if there is a failure / on a timer.
.NET specific – Exceptions won’t be thrown from Http Client if the server sent an error back (including things like auth failures).
Input validation – especially if this is exposed to potentially malicious user input.
A retry mechanism (with back off strategy) is required to handle transient conditions, but need either idempotent requests or way to avoid duplicate actions.
Monitoring for errors, health checks, latencies, etc.
Metrics for performance, how long such operations take, how many ops / sec, how many failures, etc.
Metrics for the size of responses (which may surprise you).
Correlation id for end to end tracing.
Properly handling of errors – including reading the actual response from the server and surfacing it to the caller / logs.
Handling successful requests that don’t contain the data they are supposed to.

And these are just the stuff that pop to my head from looking at 10 lines of really simple code.

And after you have done all of that, you are still not really production ready. Mostly because if you implemented all of that in the GetProductAsync() function, you can’t really figure out what is actually going on.

These kind of operation is something that you want to have to implement once, via the infrastructure. There are quite a few libraries which does robust service handling that you can use, and using that will help, but it will only take you part way toward production ready system.

Let’s take cars and driving as an example of a system. If you’ll look at a car, you’ll find that quite a bit of the car design, constraints and feature set is driven directly by the need to handle the failure mode.

A modern car will have (just the stuff that is obvious and pops to mind):

Drivers – required explicit learning stage and passing competency test, limits on driving in impaired state, higher certification levels for more complex vehicles.
Accident prevention: ABS, driver assist and seat belt beeps.
Reduce injuries / death when accidents do happen – seat belts, air bags, crumple zones.
On the road – rumble strips, road fence, road maintenance, traffic laws, active and passive enforcement.

I’m pretty sure that anyone who actually understand cars will be shocked by how sparse my list is. It is clear, however, that accidents, their prevention and reducing their lethality and cost are a part and parcel of all design decisions on cars. In fact, there is a multi layered approach for increasing the safety of drivers and passengers. I’m not sure how comparable the safety of a car is to production readiness of a piece of software, though. One of the ways that cars compete with one another is on the safety features. So there is a strong incentive to improve there. That isn’t usually the case with software.

It usually take a few (costly) lessons about how much being unavailable costs you before you can really feel how much not being production ready costs you. And at this point, most people turn to error handling and recovery strategies. I think this is a mistake. A great read on the topic is How Complex System Fail, it is a great, short paper, highly readable and very relevant to the field of software development.

I consider a system to production ready when it has, not error handling inside a particular component, but actual dedicated components related to failure handling (note the difference from error handling), management of failures and its mitigations.

The end goal is that you’ll be able to continue execution and maintain semblance of normalcy to the outside world. That means having dedicated parts of the system that are just about handling (potentially very rare) failure modes as well as significant impact on your design. and architecture. That is not an inexpensive proposition. it takes quite a lot of time and effort to get there, and it is usually only worth it if you actually need the reliability this provides.

With cars, the issue is literally human lives, so we are willing to spend quite a lot of preventing accidents and reducing their impact. However, the level of robustness I expect from a toaster is quite different (don’t go on fire, pretty much) and most of that is already handled by the electrical system in the house.

Erlang is a good example of a language and environment that has always prioritized production availability. Erlang systems famously have 99.9999999% availability (that is nine nines). That is 32 milliseconds of downtime per year, which pretty much means less than the average GC pause in most systems. Erlang have a lot of infrastructure to support this kind of availability numbers, but that still require you to understand the whole system.

For example, if your Erlang service depends on a database, a restart of a database server (which takes 2 minutes to cycle) might very well means that your service processes will die, will be restarted by their supervisors only to die again and again. At this point, the supervisors itself give up and die, passing the buck up the chain. The usual response is to restart the supervisor again a few times, but the database is still down and we are in a cascading failure scenario. Just restarting is really effective in handling errors, but for certain failure scenarios, you need to consider how you’ll actually make it work. A database being unavailable can make your entire system cycle through its restarts options and die just as the database is back online. For that matter, what happens to all the requests that you tried to process at that time?

I have had a few conversations that went something like: “Oh, we use Erlang, that is handled”, but production readiness isn’t something that you can solve at the infrastructure level. It has a global impact on your architecture, design and the business itself. There are a lot of questions that you can’t answer from a technical point of view. “If I can’t validate the inventory status, should I accept an order or not?” is probably the most famous one, and that is something that the business itself need to answer.

Although, to be honest, the most important answer that you need from the business is a much more basic one: “Do we need to worry about production readiness, and if so, by how much?”

Apr 02 2019

When you find that “select” is BROKEN

time to read 1 min | 90 words

Tweet Share Share 0 comments

Tags:

community

I’m going to be talking at CodeNode in London on June 3rd.

The topic of this talk is a few bugs that we found in the CoreCLR framework, the JIT and even the Operating System, how we tracked them down and put them down.

I have blogged about some of these in the past, and I’m looking forward to giving this talk. You can expect a talk that ranges between WinDBG walk throughs and rants about the memory model assumptions made on a different platform six years ago.

Apr 01 2019

Reviewing SledPart II

time to read 7 min | 1333 words

Tweet Share Share 7 comments

Tags:

The first part is here. As a reminder, Sled is an embedded database engine written in Rust. It takes a very different approach for how to store data, which I’m really excited to see. And with that, let’s be about it. In stopped in my last post when getting to the flusher, which simply sleep and call flush on the iobufs.

The next file is iobuf.rs.

Note that there are actually two separate things here, we have the IoBuf struct:

And we have IoBufs struct, which contains the single buffer.

Personally, I would use a different name, because is is going to be very easy to confuse them. At any rate, this looks like this is an important piece, so let’s see what is going on in there.

You can see that the last three fields shown compose of what looks like a ring buffer implementation. Next, we have this guys:

Both of them looks interesting / scary. I can’t tell yet what they are doing, but the “interesting thread interleaving” stuff is indication that this is likely to be complex.

The fun starts at the start() function:

This looks like it is used in the recovery process, so whenever you are restarting the instance. The code inside is dense, take a look at this:

This test to see if the snapshot provided is big enough to be considered stand alone (I think). You can see that we set the next_lsn (logical sequence number) and next_lid (last id? logical id?, not sure here). So far, so good. But all of that is a single expression, and it goes for 25 lines.

The end of the else clause also return a value, which is computed from the result of reading the file. It works, it is elegant but it takes a bit of time to decompose.

There is something called SegmentAccountant that I’m ignoring for now because we finally got to the good parts, where actual file I/O is being performed. Focusing on the new system startup, which is often much simpler, we have:

You can see that we make some writes (not sure what is being written yet) and then sync it. the maybe_fail! stuff is manual injection of faults, which is really nice to see. The else portion is also interesting.

This is when we can still write to the existing snapshot, I guess. And this limits the current buffer to the limits of the buffer size. From context, lid looks like the global value across files, but I’m not sure yet.

I run into this function next:

On its own, it isn’t really that interesting. I guess that I don’t like the unwrap call, but I get why it is there. It would significantly complicate the code if you had to handle failures from both the passed function and with_sa unable to take the lock (and that should only be the case if a thread panicked while holding the mutex, which presumably kill the whole process).

Sled calls itself alpha software, but given the number of scaffolding that I have seen so far (event log, debug_delay and not the measurements for lock durations) it has a lot of work done around actually making sure that you have the required facilities to support it. Looking at the code history, it is a two years old project, so there has been time for it to grow.

Next we have this method:

What this method actually going on in here is that based on the size of the in_buf, it will write it to a separate blob (using the lsn as the file name). If the value is stored externally, then it creates a message header that points to that location. Otherwise, it creates a header that encodes the length of the in_buf and returns adds that to the out_buf. I don’t like the fact that this method get passed the two bool variables. The threshold decision should probably be done inside the method and not outside. The term encapsulate is also not the best choice here. write_msg_to_output would probably more accurate.

The next interesting bit is write_to_log(), which goes on for about 200 lines. This takes the specified buffer, pad it if needed and write it to the file. It looks like it all goes to the same file, so I wonder how it deals with space recovery, but I’ll find it later. I’m also wondering why this is called a log, because it doesn’t look like a WAL system to me at this point. It does call fsync on the file after the write, though. There is also some segment behavior here, but I’m used to seeing that on different files, not all in the same one.

It seems that this method can be called concurrently on different buffers. There is some explicit handling for that, but I have to wonder about the overall efficiency of doing that. Random I/O and especially the issuance of parallel fsyncs, are likely not going to lead to the best system perf.

Is where the concurrent and possibly out of order write notifications are going. This remembers all the previous notifications and if there is a gap, will wait until the gap has been filled before it will notify interested parties that the data has been persisted to disk.

Because I’m reading the code in a lexical order, rather than in a more reasonable layered or functional approach, this gives me a pretty skewed understanding of the code until I actually finish it all. I actually do that on purpose, because it forces me to consider a lot more possibilities along the way. There are a lot of questions that might be answered by looking at how the code is actually being used in parts that I haven’t read yet.

At any rate, one thing to note here is that I think that there is a distinct possibility that a crash after segment N +1 was written (and synced) but before segment N was done writing might undo the successful write for N+1. I’m not sure yet how this is used, so that is just something to keep in mind for now.

I managed to find this guy:

If you’ll recall, this is called on a timer to write the current in memory state.

The only other interesting thing in this file is:

This looks like it is racing with other calls on trying to write the current buffer. I’m not so sure what sealed means. I think that this is to prevent other threads from writing to the same buffer while you might be trying to write it out.

That is enough for now, it has gotten late again, so I’ll continue this in another post.

Oren Eini

Oren Eini

CEO of RavenDB

Shared database in microservices is a problem, yep

X509 Certificates vs. API Keys in RavenDB

RavenDB 3.0 end of lifecycle and RavenDB 4.2 Release Candidate

I have been blogging for 15 years

Modeling discussions: Data deletions

Production ready code is much more than error handling

When you find that “select” is BROKEN

Reviewing SledPart II

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed