Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,520
|
Comments: 51,141
Privacy Policy · Terms
filter by tags archive
time to read 13 min | 2479 words

RavenDB has a hidden feature, enabled by default and not something that you usually need to be aware of. It has built-in support for caching. Consider the following code:


async Task<Dictionary<string, int>> HowMuchWorkToDo(string userId)
{
    using var session = _documentStore.OpenAsyncSession();
    var results = await session.Query<Item>()
        .GroupBy(x =>new { x.Status, x.AssignedTo })
        .Where(g => g.Key.AssignedTo == userId && g.Key.Status != "Closed")
        .Select(g => new 
        {
            Status = g.Key.Status,
            Count = g.Count()
        })
        .ToListAsync();


    return results.ToDictionary(x => x.Status, x => x.Count);
}

What happens if I call it twice with the same user? The first time, RavenDB will send the query to the server, where it will be evaluated and executed. The server will also send an ETag header with the response. The client will remember the response and its ETag in its own memory.

The next time this is called on the same user, the client will again send a request to the server. This time, however, it will also inform the server that it has a previous response to this query, with the specified ETag. The server, when realizing the client has a cached response, will do a (very cheap) check to see if the cached response matches the current state of the server. If so, it can inform the client (using 304 Not Modified) that it can use its cache.

In this way, we benefit twice:

  • First, on the server side, we avoid the need to compute the actual query.
  • Second, on the network side, we aren’t sending a full response back, just a very small notification to use the cached version.

You’ll note, however, that there is still an issue. We have to go to the server to check. That means that we still pay the network costs. So far, this feature is completely transparent to the user. It works behind the scenes to optimize server query costs and network bandwidth costs.

We have a full-blown article on caching in RavenDB if you care to know more details instead of just “it makes things work faster for me”.

Aggressive Caching in RavenDB

The next stage is to involve the user. Enter the AggressiveCache() feature (see the full documentation here), which allows the user to specify an additional aspect. Now, when the client has the value in the cache, it will skip going to the server entirely and serve the request directly from the cache.

What about cache invalidation? Instead of having the client check on each request if things have changed, we invert the process. The client asks the server to notify it when things change, and until it gets notice from the server, it can serve responses completely from the local cache.

I really love this feature, that was the Good part, now let’s talk about the other pieces:

There are only two hard things in Computer Science: cache invalidation and naming things.

-- Phil Karlton

The bad part of caching is that this introduces more complexity to the system. Consider a system with two clients that are using the same database. An update from one of them may show up at different times in each. Cache invalidation will not happen instantly, and it is possible to get into situations where the server fails to notify the client about the update, meaning that we didn’t clear the cache.

We have a good set of solutions around all of those, I think. But it is important to understand that the problem space itself is a problem.

In particular, let’s talk about dealing with the following query:


var emps = session.Query<Employee>()
    .Include(x => x.Department)
    .Where(x => x.Location.City == "London")
    .ToListAsync();

When an employee is changed on the server, it will send a notice to the client, which can evict the item from the cache, right? But what about when a department is changed?

For that matter, what happens if a new employee is added to London? How do we detect that we need to refresh this query?

There are solutions to those problems, but they are super complicated and have various failure modes that often require more computing power than actually running the query. For that reason, RavenDB uses a much simpler model. If the server notifies us about any change, we’ll mark the entire cache as suspect.

The next request will have to go to the server (again with an ETag, etc) to verify that the response hasn’t changed. Note that if the specific query results haven’t changed, we’ll get OK (304 Not Modified) from the server, and the client will use the cached response.

Conservatively aggressive approach

In other words, even when using aggressive caching, RavenDB still has to go to the server sometimes. What is the impact of this approach when you have a system under load?

We’ll still use aggressive caching, but you’ll see brief periods where we aren’t checking with the server (usually be able to cache for about a second or so), followed by queries to the server to check for any changes.

In most cases, this is what you want. We still benefit from the cache while reducing the number of remote calls by about 50%, and we don’t have to worry about missing updates. The downside is that, as application developers, we know that this particular document and query are independent, so we want to cache them until we get notice about that particular document being changed.

The default aggressive caching in RavenDB will not be of major help here, I’m afraid. But there are a few things you can do.

You can use Aggressive Caching in the NoTracking mode. In that mode, the client will not ask the server for notifications on changes, and will cache the responses in memory until they expire (clock expiration or size expiration only).

There is also a feature suggestion that calls for updating the aggressive cache in a background manner, I would love to hear more feedback on this proposal.

Another option is to take this feature higher than RavenDB directly, but still use its capabilities. Since we have a scenario where we know that we want to cache a specific set of documents and refresh the cache only when those documents are updated, let’s write it.

Here is the code:


public class RecordCache<T>
{
    private ConcurrentLru<string, T> _items = 
        new(256, StringComparer.OrdinalIgnoreCase);
    private readonly IDocumentStore _documentStore;


    public RecordCache(IDocumentStore documentStore)
    {
        const BindingFlags Flags = BindingFlags.Instance | 
            BindingFlags.NonPublic | BindingFlags.Public;
        var violation = typeof(T).GetFields(Flags)
            .FirstOrDefault(f => f.IsInitOnly is false);
        if (violation != null)
        {
            throw new InvalidOperationException(
                "You should cache *only* immutable records, but got: " + 
                typeof(T).FullName + " with " + violation.Name + 
                " which is not read only!");
        }


        var changes = documentStore.Changes();
        changes.ConnectionStatusChanged += (_, args) =>
        {
            _items = new(256, StringComparer.OrdinalIgnoreCase);
        };
        changes.ForDocumentsInCollection<T>()
            .Subscribe(e =>
            {
                _items.TryRemove(e.Id, out _);
            })
            ;
        _documentStore = documentStore;
    }


    public ValueTask<T> Get(string id)
    {
        if (_items.TryGetValue(id, out var result))
        {
            return ValueTask.FromResult(result);
        }
        return new ValueTask<T>(GetFromServer(id));


    }


    private async Task<T> GetFromServer(string id)
    {
        using var session = _documentStore.OpenAsyncSession();
        var item = await session.LoadAsync<T>(id);
        _items.Set(id, item);
        return item;
    }
}

There are a few things to note about this code. We are holding live instances, so we ensure that the values we keep are immutable records. Otherwise, we may hand the same instance to two threads which can be… fun.

Note that document IDs in RavenDB are case insensitive, so we pass the right string comparer.

Finally,  the magic happens in the constructor. We register for two important events. Whenever the connection status of the Changes() connection is modified, we clear the cache. This handles any lost updates scenarios that occurred while we were disconnected.

In practice, the subscription to events on that particular collection is where we ensure that after the server notification, we can evict the document from the cache so that the next request will load a fresh version.

Caching + Distributed Systems = 🤯🤯🤯

I’m afraid this isn’t an easy topic once you dive into the specifics and constraints we operate under. As I mentioned, I would love your feedback on the background cache refresh feature, or maybe you have better insight into other ways to address the topic.

time to read 4 min | 728 words

I got into an interesting discussion on LinkedIn about my previous post, talking about Code Rot. I was asked about Legacy Code defined as code without tests and how I reconcile code rot with having tests.

I started to reply there, but it really got out of hand and became its own post.

“To me, legacy code is simply code without tests.” Michael Feathers, Working Effectively with Legacy Code

I read Working Effectively with Legacy Code for the first time in 2005 or thereabout, I think. It left a massive impression on me and on the industry at large. The book is one of the reasons I started rigorously writing tests for my code, it got me interested in mocking and eventually led me to writing Rhino Mocks.

It is ironic that the point of this post is that I disagree with this statement by Michael because of Rhino Mocks. Let’s start with numbers, last commit to the Rhino Mocks repository was about a decade ago. It has just under 1,000 tests and code coverage that ranges between 95% - 100%.

I can modify this codebase with confidence, knowing that I will not break stuff unintentionally. The design of the code is very explicitly meant to aid in testing and the entire project was developed with a Test First mindset.

I haven’t touched the codebase in a decade (and it has been close to 15 years since I really delved into it). The code itself was written in .NET 1.1 around the 2006 timeframe. It literally predates generics in .NET.

It compiles and runs all tests when I try to run it, which is great. But it is still very much a legacy codebase.

It is a legacy codebase because changing this code is a big undertaking. This code will not run on modern systems. We need to address issues related to dynamic code generation between .NET Framework and .NET.

That in turn requires a high level of expertise and knowledge. I’m fairly certain that given enough time and effort, it is possible to do so. The problem is that this will now require me to reconstitute my understanding of the code.

The tests are going to be invaluable for actually making those changes, but the core issue is that a lot of knowledge has been lost. It will be a Project just to get it back to a normative state.

This scenario is pretty interesting because I am actually looking back at my own project. Thinking about having to do the same to a similar project from someone else’s code is an even bigger challenge.

Legacy code, in this context, means that there is a huge amount of effort required to start moving the project along. Note that if we had kept the knowledge and information within the same codebase, the same process would be far cheaper and easier.

Legacy code isn’t about the state of the codebase in my eyes, it is about the state of the team maintaining it. The team, their knowledge, and expertise, are far more important than the code itself.

An orphaned codebase, one that has no one to take care of, is a legacy project even if it has tests. Conversely, a project with no tests but with an actively knowledgeable team operating on it is not.

Note that I absolutely agree that tests are crucial regardless. The distinction that I make between legacy projects and non-legacy projects is whether we can deliver a change to the system.

Reminder: A codebase that isn’t being actively maintained and has no tests is the worst thing of all. If you are in that situation, go read Working Effectively with Legacy Code, it will be a lifesaver.

I need a feature with an ideal cost of X (time, materials, effort, cost, etc). A project with no tests but people familiar with it will be able to deliver it at a cost of 2-3X. A legacy project will need 10X or more. The second feature may still require 2X from the maintained project, but only 5X from the legacy system. However, that initial cost to get things started is the killer.

In other words, what matters here is the inertia, the ability to actually deliver updates to the system.

time to read 3 min | 481 words

A customer called us about some pretty weird-looking numbers in their system:

You’ll note that the total number of entries in the index across all the nodes does not match. Notice that node C has 1 less entry than the rest of the system.

At the same time, all the indicators are green. As far as the administrator can tell, there is no issue, except for the number discrepancy. Why is it behaving in this manner?

Well, let’s zoom out a bit. What are we actually looking at here? We are looking at the state of a particular index in a single database within a cluster of machines. When examining the index, there is no apparent problem. Indexing is running properly, after all.

The actual problem was a replication issue, which prevented replication from proceeding to the third node. When looking at the index status, you can only see that the entry count is different.

When we zoom out and look at the state of the cluster, we can see this:

There are a few things that I want to point out in this scenario. The problem here is a pretty nasty one. All nodes are alive and well, they are communicating with each other, and any simple health check you run will give good results.

However, there is a problem that prevents replication from properly flowing to node C. The actual details aren’t relevant (a bug that we fixed, to tell the complete story). The most important aspect is how RavenDB behaves in such a scenario.

The cluster detected this as a problem, marked the node as problematic, and raised the appropriate alerts. As a result of this, clients would automatically be turned away from node C and use only the healthy nodes.

From the customer’s perspective, the issue was never user-visible since the cluster isolated the problematic node. I had a hand in the design of this, and I wrote some of the relevant code. And I’m still looking at these screenshots with a big sense of accomplishment.

This stuff isn’t easy or simple. But to an outside observer, the problem started from: why am I looking at funny numbers in the index state in the admin panel? And not at: why am I serving the wrong data to my users.

The design of RavenDB is inherently paranoid. We go to a lot of trouble to ensure that even if you run into problems, even if you encounter outright bugs (as in this case), the system as a whole would know how to deal with them and either recover or work around the issue.

As you can see, live in production, it actually works and does the Right Thing for you. Thus, I can end this post by saying that this behavior makes me truly happy.

time to read 4 min | 618 words

We recently got a support request from a user in which they had the following issue:


We have an index that is using way too much disk space. We don’t need to search the entire dataset, just the most recent documents. Can we do something like this?


from d in docs.Events
where d.CreationDate >= DateTime.UtcNow.AddMonths(-3)
select new { d.CreationDate, d.Content };

The idea is that only documents from the past 3 months would be indexed, while older documents would be purged from the index but still retained.

The actual problem is that this is a full-text search index, and the actual data size required to perform a full-text search across the entire dataset is higher than just storing the documents (which can be easily compressed).

This is a great example of an XY problem. The request was to allow access to the current date during the indexing process so the index could filter out old documents. However, that is actually something that we explicitly prevent. The problem is that the current date isn’t really meaningful when we talk about indexing. The indexing time isn’t really relevant for filtering or operations, since it has no association with the actual data.

The date of a document and the time it was indexed are completely unrelated. I might update a document (and thus re-index it) whose CreationDate is far in the past. That would filter it out from the index. However, if we didn’t update the document, it would be retained indefinitely, since the filtering occurs only at indexing time.

Going back to the XY problem, what is the user trying to solve? They don’t want to index all data, but they do want to retain it forever. So how can we achieve this with RavenDB?

Data Archiving in RavenDB

One of the things we aim to do with RavenDB is ensure that we have a good fit for most common scenarios, and archiving is certainly one of them. In RavenDB 6.0 we added explicit support for Data Archiving.

When you save a document, all you need to do is add a metadata element: @archive-at and you are set. For example, take a look at the following document:


{
    "Name": "Wilman Kal",
    "Phone": "90-224 8888",
    "@metadata": {
        "@archive-at": "2024-11-01T12:00:00.000Z",
        "@collection": "Companies",
     }
}

This document is set to be archived on Nov 1st, 2024. What does that mean?

From that day on, RavenDB will automatically mark it as an archived document, meaning it will be stored in a compressed format and excluded from indexing by default.

In fact, this exact scenario is detailed in the documentation.

You can decide (on a per-index basis) whether to include archived documents in the index. This gives you a very high level of flexibility without requiring much manual effort.

In short, for this scenario, you can simply tell RavenDB when to archive the document and let RavenDB handle the rest. RavenDB will do the right thing for you.

time to read 3 min | 485 words

I was talking to a colleague about a particular problem we are trying to solve. He suggested that we solve the problem using a particular data structure from a recently published paper. As we were talking, he explained how this data structure works and how that should handle our problem.

The solution was complex and it took me a while to understand what it was trying to achieve and how it would fit our scenario. And then something clicked in my head and I said something like:

Oh, that is just epoch-based, copy-on-write B+Tree with single-producer/ concurrent-readers?

If this sounds like nonsense to you, it is fine. Those are very specific terms that we are using here. The point of such a discussion is that this sort of jargon serves a very important purpose. It allows us to talk with clarity and intent about fairly complex topics, knowing that both sides have the same understanding of what we are actually talking about.

The idea is that we can elevate the conversation and focus on the differences between what the jargon specifies and the topic at hand. This is abstraction at the logic level, where we can basically zoom out a lot of details and still keep high intent accuracy.

Being able to discuss something at this level is hugely important because we can convey complex ideas easily. Once I managed to put what he was suggesting in context that I could understand, we were able to discuss the pros and cons of this data structure for the scenario.

I do appreciate that the conversation basically stopped making sense to anyone who isn’t already well-versed in the topic as soon as we were able to (from my perspective) clearly and effectively communicate.

“When I use a word,’ Humpty Dumpty said in rather a scornful tone, ‘it means just what I choose it to mean — neither more nor less.”

Clarity of communication is a really important aspect of software engineering. Being able to explain, hopefully in a coherent fashion, why the software is built the way it is and why the code is structured just so can be really complex. Leaning on existing knowledge and understanding can make that a lot simpler.

There is also another aspect. When using jargon like that, it is clear when you don’t know something. You can go and research it. The mere fact that you can’t understand the text tells you both that you are missing information and where you can find it.

For software, you need to consider two scenarios. Writing code today and explaining how it works to your colleagues, and looking at code that you wrote ten years ago and trying to figure out what was going on there.

In both cases, I think that this sort of approach is a really useful way to convey information.

time to read 2 min | 326 words

Take a look at this wonderful example of foresightedness (or hubris).

In a little over ten years, Let’s Encrypt root certificates are going to expire. There are already established procedures for how to handle this from other Certificate Authorities, and I assume that there will be a well-communicated plan for this in advance.

That said, I’m writing this blog post primarily because I want to put the URL in the notes for the meeting above. Because in 10 years, I’m pretty certain that I won’t be able to recall why this is such a concerning event for us.

RavenDB uses certificates for authentication, usually generated via Let’s Encrypt. Since those certificates expire every 3 months, they are continuously replaced. When we talk about trust between different RavenDB instances, that can cause a problem. If the certificate changes every 3 months, how can I trust it?

RavenDB trusts a certificate directly, as well as any later version of that certificate assuming that the leaf certificate has the same key and that they have at least one shared signer. That is to handle the scenario where you replace the intermediate certificate (you can go up to the root certificate for trust at that point).

Depending on the exact manner in which the root certificate will be replaced, we need to verify that RavenDB is properly handling this update process. This meeting is set for over a year before the due date, which should give us more than enough time to handle this.

Right now, if they are using the same key on the new root certificate, it will just work as expected. If they opt for cross-singing with another root certificate, we need to ensure that we can verify the signatures on both chains. That is hard to plan for because things change.

In short, future Oren, be sure to double-check this in time.

time to read 22 min | 4283 words

Our task today is to request (and obtain approval for) a vacation. But before we can make that request, we need to handle the challenge of building   the vacation requesting system. Along the way, I want to focus a little bit on how to deal with some of the technical issues that may arise, such as concurrency.

In most organizations, the actual details of managing employee vacations are a mess of a truly complicated series of internal policies, labor laws, and individual contracts. For this post, I’m going to ignore all of that in favor of a much simplified workflow.

An employee may Request a Vacation, which will need to be approved by their manager. For the purpose of discussion, we’ll ignore all other aspects and set out to figure out how we can create a backend for this system.

I’m going to use a relational database as the backend for now, using the following schema. Note that this is obviously a highly simplified model, ignoring many real-world requirements. But this is sufficient to talk about the actual issue.

After looking at the table structure, let’s look at the code (again, ignoring data validation, error handling, and other rather important concerns).


app.post('/api/vacations/request', async (req, res) => {
    const { employeeId, dates, reason } = req.body;


    await pgsql.query(`BEGIN TRANSACTION;`);
    const managerId = await pgsql.query(
      `SELECT manager FROM Employees WHERE id = $1;`,
      [employeeId]).rows[0].id;
    const vacReqId = await pgsql.query(
      `INSERT INTO VacationRequests (empId,approver,reason,status)
       VALUES ($1,$2,$3,'Pending') RETURNING id;`,
       [employeeId,managerId,reason]).rows[0].id;


    for(const date of date) {
        await pgsql.query(
          `INSERT INTO VacationRequestDates
           (vacReqId, date, mandatory ,notes)
           VALUES ($1, $2, $3, $4);`, 
          [vacReqId, d.date, d.mandatory, d.notes]);
    }
     
    await pgsql.query(`COMMIT;`);


    res.status(201).json({ requestId: result.rows[0].id });
});

We create a new transaction, find who the manager for the employee is, and register a new VacationRequest for the employee with all the dates for that vacation. Pretty simple and easy, right? Let’s look at the other side of this, approving a request.

Here is how a manager is able to get the vacation dates that they need to approve for their employees.


app.get('/api/vacations/approval', async (req, res) => {
  const { whoAmI } = req.body;
 
  const vacations = await pgsql.query(
    `SELECT VRD.id, VR.empId, VR.reason, VRD.date, E.name,
           VRD.mandatory, VRD.notes
    FROM VacationRequests VR
    JOIN VacationRequestDates VRD ON VR.id = VRD.vacReqId
    JOIN Employees E ON VR.empId = E.id
    WHERE VR.approver = $1 AND VR.status = 'Pending'`,
    [whoAmI]);


  res.status(200).json({ vacations });
});

As you can see, most of the code here consists of the SQL query itself. We join the three tables to find the dates that still require approval.

I’ll stop here for a second and let you look at the two previous pieces of code for a bit. I have to say, even though I’m writing this code specifically to point out the problems, I had to force myself not to delete it. There was mental pressure behind my eyes as I wrote those lines.

The issue isn’t a problem with a lack of error handling or security. I’m explicitly ignoring that for this sort of demo code. The actual problem that bugs me so much is modeling and behavior.

Let’s look at the output of the previous snippet, returning the vacation dates that we still need to approve.

idempIdnamereasondate
8483391Johnbirthday2024-08-01
8484321Janedentist2024-08-02
8484391Johnbirthday2024-08-02

We have three separate entries that we need to approve, but notice that even though two of those vacation dates belong to the same employee (and are part of the same vacation request), they can be approved separately. In fact, it is likely that the manager will decide to approve John for the 1st of August and Jane for the 2nd, denying John’s second vacation day. However, that isn’t how it works. Since the actual approval is for the entire vacation request, approving one row in the table would approve all the related dates.

When examining the model at the row level, it doesn’t really work. The fact that the data is spread over multiple tables in the database is an immaterial issue related to the impedance mismatch between the document model and the relational model.

Let’s try and see if we can structure the query in a way that would make better sense from our perspective. Here is the new query (the rest of the code remains the same as the previous snippet).


SELECT VRD.id, VR.empId, E.name, VR.reason,
    (
        SELECT json_agg(VRD)
        FROM VacationRequestDates VRD
        WHERE VR.id = VRD.vacReqId
    ) AS dates
FROM VacationRequests VR
JOIN Employees E ON VR.empId = E.id
WHERE VR.approver = $1 AND VR.status = 'Pending'

This is a little bit more complicated, and the output it gives is quite different. If we show the data in the same way as before, it is much easier to see that there is a single vacation request and that those dates are tied together.

idempIdnamereasonstatusdate
8483391JohnbirthdayPending2024-08-01and 2024-08-02

8484321JanedentistPending2024-08-02

We are going to ignore the scenario of partial approval because it doesn’t matter for the topic I’m trying to cover. Let’s discuss two other important features that we need to handle. How do we allow an employee to edit a vacation request, and how does the manager actually approve a request.

Let’s consider editing a vacation request by the employee. On the face of it, it’s pretty simple. We show the vacation request to the employee and add the following endpoint to handle the update.


app.post('/api/vacation-request/date', async (req, res) => {
  const { id, date, mandatory, notes, vacReqId } = req.body;
 
 if(id typeof == 'number') {
  await pgsql.query(
    `UPDATE VacationRequestDates
    SET date = $1, mandatory = $2, notes = $3
    WHERE id = $4`,
    [date, mandatory, notes, id]);
 }
 else {
  await pgsql.query(
    `INSERT INTO VacationRequestDates (date, mandatory, notes, vacReqId)
    VALUES ($1, $2, $3, $4)`,
    [date, mandatory, notes, vacReqId]);
 }
 
  res.status(200);
});


app.delete('/api/vacation-request/date', async (req, res) => {
  const { id } = req.query;
 
  await pgsql.query(
    `DELETE FROM VacationRequestDates WHERE id = $1`,
    [id]);


  res.status(200);
});

Again, this sort of code is like nails on board inside my head. I’ll explain why in just a bit. For now, you can see that we actually need to handle three separate scenarios for editing an existing request date, adding a new one, or deleting it. I’m now showing the code for updating the actual vacation request (such as the reason for the request) since that is pretty similar to the above snippet.

The reason that this approach bugs me so much is because it violates transaction boundaries within the solution. Let’s assume that I want to take Thursday off instead of Wednesday and add Friday as well. How would that be executed using the current API?

I would need to send a request to update the date on one row in VacationRequestDates and another to add a new one. Each one of those operations would be its own independent transaction. That means that either one can fail. While I wanted to have both Thursday and Friday off, only the request for Friday may succeed, and the edit from Wednesday to Thursday might not.

It also means that the approver may see a partial state of things, leading to an interesting problem and eventually an exploitable loophole in the system. Consider the scenario of the approver looking at vacation requests and approving them. I can arrange things so that while they are viewing the request, the employee will add additional dates. When the approver approves the request, they’ll also approve the additional dates, unknowingly.

Let’s solve the problem with the transactional updates on the vacation request and see where that takes us:


app.post('/api/vacation-request/update', async (req, res) => {
  const { varRecId, datesUpdates } = req.body;
  await pgsql.query(`BEGIN TRANSACTION;`);


  for (const { op, id, date, mandatory, notes } of datesUpdates) {
    if (op === 'delete') {
      await pgsql.query(`DELETE FROM VacationRequestDates
        WHERE id = $1;`,
        [id]);
    }
    else if (op === 'insert') {
      await pgsql.query(`INSERT INTO VacationRequestDates
        (varRecId, date, mandatory, notes)
        VALUES ($1, $2, $3, $4);`,
        [varRecId, date, mandatory, notes]);
     
    }
    else {
      await pgsql.query(`UPDATE VacationRequestDates
        SET date = $1, mandatory = $2, notes = $3
        WHERE id = $4;`,
        [date, mandatory, notes, id]);
    }
  }


  await pgsql.query(`COMMIT;`);
  res.status(200);
});

That is… a lot of code to go through. Note that I looked into Sequelize as well to see what kind of code that would produce when using an OR/M, it wasn’t meaningfully simpler.

There is a hidden bug in the code above. But you probably won’t notice it no matter how much you’ll look into it. The issue is code that isn’t there. The API code above assumes that the caller will send us all the dates for the vacation requests, but it is easy to get into a situation where we may edit the same vacation requests from both the phone and the laptop, and get partial information.

In other words, our vacation request on the database has four dates, but I just updated three of them. The last one is part of my vacation request, but since I didn’t explicitly refer to that, the code above will ignore that. The end result is probably an inconsistent state.

In other words, to reduce the impedance mismatch between my database and the way I work with the user, I leaned too much toward exposing the database to the callers. The fact that the underlying database is storing the data in multiple tables has leaked into the way I model my user interface and the wire API. That leads to a significant amount of complexity.

Let’s go back to the drawing board. Instead of trying to model the data as a set of rows that would be visually represented as a single unit, we need to actually think about a vacation request as a single unit.