That No SQL ThingDocument Database Migrations

time to read 6 min | 1073 words

One of the things that I get ask frequently about No SQL is how to handle migrations. First, I want to clarify something very important:

Migrations sucks.

It doesn’t matter what you are doing, the moment that you need to be forward / backward compatible, or just update existing data to a new format, you are going to run into a lot of unpleasant situations. Chief of which is that what you thought was true about your data isn’t, and probably never was.

But leaving that aside, in the RDBMS world, we have three major steps when we migrate from one database version to another, non destructive update to the schema (add columns / tables), update the data (or move it to its new location), destructive schema update (delete columns / tables, introduce indexes, etc). With a document database, the situation is much the same, while we don’t have a mandatory schema, we do have the data in a certain shape.

I am going to use three examples, which hopefully demonstrate enough about migrations to give you the basic idea.

  • Renaming a column – Posts.Text should be renamed to Posts.Content
  • Update column value – Update tax rate in all orders after 2010-01-01
  • Change many-to-one association to many-to-many association – Posts.Category to Posts Categories

Renaming a column

Using SQL, we can do this using:

-- non destructive update
alter table Posts add Content nvarchar(max) null

-- update the data
update Posts
set Content = Text

-- destructive schema update
alter table Posts drop column Text
alter table Posts alter column Content nvarchar(max) not null

Using a Document Database, we run the following Linq query on the server (this is effectively an update):

from doc in docs
where doc.type == "post"
select new 
   type = doc.type,
   content = doc.text, // rename a column
   posted_at = doc.posted_at

Update column value

In SQL, it is pretty easy:

-- update the data
update Orders
set Tax = 0.16
where OrderDate > '2010-01-01'

And using an update Linq query:

from doc in docs
where doc.type == "order" && doc.order_date > new DateTime(2010,1,1)
select new
   tax = 0.16,
   order_date = doc.order_date,
   order_lines = doc.order_lines

Change many-to-one association to many-to-many association

Using SQL, it is pretty complex (especially when you add the FKs):

create table CategoriesPosts (PostId int, CategoryId int)

insert into CategoriesPosts (PostId, CategoryId)
select Id, CategoryId from Posts

alter table Posts drop column CategoryId

And using a doc db:

from doc in docs
where doc.type == "post"
select new 
   type = doc.type,
   content = doc.content,
   posted_at = doc.posted_at,
   categories = new [] { doc.category }

It is pretty much the same pattern over and over again :-)

More posts in "That No SQL Thing" series:

  1. (03 Jun 2010) Video
  2. (14 May 2010) Column (Family) Databases
  3. (09 May 2010) Why do I need that again?
  4. (07 May 2010) Scaling Graph Databases
  5. (06 May 2010) Graph databases
  6. (22 Apr 2010) Document Database Migrations
  7. (21 Apr 2010) Modeling Documents in a Document Database
  8. (20 Apr 2010) The relational modeling anti pattern in document databases
  9. (19 Apr 2010) Document Databases – usages