RavenDB Migrations: Rolling Updates

architecture (618) rss
bugs (451) rss
challanges (123) rss
community (381) rss
databases (481) rss
design (896) rss
development (646) rss
hibernating-practices (72) rss
miscellaneous (592) rss
performance (397) rss
programming (1092) rss
raven (1459) rss
ravendb.net (544) rss
reviews (184) rss

2025
- August (5)
- July (7)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

RavenDB - High-Performance NoSQL Document Database

Aug 26 2011

RavenDB MigrationsRolling Updates

time to read 5 min | 903 words

There are several ways to handle schema changes in RavenDB. When I am talking about schema changes, I am talking about changing the format of documents in production. RavenDB doesn’t have a “schema”, of course, but if your previous version of the application had a Name property for customer, and your new version have FirstName and LastName, you need to have some way of handling that.

Please note that in this case I am explicitly talking about a rolling migration, not something that you need to do immediately.

We will start with the following code bases:

Version 1.0

Version 2.0

public class Customer
{
    public string Name {get;set;}
    public string Email {get;set;}
    public int NumberOfOrders {get;set;}
}

public class Customer
{
    public string FirstName {get;set;}
    public string LastName {get;set;}
    public string CustomerEmail {get;set;}
    public bool PreferredCustomer {get;set;}
}

As I said, there are several approaches, depending on exactly what you are trying to do. Let us enumerate them in order.

Removing a property – NumberOfOrders

As you can see, NumberOfOrders was removed from v1 to v2. In this case, there is absolutely no action required of us. The next time that this customer will be loaded, the NumberOfOrders property will not be bound to anything, RavenDB will note that the document have changed (missing a property) and save it without the now invalid property. It is self cleaning Smile .

Adding a property – PreferredCustomer

In this situation, what we have is a new property, and we need to provide a value for it. If there isn’t any value for the property in the stored json, it won’t be set, which means that the default value (or the one set in the constructor) will be the one actually set. Again, RavenDB will note that the document have changed, (have an extra property) and save it with the new property. It is self healing Smile .

Modifying properties – Email –> CustomerEmail, Name –> FirstName, LastName

This is where things gets annoying. We can’t rely on the default behavior for resolving this. Luckily, we have the extension points to help us.

public class CustomerVersion1ToVersion2Converter : IDocumentConversionListener
{
    public void EntityToDocument(object entity, RavenJObject document, RavenJObject metadata)
    {
        Customer c = entity as Customer;
        if (c == null)
            return;

        metadata["Customer-Schema-Version"] = 2;
        // preserve the old Name proeprty, for now.
        document["Name"] = c.FirstName + " " + c.LastName;
        document["Email"] = c.CustomerEmail;
    }

    public void DocumentToEntity(object entity, RavenJObject document, RavenJObject metadata)
    {
        Customer c = entity as Customer;
        if (c == null)
            return;
        if (metadata.Value<int>("Customer-Schema-Version") >= 2)
            return;

        c.FirstName = document.Value<string>("Name").Split().First();
        c.LastName = document.Value<string>("Name").Split().Last();
        c.CustomerEmail = document.Value<string>("Email");
    }
}

Using this approach, we can easily convert between the two version, including keeping the old schema in place in case we still need to be compatible with the old schema.

Pretty neat, isn’t it?

Tweet Share Share 33 comments

Tags:

Raven

Comments

21 Jul 2011
12:57 PM

Nabil

Very nice :)

26 Aug 2011
10:03 AM

Daniel Lidström

How would I register a IDocumentConversionListener? It would be nice to have a link to the relevant documentation.

26 Aug 2011
10:15 AM

Ayende Rahien

Daniel, documentStore.RegisterListener(...)

26 Aug 2011
10:32 AM

Jose

Wouldn't DocumentToEntity break on a Name that only has one word? Or would we get the same word on both FirstName and LastName?

26 Aug 2011
10:36 AM

Ayende Rahien

Jose, Maybe, this is code that is specific for a single use case, and as such, can make a lot of assumptions.

26 Aug 2011
10:56 AM

Jose

I didn't mean to nit-pick and I understand the scope of the code above. My point is that given enough data rolling updates can be a nightmare and dangerous. But yes, RavenDB tackles it in a very elegant way.

26 Aug 2011
11:26 AM

Seems pretty straightforward for "trash" fields. But how about indices on top of changed fields? I guess if you need to i.e. search by that field, you'd want you database to migrate all documents to latest format version. What if RavenDB bundled a tool that would let you register the same converters in RavenDB and let it chew documents in the background? :)

26 Aug 2011
13:22 PM

Dmytrii Nagirniak

How would you handle the situation when a property moved to two new documents and the other way around?

26 Aug 2011
13:41 PM

Ayende Rahien

Y, That is why RavenDB has support for set operations

26 Aug 2011
13:41 PM

Ayende Rahien

Dmytrii, I don't understand the question, can you give an example?

26 Aug 2011
13:49 PM

Dmytrii Nagirniak

Sure. Let's say we have a Company with Address, company number etc.

We change the model so that the compamy no longer has address. Insteaad the address is stored in a separet document - Branch.

And then how would you merge Brancge back into Company.

Do you see what I mean?

26 Aug 2011
15:35 PM

Péter Zsoldos

I don't use ravendb - these are just general data migration questions

Is the support for rollback explicitly missing, i.e.: it is assumed that if I have made a mistake, then I code my way forward and do a new release? Assuming, that new data was created between the release and the discovery of the need for rollback to the last stable version, and I want to convert that data back to the old format (for which I have the code, written at the time I wrote this forward conversion). I want to rollback fast and reliably - no from-dusk-to-down caffeine powered coding sessions. What can I do?
How is the multiple changes scenario supported - there are sites I only use once every 6 months, but I'm sure they have multiple releases during that time. So when I log in, my records might be at version 1, while the current version could be 19. Is it one class per version, and each checks which version number it belongs to? Based on the snippet above, there is no framework support for that, or is there? Or one would just schedule a batch run outside the peak hours, forcing the not yet updated records to be loaded (and thus updated)?

26 Aug 2011
18:44 PM

tobi

Once you have many different types, all handlers need to be called. They can never be removed because some entities might still not be upgraded. This could become a perf problem (calling 100 listeners for every loaded object). I would have the listeners be registered for a specific type (and optionally for all types).

27 Aug 2011
00:42 AM

Steve Py

Hmm, the automatic nature of self healing and self cleaning could be troublesome behaviour in cases where developers didn't know better. Granted that in a perfect world, everyone should be fully versed in the capabilities of their tools, and paying due dilligence to their changes, plus testing those changes thoroughly. But if someone scoping out changes to large document stores with changes across dozens of documents, under pressure, the tool isn't helping catch situations they may have missed, it's actively hiding them.

Case in point, if you ran that scenario through without the listener, the self cleaning would erase "email" and "Name", and add the new fields with default values, would it not?

My point is that the tool doesn't know whether you want to discard or translate old data, and it seems rather dangerous to have it pick a behaviour arbitrarily. My personal preference would be for a tool to detect such changes and require deliberate rules for the specific change. (discard, or translate.)

27 Aug 2011
09:04 AM

Alex Vilela

Do I need a listener if I move the Customer class to a different namespace?

27 Aug 2011
09:07 AM

Rafal

Steve, It's just the default behavior of the Json serializer - it tries to do its best ignoring schema differences and supplying default valuse for missing properties. I wouldn't call it self healing or self cleaning because as you have said, sometimes it's just self destructing. It would be much better to have some schema validation mechanism that is an integral part of the database and that can't be easily bypassed (by not setting up the client correctly). As an example, please have a look at how it has been solved in Persevere: http://www.sitepen.com/blog/2008/11/17/evolving-schemas-with-persevere/

28 Aug 2011
05:50 AM

Ayende Rahien

Dmytrii, That requires creating a separate document, probably by just replacing that with the id of the new branch. Although, I would probably do stuff like that as a one time process, since this is a pretty radical change, and not something that you can usually slip in as a gradual transformation

28 Aug 2011
05:52 AM

Ayende Rahien

1) Rollback? Just the same way as the forward motion, just in reverse. Do the exact same thing, but reverse the steps. 2) You usually do those sort of things for one version back, which mean that at the next release, you can do the big "check & modify" for the entire db, so you don't have to deal with the 3 versions back version. 2.1) Or you can just keep all of those around and make the checks when you need them in order, based on the version of the entity.

28 Aug 2011
05:53 AM

Ayende Rahien

Tobi, Have a MigrationStoreListener that would forward the call based on the type of the entity.

28 Aug 2011
05:54 AM

Ayende Rahien

Alex, No, it would resolve that automatically

28 Aug 2011
09:16 AM

tobi

Ayende, you are right.

28 Aug 2011
17:01 PM

Dmytrii Nagirniak

Oren,

With the radical changes, when and how would you run the migration?

Doing it as one-time process is not good enough. I need to be able to run such kind of migration in multiple environments.

Cheers.

28 Aug 2011
18:48 PM

Ayende Rahien

Dmytrii, If you need to do that, then you don't do radical changes.

28 Aug 2011
18:52 PM

Dmytrii Nagirniak

I don't argue that it is a radical change. I wonder how it would be handled with RavenDB.

For example, with SQL database I would create a migration (using Migrator.net or similar) that would change the schema accordingly and them migrate the data.

This process would be automated and easily repeatable.

28 Aug 2011
18:56 PM

Ayende Rahien

Dmytrii, And you would do pretty much the same thing in RavenDB. But that isn't the scope of this post, it is about rolling update, not point in time update. For those sort of updates, you don't do radical changes, you make things change slowly, across deployments.

28 Aug 2011
19:07 PM

Dmytrii Nagirniak

That makes sense. But I was curious to see how you would do that (split radical changes into smaller ones?).

It would be amazing to see a write-up about doing this kind of stuff with RavenDB (analogy of Migrator.Net and similar).

28 Aug 2011
19:10 PM

Ayende Rahien

Dmytrii, https://github.com/ayende/RaccoonBlog/tree/master/src/RaccoonBlog.Migrations

28 Aug 2011
19:15 PM

Dmytrii Nagirniak

Thanks :) Just roll your own here sounds easy enough.

09 Sep 2011
09:30 AM

Mike

Very neat, but if you don't need a listener, is the schema version recorded anyway?

09 Sep 2011
10:50 AM

Ayende Rahien

Mike, I don't understand the question

09 Nov 2011
16:52 PM

Ryan

Not sure if you are still checking these comments, but I had a ?

Say you need to do a rolling update, you write the DocumentConversionListener above, and your objects start converting as you encounter them. So this is great for commonly accessed objects, but what about old/archived stuff? Say you have 1,000 Customers and 800 of them get updated in a week but 200 of them are fairly inactive. You don't want to turn off that converter until they are, but you don't want to shutdown the app just for some old data conversions either.

Do you recommend just running a script against the DB to manually convert the data and then pulling out the converter? Or is there another way.

09 Nov 2011
20:17 PM

Ayende Rahien

Ryan, Yes, at that point, you'll probably run a script that would convert everything. The alternative is to just keep the converter in place for all time, which is also an option

09 Nov 2011
23:39 PM

Ryan

Great, thanks. I'm about to get involved in a project in it's infancy that is currently being built on Raven so I'm trying to familiarize myself with it more. I didn't like the idea of having to leave these converters all over the place every time the object model changed, so just wanted to make sure there was a way to phase them out.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB

RavenDB MigrationsRolling Updates

More posts in "RavenDB Migrations" series:

Comments

Comment preview

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed

Oren Eini

CEO of RavenDB

Related posts that you may find interesting:

More posts in "RavenDB Migrations" series:

Comments

Comment preview

Markdown formatting

Phrase Emphasis

Links

Images

Headers

Lists

Blockquotes

Horizontal Rules

Manual Line Breaks

Fenced Code Blocks

Header IDs

Tables

Definition Lists

Footnotes

Abbreviations

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication