Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by email or phone:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 6,481 | Comments: 47,780

filter by tags archive

Upcoming conferences

time to read 1 min | 150 words

In the wake of RavenDB 4.0 Release Candidate, you are going to be seeing quite a lot of us Smile.

Here is the schedule for the rest of the year. In all of these conferences we are going to have a booth and demo RavenDB 4.0 live. We are going to demonstrate distributed database on conference network, so expect a lot of demo of the failover behavior Smile.

I’ll be speaking in Build Stuff about Modeling in Non Relation World and Extreme Performance Architecture as well as giving a full day workshop about RavenDB 4.0.

RavenDB 4.0Support options

time to read 2 min | 324 words

imageRavenDB 4.0 is going to have a completely free community edition that you could use to run production systems. We do this with the expectation that users will go with the community edition and either will be happy there or upgrade at some point to the commercial editions.

As part of the restructuring we are doing, we intend to also significantly simplify the support model. Our current support model is per RavenDB instance with professional support costing 2,000$ per instance and production (24/7) support costing 6,000$. We got a lot of feedback on this being complex to work with. In particular, the per instance cost meant that operations would need to talk to us during redeployments in order to maintain coverage of all their RavenDB instances.

As part of the Great Simplification we do in 4.0 we also want to tackle the issue of support. As a result, with the rollout of the RavenDB 4.0 RC we are going to move to flat support costs.

  • Professional Support will cost 15% of the license cost and give you access to our support engineers with a guaranteed next business day response time.
  • Production Support will cost 30% of the license cost and give you access to the core team members with 24/7 availability.

This is a significant reduction in price, because we are trying to encourage more people to get support and our previous approach was unbalanced.

The community support will continue to be offered, obviously, but we have no SLA around issues raised there.

The commercial support options will only be available for the Professional and Enterprise editions.

Here is how the costs change between RavenDB 3.x and RavenDB 4.5 for production support:

RavenDB 3.x RavenDB 4.0 Savings
Standard +
Production
Support

6,698$

5,843$

15% reduction

Enterprise 4 Cores +

Production Support

9,152$6,864$33% reduction

Practice makes perfect

time to read 3 min | 499 words

I run into this over twitter:

image

There were some suggestions there to go to meetups, find a mentor, etc. Those are important, but I consider them secondary to what you need to be a good developer.

My advice:

Write code, you'll likely write crap code, but write code, and a lot of it.

Read code, you'll not understand some, but try to.

The order matter.

The only way to be a good developer is to be a bad developer first. I have a drawer full of old hard disks that contain old code, some of it goes back over 20 years. I still remember being incredibly proud in writing a full BBS system in VBScript & ASP (classic!) that didn’t use a database but rather manipulated the HTML files on disk directly so you had what is effectively a static website that would self modify itself. The impressive thing was this was a single nested switch statement that went on for thousands of lines. I somehow managed to keep it all in my head enough to be able to actually complete the project.

It would never work in practice (I didn’t have any concept of “what happens if two requests happen at the same time”) and it was never deployed, but it was code that I wrote, and that thought me what works. More importantly, it told me what doesn’t work. That meant reading errors, figuring out how to find faults in my program, getting used to run <----> modify cycle, etc.

I wrote web systems, gesture recognition systems that would serve as hot keys in Windows, shell extensions and a lot of random stuff. Most of it was never meant to be anything, it was just a way for me to explore. The more I wrote, the more I knew what was going on.

At that point, reading other people’s code would have done nothing for me. I wasn’t at a level that I could grasp what other people were doing. It took a long time until I was ready to actually peek into other people’s code and actually be able to make sense of it. More to the point, it took a long time until I was able to actually learn something from that, rather then just go with a targeted “what do I need to make X work”.

Having other people there to help can be very useful, but it can also be a crutch. At least initially, you need to fall down a lot to figure things out. Mostly because people have very hard time telling you how they found the problem in your code. “It’s obvious that this is here” doesn’t give you much to learn from except possibly that you are stupid for missing the obvious. A lot of the advice that this tweet got is absolutely something that I can get behind, but I would put it significantly later in the process.

Inside RavenDB 4.0Chapter 6 is done

time to read 1 min | 73 words

I’ve just completed writing chapter 6 (distributed RavenDB) and pushed a preview up. This put the page count at over 200 pages so far, with another two thirds or so left.

This chapter was really hard to write, and I would really appreciate any feedback that you have on the text and on the distributed nature of RavenDB 4.0 in general. It is very similar and a different beast entirely then 3.x.

Zombies vs. Ghosts: The great debate

time to read 2 min | 398 words

imageWe have a feature in RavenDB that may leave behind some traces when a document is gone. The actual details aren’t really important for the story. Those traces are there for a reason, and a user have a good reason to want to see them in the UI.

That meant that we needed to come up with a name for them. After a short pause, we selected Zombies, because they are the remnants of real documents that are hanging around. That seem to mesh well with the technical terminology already in use (zombie processes, for example) and a reference to the current popularity of zombies in culture (books, movies, etc) which many of our guys enjoy.

Note that in this case, I’m specifically using the term guys to refer to our male developers. One of our female developers didn’t like the terminology. Because Zombies are creepy, and we don’t want that in our UI.

There was a discussion on the terminology we’ll use that was very interesting, because it was on clearly defined gender lines. None of the guys had any issue with the term, and that included a few that considered zombie movies to be yucky as well. All the women, on the other hand, thought (to varying degrees) that zombies isn’t the appropriate term to use.

We threw a few other ones around, such as orphans, but one of the features we wanted to have is the ability to wipe those traces, and “kill all orphans” is not something that I think would go well in our UI.

Eventually the idea to use the term ghosts was brought up, and it was liked by all. It has all the connotations desired to explain what this is (the remnants of a deleted document), but the images it evoked was Casper the Friendly Ghost and Pacman, apparently.

Given that while none of the guys thought there was a problem with zombies, but no one was also particularly attached to the name, and on the other hand we had strong opposition to the term and an alternative that made everyone happy, we switched to that terminology.

Fun fact, I was telling my wife this story and I wasn’t able to complete the description of the debate before she suggested using the Pacman image.

Inside RavenDB 4.0 book–Chapter 4 & 5 are done

time to read 1 min | 189 words

imageThe RavenDB 4.0 book is going really well, this week I have managed to write about 20,000 words and the current page count is at 166. At this rate, it is going to turn into a monster in terms of how big it is going to be.

The book so far covers the client API, how to model data in a document database and how to use batch processing in RavenDB with subscriptions. The full drafts are available here, and I would really appreciate any feedback you have.

Next topic is to start talking about clustering and this is going to be real fun.

I’m also looking for a technical editor for the book. Someone who can both tell me that I missed a semi column in a code listing and tell me why my phrasing / spelling / grammar is poor (and likely all three at once and some other stuff that I don’t even know). If you know someone, or better yet, interested and capable, send me a line.

Beginning the RavenDB 4.0 book

time to read 12 min | 2208 words

You might have seen me talking about how close we are to a RavenDB beta release. Today marked a very important step along the route to an actual release. I’ve shifted my focus. Instead of going head down in the code and pushing things forward and doing all the sort of crazy stuff that you have seen me talking about for the past year and a half, I got started on the Inside RavenDB 4.0 book.

I say started because just the rough table of contents took me almost the entire day to complete. I’m expecting that this will take the majority of my time for the next few months, which means that you’ll get all the drib and drabs from the raw drafts as they are composed.  I’m also using this as a pretty nice way to go over the entire product and see how it all comes together as a cohesive whole, instead of looking at just a single piece every time.

Given that the period of putting bugs in the code is almost over, I feel that I can safely let the rest of the team fish out all the oopsies hat I managed to get in and focus on the product, rather than the code. This is the second time that I have made such a shift (and the third time I’m writing a book), and it still feels awkward. On the other hand, there is a great sense of accomplishment when you see how things just click together and all that hard work is finally real in a way that no code review or artificial scenario can replicate.

Here is what I have planned so far for the book. Your comments are welcome as always.

One of the major challenges in writing this book came in considering how to structure it. There are so many concepts that relate to one another that it can be difficult to try to understand them in isolation. We can't talk about modeling documents before we understand the kind of features that we have available for us to work with, for example. Considering this, I'm going to introduce concepts in stages.

Part I - The basics of RavenDB

Focus: Developers

This part contains a practical discussion on how to build an application using RavenDB, and we'll skip theory and concepts in favor of getting things done. This is what you'll want new hires to read before starting to work with an application using RavenDB, we'll keep the theory and the background information for the next part.

  • Chapter 2 - Zero to RavenDB - focuses on setting you up with a RavenDB instance, introduce the studio and some key concepts and walk you through the Hello World equivalent of using RavenDB by building a very simple To Do app.
  • Chapter 3 - CRUD operations - discusses RavenDB the basics of connecting from your application to RavenDB and the basics that you need to know to do CRUD properly. Entities, documents, attachments, collections and queries.
  • Chapter 4 - The Client API - explores more advanced client scenarios, such as lazy requests, patching, bulk inserts, and streaming queries and using persistent subscriptions. We'll talk a bit about data modeling and working with embedded and related documents.

Part II - Ravens everywhere

Focus: Architects

This part focuses on the theory of building robust and high performance systems using RavenDB. We'll go directly to working with a cluster of RavenDB nodes on commodity hardware, discuss distribution of data and work across the cluster and how to best structure your systems to take advantage of what RavenDB brings to the table.

  • Chapter 5 - Clustering Setup - walks through the steps to bring up a cluster of RavenDB nodes and working with a clustered database. This will also discuss the high availability and load balancing features in RavenDB.
  • Chapter 6 - Clustering Deep Dive - takes you through the RavenDB clustering behavior, how it works and how the both servers & clients are working together to give you a seamless distributed experience. We'll also discuss error handling and recovery in a clustered environment.
  • Chapter 7 - Integrating with the Outside World - explores using RavenDB along side additional systems, for integrating with legacy systems, working with dedicated reporting databases, ETL process, long running background tasks and in general how to make RavenDB fit better inside your environment.
  • Chapter 8 - Clusters Topologies - guides you through setting up several different clustering topologies and their pros and cons. This is intend to serve as a set of blueprints for architects to start from when they begin building a system.

Part III - Indexing

Focus: Developers, Architects

This part discuss how RavenDB index data to allow for quick retrieval of information, whatever it is a single document or aggregated data spanning years. We'll cover all the different indexing methods in RavenDB and how you can should use each of them in your systems to implement the features you want.

  • Chapter 9 - Simple Indexes - introduces indexes and their usage in RavenDB. Even though we have performed queries and worked with the data, we haven't actually dealt with indexes directly so far. Now is the time to lift the curtain and see how RavenDB is searching for information and what it means for your applications.
  • Chapter 11 - Full Text Search - takes a step beyond just querying the raw data and shows you how you can search your entities using powerful full text queries. We'll talk about the full text search options RavenDB provides, using analyzers to customize this for different usages and languages.
  • Chapter 13 - Complex indexes - goes beyond simple indexes and shows us how we can query over multiple collections at the same time. We will also see how we can piece together data at indexing time from related documents and have RavenDB keep the index consistent for us.
  • Chapter 13 - Map/Reduce - gets into data aggregation and how using Map/Reduce indexes allows you to get instant results over very large data sets with very little cost. Making reporting style queries cheap and accessible at any scale. Beyond simple aggregation, Map/Reduce in RavenDB also allows you to reshape the data coming from multiple source into a single whole, regardless of complexity.
  • Chapter 14 - Facet and Dynamic Aggregation - steps beyond static aggregation provided by Map/Reduce and give you the ability to run dynamic aggregation queries on your data, or just facet your search results to make it easier for the user to find what they are looking for.
  • Chapter 15 - Artificial Documents and Recursive Map/Reduce - guides you through using indexes to generate documents, instead of the other way around, and then use that both for normal operations and to support recursive Map/Reduce and even more complex reporting scenarios.
  • Chapter 16 - The Query Optimizier - takes you into the RavenDB query optimizer, index management and how RavenDB is treating indexes from the inside out. We'll see the kind of machinery that is running behind the scenes to get everything going so when you make a query, the results are there at once.
  • Chapter 17 - RavenDB Lucene Usage - goes into (quite gory) details about how RavenDB is making use of Lucene to implement its indexes. This is intended mostly for people who need to know what exactly is going on and how does everything fit together. This is how the sausage is made.
  • Chapter 18 - Advanced Indexing Techniques - dig into some specific usages of indexes that are a bit... outside the norm. Using spatial queries to find geographical information, generating dynamic suggestions on the fly, returning highlighted results for full text search queries. All the things that you would use once in a blue moon, but when you need them you really need them.

Part IV - Operations

Focus: Operations

This part deals with running and supporting a RavenDB cluster or clusters in production. From how you spina new cluster to decommissioning a downed node to tracking down performance problems. We'll learn all that you need (and then a bit more) to understand what is going on with RavenDB and how to customize its behavior to fit your own environment.

  • Chapter 19 - Deploying to Production - guides you through several production deployment options, including all the gory details about how to spin up a cluster and keep it healthy and happy. We'll discuss deploying to anything from a container swarm to bare metal, the networking requirements and configuration you need, security implications and anything else that the operation teams will need to comfortably support a RavenDB cluster in that hard land called production.
  • Chapter 20 - Security - focuses solely on security. How you can control who can access which database, running an encrypted database for highly sensitive information and securing a RavenDB instance that is exposed to the wild world.
  • Chapter 21 - High Availability - brings failure to the table, repeatedly. We'll discuss how RavenDB handles failures in production, how to understand, predict and support RavenDB in keeping all of your databases accessible and high performance in the face of various errors and disasters.
  • Chapter 22 - Recovering from Disasters - covers what happens after disaster strikes. When machines melt down and go poof, or someone issues the wrong command and the data just went into the incinerator. Yep, this is where we talk about backups and restore and all the various reasons why operations consider them sacred.
  • Chapter 23 - Monitoring - covers how to monitor and support a RavenDB cluster in production. We'll see how RavenDB externalize its own internal state and behavior for the admins to look at and how to make sense out of all of this information.
  • Chapter 24 - Tracking Performance - gets into why a particular query or a node isn't performing up to spec. We'll discuss how one would track down such an issue and find the root cause responsible for such a behavior, a few common reasons why such things happen and how to avoid or resolve them.

Part V - Implementation Details

Focus: RavenDB Core Team, RavenDB Support Engineers, Developers who wants to read the code

This part is the orientation guide that we throw at new hires when we sit them in front of the code. It is full of implementation details and background information that you probably don't need if all you want to know is how to build an awesome system on top of RavenDB.

On the other hand, if you want to go through the code and understand why RavenDB is doing something in a particular way, this part will likely answer all those questions.

  • Chapter 25 - The Blittable Format - gets into the details of how RavenDB represents JSON documents internally, how we go to this particular format and how to work with it.
  • Chapter 26 - The Voron Storage Engine - breaks down the low-level storage engine we use to put bits on the disk. We'll walk through how it works, the various features it offers and most importantly, why it had ended up in this way. A lot of the discussion is going to be driven by performance consideration, extremely low-level and full of operating system and hardware minutiae.
  • Chapter 27 - The Transaction Merger - builds on top of Voron and comprise one of the major ways in which RavenDB is able to provide high performance. We'll discuss how it came about, how it is actually used and what it means in terms of actual code using it.
  • Chapter 28 - The Rachis Consensus - talks about how RavenDB is using the Raft consuensus protocol to connect together different nodes in the cluster, how they are interacting with each other and the internal details of how it all comes together (and fall apart and recover again).
  • Chapter 31 - Cluster State Machine - brings the discussion one step higher by talking about how the RavenDB uses the result of the distributed consensus to actually manage all the nodes in the cluster and how we can arrive independently on each node to the same decision reliably.
  • Chapter 30 - Lording over Databases - peeks inside a single node and explores how a database is managed inside that node. More importantly, how we are dealing with multiple databases on the same node and what kind of impact each database can have on its neighbors.
  • Chapter 31 - Replication - dives into the details of how RavenDB manages multi master distributed database. We'll go over change vectors to ensure conflict detection (and aid in its resolution) how the data is actually being replicated between the different nodes in a database group.
  • Chapter 32 - Internal Architecture - gives you the overall view of the internal architecture of RavenDB. How it is built from the inside, and the reasoning why the pieces came together in the way they did. We'll cover both high-level architecture concepts and micro architecture of the common building blocks in the project.

Part VI - Parting

This part summarizes the entire book and provide some insight about what our future vision for RavenDB is.

  • Chapter 33 - What comes next - discusses what are our (rough) plans for the next major version and our basic roadmap for RavenDB.
  • Chapter 34 - Are we there yet? Yes! - summarize the book and let you go and start actually using all of this awesome information.

Conferences schedule

time to read 1 min | 88 words


I got the schedule for the upcoming conferences, and realized that I haven’t actually been talking about the conferences we go to, which is a shame, because that is a lot.

FUTURE POSTS

  1. PR Review: Code has cost, justify it - 15 hours from now
  2. PR Review: Beware the things you can’t see - 4 days from now
  3. The right thing and what the user expect to happen are completely unrelated - 5 days from now
  4. PR Review: It’s the error handling, again - 6 days from now

There are posts all the way to Oct 25, 2017

RECENT SERIES

  1. PR Review (7):
    10 Aug 2017 - Errors, errors and more errors
  2. RavenDB 4.0 (15):
    13 Oct 2017 - Interlocked distributed operations
  3. re (21):
    10 Oct 2017 - Entity Framework Core performance tuning–Part III
  4. RavenDB 4.0 Unsung Heroes (5):
    05 Oct 2017 - The design of the security error flow
  5. Writing SSL Proxy (2):
    27 Sep 2017 - Part II, delegating authentication
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats