Ayende @ Rahien

Refunds available at head office

What is new in RavenDB 3.0: Query diagnostics

concept-18290_640We talked a lot about the changes we made for indexing, now let us talk about the kind of changes we are talking about from the query side of things. More precisely, this is when we start asking questions about our queries.

Timing queries. While it is rare that we have slow queries in RavenDB, it does happen, and when it does, we treat it very seriously. However, in the last few cases that we have seen, the actual problem wasn’t with RavenDB, it was with sending the data back to the client when we had a large result set and large number of documents.

In RavenDB 3.0, we have added the ability to get detailed statistics about what is the cost of the query in every stage of the pipeline.

RavenQueryStatistics stats;
var users = session.Query<Order>("Orders/Totals")
    .Statistics(out stats)
    .Customize(x => x.ShowTimings())
    .Where(x=>x.Company == "companies/11" || x.Employee == "employees/2")
    .ToList();

foreach (var kvp in stats.TimingsInMilliseconds)
{
    Console.WriteLine(kvp.Key + ": " + kvp.Value);
}

Console.WriteLine("Total: " + stats.DurationMilliseconds);

We can now ask RavenDB to explain us its reasoning when doing so:

  • Lucene search: 10
  • Loading documents: 2
  • Transforming results: 0
  • Total: 21

As you can see, the total time for this query is 21 ms, and we have 12 ms accounted for in the actual search time. The rest is network traffic.  This can help you diagnose more easily where the problem is, and hence, how to solve it.

Query timeout and cancellation. As I mentioned, we don’t really have long queries in RavenDB very often. But that is actually is something that happens, and we need a way to deal with that. RavenDB now places a timeout on the amount of time a query gets to run (including querying Lucene, loading documents or transforming the results). A query that doesn’t complete in time will be cancelled, and an error will be returned to the user.

You can also view the currently executing queries and kill a long running query (if you have specified a high timeout, for example).

Explaining queries. Sometimes it is easy to understand why RavenDB has decided to give you documents in a certain order. You asked them sorted by date, and you get them sorted by date. But when you are talking about complex queries, that is much harder. RavenDB will sort the results by default based on relevancy, and that can sometimes be a bit puzzling to understand.

Here is how we can do this:

session.Advanced.DocumentQuery<Order>("Orders/Totals")
                    .Statistics(out stats)
                    .WhereEquals("Company", "companies/11")
                    .WhereEquals("Employee", "employees/3")
                    .ExplainScores()
                    .ToList();

var explanation = stats.ScoreExplantaions["orders/759"];

The result of this would be something that looks like this:

0.6807194 = (MATCH) product of:
  1.361439 = (MATCH) sum of:
    1.361439 = (MATCH) weight(Employee:employees/3 in 469), product of:
      0.4744689 = queryWeight(Employee:employees/3), product of:
        2.869395 = idf(docFreq=127, maxDocs=830)
        0.165355 = queryNorm
      2.869395 = (MATCH) fieldWeight(Employee:employees/3 in 469), product of:
        1 = tf(termFreq(Employee:employees/3)=1)
        2.869395 = idf(docFreq=127, maxDocs=830)
        1 = fieldNorm(field=Employee, doc=469)
  0.5 = coord(1/2)

And if we were to ask for the explanation for orders/237, we will get:

6.047595 = (MATCH) sum of:
  4.686156 = (MATCH) weight(Company:companies/11 in 236), product of:
    0.8802723 = queryWeight(Company:companies/11), product of:
      5.32353 = idf(docFreq=10, maxDocs=830)
      0.165355 = queryNorm
    5.32353 = (MATCH) fieldWeight(Company:companies/11 in 236), product of:
      1 = tf(termFreq(Company:companies/11)=1)
      5.32353 = idf(docFreq=10, maxDocs=830)
      1 = fieldNorm(field=Company, doc=236)
  1.361439 = (MATCH) weight(Employee:employees/3 in 236), product of:
    0.4744689 = queryWeight(Employee:employees/3), product of:
      2.869395 = idf(docFreq=127, maxDocs=830)
      0.165355 = queryNorm
    2.869395 = (MATCH) fieldWeight(Employee:employees/3 in 236), product of:
      1 = tf(termFreq(Employee:employees/3)=1)
      2.869395 = idf(docFreq=127, maxDocs=830)
      1 = fieldNorm(field=Employee, doc=236)

In other words, we can see that orders/237 is ranked much higher than orders/759. That is because is matched both clauses of the query. And a match on Company is a much stronger indication for relevancy because Companies/11 appears only in 10 documents out out 830, while Employees/3 appears in 127 out of 830.

For details about this format, see this presentation, it actually talks about Solr here, but this data comes from Lucene, so it applies to both.

That is it about queries diagnostics, next, we’ll deal with transformers and another important optimization, the staleness reduction system.

Tags:

Published at

Originally posted at

Comments (6)

What is new in RavenDB 3.0: Indexing enhancements

chess-345904_640

We talked previously about the kind of improvements we have in RavenDB 3.0 for the indexing backend. In this post, I want to go over a few features that are much more visible.

Attachment indexing. This is a feature that I am not so hot about, mostly because we want to move all attachment usages to RavenFS. But in the meantime, you can reference the contents of an attachment during index. That can let you do things like store large text data in an attachment, but still make it available for the indexes. That said, there is no tracking of the attachment, so if it change, the document that referred to it won’t be re-indexed as well. But for the common case where both the attachments and the documents are always changed together, that can be a pretty nice thing to have.

Optimized new index creation. In RavenDB 2.5, creating a new index would force us to go over all of the documents in the database, not just the documents that we have in that collection. In many cases, that surprised users, because they expected there to be some sort of physical separation between the collections. In RavenDB 3.0, we changed things so creating a new index on a small collection (by default, less than 131,072 items) will be able to only touch the documents that belong to the collections being covered by that index. This alone represent a pretty significant change in the way we are processing indexes.

In practice, this means that creating a new index on a small index would complete much more rapidly. For example, I reset an index on a production instance, it covers about 7,583 documents our of 19,191. RavenDB was able to index that in just 690 ms, out of about 3 seconds overall that took for the index reset to take place.

What about the cases where we have new indexes on large collections? At this point, in 2.5, we would do round robin indexing between the new index and the existing ones. The problem was that 2.5 was biased toward the new index. That meant that it was busy indexing the new stuff, while the existing indexes (which you are actually using) took longer to run. Another problem was that in 2.5 creating a new index would effectively poison a lot of performance heuristics.  Those were built for the assumptions of all indexes running pretty much in tandem. And when we have one or more that weren’t doing so… well, that caused things to be more expensive.

In 3.0, we have changed how this works. We’ll have separate performance optimization pipelines for each group of indexes based on its rough indexing position. That lets us take advantage of batching many indexes together. We are also not going to try to interleave the indexes (running first the new index and then the existing ones). Instead, we’ll be running all of them in parallel, to reduce stalls and to increase the speed in which everything comes up to speed.

This is using our scheduling engine to ensure that we aren’t actually overloading the machine with computation work (concurrent indexing) or memory (number of items to index at once). I’ve very proud in what we have done here, and even though this is actually a backend feature, it is too important to get lost in the minutia of all the other backend indexing changes we talked about in my previous post.

Explicit Cartesian/fanout indexing. A Cartesian index (we usually call them fanout indexes) is an index that output multiple index entries per each document. Here is an example of such an index:

from postComment in docs.PostComments
from comment in postComment.Comments
where comment.IsSpam == false
select new {
    CreatedAt = comment.CreatedAt,
    CommentId = comment.Id,
    PostCommentsId = postComment.__document_id,
    PostId = postComment.Post.Id,
    PostPublishAt = postComment.Post.PublishAt
}

For a large post, with a lot of comments, we are going to get an entry per comment. That means that a single document can generate hundreds of index entries.  Now, in this case, that is actually what I want, so that is fine.

But there is a problem here. RavenDB has no way of knowing upfront how many index entries a document will generate, that means that it is very hard to allocate the appropriate amount of memory reserves for this, and it is possible to get into situations where we simply run out of memory. In RavenDB 3.0, we have added explicit instructions for this. An index has a budget, by default, each document is allowed to output up to 15 entries. If it tries to output more than 15 entries, that document indexing is aborted, and it won’t be indexed by this index.

You can override this option either globally, or on an index by index basis, to increase the number of index entries per document that are allowed for an index (and old indexes will have a limit of 16,384 items, to avoid breaking existing indexes).

The reason that this is done is so either you didn’t specify a value, in which case we are limited to the default 15 index entries per document, or you did specify what you believe is a maximum number of index entries outputted per document, in which case we can take advantage of that when doing capacity planning for memory during indexing.

Simpler auto indexes. This feature is closely related to the previous one. Let us say that we want to find all users that have an admin role and has an unexpired credit card. We do that using the following query:

var q = from u in session.Query<User>()
        where u.Roles.Any(x=>x.Name == "Admin") && u.CreditCards.Any(x=>x.Expired == false)
        select u;

In RavenDB 2.5, we would generate the following index to answer this query:

from doc in docs.Users
from docCreditCardsItem in ((IEnumerable<dynamic>)doc.CreditCards).DefaultIfEmpty()
from docRolesItem in ((IEnumerable<dynamic>)doc.Roles).DefaultIfEmpty()
select new {
    CreditCards_Expired = docCreditCardsItem.Expired,
    Roles_Name = docRolesItem.Name
}

And in RavenDB 3.0 we generate this:

from doc in docs.Users
select new {
    CreditCards_Expired = (
        from docCreditCardsItem in ((IEnumerable<dynamic>)doc.CreditCards).DefaultIfEmpty()
        select docCreditCardsItem.Expired).ToArray(),
    Roles_Name = (
        from docRolesItem in ((IEnumerable<dynamic>)doc.Roles).DefaultIfEmpty()
        select docRolesItem.Name).ToArray()
}

Note the difference between the two. The 2.5 would generate multiple index entries per document, while RavenDB 3.0 generate just one. What is worse is that 2.5 would generate a Cartesian product, so the number of index entries outputted in 2.5 would be the number of roles for a user times the number of credit cards they have.  In RavenDB 3.0, we have just one entry, and the overall cost is much reduced. It was a big change, but I think it was well worth it, considering the alternative.

In my next post, I’ll talk about the other side of indexing, queries. Hang on, we still have a lot to go through.

Tags:

Published at

Originally posted at

Comments (4)

What is new in RavenDB 3.0: Indexing backend

image

RavenDB indexes are one of the things that make is more than a simple key/value store. They are incredibly important for us. And like many other pieces in 3.0, they have had a lot of work done, and now they are sparkling & shiny. In this post I’m going to talk about a lot of the backend changes that were made to indexing, making them faster, more reliable and better all around. In the next post, I’ll discuss the actual user visible features we added.

Indexing to memory. Time and time again we have seen that actually hitting the disk is a good way of saying “Sayonara, perf”. In 2.5, we introduced the notion of building new indexes in RAM only, to speed up the I/O for new index creation. With 3.0, we have taken this further and indexes no longer go to disk as often. Instead, we index to an in memory buffer, and only write to disk once we hit a size/time/work limit.

At that point, we’ll flush those indexes to disk, and continue to buffer in memory. The idea is to reduce the amount of disk I/O that we do, as well as batch it to reduce the amount of time we spend hitting the disk.  Under high load, this can dramatically improve performance, because we are I/O bound so much. In fact, for the common load spike scenario, we never have to hit the disk for indexing at all.

Aysnc index deletion. Indexes in RavenDB are composed of the actual indexed data, as well as the relevant metadata about the index. Usually, metadata is much smaller than the actual data. But in this case, an index metadata in the case of a map/reduce index include all of the intermediary steps in the process, so we can do incremental map/reduce. If you use LoadDocument, we need to maintain the documents references, and on large databases, that can take a lot of space. As a result of that, index deletion could take a long time in RavenDB 2.5.

With RavenDB 3.0, we are now doing async index deletions, so you can delete the index, which does a very short operation to free up the index name, but do the actual work to cleanup after the index in the background. And yes, you can restart the database midway through, and it will resume the async index deletion behavior. The immediate behavior is that it is much easier to manage indexes, because you delete a big index and not have to wait for this to continue. This most commonly showed up when updating an index definition.

Index ids instead of names. Because we had to break the 1:1 association between index and its name, we moved to an internal representation of indexes using numeric ids. That was pretty much forced on us because we had to distinguish between the old index Users/Search (who is now in the process of being deleted) and the new Users/Search index (who is now in the process of being indexed).

A happy side effect of that was that we actually has a more efficient internal structure for working with indexes in general. That speeds up writes and reads and reduce the overall disk space that is used.

Interleaves indexing & task executions. RavenDB uses the term task to refer to cleanup jobs (mostly) that run on the indexes. For example, removing deleted index entries, or re-indexing referencing documents when the referenced document has changed. In 2.5, it was possible to get a lot of tasks in the queue, which would stall indexing. For example, mass deletes were one common scenario were the task queue would be full and take a bit to drain. In RavenDB 3.0 we have changed things so a big task queue won’t impact indexing to that extent. Instead, we can interleave indexing and task execution so we won’t stall the indexing process.

Large documents indexing. RavenDB doesn’t place arbitrary limits on the size of documents. That is great as a user, but it places a whole set of challenges for RavenDB when we need to index. Assume that we want to introduce a new index of a relatively large collection of documents. We will start indexing the collection, realize that there is a lot to index, so we’ll grow the number of items to be indexed in each batch. The problem is that is is pretty common that as the system grows, so does the documents. So a lot of the documents that were updated later on would be bigger than those we had initially. That can get to a point where we are trying to fetch a batch size of 128K documents, but each of them is 250Kb in size. That requires us to load 31 GB of documents to index in a single batch.

That might take a while, even if we are talking just about reading them from disk, leaving aside the cost in memory. This was worse because in many cases, when having so much data, the users would usually go for the compression bundle. So when RavenDB tried to figure out what is the size of the documents (by checking its on disk size), it would get the compressed size. Hilarity did not ensue as a result, I tell you that. In 3.0, we are better prepared to handle such scenarios, and we know to count the size of the document in memory, as well as being more proactive about limiting the amount of memory we’ll use per batch.

I/O bounded batching. A core scenario for RavenDB is running on the cloud platforms. We have customers running us on anything from i2.8xlarge EC2 instances (32 cores, 244GB RAM, 8 x 800 GB SSD drives) to A0 Azure instances (shared CPU, 768 MB RAM, let us not talk about the disks, lest I cry). We actually got customers complaints about “Why isn’t RavenDB using all of the resources available to it”, because we could handle the load in about 1/4 of the resources that the machine in question had available. Their capacity calculations were based on another database product, and RavenDB works very differently, so while they had no performance complaints, there were kinda upset that we weren’t using more CPU/RAM – they paid for those, so we should use them.

Funny as that might seem, the other side is not much better. The low end cloud machines are slow, and starved of just about any resource that you can name. In particular, I/O rates for the low end cloud machines are pretty much abysmal. If you created a new index on an existing database in one of those machine, it would turn out that a lot of what you did was just wait for I/O. The problem would actually grow worse over time. We were loading a small batch (took half a second to load from disk) and indexing it. Then we loaded another batch, and another, etc. As we realized that we have a lot to go through, we increased the batch size. But the time we would spend waiting for disk would grow higher and higher.  From the admin perspective, it would appears as if we were hanging, and not actually doing any indexing.

In RavenDB 3.0, we are not bounding the I/O costs, so we’ll try to load a batch for a while, but if we can’t get enough in a reasonable time frame, we’ll just give you what we already have so far, let you index that, and continue the I/O operation in a background thread. The idea is that by the time you are done indexing, you’ve another batch of documents available for you. Combined with the batch index writes, that gave us a hell of a perceived performance boost (oh, it is indexing all the time, and I can see that it is doing a lot of I/O and CPU, so everything is good) and real actual gains (we were able to better parallelize the load process this way).

 

Summary – pretty much all of those changes are not something that you’ll really see. There are all engine changes that happen behind the scene. But all of them are going to work together to end up giving you a smoother, faster and overall better experience.

Tags:

Published at

Originally posted at

Comments (6)

What is new in RavenDB 3.0: Simplicity

I’m not sure that there is a better word to describe it. We have a sign in the office, 3 feet by 6 feet that says: Reduce Friction. And that is something that we tried very hard to do.

Under simplicity we aggregate everything that might aggravate you, and what we did to reduce that.

That include things like reducing the number of files and assemblies we ship. Compare the 2.5 output:

image

To the 3.0 output:

image

We did that by removing a lot of dependencies that we could do without, and internalizing a lot of the other stuff.

We went over the command line interface of the tooling we use and upgraded that. For example, restoring via the command line is now split into a restoring a system database (offline operation for the entire server) or restoring a regular database (the server is fully online, and other databases can run during this time).

In secret and without telling anyone, we have also doubled the amount of parallel work that RavenDB can do. Previously, if you purchased a standard license, you were limited to 6 concurrent index tasks, for example. In RavenDB 3.0, the standard license still has 6 cores capacity, but it will allow up to 12 concurrent index tasks. If you have a 32 cores Enterprise license, that would mean 64 concurrent indexing tasks, and you can follow the logic from there, I asuume.

We have also dropped the Raven.Client.Embedded assembly. It isn’t necessary. The full functionality is still available, of course, it was just moved to Raven.Database assembly. That reduce the amount of dlls that you have to work with and manage.

You probably don’t care much, but we have done a lot of refactoring on the internals of RavenDB. The DocumentDatabase class (the core functionality in RavenDB) was broken up into many separate classes, and on the client side, we have done much the same to the DocumentStore class. We have also combined several listeners together, so now you don’t have to deal with Extended Conversion Listeners.

In terms of storage, obviously Voron gives us a huge boost, and it is designed to be a zero admin system that self optimize. But we changed things on top of that as well. Previously, we were storing the data as BSON on disk. That decision had a lot to do with serialization costs and the size on disk. However, that created issues when we had to deal with the storage at a low level. So now RavenDB stores the data in text JSON format all the way through. And yes, it will seamlessly convert from BSON to JSON when you update documents, you don’t have to do anything to get it working. We run extensive performance testing here, and it turned out that we were able to reduce the cost of writing by moving to a textual format.

Another small annoyance with RavenDB was the use of in memory databases. Those are usually used for testing, but we also have a number of clients that use those for production data, usually as a high throughput first responder, with replication to / from backend systems to ensure durability. Previously, you had to manually tell RavenDB that it shouldn’t tear down those database when they went idle. Now we won’t tear down an in memory database even if it didn’t do anything for a long while.

Another common issues was people adding / removing bundles on the fly. This isn’t supported, and it can cause issues because it usually works, but not always. We made it so the process for doing that is a bit more involved, and require an actual acknowledgment that you are doing something that might be unsafe.

Users sometimes have proxies / man in the middle service that manipulate the HTTP headers. A common example of that is New Relic. That can cause problems sometimes, since RavenDB use HTTP headers to pass the document metadata, that caused issues. By now, we have pretty much filtered out all the common stuff, but since that always required us to make a new release, that had a prohibitive cost of the users. Instead, we now allow you to customize the list of headers that the server will ignore on the fly.

We did a lot for indexes in 3.0, but one of the changes is both simple and meaningful. We gave you the ability to ask if my current index definition matches the one on the server? That is important during deployments, because you can check if an index is up to date or not, and then decide if you need to do an online index rebuild, or schedule this for a later time, with less load, or just move on because everything is the same.

In RavenDB 3.0, we have deprecated the attachments, they still work (but will be removed in the next major release), but you are expected to use RavenFS for binary storage. RavenDB now comes with a migration tool to move all attachments from a RavenDB database to a RavenFS file system.

As I said, those are small things, none of them would rise to the level of major feature on its own. In aggregate (and I mentioned just the top from a very big list) they represent a significant reduction in the amount of friction that you have to deal with when using RavenDB.

Tags:

Published at

Originally posted at

Comments (8)

What is new in RavenDB 3.0: JVM Client API

RavenDB has always been accessible from other platforms. We have users using RavenDB from Python and Node.JS, we also have users using Ruby & PHP, although there isn’t a publicly available resource for that.

With RaenDB 3.0, we release an official Java Client API for RavenDB. Using it is pretty simple if you are familiar with the RavenDB API or the Hibernate API.

We start by creating the document store:

IDocumentStore store = new DocumentStore(ravenDbUrl, "todo-db");
store.initialize();
store.executeIndex(new TodoByTitleIndex());

Note that we have an compiled index as well here, which looks like this:

public class TodoByTitleIndex extends AbstractIndexCreationTask {

    public TodoByTitleIndex() {
        map = "from t in docs.todos select new { t.Title, t.CreationDate } ";
        QTodo t = QTodo.todo;
        index(t.title, FieldIndexing.ANALYZED);
    }
}

Since Java doesn’t have Linq, we use the Querydsl to handle that. The index syntax is still Linq on the server side, though.

That is enough about setup, let us see how we can actually work with this. Here is us doing a search:

@Override
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {

    ServletContext context = request.getSession().getServletContext();
    IDocumentStore store = (IDocumentStore) context.getAttribute(ContextManager.DOCUMENT_STORE);

    String searchText = request.getParameter("search");

    try (IDocumentSession session = store.openSession()) {
        QTodo t = QTodo.todo;

        IRavenQueryable<Todo> query = session.query(Todo.class, TodoByTitleIndex.class)
            .orderBy(t.creationDate.asc());

        if (StringUtils.isNotBlank(searchText)) {
            query = query.where(t.title.eq(searchText));
        }

        List<Todo> todosList = query.toList();

        response.getWriter().write(RavenJArray.fromObject(todosList).toString());
        response.getWriter().close();

    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

Basically, we get the store from the context, and then open a session. You can see the fluent query API, and how we work with sessions.

A more interesting example, albeit simpler, is how we write:

@Override
protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException,
    IOException {
    ServletContext context = request.getSession().getServletContext();
    IDocumentStore store = (IDocumentStore) context.getAttribute(ContextManager.DOCUMENT_STORE);

    try (IDocumentSession session = store.openSession()) {
        Todo todo = new Todo(request.getParameter("title"));
        session.store(todo);
        session.saveChanges();

    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

This API should be instantly familiar for any RavenDB or Hibernate users. As for the actual entity definition, here it goes:

@QueryEntity
public class Todo {

    private Boolean completed;
    private Date creationDate;
    private Integer id;
    private String title;

    public Todo() {
        super();
    }

    public Todo(final String title) {
        super();
        this.title = title;
        this.creationDate = new Date(System.currentTimeMillis());
    }

    public Boolean getCompleted() {
        return completed;
    }

    public Date getCreationDate() {
        return creationDate;
    }

    public Integer getId() {
        return id;
    }

    public String getTitle() {
        return title;
    }

    public void setCompleted(Boolean completed) {
        this.completed = completed;
    }

    public void setCreationDate(Date creationDate) {
        this.creationDate = creationDate;
    }

    public void setId(Integer id) {
        this.id = id;
    }

    public void setTitle(String title) {
        this.title = title;
    }

    @Override
    public String toString() {
        return "Todo [title=" + title + ", completed =" + completed + ", creationDate=" + creationDate + ", id=" + id
            + "]";
    }
}
Except for the @QueryEntity annotation, it is a plain old Java class.
 
And of course, I’m using the term Java Client API, but this is actually available for any JVM language. Including Groovy, Scala and Clojure*.
 
* There is actually a dedicated RavenDB client for Clojure, written by Mark Woodhall.
Tags:

Published at

Originally posted at

RavenDB Days this week!

It is coming, and soon Smile.

 

I’m going to talk about what is new in RavenDB 3.0 (like the posts you have seen so far, a lot has changed). Manuel will present a case study of building high scale systems, then Michael will give you a guided tour through the RavenDB internal.

After lunch, we’ve Mauro talking about indexes and transformers, then I’m going to talk about the additional database types that are coming to RavenDB, and finally Judah is going to show how RavenDB speeds up your development time by a factor or two.

You can register to that here.

Published at

Originally posted at

What is new in RavenDB 3.0: Client side

While most of the time, when a user is working with RavenDB, they do so via the browser, developers spend most of their time in code, and when working with RavenDB, that means working with the RavenDB Client API. We have spent quite a bit of time on that as well, as you can imagine. Here are a few of the highlights.

Fully async, all the way through. Communication with RavenDB is done to a remote server, which means that a lot of time is spent just doing I/O. Recognizing this, we are now taking full advantage of the async API available to us and the entirety of RavenDB 3.0 Client API is async. Previously, we had an async and sync version of the stack, which cause compatibility issues, some stuff worked for sync and not for async, or vice versa. With 3.0, we now has a single code path, and it is all async. For the sync API, we are just wrapping the async API. That gives us a single code path, and a fully async I/O throughout, unless you chose to use the sync API, at which point we’ll wait, per your request.

Lazy async is fully supported now. As part of the fully async part, we now have support for lazy operations on the full async session.

Lazy timings. As long as we are speaking lazily, a frequent request was to be able to get the actual time that the lazy request spent on the server side. That was actually interesting, because when we first implemented this feature, we thought we had a bug. The total time for the request was lower than the time for each individual request. Then I realized that we are actually running the requests in parallel inside the server, so that made sense.

Embedded / remote clients unity. Similar to the previous issue, embedded clients had a totally different code path than remote clients. That caused issues because sometimes you had something that would work in embedded but not in remote. And that sucked. We changes things so now everything works through the same (async!) pipeline. That means that you can do async calls on embedded databases, and that as far as RavenDB is concerned, there really is no difference if you are calling me from the same process or remotely.

Admin operation split was done so we won’t have a developer accidently try to use an operation that requires admin privileges. Now all of the privilege actions are under the cmds.Admin or cmds.GlobalAdmin properties. That gives you a good indication about the permission levels that you need for an operation. Either you just need access to the database (obvious), or you are a database admin (able to do things such as take backups, start / stop indexing, etc) or you are a server admin, and can create or delete databases, change permissions, etc.

Bulk insert errors are now handled much better, previously we had to wait until the operation would be completed, which could be quite a while. Now we can detect and error on them immediately.

Bulk insert change detection. Users often use bulk insert to manage ETL processes. It is very common to have a nightly process that run over the data and write large portions of it. With 3.0, we have the ability to specify that the data might actually be the same, so the bulk insert process will check, and if the stored value and the new value is the same, the server will skip writing to this document. In this way, we won’t have to re-index this document, re-replicate it, etc.

Reduced client side allocations. RavenDB users often run their software for months at a time, that means that anything that we can do to help reduce resource consumption is good. We have changed how we work internally to reduce the number of allocations that we are doing, and in particular to reduce the number of large object heap allocations. This tend to reduce the overall resource cost significantly over time.

What changes? In RavenDB, you could ask the session if there has been any changes (ie, if a call to SaveChanges() will go to the server), or if a particular entity had changes. Honestly, we added that feature strictly for our own testing, we didn’t think it would actually be used. Overtime, we saw that users often had questions in the form of: “I loaded the document, made no changes, but it is still saying that it changed!” The answer for those questions ranged from: “You have a computed Age field that uses the current time to calculate itself” to “you added/remove a property, and now it needs to be saved.” Sometimes it was our change detection code that had a false positive, as well, but any time this happened, that required a support call to resolve. Now, you can call session.Advanced.WhatChanged(), which will give you a dictionary with full details on all the changes that happened in the session.

Missing properties retention. This one is probably the first feature that we developed strictly for 3.0. The issue is what happens when you load a document from the database, but it has extra properties. What do I mean by that? Imagine that we changed our code to remove the FullName property. However, all our documents retain that property. In 2.5, loading and saving the document would overwrite the existing document with the new one, without the property. We got feedback from customers that this wasn’t a desirable behavior. Now, even though the FullName property isn’t accessible from your model, it will be retained (even if you have made changes to other properties).

Formatted indexes. In 2.5, you usually define indexes using the linq query format, but when you look at them on the server side, they can be… quite ugly. We are now using the same index format technique we have server side to prettify the indexes and make sure that they look very similar to the way you define them in code.

There are other stuff that went into the client, better Linq query support, you can now do includes on the results of MoreLikeThis query, and a lot of other stuff that I just done have the time (and you the patience) to go over them all. Those are the highlights, the things that really stand out.

Tags:

Published at

Originally posted at

Comments (5)

RavenDB, unrecoverable database and data loss

That is one scary headline, isn’t it? Go into the story and read all about it, there is a plot twist in the end.

This post was written in 05:40 AM, and I have spent the entire night up and awake*.

A customer called me in a state of panic, their database is not loading, and anything they tried didn’t work. I worked with him for a while on understanding what is going on, and how to try to recover what was going on.

Here is the story as I got it from the customer in question, and only embellished a lot to give the proper context for the story.

It all started with a test failure, in fact, it started with all the tests failing. The underlying reason was soon discovered, the test database disk was completely full, not even enough disk space to put half a byte.  The customer took a single look at that, and run a search on the hard disk, to find what is taking so much space. The answer was apparent. The logs directory was full, in fact, the Webster dictionary would need to search hard and wide to find a word to express how full it was.

So the customer in question did the natural thing, and hit Shift+Delete to remove all that useless debug logs that has been cluttering the disk. He then started the server again, and off to the races. Except that there was a slight problem, when trying to actually load the database, the server choked, cried and ended up curled to a fetal position, refusing to cooperate even when a big stick was fetch and waved in its direction.

The problem was that the logs files that were deleted were not debug log. Those were the database transaction logs. And removing them from the disk has the effect of causing the database to be unable to recover, so it stopped and refused to work.

Now, remember that this is a test server, which explains why developers and operations guys are expected to do stuff to it. But the question was raised, what actually caused the issue? Can this happen in production as well? If it happens, can we recover from  it? And more importantly, how can this be prevented?

The underlying reason for the unbounded transaction logs growth was that the test server was an exact clone of the production system. And one of the configuration settings that was defined was “enable incremental backups”. In order to enable incremental backups, we can’t delete old journal files, we have to wait for a backup to actually happen, at which point we can copy the log files elsewhere, then delete them.  If you don’t backup a database marked with enable incremental backups, it can’t free the disk space, and hence, the problem.

In production, regular backups are being run, and there were no tx log files being retained. But no one bothered to do any maintenance work on the test server, and the configuration explicitly forbid us from automatically handling this situation.  But in a safe-by-default mindset we would do anything for the operations guy to notice it with enough time to do something about it. That’s why for 3.0 we are taking a proactive step toward this case, and we will alert when the database is about to run out of free space.

Now, for the actual corruption issue. Any database makes certain assumptions, and chief among them is that when you write to disk, and actually flush the data, it isn’t going away or being modified behind our back. If that happens, which can be because of disk corruption, manual intervention in the files, stupid anti virus software or just someone randomly deleting files by accident. At that point, all bets are off, and there isn’t much that can be done to protect ourselves from this situation.

The customer in question? Federico from Corvalius.

CV-logo-BLACK-v_final

Note that this post is pretty dramatized. This is a test server, not a production system, , so the guarantees, expectations and behavior toward them are vastly different. The reason for making such a big brouhaha from what is effectively a case of fat fingers is that I wanted to discuss high availability story with RavenDB.

The general recommendation we are moving toward in 3.0 is that any High Availability story in RavenDB has to take the Shared Nothing approach. In effect, this means that you will not using technologies such as Windows Clustering, because that relies on a common shared resource, such as the SAN. Issues there, which actually creep up on you (out of quota space in the SAN can happen very easily) and take down the whole system, even though you spent a lot of time and money on a  supposedly highly available solution.

A shared nothing approach limit the potential causes for failure by having multiple nodes that can each operate independently. With RavenDB, this is done using Replication, you define a master/master replication between two nodes, and you can run it with one primary node that your servers connect to usually. At that point, any failure in this node would mean automatic switching over to the secondary, with no downtime. You don’t have to plan for it, you don’t have to configure it, It Just Works.

Now, that is almost true, because you need to be aware that in a split brain situation, you might have conflicts, but you can set a default strategy for that (server X is the authoritative source) or a custom conflict resolution policy.

The two nodes means that you always have a hot spare, which can also handle scale out scenario by handling some of the load from the primary server if needed.

Beyond replication, we also need to ensure that the data that we have is kept safe. A common request from admins that we heard is that a hot spare is wonderful, but they don’t trust that they have a backup if they can’t put it on a tape and file it on a shelve somewhere. That also help for issues such as offsite data storage in case you don’t have a secondary data center (if you do, put a replicated node there as well). This may sound paranoid, but having an offline backup means that if something did a batch job that was supposed to delete old customers, but deleted all customers, you won’t be very happy to realize that this batch delete process was actually replicate to your entire cluster and your customer count is set at zero, and then start declining from there. This is the easy case, a bad case is when you had a bug in your code that wrote bad data over time, you really want to be able to go back to the database as it was two weeks ago, and you can only do that from cold storage.

One way to do that is to actually do backups. The problem with doing that is that you usually go for full backups, which means that you might be backing up tens of GB on every backup, and that is very awkward to deal with. Incremental backups are easier to work with, certainly. But when building Highly Available systems, I usually don’t bother with full backups. I already have the data in one or two additional locations, after all. I don’t care for quick restore at this point, because I can do that on one of the replicated nodes. What I do care is that I have an offsite copy of the data that I can use if I ever need to. Because time to restore isn’t a factor, but convenience and management is, I would usually go with the periodic export system.

This is how this looks like:

image

The Q drive is a shared network drive, and we do an incremental export to it every 10 minutes and a full export every 3 days.

I am aware that this is pretty paranoid setup, we have multiple nodes holding the data, and exports of the data, sometimes I even have each node export the data independently, for the “no data loss, ever”

Oh, and about Federico’s issue? While he was trying to see if he could fix the database in the event such a thing happen in production (in the 3 live replicas at once), he was already replicating to the test sandbox from one of the production live replicas. With big databases it will take time, but a high-availability setup allows it. So even if the data file appears to be corrupted beyond repair, everything is good now.

* To be fair, that is because I’m actually at the airport waiting for a plane to take me on vacation, but I thought it was funnier to state it this way.

Tags:

Published at

Originally posted at

Comments (2)

What is new in RavenDB 3.0: The studio

It still feels funny to say that a major feature in a database product is the user interface, but I’m feeling a lot less awkward about saying that about the new studio now.

The obvious change here is that it is using HTML5, and not Silverlight. That alone would be great, because Silverlight has gotten pretty annoying, but we have actually done much more that. We moved to HTML5 and we added a lot of new features.

Here is how this looks like:

image

Now, let me show you some of the new stuff. None of it is ground breaking on its own, but combined they create a vastly improved experience.

Indexes copy/paste allows you to easily transfer index definitions from one database to another, without requiring any external tools.

image

Also on indexing, we have the format index feature, which can take a nasty index and turn that into a pretty and more easily understood code:

image

Speaking of code and indexing, did you notice the C# button there? Clicking on that will give you this:

image

Like the copy/paste index feature, the idea is that you can modify the index on the studio, play with the various options, then you hit this button and you can copy working index creation code into your project and don’t worry any more about how you are going to deploy it.

We also added some convenience factors, such as computed columns.  Let us see how that works. Here is the default view of the employees in the Northwind database:

image

that is nice, but it seems complex to me, all I care about is the full name and the age. So I head to the settings and define a few common functions:

image

I then head back to the employees collection and click on the grid icon at the header row, which gives me this:

image

After pressing “Save as default”, I can see that the values shown for employees are:

image

You can also do the same for the results of specific queries or indexes, so you’ll have better experience looking at the data. The custom functions also serve additional roles, but I’ll touch them on a future post.

Speaking of queries, here is how they look like:

image

 

Note the Excel icon on the top, you can export the data directly to Excel now. This is common if you need to send it to a colleague or anyone in the business side of things. For that matter, you can also load data into RavenDB from a CSV file:

image

There is actually a lot of stuff that goes on in the studio, but I won’t talk about it now, replication tracking, better metrics, etc. I’ll talk about them in posts specific for the major bundles and a post (or posts) about better operations support.

I’ll leave you with one final feature, the map reduce visualizer:

image

More posts are coming Smile.

Tags:

Published at

Originally posted at

Comments (1)

What is new in RavenDB 3.0: RavenFS

A frequent request from RavenDB users was the ability to store binary data. Be that actual documents (PDF, Word), images (user’s photo, accident images, medical scans) or very large items (videos, high resolution aerial photos).

RavenDB can do that, sort of, with attachments. But attachments were never a first class feature in RavenDB.

With RavenFS, files now have first class support. Here is a small screen shot, I’ve a detailed description of how it works below.

image

The Raven File System exposes a set of files, which are binary data with a specific key. However, unlike a simple key/value store, RavenFS does much more than just store the binary values.

It was designed upfront to handle very large files (multiple GBs) efficiently at API and storage layers level. To the point where it can find common data patterns in distinct files (or even in the same file) and just point to it, instead of storing duplicate information. RavenFS is a replicated and highly available system, updating a file will only send the changes made to the file between the two nodes, not the full file. This lets you update very large files, and only replicate the changes. This works even if you upload the file from scratch, you don’t have to deal with that manually.

Files aren’t just binary data. Files have metadata associated with them, and that metadata is available for searching. If you want to find all of Joe’s photos from May 2014, you can do that easily. The client API was carefully structured to give you full functionality even when sitting in a backend server, you can stream a value from one end of the system to the other without having to do any buffering.

Let us see how this works from the client side, shall we?

var fileStore = new FilesStore()
{
    Url = "http://localhost:8080",
    DefaultFileSystem = "Northwind-Assets",
};

using(var fileSession = fileStore.OpenAsyncSession())
{
    var stream = File.OpenRead("profile.png");
    var metadata = new RavenJObject
    {
        {"User", "users/1345"},
        {"Formal": true}
    };
    fileSession.RegisterUpload("images/profile.png", stream, metadata);
    await fileSession.SaveChangesAsync(); // actually upload the file
}

using(var fileSession = fileStore.OpenAsyncSession())
{
    var file = await session.Query()
                    .WhereEquals("Formal", true)
                    .FirstOrDefaultAsync();

    var stream = await session.DownloadAsync(file.Name);

    var file = File.Create("profile.png");

    await stream.CopyToAsync(file);
}

First of all, you start by creating a FileStore, similar to RavenDB’s DocumentStore, and then create a session. RavenFS is fully async, and we don’t provide any sync API. The common scenario is using for large files, where blocking operations are simply not going cut it.

Now, we upload a file to the server, note that at no point do we need to actually have the file in memory. We open a stream to the file, and register that stream to be uploaded. Only when we call SaveChangesAsync will we actually read from that stream and write to the file store. You can also see that we are specifying metadata on the file. Later, we are going to be searching on that metadata. The results of the search is a FileHeader object, which is useful if you want to show the user a list of matching files. To actually get the contents of the file, you call DownloadAsync. Here, again, we don’t load the entire file to memory, but rather will give you a stream for the contents of the file that you can send to its final destination.

Pretty simple, and highly efficient process, overall.

RavenFS also has all the usual facilities you need from a data storage system, including full & incremental backups, full replication and high availability features. And while it has the usual file system folder model, to encourage familiarity, the most common usage is actually as a metadata driven system, where you locate a desired file based searching.

Tags:

Published at

Originally posted at

Comments (18)

What is new in RavenDB 3.0: Voron

If you have been following this blog at all, you must have heard quite a lot about Voron. If you haven’t been paying attention, you can watch my talk about it at length, or you can get the executive summary below.

The executive summary is that Voron is a high performance low level  transactional storage engine, which was written from scratch by Hibernating Rhinos with the intent to move most / all of our infrastructure to it. RavenDB 3.0 can run on either Voron or Esent, and show comparable performance using either one.

More importantly, because Voron was created by us, this means that we can do more with it, optimize it exactly to our own needs and requirements. And yes, one of those would be running on Linux machines.

But more important, having Voron also allows us to create dedicated database solutions much more easily. One of those is RavenFS, obviously, but we have additional offering that are just waiting to get out and blow you minds away.

Tags:

Published at

Originally posted at

Comments (9)

What is new in RavenDB 3.0?

“I don’t know, why are you asking me such hard questions? It is new, it is better, go away and let me play with the fun stuff, I think that I got the distributed commit to work faster now. Don’t you have a real job to do?”

That, more or less, was my response when I was told asked that we really do need a “What has changed” list for RavenDB. And after some kicking and screaming, I agreed that this is indeed something that is probably not going to be optional. While I would love to just put a sticker saying “It is better, buy it!”, I don’t think that RavenDB is targeting that target demographic.

There is a reason why I didn’t want to compile such a list. Work on RavenDB 3.0 actually started before 2.5 was even out, and currently it encompass 1,270 resolved issues and 21,710 commits. The team size (as in people actually paid to work on this on this full time, excluding any outside contributions) grew to just over 20. And we had close to two years of work. In other words, this release represent a lot of work.

The small list that I had compiled contained over a hundred items. That is just too big to do justice to all the kind of things we did. So I won’t be doing a single big list with all the details. Instead, I’m going to do a rundown of the new stuff in separate blog post per area.

All the indexing improvements in one blog post, all the client API changes in another, etc.

At a very high level, here is the major areas that were changed:

  • Voron
  • RavenFS
  • HTML5 Studio
  • JVM API
  • Operations
  • Indexes & Queries

I’ll get to the details of each of those (and much more) in the upcoming posts.

Because there is so much good stuff, I'm afraid that I'll have to break tradition. For the following week or so, we are going to be moving to a 2 posts a day mode.

Also, please remember that we're hosting two RavenDB events in Malmo and Stockholm, Sweden next week. We'll be talking about all the new stuff.

RavenDB Events

Tags:

Published at

Originally posted at

ayende.com is now running on Voron

FireworksAfter about a week of running (and no further issues Smile) on the Esent database, we have now converted the backend database behind this blog to Voron.

The process was done by:

  • Putting App_Offline.html file for ayende.com
  • Exporting the data from blog.ayende.com database using the smuggler.
  • Deleting the database configuration from RavenDB, but retaining the actual database on disk.
  • Creating a new RavenDB database with Voron as the new blog.ayende.com database.
  • Importing the data from the previous export using smuggler.
  • Deleting the App_Offline.html file.

Everything seems to be operating normally, at least for now.

To my knowledge, this is the first production deployment of Voron Smile.

Tags:

Published at

Originally posted at

Comments (6)

Analyzing (small) log file

I got a log file with some request trace data from a customer, and I want to have a better view about what is actually going on. The log file size was 35MB, so that made things very easy.

I know about Log Parser, but to be honest, it would take more time to learn to use that effectively than to write my own tool for a single use case.

The first thing I needed to do is actually get the file into a format that I could work with:

var file = @"C:\Users\Ayende\Downloads\u_ex140904\u_ex140904.log";
var parser = new TextFieldParser(file)
{
CommentTokens = new[] {"#"},
Delimiters = new[] {" "},
HasFieldsEnclosedInQuotes = false,
TextFieldType = FieldType.Delimited,
TrimWhiteSpace = false,
};

////fields
// "date", "time", "s-ip", "cs-method", "cs-uri-stem", "cs-uri-query", "s-port", "cs-username", "c-ip",
// "cs(User-Agent)", "sc-status", "sc-substatus", "sc-win32-status", "time-taken"

var entries = new List<LogEntry>();

while (parser.EndOfData == false)
{
var values = parser.ReadFields();
if (values == null)
break;
var entry = new LogEntry
{
Date = DateTime.Parse(values[0]),
Time = TimeSpan.Parse(values[1]),
ServerIp = values[2],
Method = values[3],
Uri = values[4],
Query = values[5],
Port = int.Parse(values[6]),
UserName = values[7],
ClientIp = values[8],
UserAgent = values[9],
Status = int.Parse(values[10]),
SubStatus = int.Parse(values[11]),
Win32Status = int.Parse(values[12]),
TimeTaken = int.Parse(values[13])
};
entries.Add(entry);
}

Since I want to run many queries, I just serialized the output to a binary file, to save the parsing cost next time. But the binary file (BinaryFormatter) was actually 41MB is size, and while parsing the file took 5.5 seconds for text parsing, the binary load process took 6.7 seconds.

After that, I can run queries like this:

var q = from entry in entries
where entry.TimeTaken > 10
group entry by new {entry.Uri}
into g
where g.Count() > 2
select new
{
g.Key.Uri,
Avg = g.Average(e => e.TimeTaken)
}
into r
orderby r.Avg descending
select r;

And start digging into what the data is telling me.

Published at

Originally posted at

Comments (9)

Accidental code review

I’m trying to get a better insight on a set of log files sent by a customer. So I turned to find a tool that can do that, and I found Inidihiang. There is a x86 vs x64 issue that I had to go through, but then it was just sitting there trying to parse a 34MB log file.

I got annoyed enough that I actually checked, and this is the reason why:

image

Sigh…

I gave up on this and wrote my own stuff.

Tags:

Published at

Originally posted at

Comments (10)

Talking in the Twin Cities code camp in October

I’m going to talk at the twin cities code camp this October, about Polyglot Persistence and dedicated database solutions. This is a natural extension of the talk I did in the RavenDB Conference, when I created a live coding database capable of millions of writes per second on my laptop.

Published at

Originally posted at

Where do buzzwords retire to?

At lunch today at the office, we had an interesting discussion on the kind of must have technologies. Among the things that were thrown out were:

  • Service Oriented Architecture
  • Single Page Application
  • Cloud
  • Agile
  • TDD
  • Web 2.0
  • REST
  • AJAX
  • Data driven applications

All of those things were stuff that you had to do, and everyone was doing it. And a few years later… they are no longer hot and fancy, but they are probably still in heavy use.

By their out of fashion as much as yellow Crocs:

Tags:

Published at

Originally posted at

Comments (9)

All I wanted was to set my shortcuts

I’m trying to se the ReSharper keyboard shortcuts, and I got this:

image

So I waited, and waited, and waited.

Then I downloaded process monitor and tried to see what the hell is going on in here.

All I could figure out was that it was reading some persistent solution cache. So I closed Visual Studio and opened it without opening a solution. I then tried the same thing again, and it completed in a mere few tens of seconds.

Tags:

Published at

Originally posted at

Comments (3)

What kind of problems you’ll find only when you are dog fooding

Just minutes after I posted the previous post about dog fooding and the issues this has, we run into major trouble. Actually, several issues.

The first of which was that we started to get authentication issues, but even though we upgraded the server, we didn’t change any security related configuration. So what was going on? I tried logging in from my own machine, using the same credentials as the server, and everything worked! I then tried replacing the credentials to ones I knew worked, because they were being used by another system with no issues, and that didn’t work either.

Looking in Fiddler, there were no 401 requests, either. Very strange.

This ended up being an issue of lazy requests. Basically, a lazy request is one that carry multiple requests in a single round trip for the server. We changed how we handle those internally, so they look pretty much like any other request for the code, but it appears that we also forced them to go through authorization again, and obviously that didn’t work. Once we knew what was going on, fixing this was very easy.

The next issue was a little crazier. Some things were failing, and we weren’t sure why. We got an illegal duplicate key from somewhere, but it made no sense. What was worse, it appear to be happening in random places. It took me a while to figure out the root cause of this issue. In RavenDB 2.5, we tracked indexes using the index name. In RavenDB 3.0, we are tracking indexes using numeric ids. However, the conversion process between 2.5 to 3.0 didn’t take that into account, and while it gave the existing indexes ids, it didn’t set the next index id to the correct value.  When we would try to create a new index, it would generate an index id that was already existing, and that failed.  The error could crop up when you run a dynamic query that had to create a new index, so that was kind of funky.

The last error, however, is something that I consider to be purely evil. RavenDB 3.0 added a limitation that an index cannot output more than a set amount of index entries per document. The idea is that we want to stop the fan out problem. An index with high fan out can consume a lot of memory and other resources without control.

The idea was that we want to explicitly stop that early on. But what about existing indexes? We added a way to increase the limit explicitly, but old indexes obviously wouldn’t have this option. The problem was that this is only triggered on indexing, so you start the application, and everything works. Then indexing starts to fail. Which is fine, but then we have another RavenDB feature that applies, and if an index has too many errors, then it would be marked as failing. Failed indexes throw when queried.

The reason that this is evil is because it actually takes quite a bit of time for this error to surface. So you run your tests, and everything works, and a few hours or days later, everything crashes.

All of those issues has been resolved. I found that the index id issue was actually properly there, but I appear to have removed it during an unrelated fix to the previous problem without noticing. The lazy requests now know that they are already authenticated and the maximum size when loading from an existing system is 32K, which is big enough for pretty much anything you can think of. The behavior when you exceed the max fan out is also more consistent, it will skip this document, and only if you have a lot of them will it actually disable the index.

And yes, pushing things to ayende.com and seeing what breaks is a pretty good testing strategy, egg on face and all.

Tags:

Published at

Originally posted at

Comments (6)

Dogfood sometimes doesn’t taste so good

We just finished rolling back our internal servers migration to 3.0 back to 2.5. That was quite unpleasant, and was actually noticed by users.

That isn’t pleasant, but it is always better if we get the egg all our face than if it is a customer. The actual issue that we run into was pretty interesting.

The problem is that the database we use for running this blog (as well as most of our internal systems) has been through… a lot. It has gone through pretty much every released version, and many that weren’t actually released.

That means that from storage perspective (only of interest to RavenDB developers), it is a bit of a mess. That in turn meant that we had to do extra work to convert the storage from the 2.5 version to the 3.0 version. That caused enough memory to be used that we hit our limits on memory usage, and failed to convert it to the 3.0 version.

That meant that it was stuck. That is actually one of the reasons that we test those things on our own systems, so that was great.

The not so great part was that we also uncovered another interesting bug (actually, several of them in conjunction). The new studio had a tendency to read the stats from all the available databases, if the number we had was small enough. That was done so we can show the number of documents on each database in the databases page.

That meant that we would effectively start all of them in parallel (and consume resources that weren’t actually needed).

And that, in turn, exposed a race condition (in Esent!) that can resulted in a hard process crash. That was the hardest thing to get over with, because obviously we don’t have source access to Esent and it was kinda of hard to pinpoint where this was actually happening and why.

All fixed and good now, and ready to try again.

Tags:

Published at

Originally posted at

Comments (5)

RavenDB Days in Stockholm & Malmo

This is a reminder that we are going to be doing two full day events in Sweden in a few weeks.

So far, we have enough people registering that we actually had to move the event to a bigger venue! So we have more room for you to come and hear all about RavenDB 3.0 and the cool stuff that we are bringing with it.

You can register here.

Tags:

Published at

Originally posted at

RavenDB 3.0–Production dog fooding

The following screen shot is from our production RavenDB instance, which is now running RavenDB 3.0 build 3466.

This is part of our dog fooding strategy, and it pays heavy dividends (see the next posts) Smile.

image

Tags:

Published at

Originally posted at

Comments (3)

Playing with Roslyn

We do a lot of compiler work in RavenDB. Indexes are one such core example, where we take the C# language and beat both it and our heads against the wall until it agrees to do what we want it to.

A lot of that is happening using the excellent NRefactory library as well as the not so excellent CodeDOM API. Basically, we take a source string, convert it into something that can run, then compile it on the fly and execute it.

I decided to check the implications of using this using a very trivial benchmark:

private static void CompileCodeDome(int i)
{
    var src = @"
class Greeter
{
static void Greet()
{
System.Console.WriteLine(""Hello, World"" + " + i + @");
}
}";
    CodeDomProvider codeDomProvider = new CSharpCodeProvider();
    var compilerParameters = new CompilerParameters
    {
        OutputAssembly= "Greeter.dll",
        GenerateExecutable = false,
        GenerateInMemory = true,
        IncludeDebugInformation = false,
        ReferencedAssemblies =
        {
            typeof (object).Assembly.Location,
            typeof (Enumerable).Assembly.Location
        }
    };
    CompilerResults compileAssemblyFromSource = codeDomProvider.CompileAssemblyFromSource(compilerParameters, src);
    Assembly compiledAssembly = compileAssemblyFromSource.CompiledAssembly;
}

private static void CompileRoslyn(int i)
{
    var syntaxTree = CSharpSyntaxTree.ParseText(@"
class Greeter
{
static void Greet()
{
System.Console.WriteLine(""Hello, World"" + " +i +@");
}
}");

    var compilation = CSharpCompilation.Create("Greeter.dll",
        syntaxTrees: new[] {syntaxTree},
        references: new MetadataReference[]
        {
            new MetadataFileReference(typeof (object).Assembly.Location),
            new MetadataFileReference(typeof (Enumerable).Assembly.Location),
        });

    Assembly assembly;
    using (var file = new MemoryStream())
    {
        var result = compilation.Emit(file);
    }
}

 

I run it several times, and I got (run # on the X axis, milliseconds on the Y axis):

image

The very first Roslyn invocation is very costly. The next are pretty much nothing. Granted, this is a trivial example, but the CodeDOM (which invokes csc) is both much more consistent but much more expensive in general.

Tags:

Published at

Originally posted at

Comments (7)