Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive
time to read 9 min | 1650 words

Continuing with my work on porting leveldb to .NET, we run into another problem. The log file. The log file is pretty important, this is how you ensure durability, so any problems there are a big cause of concern.

You can read a bit about the format used by leveldb here, but basically, it uses the following:

   1: block := record* trailer?
   2: record :=
   3:  checksum: uint32    // crc32c of type and data[] ; little-endian
   4:  length: uint16        // little-endian
   5:  type: uint8        // One of FULL, FIRST, MIDDLE, LAST
   6:  data: uint8[length]

Block is of size 32Kb.

The type can be First, Middle, End or Full. Since it is legit to split a record across multiple blocks. The reasoning behind this format are outlined in the link above.

It is also a format that assumes that you know, upfront, the entire size of your record, so you can split it accordingly. That makes a lot of sense, when working in C++ and passing buffers around.

 

This is straightforward in C++, where the API is basically:

Status Writer::AddRecord(const Slice& slice)

(Slice is basically just a byte array).

In .NET, we do not want to be passing buffers around, mostly because of the impact on the LOH. So we had to be a bit smarter about things, in particular, we had an interesting issue with streaming the results. If I want to write a document with a size of 100K, how do I handle that?

Instead, I wanted this to look like this:

   1: var buffer = BitConverter.GetBytes(seq);
   2: await state.LogWriter.WriteAsync(buffer, 0, buffer.Length);
   3: buffer = BitConverter.GetBytes(opCount);
   4: await state.LogWriter.WriteAsync(buffer, 0, buffer.Length);
   5:  
   6: foreach (var operation in writes.SelectMany(writeBatch => writeBatch._operations))
   7: {
   8:     buffer[0] = (byte) operation.Op;
   9:     await state.LogWriter.WriteAsync(buffer, 0, 1);
  10:     await state.LogWriter.Write7BitEncodedIntAsync(operation.Key.Count);
  11:     await state.LogWriter.WriteAsync(operation.Key.Array, operation.Key.Offset, operation.Key.Count);
  12:     if (operation.Op != Operations.Put)
  13:         continue;
  14:     using(var stream = state.MemTable.Read(operation.Handle))
  15:         await stream.CopyToAsync(stream);
  16: }

The problem with this approach is that we don’t know, upfront, what is the size that we are going to have. This means that we don’t know how to split the record, because we don’t have the record until it is over. And we don’t want (can’t actually) to go back in the log and change things to set the record straight (pun intended).

What we ended up doing is this:

image

Note that we explicitly mark the start / end of the record, and in the meantime, we can push however many bytes we want. Internally, we buffer up to 32Kb in size (a bit less, actually, but good enough for now) and based on the next call, we decide whatever the current block should be marked as good or bad.

The reason this is important is that this allows us to actually keep the same format as leveldb, with all of the benefits for dealing with corrupted data, if we need to. I also really like the idea of being able to have parallel readers on the log file, because we know that we can just skip at block boundaries.

time to read 2 min | 317 words

Scripted Index Results (I wish it would have a better name) is a really interesting new feature in RavenDB 2.5. As the name implies, it allows you to attach scripts to indexes. Those scripts can operate on the results of the indexing.

Sounds boring, right? But the options that is opens are nothing but. Using Scripted Index Results you can get recursive map/reduce indexes, for example. But we won’t be doing that today. Instead, I’ll show how you can enhance entities with additional information from other sources.

Our sample database is Northwind, and we have defined the following index to get some statistics about our customers:

image

And we can query it like this:

image

However, what we want to do is to be able to embed those values inside the company document, so we won’t have to query for it separately. Here is how we can use the new Scripted Index Results bundle to do this:

image

Once we have defined that, whenever the index is done, it will run these scripts, and that, in turns, means that this is what our dear ALFKI looks like:

image

I’ll leave recursive map/reduce as tidbit for my dear readers Smile.

time to read 3 min | 456 words

One of the nice things about having more realistic demo data set is that we can now actually show demos on that data. I didn’t realize how much of an issue that was until we actually improved things.

Let us see how we are going to demo RavenDB’s dynamic reporting feature.

We start by creating the following index. It is a pretty simple one, with the one thing to notice is that we are explicitly setting the Sort mode for Total to be Double.

image

Now that we have done that, we are going to go to Query > Reporting:

image

And then I can start issue reporting queries:

image

This is the equivalent of doing:

select EmployeeID, sum(tot.Total) Total from Orders o join 
    (
        select sum((Quantity * UnitPrice) * (1- Discount)) Total, OrderId from [Order Details]
        group by OrderID
    ) tot
    on o.OrderID = tot.OrderID
where o.CustomerID = @CustomerId
group by EmployeeID

The nice thing about this, and what makes this feature different from standard map/reduce, is that you can filter the input data into the aggregation.

In code, this would look something like this:

session.Query<Order>("Orders/Total")
  .Where(x=>x.Company = companyId)
  .AggregateBy(x=>x.Employee)
  .SumOn(x=>x.Total)
  .ToList();

Pretty nice, even if I say so myself.

time to read 1 min | 109 words

RavenDB has always came with some sample data that you could use to test things out. Unfortunately, that sample data was pretty basic, and didn’t really cover a lot of interesting scenarios.

For RavenDB 2.5, we updated the sample data to use the brand new and amazing (wait for it…) Northwind database.

image

At a minimum, it would make demoing stuff easier. And in order to make things even nicer, you can get the C# classes for the sample data here.

time to read 18 min | 3449 words

The following code has very subtle bug:

   1: public class AsyncQueue
   2: {
   3:     private readonly Queue<int> items = new Queue<int>();
   4:     private volatile LinkedList<TaskCompletionSource<object>> waiters = new LinkedList<TaskCompletionSource<object>>();
   5:  
   6:     public void Enqueue(int i)
   7:     {
   8:         lock (items)
   9:         {
  10:             items.Enqueue(i);
  11:             while (waiters.First != null)
  12:             {
  13:                 waiters.First.Value.TrySetResult(null);
  14:                 waiters.RemoveFirst();
  15:             }
  16:         }
  17:     }
  18:  
  19:     public async Task<IEnumerable<int>> DrainAsync()
  20:     {
  21:         while (true)
  22:         {
  23:             TaskCompletionSource<object> taskCompletionSource;
  24:             lock (items)
  25:             {
  26:                 if (items.Count > 0)
  27:                 {
  28:                     return YieldAllItems();
  29:                 }
  30:                 taskCompletionSource = new TaskCompletionSource<object>();
  31:                 waiters.AddLast(taskCompletionSource);
  32:             }
  33:             await taskCompletionSource.Task;
  34:         }
  35:     }
  36:  
  37:     private IEnumerable<int> YieldAllItems()
  38:     {
  39:         while (items.Count > 0)
  40:         {
  41:             yield return items.Dequeue();
  42:         }
  43:  
  44:     }
  45: }

I’ll even give you a hint, try to run the following client code:

   1: for (int i = 0; i < 1000 * 1000; i++)
   2: {
   3:     q.Enqueue(i);
   4:     if (i%100 == 0)
   5:     {
   6:         Task.Factory.StartNew(async () =>
   7:             {
   8:                 foreach (var result in await q.DrainAsync())
   9:                 {
  10:                     Console.WriteLine(result);
  11:                 }
  12:             });
  13:     }
  14:  
  15: }
Can you figure out what the problem is?
time to read 1 min | 74 words

Well, it is nearly the 29 May, and that means that I have been married for two years.

To celebrate that, I am offering a 29% discount on all our products (RavenDB, NHibernate Profiler, Entity Framework Profiler, etc).

All you have to do is purchase any of our products using the following coupon code:

2nd anniversary

This offer is valid to the end of the month, so hurry up.

time to read 8 min | 1426 words

One of the things that we are planning for Raven 3.0 is the introducing of additional options. In addition to having RavenDB, we will also have RavenFS, which is a replicated file system with an eye toward very large files. But that isn’t what I want to talk about today. Today I would like to talk about something that is currently just in my head. I don’t even have a proper name for it yet.

Here is the deal, RavenDB is very good for data that you care about individually. Orders, customers, etc. You track, modify and work with each document independently. If you are writing a lot of data that isn’t really relevant on its own, but only as an aggregate, that is probably not a good use case for RavenDB.

Examples for such things include logs, click streams, event tracking, etc. The trivial example would be any reality show, where you have a lot of users sending messages to vote for a particular candidate, and you don’t really care for the individual data points, only the aggregate. Other things might be to want to track how many items were sold in a particular period based on region, etc.

The API that I had in mind would be something like:

   1: foo.Write(new PurchaseMade { Region = "Asia", Product = "products/1", Amount = 23 } );
   2: foo.Write(new PurchaseMade { Region = "Europe", Product = "products/3", Amount = 3 } );

And then you can write map/reduce statements on them like this:

   1: // map
   2: from purchase in purchases
   3: select new
   4: {
   5:     purchase.Region,
   6:     purchase.Item,
   7:     purchase.Amount
   8: }
   9:  
  10: // reduce
  11: from result in results
  12: group result by new { result.Region, result.Item }
  13: into g
  14: select new
  15: {
  16:     g.Key.Region,
  17:     g.Key.Item,
  18:     Amount = g.Sum(x=>x.Amount)
  19: }

Yes, this looks pretty much like you would have in RavenDB, but there are important distinctions:

  • We don’t allow modifying writes, nor deleting them.
  • Most of the operations are assumed to be made on the result of the map/reduce statements.
  • The assumption is that you don’t really care for each data point.
  • There is going to be a lot of those data points, and they are likely to be coming in at a relatively high rate.

Thoughts?

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 5 days from now
  3. What happens when a sparse file allocation fails? - 7 days from now
  4. NTFS has an emergency stash of disk space - 9 days from now
  5. Challenge: Giving file system developer ulcer - 12 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}