Ayende @ Rahien

filter by tags archive

architecture (616) rss
bugs (451) rss
challanges (123) rss
community (381) rss
databases (481) rss
design (896) rss
development (642) rss
hibernating-practices (71) rss
miscellaneous (592) rss
performance (397) rss
programming (1088) rss
raven (1457) rss
ravendb.net (541) rss
reviews (184) rss

2025
- July (7)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

RavenDB - High-Performance NoSQL Document Database

May 24 2019

Backup goes wild

time to read 1 min | 143 words

Tweet Share Share 0 comments

Tags:

humor

We have a lot of internal test scenarios that run RavenDB through its paces.

One of them had an issue. It would test that RavenDB backups worked properly, but it would (sometimes) fail to clean up after itself.

The end result was that over time, the test database would accumulate more and more backup tasks that it had to execute (and all of them on the same schedule).

You can see here how RavenDB is allocating them to different nodes in an attempt to spread the load as much as possible.

We fixed the bug in the test case, but I also consider this a win because we now have protection from backup DoS attacks Smile . And I just love this image.

Jul 11 2018

Failing to prove you are human

time to read 1 min | 85 words

Tweet Share Share 1 comments

Tags:

humor

I didn’t intend to do another non technical post, but this was hilarious. My previous post was about taking my daughter to the ice cream shop.

My wife, naturally, wanted to comment on the post. She told me she gave up after failing to complete the CAPTCHA.

I’m not sure if I should be worried about the UX of the blog or upgrade my wife’s firmware.

Jul 10 2018

Toddlers, cursing and preparing ahead of time

time to read 2 min | 275 words

Tweet Share Share 0 comments

Tags:

My daughter is 3¼ years old now. About the time that she was born, I decided that I needed to make a small change in my language. Whenever I felt the urge to curse, I would say a food’s name. For example, after being puked on, my reaction would be some variant of: “PASTA”, “PASTA BOLOGNESE” or other pasta’s favorites.

As time went by, I got better and better at expressing emotion through increasingly disturbing food references. My current favorite is: “Pasta Bolognese with pickled carrots in a bun with anchovies and raw eggs”.

A couple of days ago, I took my daughter and a friend to an ice cream shop. As expected of an ice cream shop in the middle of (very hot) summer, the place was packed. My daughter was quite excited to go there and expressed her emotions by standing up and shouting at the top of her lungs (she is three, with a voice that carry like a foghorn): “PASTA! PASTA BOLOGNESE” over and over again.

This is a crowded shop, full of small kids and parents. I got some looks for the little girl holding up a full ice cream cone and shouting about pasta, but it was infinitely preferable to the alternative.

An unforeseen side effect, however, is that because I can, I’m very free with pasta based profanities. This had led to what is effectively a competition, with her trying to cause me to go overboard with that.

And now I must go back to work, before the gluten police arrival.

Jun 13 2018

.NET Core 2.1 broke my software, thank you very much!

time to read 1 min | 137 words

Tweet Share Share 5 comments

Tags:

humor

We just upgraded our stable branch to .NET Core 2.1. The process was pretty smooth overall, but we did get the following exchange in our internal Slack channel.

It went something like this:

is it known that import doesn't work ?
As you can imagine, Import is pretty important for us.
no
does it work on your machine ?
checking,,,
what's an error?
no error.
so UI is blocked?
do you have any errors in dev tools console?
`TypeError: e is undefined`
doesn't says to me much
same thing in incognito
export doesn't work either
lol the reason is: dotnet core 2.1

the websockets are faster and I had race in code
will push fix shortly

There you have it, .NET Core 2.1 broke our code. Now I have to go and add Thread.Sleep somewhere…

Apr 01 2018

This will be $11,509,586,000, please (excluding tip)

time to read 2 min | 251 words

Tweet Share Share 4 comments

Tags:

About a decade ago I was working at a client, and I was trying to get some idea about the scope of the project. This was a fairly large B2B system, with some interesting requirements around business logic behaviors.

For the life of me, I couldn’t get the customer to commit to even rough SLA or hard numbers around performance, capacity and scale. Whenever I asked, I got some variant of: “It has to be fast, really fast.”

But what is fast, and under what scenario, that was impossible to figure out. I got quite frustrated by this issue and pressed hard on this topic, and finally got something that was close to a hard number, “it has to be like Google”.

That was a metric I could do something with. I went to last year’s Google financial statements, figured out how much money they spent that year and sent the customer a quote for 11.5 billion US dollars.

As you might imagine, that number caught people’s attention, especially since I sent it to quite a few people at the customer. On a call with the customer I explained, “You want it to be like Google, so I used the same budget as Google”.

From that point it was much easier to actually get performance and scale numbers, although I did have to cut a few zeroes (like, all of them) from the quote.

Oct 19 2017

Giving Demeter PTSD

time to read 1 min | 113 words

Tweet Share Share 1 comments

Tags:

humor

The law of Demeter goes like this, a method m of an object O may only invoke the methods of the following kinds of objects:

O itself
m's parameters
Any objects created/instantiated within m
O's direct component objects
A global variable, accessible by O, in the scope of m

And then we have this snippet from a 1,741 lines index that was sent to us to diagnose some performance problems.

There are at least two separate leaks of customer data here, by the way, can you spot them?

This is it for this post, I really don’t have anything else left to say.

Sep 12 2017

Fractal release process

time to read 1 min | 66 words

Tweet Share Share 2 comments

Tags:

raven
humor

As part of the release process for RavenDB RC, we have different team members doing strange things. Here is what one of them sent me. This is a RavenDB cluster composed of 30+ nodes, with a single database that spans 27 nodes with full master/master replication between them.

Jun 28 2017

The things that come out late at night

time to read 2 min | 205 words

Tweet Share Share 2 comments

Tags:

humor

The following is the opening paragraphs for discussion RavenDB 4.0 clustering and distribution model in the Inside RavenDB 4.0 book.

You might be familiar with the term "murder of crows" as a way to refer to a group for crows[1]. It has been used in literature and arts many times. Of less reknown is the group term for ravens, which is "unkindness". Personally, in the name of all ravens, I'm torn between being insulted and amused.

Professionally, setting up RavenDB as a cluster on a group of machines is a charming exercise (however, that term is actually reserved for finches) that bring a sense of exaltation (taken too, by larks) by how pain free this is. I'll now end my voyage into the realm of ornithology's etymology and stop speaking in tongues.

On a more serious note, the fact that RavenDB clustering is easy to setup is quite important, because it means that it is much more approachable.

[1] If you are interested in learning why, I found this answer fascinating

It was amusing to write, and it got me to actually start writing that part of the book. Although I’m not sure if this will survive editing and actually end up in the book.

Mar 18 2016

Code through the looking glassAnd a linear search to rule them

time to read 1 min | 187 words

Tweet Share Share 9 comments

Tags:

The final piece for this candidate is the code to actually search through the sorted array. As you'll recall, the candidate certainly made our computer pay for that sorting and merging. But we are fine now, we have manage to recover from all of that abuse, and we are ready to rock.

Here is how the search (over sorted array) was implemented.

public List<int> EmailSearcher(string email)
{
    List<int> answer = new List<int>();
    for (int i = 0; i < emailsArray.Length; i++)
    {
        if (emailsArray[i].STR.ToString().Equals(email, StringComparison.OrdinalIgnoreCase))
        {
            answer = emailsArray[i].ID;
            return answer;
        }
    }

    return answer;
}

And with that, I have nothing left to say.

Mar 17 2016

Code through the looking glassSorting large data sets

time to read 3 min | 590 words

Tweet Share Share 10 comments

Tags:

The task in question was to read a CSV file (which is large, but can fit in memory) and allow quick searches on the data by email to find the relevant users. Most candidates handle that with a dictionary, but since our candidate decided to flip things around, he also included an implementation of doing this with sorted arrays. That is actually quite nice, in theory. In practice, it ended up something like this:

funny, funny pictures, funny photos, hilarious, wtf, weird, bizarre, funny animals, 17 Bizarre Photoshop Hybrid Animals

In order to truly understand the code, I have to walk you through a bunch of it.

It starts with the following fields:

List<Structure> emails = new List<Structure>();

Structure[] namesArray = new Structure[0];

Structure, by the way, is a helper class that just has a key and a list of ids. The duplicate is strange, but whatever, let move on.

The class constructor is reading one line at a time and add an structure instance with the email and the line id to the emails list.

public void ToArray()
{
    emailsArray = this.emails.ToArray();
}

This code just copy emails to the array, we are not sure why yet, but then we have:

public void Sort()
{
    Array.Sort(emailsArray, delegate(Structure x, Structure y) { return x.STR.CompareTo(y.STR); });
}

So far, this is right in line with the "use a sorted array" method that the candidate talked about. There is a small problem here, because emails are allowed to be duplicated, but no fear, our candidate can solve that…

public void UnifyIdLists()
{
    for (int i = 0; i < emailsArray.Length; i++)
    {
        if (i + 1 == emailsArray.Length)
            break;
        if (emailsArray[i].STR.Equals(emailsArray[i + 1].STR))
        {
            emailsArray[i].ID.AddRange(emailsArray[i + 1].ID);
            emailsArray[i + 1] = null;
            List<Structure> temp = emailsArray.ToList<Structure>();
            temp.RemoveAll(item => item == null);
            emailsArray = temp.ToArray();
        }
    }
}

The intent of this code is to merge all identical email values into a single entry in the list.

Now, to put things in perspective, we are talking about a file that is going to be around the 500MB in size, and there are going to be about 3.5 million lines in it.

That means that the emailsArray alone is going to take about 25MB.

Another aspect to consider is that we are using dummy data in our test file. Do you want to guess how many duplicates there are going to be there? Each of which is generating two 25MB allocations and multiple passes over an array of 3.5 million items in size.

Oh, and for the hell of it, the code above doesn't even work. Consider the case when we have three duplicates…

Oren Eini

Oren Eini

CEO of RavenDB

Backup goes wild

Failing to prove you are human

Toddlers, cursing and preparing ahead of time

.NET Core 2.1 broke my software, thank you very much!

This will be $11,509,586,000, please (excluding tip)

Giving Demeter PTSD

Fractal release process

The things that come out late at night

Code through the looking glassAnd a linear search to rule them

Code through the looking glassSorting large data sets

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed