Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive
time to read 2 min | 316 words

imageNot all of our testing happened in a production settings. One of our test clusters was simply running a pretty simple loop of writes, reads and queries on all the nodes in the cluster while intentionally destabilizing the system.

After about a week of this we learned that this worked, there were no memory leaks or increased resource usage and also that the size of the data on disk was about three orders of magnitude too much.

Investigating this we discovered that the test process introduced conflicts because it wrote the same set of documents to each of the nodes, repeatedly. We are resolving this automatically but are also keeping the conflicted copies around so users can figure out what happened to their system. In this particular scenario, we had a lot of conflicted revisions, and it was hard initially to figure out what took that space.

In our production system, we also discovered that we log too much. One of the interesting feedback items we were looking for in this production test run is to see what kind of information we can get from the logs and make sure that the details there are actionable. A part of that was to see if we could troubleshoot something simply using the logs, and add missing details if there were stuff that we couldn’t figure out from them.

We also discovered that under load, we would log a lot. In particular, we had logs detailed every indexed document and replicated item. These are almost never useful, but they generate a lot of noise when we lowered the log settings. So that went away as well. We are very focused on logs usability, it should be possible to understand what is going on and why without drowning in minutia.

time to read 6 min | 1179 words

imageThe year 2018 just rolled by, and now it the time to talk about what we want to do in this year. The release of the 4.0 version is going to be just the start, to be honest.

In no particular order, I want the following things to happen in the near future:

  • Finishing the book (github.com/ravendb/book). I currently have more than 300 pages in it, and I’m afraid that I’m only 2/3 of the way, if that. RavenDB has gotten big and doing justice to everything it does take a lot of time. My wish list here is that I’ll finish writing all the content by the first quarter and have it out (as in, you can have it on your desk) by the second quarter. Note that you can read it right now, and the feedback would be very welcome.
  • All the client APIs RTM’ed. We currently have clients for .NET, Python, JVM, Go, Ruby and Node.JS. Some of them are already ready for production, some are at RC level and some are still beta quality. We’ll dedicate a some effort and release all of these in the first quarter as well. I think that alongside with being able to run on multiple operating systems, we want to give people the choice of using RavenDB from multiple platforms and having a client for a particular platform is the first step on that road.
  • Getting (and incorporating) users’ feedback. We have worked closely with several of our customers on the release of 4.0, and we have got people chomping at the bit to just get it out (who wants to say no to being 10 – 50 times faster). But RavenDB 4.0 is a huge undertaking, and there are going to be things that we missed. The feedback from the RC releases has been invaluable in finding scenarios and conditions that we didn’t consider. I’ve explicitly put aside time to handle that sort of feedback as people are rolling out RavenDB and need to smooth any rough corners that still remain.

These are all the near term plans, for the next few months. These mostly deal with actually dealing with the aftermath of a big release, with nothing major planned for the near future because I expect all of us to be dealing with all the other things that you need to do with a big release.

The last year had seen us grow by over 40% in terms of manpower and the flexibility of having some many great people working here which can push the product in so many directions at once is intoxicating. I have been dealing with a lot of retrospectives recently as we have been completing RavenDB 4.0 and it amazed me just how much was accomplished and how many irons we still have in the fire. So let’s talk about the big plans for 2018, shall we?

Additional storage types

In 4.0, we have JSON documents and binary attachments that you can add to a document. One of our goals in 2018 is to add two or three additional options, turning RavenDB from a document database to a true multi paradigm database. In particular, we want to add:

  • Distributed counters
  • Time series
  • Graph operations

These are all going to be living together with documents, so you have have a user’s document with a FitBit and a heartrate time series on that document that updates every 5 seconds. Or you can have a post document in a blog and use a counter to track how many likes it has gotten. And I don’t believe that I need to explain about graph operations. We want to allow you to define connections between documents and query them directly.

The idea here is that we got documents, but they aren’t always the best tool for the job, so we want to offer you the option to do that in a way that is optimized, fast, easy and convenient to use.

Better integration

We already have done a lot of work around working with additional services and environment, we just need to polish and expose that. This means things like being able to get a PouchDB instance that is running in your browser and have it sync automatically and securely to your RavenDB cluster.  Or being able to point RavenDB into an instance of a relational database and have it such all the data, build the document model and save your a lot of work on migrating to a document database.

Ever faster

Performance is addictive, and it has caught us. We are now orders of magnitude faster than ever before. We have actually been scaling down our production servers intentionally to be able to see if we can find more bottlenecks in the real world, so far we went down CPU by half and memory to one quarter and we are still seeing faster response times and better latencies. That said, we can do better, and we are planning to.

I’m looking forward to are things like Span<T> and Memory<T> which would really reduce overheads in some key scenarios.  We are also eagerly awaiting the arrival of SIMD intrinsics in the CoreCLR and already have some code paths that are going to be heavily optimized as a result. Early results show something like 20% – 40% improvement, but we’ll probably be able to get more over time.  One of the reasons I’m so excited about the release is that people get to actually use the software and see how much it improved, but also give us feedback on the things that can be made even faster.

Until we have actual people using us in production over a long period of time, it is hard to avoid optimizing in the dark, and that never gives a good ROI.

Community

Everything before was mostly technical things. Features that are upcoming and new things that you get to do with RavenDB. We are also going to invest heavily in getting the word out, showing up at conferences and users’ group. We are also scheduling a lot of workshops around the globe to teach RavenDB 4.0. The first round is already available here.

There is also the new community license for RavenDB, allowing you to go to production without needing to purchase a commercial license. This should reduce the barrier for adoption and we hope to see a lot of new users starting to come to RavenDB. We are now free to use, running on multiple operating systems and available in the most commonly used platform. And we are easy to get right, although it was anything but easy to get there. A good example of that is the setup video.

All in all, 2017 has been a major year for us, both in term of growth on any parameter we track and the culmination of years of efforts that lead us to the release of RavenDB and seeing the new version take its first steps on the first days of 2018.

Happy new year, everyone.

time to read 1 min | 155 words

It’s that time again, we are looking for more people to work on RavenDB. I’m going to assume that if you are reading this, you know what we do, so I’ll skip telling you how exciting, dynamic and buzzword of the day this position is. I’ll say that we are doing a lot of fun things, one of our guys just finished taking us from 200 req/sec in a particular scenario to 30,000 req/sec, for example Smile.

We are looking for someone who can build system software in C#, with really good understanding of the way computers work, and how to get the best out of them. If you have OSS contributions, that puts you at the head of the line.

Just ping us at jobs@ravendb.net with your CV.

This position is for Hadera, Israel, and is not available for remote work.

time to read 7 min | 1216 words

I run into this post, which updates the (by now) venerable Joel Test to our modern age. I remember reading the Joel Test (as well as pretty much anything by Joel) at the beginning of my career and I’m pretty sure that it influenced the way I choose employers and designed software. Seeing this post, I decided to see how Hibernating Rhinos would rank on this test today. I put both the original and updated version, and my comments are below.

  Original Updated
1

Do you use source control?

 
2

Can you make a build in one step?

Can you build and deploy your software in one step?

3

Do you make daily builds?

Do you build on every commit?

4

Do you have a bug database?

 
5

Do you fix bugs before writing new code?

 
6

Do you have an up-to-date schedule?

Do you measure your progress in terms of value delivered?

7

Do you have a spec?

Do you have a runnable spec?

8

Do programmers have quiet working conditions?

Does your environment foster collaboration?

9

Do you use the best tools money can buy?

 
10

Do you have testers?

Is testing everyone's responsibility?

11

Do new candidates write code during their interview?

 
12 Do you do hallway usability testing?  

 

  • Source control – Yes, thankfully, I think that the days of anyone not using source control for anything but a scratch project are behind us. Now the arguments are which source control.
  • Build & deploy in one step – Yes, the build process runs on a Team City server, and while it sometimes require some TLC (I’m looking at you , nuget), it pretty much runs without us having to pay much attention to it.
  • Build & verify on every commit – No. But yes. What we do is have the build server run the full suite on every Pull Request, which is our unit of integration. Commits are far less important, because we break them apart to make them easier to review.
  • Bug database – Yes, but see also the next topic. We mostly use it for bugs we find, and features / improvements, not so much for customers bugs.
  • Do you fix bugs before writing new code – No. But yes. The reason this is complex to answer is how you define bugs. A customer issue is typically handled from A to Z on the spot. We have a rotating function of support engineer that handle such scenarios, and they prioritize that over their routine work.
  • Do you have a schedule / do you measure progress in term of value – We have a rough schedule, with guidelines about this is hard deadline and this is a nice deadline. Hard deadline is about meeting outside commitments, typically. Nice deadlines are about things we would like to do, but we won’t kill ourselves doing them. We do have a sense of what is important and what isn’t. By that I mean is that we have a criteria for “we should be chasing after this” and “this is for when we run out of things to do”.
  • Do you have a (runnable) spec?  - Yes, we have a spec. It isn’t runnable, and I’m not sure what a runnable spec for a database would be. The spec outline thinks like the data format and how we do data fetches for indexes, architectural considerations and rough guidelines into where we are going. It isn’t detailed to the point of being runnable, and I don’t like the idea very much.
  • Developers have quite working conditions / environment encourage collaboration  – The typical setup we have is a separate office for every two developers. I typically see people move around the offices and collaborate on all sort of stuff. If it bugs the other dev in the room, they usually have headphones to deal with it, but that isn’t happening enough to be a major problem. A common issue for people who leave their workstation unattended and use headphones is that by the time they get back and put the headphones, the music has been changes to something suitably amusing, such as this one.
  • Best tools that money can buy – Procurement in Hibernating Rhinos is a process, it involves sending an email with “I need this tool”, and you must include the link to the tool. Then you have to wait anything between 15 minutes to 24 hours (depending on when you asked), and you’ll get the license. I have seen far too many stupid decisions of “oh, we don’t have a budget for this 200$ tool but we’ve no problem paying the 2000$ that it would cost us in time” to suffer that.
  • Testers / everyone is a responsible – Yes. Every single dev is writing tests, and every single PR is sent after tests has been run locally, and then on the build server.
  • Candidates write code in interview – Yes, oh yes they do.
  • Hallway usability testing – See below, too complex to answer here.

RavenDB has multiple level of “user interface”. The most obvious one is the RavenDB studio, but the one that we spend the most time on is the external (and internal) APIs. For the API, we have a review process in place to make sure that we are consistent and make sense. Most of the time we are doing things that follow the same design line as before, so there is not much to think about. For big things, we typically also solicit feedback from the community, to make sure that we aren’t looking into with colored glasses.

For our actual user interface, the Studio, we used to just have the other devs look at the new functionality.  But that led to a lot of stuff that worked, but the amount of attention we actually paid to the UI used to be really variable. Some features we would iterate over for multiple weeks, getting them just right (the most common operations, as we see them). But other stuff was just “we need to expose this functionality, let us do this”, which led to almost one to one mapping of the server side concept to the UI, which isn’t always helpful for the users.

We have started with a full UX study of the RavenDB Studio, and we are going to be doing full design analysis on each of our views with an eye to improve it significantly by 4.0.

time to read 2 min | 321 words

elemarjrToday is a great plus one news day to RavenDB Latin America. Elemar Junior is joining our team as official RavenDB consultant, on January 1st.

Elemar will work helping our customers, writing a lot of code, producing demos, videos and tutorials and in general focus on the getting started process easier as well as faster and smoother process for developers and operations people to get familiar and accustomed to getting the best out of RavenDB.

Elemar is well known for writing and speaking about advanced topics on development, design and software architecture. If you aren’t familiar with him, feel free to check his blog (pt-br only), or linkedin but the short gist of it is that he started to write computer code when he was nine years old and still love it. He has over seventeen years of professional experience developing software (used in over thirty countries) for manufacturing furniture and to design and planning of residential and commercial spaces.

With a strong focus on the Microsoft ecosystem, he has been awarded as Microsoft MVP since 2011. Elemar is the author of FluentIL, an emitting library for .NET platform, and leading figure of CodeCracker, a popular analyzer library for C# and VB that uses Roslyn to produce refactoring, code analysis, and other niceties.

Elemar can be reached via elemarjr@ravendb.net and is currently looking for someone that would Photoshop this image with a cat or ask him tough questions about RavenDB.

On a more serious note, this represent a bigger focus on having dedicated people to handle just working with customers, providing guidance and support and writing tutorials, sample applications and in general just making everything that much easier.

We are hiring

time to read 1 min | 84 words

It is that time again, and Hibernating Rhinos is hiring developers to work on our flagship product, RavenDB. We need passionate developers, and we don’t care if you are just starting out, or if you have a decade of experience.

If you want to join our team, drop us a line at jobs@ravendb.net, with a link to github profile page (or similar) with projects that you have worked on, as well as your CV.

This position is only available in Israel.

time to read 1 min | 70 words

Well, it is nearly the 29 May, and that means that I have been married for four years.

To celebrate that, I am offering a 29% discount on all our products (RavenDB, NHibernate Profiler, Entity Framework Profiler).

All you have to do is purchase any of our products using the following coupon code:

4th Anniversary

This offer is valid to the end of the month only.

Funding options

time to read 8 min | 1433 words

This is a divergence from my usual discussion on technical stuff. In this post, I want to talk about money. In particular, how you get it from other people. Note that I am neither an expert nor qualified to talk about the subject matter, this post came out of a lot of scribbled notes and is mostly meant to serve as a way to lay down a line of thought. All numbers are made up, and while I would like such a car, it would be mostly to inflict it on the employee of the month.

There are many cases in the lifecycle of a business where you need more cash than you currently have (or are willing to spend outright).

A common scenario is when you start a business, or when you want to expand it. For our discussion, we’ll use the example of the following drool worthy car:

I consider such a piece of art priceless, but let us say that I managed to convince the owner to sell it to me for the nice sum of 1,000,000$.

Unfortunately, I don’t have 1,000,000$. I only have 650,000$. So long, beautiful car, it was very nice to know you, but it is just not possible. Except that there is this thing where people give you a lump sum of money, and you give it back over time (although usually more than you got).

Funding is important for businesses in the same manner that breathing is for people. There are typically several ways to fund a business:

  • Direct cash infusion – That is usually how most business start. The amount of money put into the business depend on what it needs to do. A web developer would need the money buy a laptop and a Starbucks loyalty card, so that is easy. For a restaurant, you need enough money for rent, employees, equipment, etc. The smaller the amount you need to put into the business to kick start it, the easier it is to just use your own saving to do so.
  • Partners – This is pretty much the same as the previous one, but instead of having only one person do that, you have multiple people and more savings to dip into.
  • Angels/Investors – Those are people who for various reasons would give you money. Sometimes this is because they are related to you, but often time it is a calculated move, investing some money in a business in order to get a stake in it and cash it in afterward.
  • Government development loan / grant – Sometimes you can get this, and they usually have both very good terms, and really strict rules, regulation and hops to jump through.
  • Bank / credit loan – Well, you are presumably familiar with that. You get a loan, pay interest, mortgage some assets, etc.
  • Self funded – Your business is making more money than it is spending, therefor you have money to spend on the business.

The best choice is self funding, because that mean that you are profitable and aren’t beholden to someone. The other really depend on personal preferences. Here are mine:

  • Direct cash infusion – That works for starting a business with low starting overhead costs (see, single developer shop). It might also be viable if you have a lot of personal wealth that you can put into the business, but personally, I like to think about the money flowing in the other direction. Otherwise that is an indication that there is something strange going on here.
  • Partners – I used to work at a place that was owned by 7 founding members + 1 “silent partner”. I still remember when the entire company got an email from a co-CEO that was basically: “You are forbidden to discuss project X or anything related to it with the other co-CEO”. That left an… impression, shall we say. Also, this is again something that you would usually do in the beginning. Bringing a partner into an existing business implies one of a few things. You are in a big trouble (either personally or the business) and need cash infusion that you can’t/won’t supply or you are doing really well and people are flocking to join you.
  • Investors/Angels – This is very similar to the previous point, with the caveat that investors usually aren’t going to meddle in the day to day affairs, nor are they going to shoulder any burden. They are there to provide the money, some expertise/networking but that is basically it. They do create a pretty huge amount of bureaucracy, reports, compliance, etc. The investors needs to know that you aren’t blowing away their money, after all.
  • Government development loan / grant – This is pretty much the same as the previous one, only the investor is the government. If you thought that investors generated a lot of paperwork, you were mistaken.

The remaining two options are self funding and getting a loan. Now, assuming that no one else buy this magnificent car, I can put some numbers in Excel and predict that in a couple of years, I’ll have enough money to buy it outright. So all I need to do is ask the owner to not sell it to anyone, hope that my cash flow remain according to projections, hope the price doesn’t change and just wait.

Of course, that means that I can’t crash lift moral by making this the official company car in the meantime. I’m losing quite a lot of amusing moments by waiting, and that is assuming that it is still possible in two years. Of course, if in two years I would have the money to do so, I’m not so sure that I would still want to just purchase it directly. That would mean having no money at all. And that is kinda of scary, because salaries need to be paid, and this car doesn’t look like it has good gas/mileage ratio.

So the option that we have left is taking a loan. The nice thing about doing that is that we can mortgage the actual asset that we are buying, this magnificent car. Now, the bank may not value it as much as I do, so they are going to give it a price of only 900,000$, and then they are going to only agree to fund 80% of that, which gives us 720,000$.

In other words, that means that we need to puny up 380,000$, which is much more reasonable, and leave us with a bit of free cash cushion. That lead to a few interesting observations:

  • The loan amount and the money we already have are comparable. That means that the bank is going to be much nicer to us than if we wanted to borrow much more money than we already have (on the assumption that if we got this amount of money once, we’ll be able to get it again to pay them).
  • There is a valid asset to mortgage, which reduce the loan risk (and thus get us better terms).
  • The current interest environment is at an all times low, which mean that this is a great time to loan money (and bad time to try to save).

This means that this is a much simpler deal than going to a bank with a business plan and hoping that they will believe that we can make it. Now, let us get down to the financial details.

An offer from bank A is for an interest rate of 4%. That gives us a month payment over ten years of 7,290$ per month.

An offer from bank B is for an interest rate of 4.25%. Which gives a monthly payment of 7,375$ per month.

That is a simple number game, and we are pretty much done at this point, right? Almost, but let us project this over 10 years, and see where that put us.

Bank A: Total amount of interest paid is 154,800$

Bank B: Total amount of interest paid is 165,000$

In other words, the total difference is 10,200$. That means that while it is still a numbers game, it isn’t just the interest rate. The reason is that we now need to consider a lot more aspects. For example, Bank B may have an easier loan approval process, or require less security, or value the car higher than bank A. Bank A doesn’t allow early cash out, while bank B does, or a million and one other differences.

The question now becomes is whatever the other stuff beyond the raw interest rate can be quantified, and whatever it is worth more than 10,000$.

As I said earlier in this post, this is mostly settling things in my mind. Feel free to ignore this post all together.

time to read 1 min | 124 words

I am currently in the states, and I can’t go anywhere without seeing a lot of signs for Black Friday. Since it seems that this is a pretty widely spread attempt to do a load test on everyone’s servers (and physical stores, for some reason). I decided that I might as well join the fun and see how we handle the load.

You can use one of the following coupon codes (each one has between 4 – 16 uses) to get a 20% discount for any of our products, if you buy in the next 48 hours.

    1. pink-Sunday
    2. white-Monday
    3. green-Tuesday
    4. orange-Wednesday
    5. red-Thursday
    6. black-Friday
    7. yellow-Saturday

This explicitly includes 20% discounts for RavenDB and the Profilers.

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 5 days from now
  3. What happens when a sparse file allocation fails? - 7 days from now
  4. NTFS has an emergency stash of disk space - 9 days from now
  5. Challenge: Giving file system developer ulcer - 12 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}