Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by email or phone:

ayende@ayende.com

+972 52-548-6969

, @ Q j

Posts: 6,633 | Comments: 48,370

filter by tags archive

Inside RavenDB 4.0Book update

time to read 1 min | 66 words

Just to let you know, the book is pretty much edited, that means that you won’t have to suffer through my horrible sentence structure.

You can read this here.

What remains to be done now is for me to go over the book again, verify that there aren’t any issues, and we are done.

In other words, we are now “Done, Done” in the “Done, Done, Done” scale.

How to really fail a coding interview

time to read 1 min | 154 words

Our current interview question is from this post. We use that between the phone interview and the actual interview to get a feel about a candidate abilities. You can learn a surprising amount of information from even small amount of code.

Note that one of the primary goals of such a question isn’t to tell you “You should really hire this candidate” but to tell you that “You really shouldn’t”.  To clarify, this is a “do it on your own, and you got the whole internet at your disposal” kind of question. Typically we give a week or so to answer this.

Sometimes we get a very clear signal from the code, like in the case of this code:


But I think the crowning glory was this code:

I picked two of the worst offenders, but there were more. Some things I can sort of let slide, and some things I’ll just say no to.

DotNetRocks show on RavenDB with Kamran Ayub

time to read 1 min | 107 words

Kamran Ayub did a great DotNetRocks show about RavenDB 4.0. Kamran is also being the RavenDB 4.0 course on PluralSight, so he knows his stuff.

I got to say, it is… strange to listen to a podcast about RavenDB. I found myself nodding along quite often and the outside perspective is pretty awesome.

Kamran also tested the same application on RavenDB 3.5 and RavenDB 4.0, seeing 20x performance improvement. Best quote from the show as far as I’m concerned:

So fast you aren’t sure it actually worked.

Kamran also have a follow up post with some numbers and more details here.

Listen to the show here.

RavenDB online bootcamp is now updated to 4.0

time to read 1 min | 136 words

imageIn addition to the book and the documentation, we are also working on making it more accessible to get started with RavenDB. The RavenDB Bootcamp is a self directed course meant to give you an easy way to start using RavenDB.

This is a guided tour, walking you through the fundamentals of getting RavneDB up and running, how to put data in and query it, how you can use indexing and MapReduce. These are short lessons, providing practical experience and guidance on how to start using RavenDB.

You can also register to get a lesson a day.

This is now updated to RavenDB 4.0, smoothing the learning curve and making it even simpler to get started.

The project is free, but I’ll charge you for reporting bugs

time to read 4 min | 607 words

The impetus for this post is this twit:

I cannot express enough how much I object to this statement. I absolutely understand the reasoning, by the way. Drive by bug reports can be frustrating. Users of open source projects can have unreasonable expectations. Any open source project maintainer can tell you about posts with “Your code SUCKS” or people calling your phone at odd hours with hard to understand accents (and I’m not really one who can complain here) and demanding that you fix stuff. This sucks, absolutely. And in a popular project, you might run into a lot of “user error” bug reports, or even outright “fix my code” issues.

Nevertheless, I think that this approach focus on one side of the issue, how much burden it puts on the maintainers of the project. What it misses is that there is really valuable information contained in the bug report. It might be something that the software is not capable of doing, a wrong usage (no, you cannot use this class concurrently) or a real bug. Regardless, a bug report is valuable in and of itself. Someone put the time to actually use your software, identified a problem (for their use case) and reported it.

That doesn’t mean that you are in any way (if you are OSS project) obliged to fix this issue. In fact, I believe that even if you just threw the code on GitHub because you didn’t have anything else to do with it, bugs are still valuable.

Bug reports involve efforts, and if you have a live OSS project, you want to respect that and answer those bugs. At a minimum, they tell you about what users are doing with your software. That might not motivate you to fix those issues, but it is good to know anyway.

Sometimes you will get a good bug, either about an “obvious” missing functionality that you can add or “wow, we have that” that is critical to fix. Bugs that never impact you are also interesting. It might be a race condition that you were lucky to never hit, or a silent miscalculation that was never noticed, data corruption that will hit you in the future or even a security issue. Regardless, it is interesting.

All of the above doesn’t mean that you have to do something about any of these. In fact, even if you don’t ever intend to go back to this project, bugs are very useful. And not for you, for other people. If someone comes to your project and see posted bugs, they can figure out what not to do. They can learn about possible workarounds provided and confirm that this is a bug / limitation and they aren’t going crazy.

As for how to actually handle such bugs in an OSS project, if you aren’t interested in fixing a bug, it is perfectly fine to not to. A free OSS project can absolutely have policies on what are acceptable bugs, and closing the “fix my code” is a good policy in general.

For real issues in the code that you aren’t interested in fixing, it is okay to say: “Send me a pull request for this”.

For nasty replies, I found that: “I’ll be happy to refund your money” usually puts things in perspective.

Inside RavenDB 4.0The book is done

time to read 1 min | 166 words

The Inside RavenDB 4.0 book is done. That means that all of the content is there and it covers every aspect of RavenDB.

imageThere is still quite a bit to be done (editing and re-reads, mostly), but the the hardest part (for me) is done. I got it all out of my head and into a format where others can look at this.

You can read the draft release here.

The book cover:

  1. Welcome to RavenDB
  2. Setting up RavenDB
  3. Document modeling
  4. Client API usage
  5. Batch processing with subscriptions
  6. Distributed RavenDB Clusters
  7. Scaling RavenDB
  8. Sharing data and ETL processes
  9. Querying
  10. Indexing in RavenDB
  11. Map Reduce and aggregations
  12. Managing and understanding indexes
  13. Securing your RavenDB cluster
  14. Encrypting your data
  15. Production deployments
  16. Monitoring and troubleshooting
  17. Backup and restore
  18. Operational recipes

A total of 18 chapters and 570 pages so far.

I’m still missing an index, intro and a bunch of stuff, but these are now more technical in nature. No need for creative juices to pump to get them working.

Feedback is welcome, I would really appreciate it. You can read it here.

Inside RavenDB 4.0Chapter 17 is done

time to read 1 min | 156 words

You might have noticed that I’ve slowed down writing blog posts. This is because pretty much every word I write these days goes into the book.

I just completed chapter 17, and we are now standing at around 550 pages in length, and there is only one chapter left.

Chapter 17 talks about backup and restores, how they work in RavenDB and how to properly manage your backup strategies in RavenDB. It sounds deathly dull, but it can actually be quite interesting, since backing up of a distributed database (and restoring one, which is harder) and non trivial problems. I hope that I did justice to the topic.

Next, maybe even as soon as early next week is Chapter 18, operational recipes, which will cover all sort of single use case response from the operations team to how to deal with various scenarios inside RavenDB.

You can read the draft here and your feedback is always appreciated.

It was your idea, change the diaper

time to read 3 min | 470 words

imageYou learn a lot of things when talking to clients. Some of them are really fascinating, some of them are quite horrifying. But one of the most important things that I have learned to say to client is: “This is out of scope.”

This can be an incredibly frustrating thing to say, both for me and the client, but it is sometimes really necessary. There are times when you see a problem, and you know how to resolve it, but it is simply too big an issue to take upon yourself.

Let me give a concrete example. A customer was facing a coordination problem with their system, they need to deal with multiple systems and orchestrate actions among them. Let’s imagine that this is an online shop (because that is the default example) and you need to process and order and ship it to the user.

The problem at this point is that the ordering process need to coordinate the payment service, the fulfillment service, the shipping service, deal with backorders, etc. Given that this is B2B system, the customer wasn’t concerned with the speed of the system but was really focused on the correctness of the result.

Their desire, to have a single transaction encompass all such operations. They were quite willing to pay the price in performance for that, in order to achieve that goal. And they turned to us for help in this matter. They wanted the ability to persistently and transactionally store data inside RavenDB and only “commit” it at a given point.

We suggested a few options (draft documents, a flag in the document, etc), but we didn’t answer their core question. How could they actually get the transactional behavior across multiple operations that they wanted?

The reason we didn’t answer that question is that it is… out of scope. RavenDB doesn’t have this feature (for really good reasons) and that is clearly documented. There is no expectation for us to have this feature, and we don’t.  That is where we stop.

But what is the reason that we take this stance? We have a lot of experience in such systems and we can certainly help find a proper solution, why not do so?

Ignoring other reasons (such as this isn’t what we do), there is a primary problem with this approach. I think that the whole idea is badly broken, and any suggestion that I make will be used against us later. This was your idea, it broke (doesn’t matter you told us it would), now fix it. It is a bit more awkward to have to say “sorry, out of scope” ahead of time, but much better than having to deal with the dirty diapers at the end.

RavenDB Security ReportCollision in Certificate Serial Numbers

time to read 2 min | 209 words

imageThis issue in the RavenDB Security Report is pretty simple, when we generate a certificate, we need to generate a certificate serial number. We were using a random number that is 64 bits in length, but that is too small. The problem is the birthday attack. For a 64 bits number, you only need about 5 billion attempts to generate a collision. In modern cryptography, that is actually a very low security threshold.

So we fixed it and used a random value that is 20 bytes in length. Or so we thought. This single issue is worth the trouble of publicly discussing the security report. As it turned out, I didn’t read the API docs properly and used this construction:

new BigInteger(20, random);

Where the random is a cryptographically secured random number generator. The problem here is that this BigInteger constructor uses bits length, not bytes length. And that resulted in a security “fix” that actually much worse than the previous situation (you only need a bit over a thousand tries to generate a collision). This has already been fixed, obviously, but I’m very happy that it was caught.

FUTURE POSTS

  1. I WILL have order: How Noise sorts query results - about one day from now
  2. Reviewing the Bleve search library - 2 days from now
  3. I WILL have order: How Bleve sorts query results - 3 days from now
  4. I won’t have order: Looking at search libraries without ordering - 4 days from now

There are posts all the way to May 31, 2018

RECENT SERIES

  1. I WILL have order (3):
    25 May 2018 - How Lucene sorts query results
  2. RavenDB 4.1 features (4):
    22 May 2018 - Highlighting
  3. Inside RavenDB 4.0 (10):
    22 May 2018 - Book update
  4. RavenDB Security Report (5):
    06 Apr 2018 - Collision in Certificate Serial Numbers
  5. Challenge (52):
    03 Apr 2018 - The invisible concurrency bug–Answer
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats