Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,597
|
Comments: 51,224
Privacy Policy · Terms
filter by tags archive
time to read 1 min | 196 words

I know that I’m usually critiquing Visual Studio, but I was working in VC++ today, and I had several Wow moments.

The intellisense story is very poor compare to C# or VB.Net, but C++ is a hard language to parse, so I’m not (too) bothered by the issue.  A C++ programmer is supposed to built his own API anyway, so it doesn’t matter J.

What really impressed me was the debugging story. I had the data tips for native C++ types like vector<wstring> (!!).  As far as debugging went, it felt nearly as good as the managed world.

One thing that I miss is the exception assistant for std::exception (or just a way to get the exception message), but I can live without it. Hell, even mutli threaded debugging wasn’t that big of a deal (for a simple scenario).

Really nice.

I implemented my own ref counting scheme and RAII for thread safety, and didn’t swear much at all :-D. If this continue, I may start going on tours through <algorithms>.

 

As an aside, the intellisense is familiar with Guard and MutexGuard, but I can’t find them mentioned in the help anywhere, and it causes compiler error when I tried to use it.

time to read 1 min | 171 words

I am trying to play Black & White 2, and I'm running into a lot of issues with my graphic card (ABIT R9600XT) the default drivers crash often when playing the game, and trying to update the drivers resulted in crashing the computer itself when playing. This is a fairly old card, and it is only capable of displaying B&W in the lowest level of detials, but I am still blown away by the game.

My requirements are:

  • Stable Drivers - I'm not a top notch gamer. I want stable drivers.
  • Dual head support - not something that I'm willing to give up ever.
  • Support the things that B&W 2 needs.
  • Not a big, noisy monsster.

Any recommendations?

time to read 2 min | 348 words

For the past week I have been doing quite a bit of work in unmanaged C++. It is mainly a set of small utilities that we need, and they can't have a dependency on anything.

I used to be a big bigot for C++. I understood RAII and shared_ptr, I could read templated code without a hitch and I occationaly even managed to write a program where pointer chasing was optional :-)

Now that I am working in C++ again, I am amazed to find out just how much I forgot, and just how much stuff is just not there.It was so bad that I actually had to do a search for "Hello World C++" just to find the correct syntax to get the application to compile.

I spent a several minutes searching for std::string.split() method. I knew that it had one, but I could find it (never mind the intelli sense story for C++). I slowly dawned on me that I was actually thinking about System::String, and that C++ really does have more than a single string type.

I still remember the high level concepts, but I lost a lot of the instincts that I had for C++.

For instnace, should a method have a return value of std::string, or std::string* ? What is the meaning of each? I had to actually consider the implications of those decisions before I could go on. And don't get me started about having to worry about releasing memory. I need to write some mutli threaded safe code in C++, and just making sure that all the threads are releasing memory appropriately is going to be a huge headache.

But even beyond that, there is the not small issue of needing to remember all the API that I need. It is not just shared_ptr vs auto_ptr, it is simple things like reading a file or consuming XML.

I feel so lonely without the BCL...

time to read 14 min | 2786 words

After my last post about the ADO.Net Entity Framework, I got a long comment from Pablo Castro, the ADO.NET Technical Lead. I took the time answerring it, mainly because I wanted to consider my words carefully. Again, I'm probably not a partial side in this matter, with my strong NHibernate bias, but I do have quite a bit of experiance with Object Relational Mapping and how they map to real world applications.

My previous post was mainly about extensibility in the framework, and how much I can extend it to fit the needs of the applications. So, without further ado :-), let me get to the points Pablo brought up:

We're not ignoring extensibility :), we're looking at some aspects of this right now. A couple of examples:
  - we are working on making sure that the data model itself is extensible so services can be built on top of the model and have service-specific metadata that's accessible through the metadata API and at the same time doesn't interfere with the code runtime that depends on it.
  - we're also working finishing the details of the provider model so everyone can plug-in their stores  into the entity framework.

I read this paragraph several times, and I am afraid that I'm lacking the background to understand exactly what he means here. The Provider model, as I understand it, will allow to use Oracle, MySQL, etc as the database engine underlying the technology. The services is something I'm puzzled about, I'm not sure what is a service in this particual instance. Is the LINQ wrapper on the EDM a service? Or does it mean application level service?

A few comments on your specific concerns:
- many to many relationships: it's not in the current bits, but it's not ruled out either as of yet. The system models certain things as link-tables already (which doesn't necessarily mean you have to use link tables for 1-n relationships in the actual store, of course, you can use the regular 2-table PK/FK story), although not all the 2details are in place for specifying and updating m-m relationships. We'll see were we land on this.

This worries me. Many to many is a very common idiom in databases and in domain models. Users and groups is a very simple example of this, but in any model there are many such relationships. I am aware of the amount of work that this feature require, but that is the reason we get those frameworks in the first place, so the developers wouldn't have to handle this.

- Indexed collections. The current collections are not indexable, but I'll take the feedback and look into it. As for dictionary, I acknowledge that there are scenarios for it, although we currently don't have plans for doing it. Again, feedback taken.

Sets match the database model very nicely, but as a developer, I often has other need from collections. For instance, a set of filtering rules where the orderring of the rules is important match very nicely to a list (indexed by the position). A set of attributes that can related to an object is modeled as a index collection of key and value pairs. Frankly, I don't see much of a difference between dictionaries and indexed collections at a high level.

Those are the simple things, by the way. What about cases where the collection's key has a valid business meaning. A good example is checking valid date ranges. My house' lease have a contract for a specific period only, this naturally maps to an object model that has the relation between the house and the current leaser is a dictionary of date ranges and customers. Add a couple more freakish requirements, and you have to have support for those issues. As much as I would like it to be, using simple sets is often just not possible, too much information goes away this way.

- Custom collections: we have had *really long* discussion over this among ourselves...right now we are focusing on the scenarios where our own classes are used for collections; that allows us to do relationship fixups and the ends of a relationship changes. It also makes it straightforward to do change-tracking over related sets. With custom collections we can't fix up ends of relationships, so let's say you have a Customer object and an SalesOrder object; when you add the SalesOrder to the Customer.Orders collection, you'll expect SalesOrder.Customer to point to the right customer...we do that by hooking up into the collection. It results in less surprises, particular for users that don't have a lot of experience with the tricky areas of object layers on top of databases. As for your particular scenario, is that something you can embed in the CLR type that represents the entity (e.g. as property in the user part of the partial class)?

I wrote my own collection classes for NHibernate that will do the fixups for the relationships, so I fully understand the need and how nice it is to have this. That said, please don't try to protect me from myself. If writing a custom collection is something really hard, document it with big red giant letters, and let me feel the pain of doing it myself. I will need to do this, period.

I understand the issues with exploding test matrixes and scenarios that you can't support because while you may infinite supply of resources, you don't have infinite supply of time :-). But I would much rather a "Here Be Dragons" sign over a locked door.

For the specific scenario, you can take a look at this post, I discussed a bit how I solved the issue. Note that all the History collections are of IDictionary<DateRange, T> type.

- The tricky part of this one is update. When you change the value in your entity from false (null) to true (not-null) what value do you put in the database? "not-null" is not specific enough.

It took me a moment to figure out what you meant here. To the readers who aren't versed in OR/M implementation details, the issue is this:

sale.Customer = currentCustomer;
context.SubmitChanges();

The OR/M nees to figure out what the Id of the currentCustomer is, so it can save it to the database. I'm not sure that I understand the problem here, though.

You have the instance of the related object, and you know the object model. It is very simple to get from those two pieces of information to the value that should go to the database. I understand that you are working slightly differently in your model, using EntityRef<T> to explicitly hold the ID value, but I don't see this as an unsolvable issue, or even a very hard one. The simplest issue here is to dictate that you need this value, and fail if it is not there. If I do my custom thing, I should play nice with the rest of the framework.

Thoughts and further questions are welcome :)

You'll probably regret saying this :-) but I will try.

First, the main assumtion that I am making here is that the ADO.Net Entity Framework is supposed to be used in both new projects and existing ones, and that it is supposed to be a fully featured package, and not one that is targeted at the simple scenarios. If I am wrong, than most of my comments are invalid by defination, but the way I see ADO.Net Entity Framework presented seems to support this assumtion.

  • Legacy support
    This is a major issue if you decide that you want to support existing projects and not just new ones. In this case, you need to support some fairly crazy database schemas. From the sane to the fully de-normalized ones to the Let Us Put This Data Somehere approach that some peole prefer.
    To give you a few examples:
    • A PK/FK relationship where the FK is found via substring(5, PK) in the parent table.
    • A boolean flag that has null / not null mapped to true or false, with completely random values as the non null values.
    • A table where each row contains several entities, and the relations between them.
  • Inheritance model
    From the documentation that I have seen so far, the inheritance models supported are table per hierarchy (discriminator) and table per sub-class. It there support for table per class, and how well it plays with the rest of the system.
  • Splitting a value from a table
    This is related to too much information in a row, but I may want to break a part of the row into a value object that is associated with the object.
  • Caching
    What is the caching story? I know that you have Identity Map in place, but what about application wide caching? I assume that you will use ASP.Net's cache, but what happen on a web farm scenario?
  • Master / slave scanerios
    What happen when I want to scale by making all my writes to a single server, and replicate from there?
  • Connection Control
    What happens if I want to explicitly control the connection lifetime and behavior?
    For that matter, how much control do I have for where exactly the data goes? Can I decide to move to a different database for a save, and then move back?
  • Change tracking
    Change tracking on entities is usually done by comparing them to their original values, when this is done on large object graphs, it can be a significant performance hit, especially if I'm only updating a few values, but I need to read a lot of data. Can I take action in this case?
  • SQL Functionality
    What happens if I have some specific functionality in the database that I need to use? UDF is one example, but my SQL Functions come to mind as well. Is it possible? How well it integrates into the rest of the systems?
  • Composite Keys
    Can I use them? How well do they play with relashions? What happen if the relation is based on only part of the compose key?
  • Custom Types
    I mentioned the null bool scenario, but there are many other cases where I want to get involved with the way the framework is creating and persisting properties in my objects.

Customer scenarios - This is here because of a post by Clemens Vasters about how much less affective he became in influencing the direction of WCF since he joined the team. Now he doesn't have the justification of a customer need to do it.

Here a few examples of scenarios that I personally run into:

  • Timed objects - I discussed how I used OR/M to make sense of a really complicated business model. This approach require customized collection and quite a bit flexibility on the side of the OR/M when defining the way the data is transformed from the database to the object graph.
  • Xml Extended Attributes - I have got a customer who want to keep a list of extended attributes inside the table itself, and has decided to use XML Column in order to do so. The challange here is to get the data from the XML Column inot an object property and persist it back to the XML. This allows extending the table for child objects without modifying the object structure.
  • Handling legacy schema such as this one, I know how I can map it cleanly so the object model has no idea how the database looks like.

I posted anoter list of features that I constant use in OR/M here.

off-topic: regarding data-adapter batching, I made that call early in the Whidbey cycle; it was painful, but it was unrelated to wanting/not-wanting extensibility. It was a scoping decision. At some point you have to decide where to cut a release and it forces you to prioritize scenarios, and that includes painful decisions. I completely understand your frustration, but you can be assured that it wasn't, and it's not now in the Entity Framework design, lack of interest for extensibility.

The issue that I have with this approach is that it targets the simple scenarios, those who probably wouldn't get that much benefit from this. The amount of effort that Microsoft is putting into OR/M efforts is proof that the DataSet model doesn't really scale for complex applications. The decision to make this accessible only via DataAdapter punishes anyone who decide that they can do better by going lower in the stack and build from there. I am used to having more options the lower I go, not the other way around.

In ASP.Net, for instance, I may not be able to change the Page Controller architecture in the WebForms level, but I can go down a couple of levels and use Front Controller architecture, and the only functionality that I lose is the one spesific for WebForms.

But this is crying over spilled milk, I am afraid. The most pressing question at the moment is will this be fixed in .Net 3.0? And yes, I am talking about the former WinFX release. I want and need this ability. This is the closest release, and the next one after that is sometimes in 2008, which is way too long.

time to read 12 min | 2350 words

First things first, don't get too excited, NHibernate doesn't support Stored Procedures (yet).

What I am talking about is encapsulating HQL queries in a similar manner to the way stored procedures work.

A little background before I start talking about the technical details. When using NHibernate, I'm usually pushing the use of NHibernate's Criteria approach, since this is by far the most easy way for people to comprehend how to query the database using NHibernate. While for simple uses HQL is very simple:

from

Employee e where e.MiddleName is not null

HQL can get complicated when you want to do complex stuff, and it has one big problem. It looks like SQL, but it is not SQL.

Take this example, for instnace:

from PresenseReport pr, EmployementContract ec

where pr.Employee = ex.Employee

and   pr.Date between ec.ValidityStart and ec.ValidityEnd

and   pr.Employee.IsMobile = true

and   pr.Date between :startDate and :endDate

and   pr.Employee.Manager = :manager

I will spare you the effort of trying to come up with the equilent SQL. This is still a simple query, but it is not something that can be expressed using NHibernate's Criterias.

Now we have got several options of how to use it. We can embed it in our code, but this has some serious disadvantages. I really hate to have strings in my application, and I would much rather use a seperate file or a resource somewhere, but it turns out that we don't need to. NHibernate has the concept of a Named Query, which allow you to define a query in the mapping files and give it a name, like this:

<?xml version="1.0" encoding="utf-8"?>

<hibernate-mapping xmlns:xsd="http://www.w3.org/2001/XMLSchema"

                               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

                              xmlns="urn:nhibernate-mapping-2.0">

       <query name="GetPresenceReportAndContractsForSubordinaceMobileEmployeesInDateRange">

              <![CDATA[

              from PresenseReport pr, EmployementContract ec

              where  pr.Employee = ex.Employee

              and    pr.Date between ec.ValidityStart and ec.ValidityEnd

              and    pr.Employee.IsMobile = true

              and    pr.Date between :startDate and :endDate

              and    pr.Employee.Manager = :manager

              ]]>

       </query>

</hibernate-mapping>

You may consider the query name too long, but I like to be explicit about such things.

Now we can use the same query in several locations, without copying and pasting the code. If you really like, you can even change the query without changing the client code (which is supposed to be the holy grail of stored procedures.

You can use this query like this:

session.GetNamedQuery("GetPresenceReportAndContractsForSubordinaceMobileEmployeesInDateRange").

       SetParameter("manager", currentEmployee).

       SetParameter("startDate", startDate).

       SetParameter("endDate", endDate).

       List();

Now, this still doesn't free the application developers from understanding what is going on, of course. For instnace, this query returns a list of tuples, of of those contains a precense report and the matching employement contract. I need to understand the return value and unpack it into a form that is easier to work with.

By the way, in this case, the idea is to get the employement contract and use that to validate the report. Notice that I am checking for the date of botht he employement contract validity and the precense report itself. It removes a whole set of issues that I suddenly do not hvave to think about.

If you wanted to be really fancy, it is trivial to take the XML above and use code generation to generate a query class that will wrap this functionality.

time to read 1 min | 189 words

My current readership seems to have stabilized on around 3,500 readers a day (after a peak of about 5,500 a day after I got linked from ScottGu's blog).

After seeing the dramatic difference that such a link can cause, I decided to check how I can bring in more readers. The simplest thing is to get myself linked from the aggregations sites such as Digg, DotNetKicks, etc. Of course that the high goal there is to get slashdotted, but I'm leaving this tactic for the future :-)

Now, putting my own posts on those sites is sleazy, but there is no reason that I wouldn't make it easy for you to do it. So I added links to deli.cio.us, Digg and DotNetKicks on each post. I used dasBlog's ability to define custom macros, and this custom macro library.

The link contains all the information that you can want on this, but be aware that the macros shown there wouldn't work as-is. For some reason Vasanth forgot to put <% on the beginning of each macro.

FUTURE POSTS

  1. Replacing developers with GPUs - 3 hours from now
  2. Memory optimizations to reduce CPU costs - 2 days from now
  3. AI's hidden state in the execution stack - 5 days from now

There are posts all the way to Aug 18, 2025

RECENT SERIES

  1. RavenDB 7.1 (7):
    11 Jul 2025 - The Gen AI release
  2. Production postmorterm (2):
    11 Jun 2025 - The rookie server's untimely promotion
  3. Webinar (7):
    05 Jun 2025 - Think inside the database
  4. Recording (16):
    29 May 2025 - RavenDB's Upcoming Optimizations Deep Dive
  5. RavenDB News (2):
    02 May 2025 - May 2025
View all series

Syndication

Main feed ... ...
Comments feed   ... ...
}