Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,597
|
Comments: 51,224
Privacy Policy · Terms
filter by tags archive
time to read 5 min | 995 words

I talked about why statics are evil a couple of days ago. Now let us see why we want to use them anyway. Let us talk about a common scenario, and see what we have there. The scenario that I present here is extremely simplistic, of course, but it should be enough that you would get the point.

Let us take the common scenario of dispalying a web page. Most web pages are composed of many small pieces of data, and it is often not possible to fetch them from the same source. For the purpose of discussion, we will need to show a list of orders, the customer info as well as a set of personalization information.

The calls we need to make include the following:

  • Get Recent Orders
  • Get Current Customer
  • Get Personalization Information
  • Get Shippers Statuses

Eventually, each of those call will need to acceess a database. There are several ways to handle this issue. The simplest one will be to have each call create its own connection, like this:

using(IDbConnection connection = DataBase.CreateConnection())
{
  //do work
}

This is simple, but it moves the responsability for the data access to each of the classes involved. Even assuming that creating and disposing the connections is not important, because we have connection pooling turned on, there is still overhead associated with them. I really don't like spreading the responsability around like this, even more than the performance issue.

Let us try something different, and pass the connection from outside, like this:

void OnPageLoad(object sender, EventArgs e)
{
  using(IDbConnection connection = DataBase.CreateConnection())
  {
    connection.Open();
    OrdersCollection orders = new OrderRepository(connection).GetRecentOrders(...);
    Customer cust = new CustomerRepository(connection).GetCurrentCustomer(...);
    // etc, etc 
  }
  // do something useful 
}

This get rid of the issue of creating and disposing the connection, but it is still painful. I need to pass the connection explicitly, and now my UI layer knows about such things as databases. That in itself isn't really bad, but the code above is UI code that is managing database connections.

This is definately not the responsability of any UI layer that I have heard of.

Keep those issues in mind, let us take a look in another scenario. I need to validate the orders, so I can display their status. For that, I have a set of business rules that run on each order, and check it for consistency. At the beginning, I used this approach:

ValidationResult result = new Validator().ValidateOrder(order);

Validator will run each of the seperate business rules, and aggerate its results:

public ValidationResult ValidateOrder(Order order)
{
 ValidationResult result = new ValidationResult ();
 foreach(IBusinessRuleValidator validator in OrderValidators)
 {
    validator.Validate(order, result);
 }
 return result;
}

Very simple, isn't it? Until I need to add a business rule that need to check the database as well. For example, validate that I have a contract with a supplier in a spesific date. Now I need to modify the Validator class to pass it a connection, and all the business rules, just for the sake of a single rule. (This is assuming that the code is mine to change.)

I'm going to leave it at this point, and post my preferred way to handle these types of issues.

Ten points for the first guy/gal that can post a solution that contains "service" in its description and actually makes sense :-)

time to read 1 min | 117 words

Note: You can probably ignore this message, it is here for a specific person.

To my reader from *.af.mil, I'm flattered that you link to me, but this address is not on the internet, and it is driving me crazy :-)

I don't know who you are or how to contact you, but I'm assuming that this is an internal army blog, I'm pretty sure that this blog shouldn't go out and ping me from a secured network. If you can't turn pinging off, can you at least send me the posts? I feel like someone is whisperring just outside my hearing.

Thanks.

time to read 4 min | 614 words

Moran has pointed me to the CROSS APPLY syntax in T-SQL 2005, which does allows joining against a table valued function. Using this, the query goes down to this:

SELECT

        EmpName,

        CONVERT(NVARCHAR, CurrentDate,103) Date,

        HasWorked = CASE SUBSTRING(Roster,

                dbo.IndexInRoster(StartDate,CurrentDate,LEN(Roster)), 1)

                        WHEN '_' THEN 0

                        WHEN '*' THEN 1

                        ELSE NULL

            END

FROM Schedules CROSS APPLY DateRange(StartDate,EndDate)

This is much nicer way to deal with it. Considerring that I am using similar techniques all over the place, this is a really good thing to know.

time to read 1 min | 153 words

Okay, I manage to hold out for over a week without releasing it, but the code started snarling at me when I worked with it, so I suppose I have better let it free.

This release contains multiply fixes for orderring, including some really bizzare edge cases. In addition to that, Eric Nicholson was kind enough to send a patch that fixes an issue with mocking classes that has finalizers.

Just to make things interesting, you can find only the binaries here. I decide to keep the source for myself at the moment, network connectivity issues to the subversion repository had nothing to do with this decision.

Happy mocking,

  Ayende.

Update: Okay, I broke down and update the source repository as well. That was 5 minutes of Rhino Mocks being "closed source"

time to read 55 min | 10901 words

I'm starting to get quite a bit of mail from this blog. Some of those questions are about subjects I can answer immediately, some require a fair amount of work (which can be had, if you really want), and the more interesting ones are those that require some thinking, but does not require too much time. This question from Dave is the best one so far, and I got his permission to blog about it, so I'm doubly happy.

The issue is working against a legacy database to get the data for further processing. I'll let Dave explain the issue, since he does it much better:

I have to write a query to generate a report over some interesting data.  It's basically scheduling which days people are working.  The data looks like this:

Id EmpName StartDate EndDate Roster
1 Bob 12/06/2006 18/06/2006 _*___**
2 Mary 12/06/2006 18/06/2006 *_*__*_

The trick is, the roster field contains a string with a _ or * depending on wether the person is scheduled to work that day or not, but the first character always starts on the sunday.  The startdate and enddate can be any day of the week.
In the example above, the 12-jun is a monday, so monday corresponds to the second character in the roster string, so Bob's working and Mary's not.
The roster string wraps around, so the first character of the roster string actually corresponds with the enddate here!  Now, this roster string could be 7, 10, 14 days long. 

I could get the report out if I can write a query to get it to this:

 

Employee DateWorking
Bob 12/06/2006
Bob 16/06/2006
Mary 13/06/2006
Mary 16/06/2006

 

By the way, I haven't asked, but I'll bet that this schema has originated from a MainFrame, if not currently, than in its recent past.

First I created the schema I needed:

 

CREATE TABLE Schedules

(

        Id INT IDENTITY(1,1) NOT NULL PRIMARY KEY,

        EmpName NVARCHAR(255) NOT NULL,

        StartDate DATETIME NOT NULL,

        EndDate DATETIME NOT NULL,

        Roster NVARCHAR(50) NOT NULL

);

GO

 

INSERT INTO Schedules

SELECT 'Bob','12-Jun-06','18-Jun-06','_*___**'

UNION ALL

SELECT 'Mary','12-Jun-06','18-Jun-06','*_*__*_'

 

GO

 

Then, I started playing with DatePart(), getting the day of the week of StartDate from each row. This gave me the index I needed into the Roster column. But, this only told me whatever the employee worked or didn't work on the start date, which isn't very helpful. What I needed was a way to check for all the values between StartDate and EndDate.

I posted about this issue a while ago, and I made use of this techqnique here:

 

CREATE FUNCTION DateRange ( @start datetime, @end datetime )

RETURNS @DateRange TABLE ( CurrentDate datetime )

AS

BEGIN

      WHILE (@start <= @end)

      BEGIN

            INSERT INTO @DateRange(CurrentDate) VALUES(@start)

            SELECT @start = DATEADD(day,1,@start)

      END

      RETURN

END

GO

Conceptually, what I wanted was this:

SELECT

      IndexInRoster = DatePart(dw,StartDate) + DateDiff(day,StartDate, CurrentDate)

FROM Schedules, DateRange(StartDate,EndDate)

Unfortantely, DateRange() is a table valued function, and what this query ask from SQL Server is to join each row in the Schedules table to another table. This is not possible, of course.

I settled on faking it using this appraoch:

WITH AllDatesInTable(CurrentDate) AS

(

        SELECT CurrentDate FROM dbo.DateRange(

                  (SELECT MIN(StartDate) FROM Schedules),

                        (SELECT Max(EndDate) FROM Schedules) )

)

SELECT

      TestIndexInRoster = DatePart(dw,StartDate) + DateDiff(day,StartDate, CurrentDate)

FROM Schedules JOIN AllDatesInTable

ON CurrentDate BETWEEN StartDate AND EndDate

This query uses Common Table Expression to define a table that has all the dates in the Schedules table. Notice that I constrained it to all the dates in the current row. In essense, this give me a row per each date in the date range of each row. This is the basis of solving this problem.

The other issue is the wrapping of the day index in the roster. This is a bit complicated because we need to take into account three things. SQL Server string handling is 1 base, not 0 based (argh!), we are shifting based on the start date functionality, and we need to wrap around correctly. In order to handle this issue I created this function:

CREATE FUNCTION IndexInRoster(@StartDate DATETIME, @CurrentDate DATETIME, @RosterLen INT)

RETURNS INT AS

BEGIN

      DECLARE @Result int

      SET @Result = (DATEDIFF(day,@StartDate,@CurrentDate) + DATEPART(dw,@StartDate)) % (@RosterLen)

      IF @Result = 0

            RETURN @RosterLen

     

      RETURN @Result

END

GO

The check for @Result equals 0 is there because SQL Server is using 1 based string handling.

Brining it all together, we get this:

WITH AllDatesInSchedulesTable(CurrentDate) AS

(

        SELECT CurrentDate FROM dbo.DateRange(

                  (SELECT MIN(StartDate) FROM Schedules),

                        (SELECT Max(EndDate) FROM Schedules) )

)

SELECT

        EmpName,

        CONVERT(NVARCHAR, CurrentDate,103) Date,

        HasWorked = CASE SUBSTRING(Roster,

                dbo.IndexInRoster(StartDate,CurrentDate,LEN(Roster)), 1)

                        WHEN '_' THEN 0

                        WHEN '*' THEN 1

                        ELSE NULL

            END

FROM Schedules JOIN AllDatesInSchedulesTable

ON CurrentDate BETWEEN StartDate AND EndDate

And the result of this query:

EmpName Date HasWorked
Bob 12/06/2006 1
Mary 12/06/2006 0
Bob 13/06/2006 0
Mary 13/06/2006 1
Bob 14/06/2006 0
Mary 14/06/2006 0
Bob 15/06/2006 0
Mary 15/06/2006 0
Bob 16/06/2006 1
Mary 16/06/2006 1
Bob 17/06/2006 1
Mary 17/06/2006 0
Bob 18/06/2006 0
Mary 18/06/2006 1

And from here it is trivial to get to whatever format you want.

time to read 1 min | 188 words

Oren Ellenbogen has posted about generic constraints in the case where you have both an interface and a common base class. I tend to use the same Interface + Default base class often enough in my code. 

The issue I have with this is whatever the interfaace is even needed? I often find myself extending only from the base class and never directly from the interface.

Take for instance the MethodRecorderBase and IMethodRecorder from Rhino Mocks. Everythings works against IMethodRecorder, but all the concrete classes are decendants of MethodRecorderBase. In this case, whenever I need to add a method to the Method Recorders family, I need to update both the interface and base class (and maybe the sub classes).

Just to be clear, I'm not saying the the general principal is wrong, it is a defintely a Good Thing in certain cases (ICollection & CollectionBase come to mind), but I wonder about its usage when you start with new code.

Isn't this a violation of both DRY and YAGNI?

time to read 1 min | 156 words

Yesterday I was at the C# User Group meeting. The lecturer was Dan Amiga, and the subject was ASP.Net Hard Core.

I came out of the meeting completely satisfied. Dan has gone over quite a lot of subjects, some of them I was already using, but he went deeper into the stuff. And many of those things I wasn't even aware of. Very good stuff, overall, and I really enjoyed it.

The nicest tidbit in the entire lecture was how to package a User Control as a DLL. It is a bi t of a  hack, but it works. Compile the web site with ASPNET COMPILER using fixed names and use the DLL.

 

The part about Vendor Lock In was when we tried to disperse. The elevator absolutely refused to take us all down, so we had to go down  in very small groups, so the elevators wouldn't panic and decide that this isn't some mass exodus to leave Microsoft J.

time to read 12 min | 2240 words

My last post about static variables caused a few comments, so I think that it needs more clarifications. The first thing is to understand what I am talking about when I'm thinking about thread safety.

  • Thread safety issue when you initialize something (which I actually run into today :-) ) where two threads try to access a lazy resource and end up initializing it twice. This is relatively easy to solve using static constructor or static initializators. They are guaranteed to run before somebody access this field for the first time, and they are guaranteed to run once and only once.
    public static readonly IDbConnection Connection = CreateAndOpenConnection(); 

    This is not the kind of thread safety I am talking about here.
  • Thread safety when you use a variable from multiply threads. This is usually what worries me. It worries me because I often work in multiply threads, and because I usually work in the web, where thread affinaty in not something I can rely on. Check out the sample code below.

First, we define the database class:

public class DataBase

{

       public static SqlConnection Connection = CreateAndOpenConnection();

 

       private static SqlConnection CreateAndOpenConnection()

       {

              SqlConnection sqlConnection = new SqlConnection("Data Source=localhost;Initial Catalog=test;Integrated Security=True");

              sqlConnection.Open();

              return sqlConnection;

       }

}

Notice that the SqlConnection variable is public static, and in its documentation there is a comment saying that public static variables of this type are safe for multi threading. The documentation is even current in this instance. It is just talking about access the variable itself, not working with it.

Update: There is something wrong in my reading comprehension. Sergey posted a comment about the above sentence. I read it wrong. It is not public static memebers of this type that I define. It is public static members that this type has. Damn! Sorry about the mistake.

Here is a simple method that works with this object. Notice that there is a slight delay to simulate heavier load:

private static void GetData()

{

       using (IDbCommand command = DataBase.Connection.CreateCommand())

       {

              command.CommandText = "SELECT * FROM Persons";

              using(IDataReader reader = command.ExecuteReader())

              {

                     Thread.Sleep(500);

                     while(reader.Read())

                     {

                           Console.WriteLine(reader.GetValue(0));

                     }

              }

       }

}

Pretty standard stuff so far, isn't it? Now, let us look at our Main method:

static void Main(string[] args)

{

       new Thread(GetData).Start();

       new Thread(GetData).Start();

       Console.ReadKey();

}

We are accessing the GetData() method from two different threads, and both threads end up using the same instance of SqlConnection.

Can you guess what the result will be?

In my tests, it consistently throws one of these exceptions:

  • Invalid attempt to Read when reader is closed.
  • There is already an open DataReader associated with this Command which must be closed first.

We have got ourself a threading problem...

Clarification: The issue that I am raising here is a basic one, I know. I'm trying to reach some place and I want to go all the way, so I wouldn't lose anything along the way. It should be clearer later. (Oh, and the static connection property is evil. It is just the simplest example I could think about).

Static Issues

time to read 2 min | 349 words

The easiest way to get access to something, is to make is static. This way, you can to it from everywhere in the application, without having to instansiate it every place you need it.

Consider this piece of code, which saves an order to the database:

using(IDbCommand command = DataBase.Connection.CreateCommand())

{

    command.CommandText = "INSERT INTO Customers VALUES('New','Customer'";

    command.ExecuteNonQuery();

}

Here is how we define the DataBase class:

public class DataBase

{

    static IDbConection connection;

 

    public static IDbConnection Connection

    {

        get

        {

            if ( connection == null)

            {

                connection = CreateAndOpenConnection();

            }

            return connection;

        }

    }

}

Even ignoring the pitfall of leaving connections open for a long time, can you see the problem?

What happen if we are using this on the web, and two users are trying to view a page at the same time? One of them is likely to get the "data reader is already open" error. The same issue appear if you want to work multi threaded.

This is a major issue when using static, and it is usually the reason for ruling them out. By the way, I'm using static as the word here, but everything I say here can be said about Singleton as well.

I'll post later how to handle this issue safely on multi threaded (and web) scenarios while maintianing the ease of use that static gives us.

 

FUTURE POSTS

  1. Replacing developers with GPUs - 3 hours from now
  2. Memory optimizations to reduce CPU costs - 2 days from now
  3. AI's hidden state in the execution stack - 5 days from now

There are posts all the way to Aug 18, 2025

RECENT SERIES

  1. RavenDB 7.1 (7):
    11 Jul 2025 - The Gen AI release
  2. Production postmorterm (2):
    11 Jun 2025 - The rookie server's untimely promotion
  3. Webinar (7):
    05 Jun 2025 - Think inside the database
  4. Recording (16):
    29 May 2025 - RavenDB's Upcoming Optimizations Deep Dive
  5. RavenDB News (2):
    02 May 2025 - May 2025
View all series

Syndication

Main feed ... ...
Comments feed   ... ...
}