Ayende @ Rahien

Refunds available at head office

Git commits as code review?

I just had to go through a code base where I had a bunch of of comments.

Instead of going with the usual route of just noting the changes that I think should be done, I decided to do something else. I fixed each individual change, and commit them individually.

This is how it looks like, each commit is usually less than a single screen of changes (diff mode).

image

I wonder if it is something that I can apply more generally.

UI Mockups

I spoke in the past about the importance of UI mockups. I consider them an essential part of software design. It is often much easier to gather requirements when you are talking in concrete terms about how the software looks like.

It is important to note that I do not care about the look and feel of the mockup, in fact, I don’t want it to look good, it should have a drafty look that would make it clear that this is just a mockup. It is a communication tool as much as anything else. It just allow me to talk about things in concrete terms.

A while ago I found Balsamiq, and I was happy. Balsamiq does just about everything that I want from a UI mockup tool.

Except, it doesn’t do Hebrew :-(.

That prompt me to start looking at several other UI mockup tools, to see if there are any that do support Hebrew and that I can tolerate them.

I tried a few others, and got varying support for Hebrew. The best ones from Hebrew support perspective where Mockup Screens and Blend’s SketchFlow.

My test scenario was this UI (in Hebrew):

image

This is Balsamiq mock, which took about 3 minutes to build, the first time I used Balsamiq.

I could get it in Mockup Screens after about ten minutes of playing with the options to get it working. Mockup Screens is nice, but it is giving me too many knobs to turn. I felt it especially when I tried to do the table. I had to specify columns & rows, and specifying the data was done in a separate dialog.

It works, but it is awkward.

The next one I tried is Blend’s SketchFlow. I heard a lot of excitement about it, so I tried to give it a spin. A Silverlight 3 SketchFlow project simply does not support Hebrew. A WPF SketchFlow does, and that exposed me to a very nasty surprise. I was expected a way to easily build sketches of the screens that I was interested in.

Basically, something similar to what Balsamiq gives me. What it turns out is that there are a few simple controls (check box, drop down, etc) and everything else that you want you need to actually sketch. That is, sit down and draw it.

The problem is that this take inordinate amount of time. Especially if you are actually trying to do something like the screen above. I just wan’t a stupid table with some data, but this was way too hard to do in SketchFlow. If I am reduced to literally drawing on the screen, I might as well draw it on paper. It is about as easy to manipulate, at that stage.

In the end, I think that what I’ll end up doing is using Excel or Word as the mockup tool. They both have good Hebrew support, and they can do things like tables, drop downs and checkbox easily, which is pretty much all I need. It also make it perfectly clear that this isn’t something executable.

And if you don’t care for Hebrew support, go and use Balsamiq, it is awesome.

Update:

Matt Kellogg has informed me that you can use Balsamiq with Hebrew support, all you need to do is set Use System Fonts:

image

And once that is done, it took only a few moments to handle this:

image

Now, it does suffer from all the usual RTL problems, but it is working, and it is still by far the best one out there.

Published at

Originally posted at

Comments (16)

What is up with the Entity Framework vNext?

Every now and then I do a quick check on the EF blog, just to see what there status is. My latest peek had caused me to gulp. I am not sure where the EF is going with things, I just know that I don’t really like it.

For a start, take a look at the follow sample from their Code Only mapping (basically Fluent NHibernate):

.Case<Employee>(
e => new {
manager = e.Manager.Id
thisIsADiscriminator = “E”
}
)

There are several things wrong here: “manager” and “thisIsADiscriminator” are strings for all intent and purposes. The compiler isn’t going to check them, they aren’t there to do something, they are just there to avoid being a literal string. But they are strings.

Worse, “thisIsADiscriminator” is a magic string.

Second, and far more troubling, I am looking at this class definition and I cringe:

public class Category{
public int ID {get;set;}
public string Name {get;set;}
public List<Product> Products {get;set;}
}

The problem is quite simple, this class has no hooks for lazy loading. You have to load everything here in one go. Worse, you probably need to load the entire graph in one go. That is scaring me on many levels.

I am not sure how the EF is going to handle it, but short of IL Rewrite techniques, which I don’t think the EF is currently using, this is a performance nightmare waiting to happen.

Disposable servers

I have been using Amazon EC2 and GoGrid recently, and I found myself spinning out a server to do a specific task (converting Subversion repository to git repository, copying a database, trying out VS 2010, etc).

Usually I do it because the cloud machines has a lot more bandwidth than I have locally, and I can also choose a specific OS + services that match what I care for. I am not used to thinking about servers in this fashion, but I find myself doing it fairy often recently.

Interesting shift.

Review: GoGrid vs.Amazon EC2

For the last year or so, I have been running my servers on GoGrid. The main reason for wanting to do that is because I wanted to run on a Windows 2008 server, and there was no good UI for managing EC2 at the time. I have been playing around with EC2, trying to see if this fit my needs better. Well, in reality, I was trying to see if locating a server instance in Europe would give me better latency when I need to administer the servers (it does).

Nitpicker corner: The following is a review of both providers for my scenario. I know that you’ll ignore that, but do try paying attention to the fact that I am talking about my scenario, and not yours.

imageAfter playing around with EC2 for several days, I came with the following conclusions. GoGrid & EC2 only looks similar. They are actually two different offering targeting different types of scenarios.

EC2 is to handle clouds, period. Server instances are meaningless, except maybe by their role. They come & go as they please and they are utterly disposable. That makes it great if what you want to do is have a cloud. It is very hard if what you want is just running server instances.

For my purposes, I don’t want to have a cloud of indistinguishable machines, I want to be able to run a lot of different sites on different machines. Maybe I want to have a set of machines for a single site, but I still want to be able to clearly and easily separate the machines that run nhprof.com and the ones that run ayende.com.

That made working with EC2 really uncomfortable, to tell you the truth. They don’t support anything like naming an instance, which make sense, from their point of view. You should not get attached to an instance, you should get attached to an image of that instance.

That doesn’t work quite that well for my case, however. I was willing to accept this limitation, but I run into a few others that were deal breakers for me.

  • Amazon EC2 doesn’t support Windows 2008. This is really annoying, both because there are some features that I could have used and because Windows 2003 (their only Windows offering) is not supporting RDP saved passwords, and doesn’t support copy/paste in the login screen. Both of which make it horrible login experience with the crazy passwords Amazon assign.
  • No good way to recover an instance. I setup an instance, told it to install security updates, and rebooted it. It didn’t come up. I am not sure what I am supposed to do in this case. You have to enter a support contract to recover this, as far as I could tell (and I wasn’t willing to do so just for the trial). It seems that the response would have been: ‘just recreate the instance’.
  • Long instance provisioning time. Spinning up a new instance in EC2 seems to take about 15 – 30 minutes, which is annoyingly long. Yes, I know about reserved instances, not applicable for my scenario. (GoGrid feels faster, but I don’t have any data to say if it is faster).

GoGrid, on the other hand, takes a drastically different approach. image

Basically, they give you a server farm, you can create instances of machines, name them and manage them as individuals. They do support Windows 2008, and their support is phenomenal.

I have been running nhprof.com there for a long while, and overall, I love what they are doing, their support level and the experience.

Their main disadvantage compared to EC2 was that they didn’t support cloning an image, they fixed that a while ago, which makes me very happy.

This subtle difference, from focusing on set of instances to focusing on an instance, is a huge benefit for my scenario, because it allows me to manage the different parts of my infrastructure directly, instead of indirectly or via brute force.

One thing to note about both of them. They both consider instances to be disposable. If you have problems with a server, you are probably better off just creating a new instance and setting that up.

GoGrid is better than EC2 here, because they will try to salvage a dead instance, but you should have a backup and the means to restore it on a new machine ready, in the end, it is much easier than the alternative.

I am using a backup service (Mozy, if you care) for all my servers, and that takes care of that.

Note that my scenario is that I care about running my existing applications on a server in the cloud, not running in the cloud. That is why I didn’t even consider something like Azure of AppEngine. They don’t matter for this scenario.

If I was building a new application that required scaling, it would probably be a different sort of decision matrix, with potentially a different result.

From pricing perspective, they seems comparable. GoGrid says that they are cheaper, but they would :-)

NHProf in the real world: NHibernate Search integration

I had a problem in one of my applications, a Lucene search was returning more results than I wanted it to, and I wasn’t sure what was going on.

image

I pre marked the problematic line, but at the time, I had no idea why it is returning me all those extra results that had nothing to do with my queries. But since NH Prof gave me the full Lucene query text, I could take this and plug this into Luke.

image

Luke is the preferred tool to working with Lucene indexes, and running the query allowed me to play around and figure out what my problem was:

image

I was specifying that everything that is active MUST be returned, and everything that matches the query should be returned. Obviously, anything that is active (regardless of the rest of the query), must be returned.

In effect, this is similar to this SQL query:

select * from Entries
where IsActive = 1 or ( ... )

Once I figured that one out, it was pretty easy to fix the query to use MUST_NOT IsActive == False. But it would have been much harder without NH Prof.

The problem with running on virtual hardware…

I tested this in both GoGrid & EC2, and got similar results:

image

The disk I/O is really bad. Just to give you an idea about the difference, here is the same process on my local machine (1TB 7,200 RPM WD HD):

image

Tags:

Published at

Originally posted at

Comments (13)

Git renames

One of the things that I like about Git is that I don’t have to think about operations that I make in my source. For example, I am working on the refactoring from NH Prof to ÜberProf, and I wanted to change the directories & project files. So I just went to explorer and renamed them.

Then I had to fix some namespaces references in the project file. It looks like this:

image

Notice that it capture both the rename and the content change?

You can also see how it looks in the log file:

image

Trying to do stuff like that with SVN is just PITA, with Git, I didn’t have to think about it.

What is your exit strategy?

My previous post relating to the business side of software seems to have met with positive reception, so here is another one. Let me know if you like it, or if you are interested in tech only content.

My father told me the secret for getting rich when I was just a little boy. It goes like this:

  • Buy cheap
  • Sell high

Unfortunately, I think that the .COM bubble made it clear that two yuppies in the living room don’t really add value that would make people buy from them.

So another strategy is needed.

One of the nicest past times is day dreaming.

Here are  a couple of nice dreams of getting rich quick:

If I make this widget, I can sell it for 50$. It is going to cost me 15,000$ to write it, but then every sell is practically free. I only need 300 sales to start earning money! If I sell just a 1,000 of them, I am going to earn 35,000$, and I can keep selling it forever.

It is like Manna from Heaven!

Or this one, which should be more familiar to anyone who isn’t in software:

I can go to the bank and get a mortgage on this house, they will cover 90% of the cost, so I only need to bring 15,000$ to get it. I pay 15,000$ and I have a 150,000$ assets in my hands. Mortgage payments is only 1,000$ a month for 15 years.

I can rent it for 1,500$, so the house covers the mortgage and I get 500$ a month for doing nothing. In 2.5 years, I get my original 15,000$ back, I still have the house, which is going to be worth 175,000$. I can sell it then and make 35,000$ profit!

It is like Manna from Heaven!

Now, I think that you can agree with me that those are pretty common thought patterns for people. They are also true. Think about it, wouldn’t you want to make 200% – 300% profit in a short amount of time?

Both schemes are valid ways to do so. Well, sort of.

The problem with the two schemes outlined above is that they have (in technical terms), absolutely no error handling. In fact, people paint the picture in terms so rosy that I suffer from pink overload.

Generally speaking, there are few things that hold true in business as well as this statement:

The higher the return on investment, the higher the risk.

With the real estate example, let us say that you have a month in which you have no renters, can you pay the mortgage? Out of your own pocket, that is. If not, you are likely to start spiraling down. And if you can pay the mortgage, can you pay city tax? What happen when you have a renter that is there, but doesn’t pay? You got legal costs to evict them.

And that is completely ignoring something like the current recession.

Just to give you some idea, you bought the house, rented it, and on the third month, the renter stopped paying rent and refuse to leave. It is going to take 30 days to evict them, and another 2 weeks to get a new renter. Your out of pocket expenses are going to be 2,000$ for the mortgage (for the two months in took to resolve things), another 1,500$ for the legal fees and probably at least two bottle of anti acid for your ulcer.

Oh, you might be able to recover some of the costs if you go after the renters (add more legal fees), but that assumes that they are able to pay, and the money is only going to show up sometimes in the future (if at all). In the meantime, you have better be able to cover the unexpected 3,500$ expense that just dropped in your lap.

With software, let us say that you quit your job to work on your widget. You spent a lot of time & effort on that, then you start selling it. In the first three months, 50 people buy it. Your underlying assumption about the number of buyers was overly optimistic. You are out of a job, out 12,500$ and feeling cheated.

Where it the Manna from Heaven?!

One of the things that I like about being in the army is that it thought me to plan. I can write an pretty good op plan, and that is close enough to a business plan. Most op plans have standard sections:

  • Goal
  • Mission
  • Our forces
  • Enemy forces
  • Obstacles
  • Dangers
  • Contingency plans
  • Abort

Whenever you are going to plan something big, you have to sit down and plan for the things that are going to bite you in the ass. I generally try to divide things into two categories. Contingency plans are for when the situation is recoverable. Abort is for when it isn’t, and I want to get out with as little damage as possible.

With software, that means that you have things like user studies, betas, etc. With real estate, I am not an expert so I am not going to comment. I would say that not getting in over your head is a good plan no matter what.

Final thoughts, figure out what your exit strategy is before you go in.

Figure out the cost of that exit strategy along with the cost of entering the game in the first place. That is the amount of money that you are putting on the table. And another final word of advice, that money is at risk, and the higher the return, the higher the risk. If you want to sleep well at night, make sure that you aren’t risking the money that you need to buy bread.

The law of unintended consequences

A while ago I got a tattoo on my forearm (did I mention that you DevTeach is wild?). Here is how it looks like:

image

To preempt the nitpickers, this was my logo first, I reused it for other stuff afterward (including Rhino Mocks & NH Prof), but it is mine. I did not tattooed NH Prof’s logo.

So a while ago I was buying something in the supermarket and the clerk asked me why I had this tattoo. I told her that I like rhinos, and she said, but it is not a rhino.

Took me a while to realize that from her perspective, it looked like this:

image

Do you see it? You may need to tilt your head a bit to see it.

I guess that if need be, I can prove that I am OSS aficionado now.

How NH Prof integrate with a profiled application

In order to integrate with an application, NH Prof requires that you’ll reference the appender dll and call NHibernateProfiler.Initialize(). For Linq To SQL, the process is similar, but you call LinqToSqlProfiler.Initialize().

This post describe the internal implementation of how the appender works. Well, that is a lie, I am actually using this post as a design document for the restructuring process that I am currently at to support multiple OR/M backends. The idea is to give some background information so people who want to integrate with the profiler can do so.

There are two sides for integrating an OR/M with NH Prof (by the way, I am taking suggestions about how to call the uber product, so far xProf is the winning candidate, and I am not sure that I like it).   The first part is actually integrating with the application, this is the job of the appender. It is responsible for capturing and sending the event stream from the profiled application to NH Prof. It is working under some fairly tight constraints:

  • Performance is a huge consideration, the overhead of the profiler on the profiled application side must be kept to a minimum.
  • The profiler may be started & stopped at any time, requiring application level support for reliable communication.
  • In the future, I want to be able to support production profiling, which is a whole can of worms in itself.

The second part of profiler integration is actually doing a lot of work on the profiler side. It means getting the event stream, parsing it, analyzing it, etc. This is actually fairly narrow part of the

Overall, the structure looks something like this:

image

At the top layer of the stack, we have integration with the actual OR/M. This is the point where the OR/M gets to output different

The output that we need is things like:

  • Session / Unit of Work Start & End notifications
  • Executed SQL (including parameters)
    • This should include capturing SQL executed as a result of lazy load operation.
  • Last query duration
  • Last query row count
  • Entity loaded (name + key)
  • Cached queries (for that matter, different types of queries, such as NHibernate Search’ queries)
  • Warning / Error messages about the OR/M usage
  • Statistics about the OR/M usage

The way NH Prof is dealing with this is quite simple, we have the following class:

public static class ProfilerIntegration
{
    public static void PublishProfilerEvent(
        Guid sessionId,
        string loggerName,
        string message
    );

    public static void PublishProfilerWarning(
        Guid sessionId,
        string message
    );

    public static void RegisterStatisticsSource(
      IStatisticsSource statsSource
    );
}

As you can see, the intent is to create an extremely simple integration point. The idea is that I don’t want to spend a lot of time integrating different OR/Ms. The only thing that I need to do if figure out how to get the OR/M to call those methods with the right information.

The format is textual and human readable. On the GUI side, NH Prof will take the information and start doing correlation and analysis. Part of the reason that I need to support different OR/Ms on the profiler side is that each of them has different ways of expressing those events, and I don’t want to try to do a single iota of extra work on the profiled application side if that can possibly be avoided.

Here is an example of how a minimal Linq To SQL implementation is using this infrastructure:

// this gets hooked into the l2s data context by black magic 
public class LinqToSqlAppender : IDisposable
{
    public Guid id = Guid.NewGuid();

    public LinqToSqlAppender()
    {
        ProfilerIntegration.PublishProfilerEvent(id, "LinqToSql.DataContext", "data context opened");
    }

    public void OnSqlExecuted(string sql)
    {
        ProfilerIntegration.PublisProfilerEvent(id, "LinqtoSql.SQL", sql);
    }

    public void Dispose()
    {
        ProfilerIntegration.PublishProfilerEvent(id, "LinqToSql.DataContext", "data context closed");
    }
}

With this in place, I have a very short turn around time in the actual profiler implementation.

Choosing between Active Record, Fluent NHibernate and NHibernate

A forum question:

I have seen videos where you created a domain model using ActiveRecord in real-time... elaborate on when and how you move to using full Hibernate in more detail.

The problem with presentations such as the one described in the question is that they are cheating. Oh, I really do work things on the fly, but it is like asking me to type with my eyes closed (as I am doing right now, just to see if I can).

I can do that, but it doesn’t really say much about my typing skills. The type of models that I create live tend to be very simple ones, things that I have done for dozens and hundreds of times. Building a model live on stage looks impressive, especially since I insist of the people in the audience picking the model. But it really isn’t that hard.

That said, the actual question is more interesting, what should you choose? Active Record, Fluent NHibernate or mapping files?

I, personally, like using mapping files, they are simple, uncomplicated and expose everything that NHibernate can do. Active Record is nice if you want to put the persistence definition right there along the class definition. Another major advantage of Active Record is that it is working hard to infer things for you. It makes working with it very easy.

Fluent NHibernate is mapping using C#. I don’t quite see the point there, especially since in some things FN decided to diverge from the NHibernate terminology, but I get why people love it. The part about it that I simply adore is the auto mapping support. That is a great way of getting things started.

But usually, even if I am using Fluent NHibernate or Active Record, I am mostly using them as scaffolding. At some point, I’ll ask them to generate the HBM for me and start working with them.

That is me, however, your mileage might vary.

NH Prof New Feature: NHibernate Search Integration

Well, I was demoing how easy it is to add new features to NH Prof in a user group in JAOO today, and tomorrow I am doing my OR/M += 2 talk. Part of the things that I want to show is NHibernate Search, but the demo doesn’t really work unless I can point to NH Prof and show what is going on there.

I now consider output results to the console to be so 2005.

Here is the code that I am using for this post:

using (var s = factory.OpenSession())
{
    var posts = NHibernate.Search.Search.CreateFullTextSession(s)
        .CreateFullTextQuery<Post>("Title:NHibernate User.Name:ayende")
        .List<Post>();

    NHibernate.Search.Search.CreateFullTextSession(s)
        .CreateFullTextQuery<Post>("Title:NHibernate User.Name:ayende")
        .SetMaxResults(10)
        .List<Post>();

    foreach (var post in posts)
    {
        Console.WriteLine(post.Title);
    }
}

I had to make a small modification to NHibernate Search, to output the right information (which means that you can make this work with r1044 or up), but here is the result:

image

Notice that you can get not only the actual Lucene query text, but you also get the query duration and the number of returned results. The next query is NHibernate Search actually hitting the database to load the managed entities, after hitting the Lucene index to perform the actual search.

We can also generate some warnings on top of Lucene! As you can see here, we detect unbounded queries:

image

If you do specify a maximum number of entities to return, we are going to reflect that in the query directly:

image

We can also tell you if your Lucene index is out of date with respect to the database:

image 

Sweet, and is currently warming up in the oven, build 488 should have it.

JAOO: OR/M += 2

Just finished doing this presentation, I think it went very well, although I planned to do a 45 minutes session + 15 questions but I ended up hitting the session time limit without covering everything that I wanted.

You can get the source code that I have shown in the presentation here: http://github.com/ayende/Advanced.NHibernate

You can find the PDF of the presentation here: http://ayende.com/presentations.aspx

Using a service bus for queries

image Yep, another forum question. Unfortunately, in this case all I have is the title. Even more unfortunately, I already used the stripper metaphor before.

There are some questions that I am really not sure how to answer, because there are several underlying premises that are flat out wrong in the mere asking of the question.

“Can you design a square wheel carriage?” is a good example of that, and using a service bus for queries is another.

The short answer is that you don’t do that.

The longer answer is that you still don’t do that, but also explains why the question itself is wrong. One of the things that goes along with a service bus is the concept of services.

image Services are autonomous.

Does this ring a bell?

You don’t query a service for its state, because that would violate the autonomy tenant.

But I need the users data from the Personalization service to show the home page, I can hear you say. Well, sure, but you don’t perform queries across a service boundary.

 

Notice the terminology here. You don’t perform queries across a service boundary.

But you can perform queries inside a service boundary. The image on the right shows one such example of that.

We have several services in a single application, they communicate between services using a service bus.

But a service isn’t just something that is running in a server somewhere. The personalization service also have user interface, business logic that needs to run on the UI, etc.

That isn’t just some other part of the application that is accessing the personalization service operations. It is a part of the personalization service.

And inside a service boundary, there are no limitation to how you get the data you need to perform some operation.

You can perform a query using whatever method you like (a synchronous web service call, hitting the service database, using local state).

Personally, I would either access the data store directly (which usually means querying the service database) or use local state. I spoke about building a system where all queries are handled using local state here.

Texo – My Power Shell Continuous Integration Server

Yes, I wrote my own CI server. I even did it in Power Shell, because that looks cool. You can find the source here. It is currently running in production and is responsible for pushing NH Prof builds out.

Now, what was I thinking when I built my own CI server? Put simply, I had the following goals:

  • Test WPF apps – CC.Net doesn’t allow it, since it is running as a service and that affects the way WPF tests behave. Texo shell out a different process, so it doesn’t have this limitation. Most other CI servers do the same.
  • Don’t expose passwords – the thing that really killed me with CC.Net was looking at the build log and seeing my password right there in plain text (happens if there is connectivity error to the repository). Yes, I am also surprised this is a feature.
  • Handle Git pushes – Git allows you to push several changes to the repository in a single shot. When I tried to use CC.Net to build NH Prof from git, it only showed the last commit. Texo understand the notion of a push (it takes it from the git hub API) and can pass that information to the build script.
  • Reactive – Texo doesn’t check the repository, in fact, most of the time it is completely passive (and is likely to be shut down). Whenever a push is made to the repository, github will call the Texo’s url, providing the information about the current push. Texo will take that information and create the builder process, which will update / clone the new repository, and then execute the build command.
  • Small configuration footprint – there are only two types of configuration, the SMTP settings and the project information. Here is the full configuration file:
  • <settings>
      <email>
        <smtpServer>smtp.gmail.com</smtpServer>
        <username>*****@gmail.com</username>
        <password>****</password>
        <useSSL>true</useSSL>
        <port>587</port>
        <from>*****@gmail.com</from>
      </email>
      <project
        url="https://github.com/ayende/Texo"
        name="Texo"
        git="C:\Work\Texo"
        ref="refs/heads/master"
        cmd="powershell .\psake.ps1 default.ps1 upload"
        build="3"
        workingDir="C:\Builds\Texo"
        email="ayende@ayende.com" />
    </settings>

What it doesn’t do:

  • UI, reports, tracking, whatever. Texo has one purpose in life, listen to changes and build the software, nothing else. The UI is a very simple email notification process, nothing more.
  • Hung build recovery.
  • Anything but git + github.

Texo is compromised of two parts, a web endpoint, written in C#, that plugs into IIS. I assume that the IIS website user is going to be the same one that the tests will run under (makes things much simpler). Once a notification arrives, the endpoint will invoke the Builder Power Shell script to perform the actual CI process.

At some point I really have to make up a list of all the projects that I was involved at, I know how I am going to title it: “NIH R US”.

Oh, and can you figure out the naming?

Licensing a commercial product

This is another forum question, this time from Brendan Rice:

A lot of developers are unsure of how best to go about making money from a product, how do you go about implementing licensing, what pay system do you use, how do you accept payment, are there any legal issues...

Well, talk about an open ended question. There are several aspects for the answer, legal, licensing and payment processing. There are somewhat related, though.

From the legal side, you need to understand basic concepts in the legal side of software engineering. You need to understand copyright, the idea of licensing software, what rights you care about and what you shouldn’t. I got a lot of my knowledge from simply researching the topic.

You might want to have a lawyer draft your EULA, but there are two major things that you want to remember about the EULA. Some people actually read the bloody things. If you put things there that are too nefarious people will get pissed at you. There is such a thing as bad publicity. You want to avoid that.

The second important thing about EULA is that if you take someone to court over it, you have already lost. I like to think about EULA as just setting the grounds for what is expected from either side. By all means, get that through your lawyer, but be sure that you know what is in there. And be sure that it is an agreement that you would be willing to sign.

From the licensing perspective, I had a disastrous experience using one licensing component, after which I decided that I might as well write my own. It is a pretty simple system, based on signed XML files, I have the secret private key and the application ship with the public key. It allows to pass data around in a very simple form while protecting the license files from tampering. The code is available, and it is pretty simple, so I won’t get deeper into it.

The last part, payment processing, is probably the most interesting bit. I use a payment provider, because trying to manage something like that yourself is a nightmare. My payment provider handles all sort of payment options, including things that require someone to answer the phone or manually clear mailed checks, etc.

They also provide nice admin site, where I can do things like generate coupons, like this one: NHP-45K2D46S27 (yes, it is a valid one, at least until someone will use it), refund people, taxation, view interesting reports and in general administer all aspects of the accepting payments.

They take a commission that isn’t significantly larger than most credit cards and in general they solve so much headache that I am happy to pay them.

The result of a successful order in the payment provider is an email generated that is sent to a mailbox monitored by a service. That email is read, parsed, and the corresponding license file is then sent to the user.

Nothing really earth shattering in all the process, yes, I know. But it is probably important to outline that clearly for people who haven’t done it yet. It isn’t complex or hard by any measure.

Linq to Sql Profiler: Spike results

Well, so far, so good.

I started by defining a simple Linq to SQL model, there are zero things that you need to do here to make things work:

image

And now to the actual code using this model:

static void Main()
{
    LinqToSqlProfiler.Initialize();
    using (var db = new BlogModelDataContext())
    {
        var q = from blog in db.Blogs
                where blog.Title == "The Lazy Blog"
                select blog;

        foreach (var blog in q)
        {
            Console.WriteLine(blog.Title);

            foreach (var post in blog.Posts)
            {
                Console.WriteLine(post.Title);
            }
        }
    }
}

I think that we can agree that this is pretty routine usage of Linq to SQL. The only thing extra that we need is to initialize the profiler endpoint.

The end result is:

image 

We can detect data contexts opening and closing, we can detect queries and their parameters, format them to display properly and show their results. We can even detect queries generated from lazy loading and the stack trace that caused each query (a hugely valuable feature).

Now, before you get too excited, this is a spike. A lot of the code is going to end up in the final bits, but there is a lot more to do yet.

Things that I am not going to be able to do:

  • Track local transactions (I’ll probably be able to track distributed transactions, however)
  • Show query row count
  • Track query duration
  • Show entities load by session

I am going to be able to show at least some statistics, however, which is pretty nice, all told.

Thoughts?

JAOO: Evolving the Key/Value Programming Model to a Higher Level

Billy Newport is talking about Redis, showing some of the special APIs that Redis offers.

  • Redis gives us first class List/Set operation, simplify many tasks involving collections. It is easy to get into big problems afterward.
  • Can do 100,000 operations per second.
  • Redis encourage a column oriented view, you use things like:
R.set("user:123@firstname", "billy")
R.set("user:123@surname", "newport")
R.set("uid:bewport", 123)

Ayende’s comment: I really don’t like that. No transactions or consistency, and this requires lots of remote calls. 

  • Bugs in your code can corrupt the entire data store. Causing severe issues in development.
  • There is a sample Twitter like implementation, and the code is pretty interesting, it is a work-on-write implementation.
  • List/set operations are problems. What happen when you have a big set? Case in point, Ashton has 4 million followers, work-on-write doesn’t work in this case.
  • 100,000 operations per second doesn’t mean much when a routine scenario result in millions of operations.
  • This is basically the usual SELECT N+1 issue.
  • Async approach is required, processing large operations in chunks.
  • Changing the way we work, instead of getting the data and working on it, send the code to the data store and execute it there (execute near the data).
    • Ayende’s note: That is still dangerous, what happen if you send a piece of code  to the data store and it hungs?
  • Usual problems with column oriented issues, no reports, need export tools.
  • Maybe use closures as a way to send the code to the server?

Ayende’s thoughts:

I need to think about this a bit more, I have some ideas based on this presentation that I would really like to explore more.

Tags:

Published at

JAOO: Working Effectively with Legacy Code 2 – Michael Feathers

I have a tremendous amount of respect to Michael Feathers, so it is a no brainer to see his presentation.

Michael is talking about why Global Variables are not evil. We already have global state in the application, removing it is bad/impossible. Avoiding global variables leads to very deep argument passing chains, where something needs an object and it passed through dozens of objects that just pass it down. We already have the notions on how to test systems using globals (Singletons). He also talks about Repository Hubs & Factory Hubs – which provide the scope for the usage of a global variable.

  • Refactor toward explicit seams, do not rely on accidental seams, make them explicit.
  • Test Setup == Coupling, excessive setup == excessive coupling.
  • Slow tests indicate insufficient granularity of coupling <- I am not sure that I agree with, see my previous posts about testing for why.
  • It is often easier to mock outward interfaces than inward interfaces (try to avoid mocking stuff that return data)
  • One of the hardest things in legacy code is making a change and not knowing what it is affecting. Functional programming makes it easier, because of immutability.
  • Seams in a functional languages are harder. You parameterize functions in order to get those seams.
  • TUF – Test Unfriendly Feature – IO, database, long computation
  • TUC – Test Unfriendly Construct – static method, ctor, singleton
  • Never Hide a TUF within a TUC
  • No Lie principal – Code should never lie to you. Ways that code can lie:
    • Dynamically replacing code in the source
    • Addition isn’t a problem
    • System behavior should be “what I see in the code + something else”, never “what I see minus something else”
    • Weaving & aspects
    • Impact on inheritance
  • The Fallacy of Restricted Languages
  • You want to rewrite if the architecture itself is bad, if you have issues in making changes rapidly, it is time for refactor the rough edges out.

xProfiler – A generic OR/M Profiler

With the release of NH Prof v1.0, I started to look if I can extend what I am doing for NHibernate for other OR/M in the .NET space. My initial spiking makes me optimistic, this is certainly possible. I’ll probably talk at length about the actual architectural implementation, but for now I want to concentrate on the actual high level requirements. I want to be able to support the following:

  • Linq to SQL
  • SubSonic
  • LLBLGen
  • Plug your own DAL

While none of them are going to provide me with the detailed information that I can get from NHibernate, it turns out that I can get a pretty good mileage from just pushing the basics along. The first spikes with Linq to SQL are promising (more about that will show up starting next week or the one after that), and I intend to allow you to:

  • Show DataContext
  • Show SQL Statements
    • Show you the actual formatted SQL, including parameters
    • Show you the stack trace of where that SQL was generated
  • Generate alerts for bad practices such as SELECT N+1 or issuing too many queries

There are things that I can do with NHibernate that are simply not possible with other OR/Ms (something like tracking loaded entities per session, for example, or showing cached queries), but since most of those are actually capabilities that NHibernate has and the others do not, I think it is still great.

Currently the plan is to have a separate product for each OR/M, that means that buying NH Prof will not get you L2S Prof, but we will most likely have some uber license that will cover all of them.

You’ll notice that the Entity Framework isn’t listed in my initial targets list. That is for a very simple reason, plugging into EF seems to be about nine times harder than doing it with anything else. I would need to get a strong feedback that this is something that enough people are willing to pay for.

Message passing concurrency and shared state

Mike Rettig has left a somewhat snarky comment on a post detailing a deadlock issue that I run into:

Locking on shared state? I thought you were a proponent of message based concurrency.  This post demonstrates exactly why concurrency combined with shared state is so hard.
Looking forward to your next thread race or deadlock,

The problem with message passing concurrency is that the underlying assumption here is that there isn’t any shared state. But in my situation, that is no a valid assumption.

Let us see if I can give a good example of what I mean. Let us assume that we have a message passing the exchange the following messages:

  • Session Created { Session Id }
  • Statement Executed { Session Id, Statement Text }
  • Query Sessions And Statements { }

Furthermore, you are not going to be make use of something like a DB to manage the state (which would handle the sharing issue for you), you have to manage everything in memory.

I would be very interested in hearing how you can design such a system without having shared state and locking.

What is happening with NH Prof?

NHProf Logo

I was a bit quite on the NH Prof front lately, not because I didn’t work on it, but because I was fighting a really nasty bug. The way NH Prof is making use of WPF exposed a memory leak scenario inside WPF.

Luckily, once we were able to isolate the actual problem, it was relatively easy to find a workaround. If you care, the resolution was to keep a single instance bound to the view and replace its values, instead of providing a new instance when replacement was required. The problem is that this bug took forever to isolate.

Also included are a bunch of performance optimizations that I did along the way to resolve the OOM error. Those relate to better handling of batch statements, caching the result of SQL parsing and optimized NH Prof’s idle state (I’ll have a separate post about that).

In addition to that, I gave some additional love to DDL statements, making sure that NH Prof treated them specially and didn’t generate DML errors for DDL statements. Same goes for cached statements.

It is a bunch of small changes, the biggest of them was tracking down and viciously attacking the OOM error. For the next week, I am going to be busy at JAOO, but I also intend to spend some free time continuing the final polishing & touch ups.

The week after that, I intend to seriously start working on the 1.1 feature set.

Should be interesting.

Do you sign the default contract?

This is just something that came up recently in a mailing list, we were talking about copyright, ownership and such. The topic of who owns the code you write on your own time (and on your own machines) came up.

The opinion of some people was that the employer may own the code even under those circumstances. It seems that it isn’t usually part of the law (that depend on where you are at, of course), but it is part of standard employment contract templates.

When I started looking for a job, I insisted on taking the employment contract home and going over it with:

  • a calm mind
  • having another set of eyes go over it

I had one case of not properly reading what I was signing on with bad consequences, I learned since then.

There is no such thing as a standard contract, you can always negotiate.

For that matter, I rejected an offer from one place after verbal agreements that we reached didn’t get into the contract (twice!). I decided that if they were trying to effectively cheat me when I wasn’t even working for them, I had better things to do than to put my head into this sickbed.

Some of the things that I found in employment contracts are of the sort that would make your head curl. Non compete agreements that basically say that you are not allowed to do any work (for anyone) for 2 years after you stop working for the company. Ownership on anything you do (be in software artifacts, a book about flowers and quite possibly any children you have during your employment terms).

Some of them are unenforceable at court, but you would be at a much better position if you didn’t have to deal with annoying section in a contract that you are signed on in the first place.

My usual approach to reading contracts is to debug them, assuming that the other side is nefarious, evil, double dealing and likes kicking puppies before breakfast. Most places will go with the “Try and you shall succeed” method for contracts. If you signed on to them without complaints, they are good. If you object to something, they can amend the contract to be more reasonable. It isn’t that they are nefarious, or that they even plan to act according to the contract. But it is best if they don’t have any leverage on you.

An interesting point that I run into is that it is often useful to be bold when negotiating a contract. I deleted the non compete clause for my employment contract when I viewed it, and required a lot of clarifications about what of my work amounts to company’s property. I followed the same logic as they did, “Try and you shall succeed”, if they didn’t care about that, I was good.

We ended up with a 1 year limitation for clients that they sent me to, and agreeing that any software work that I am making on the company’s time or using their equipment belong to the company, which I considered reasonable.

Not reading the contract is a crime, once you did, be very careful in deciding what is acceptable and what isn’t. And if you are already signed on a contract, make sure that you know what is in it.