Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive
time to read 3 min | 406 words

Usually when I am doing a code review, I ask not just access to the code, but to the source control repository as well. There are quite a lot that you can learn from the source control of a project.

Here is something that is both good and bad. It is bad because we have no checkin comments. It is good because we can see that we have very frequent checkins.

image 

Another thing that you want to check is the bug tracking system. Is there such a thing (including email thread or a post it notes or a story wall)? Is it relevant?

Is there a defined build process? How do you deploy to production? How do you manage DB changes? What is you back off strategy from an ambitious feature? How do you rollback a failed deployment? Do you have a physical deployment plan?

I don't expect all projects to have detailed answers to the all the questions I put above, but I do expect to have a response. The worst possible thing that could happen is that there is no thought given to those.

My standard answer to "How do you manage DB changes?" is: I create a diff using SQL Compare and apply that when needing to update DB versions.

The most important thing in my opinion is the actual interaction in code reviews. I tend to have two stages of code review. The first is going over the code with only general idea about what is going on. This is to find if there are things that require special attention, if there are obvious issues there.

In general, you want to be able to view this from a clean mind. Then, you want to have a conversation with the developers. Based on that initial review. This allows you to capture more than just the code, it let you capture the actual dynamic of the team and how they work. During this phase I also get a lot more

The second phase of the code is much more in depth, I tend to try to go over every line in the code at least once. After that, I can sit with the developers again and discuss the overall impression.

Probably the most value in the process comes at this second state, I think.

time to read 3 min | 544 words

Logging is probably the most mentioned sweet spot of AOP. Probably because it is the simplest and most straightforward example most people can think of. I remember going over a piece of code that handles billions of transactions a day(that is for you, Jdn), and seeing something like this:

public void DoSomethingInteresting(string arg1, int arg2)
{
	LogGateway.Enter("DoSomethingInteresting", arg1, arg2);
	try
	{
		// actual method code 
		LogGateway.Success("DoSomethingInteresting", arg1, arg2);
	}
	catch(Exception e)
	{
		LogGateway.Error(e, "DoSomethingInteresting", arg1, arg2);
		throw e;
	}
	finally
	{
		LogGateway.Exit("DoSomethingInteresting", arg1, arg2);
	}
}

That was repeated for each and every method in the system. It was horrible.

I don't care how you do it, but there are at least 7 AOP approaches in the CLR, at least three of them are applicable in your environment.

I am going to show how to handle that with Windsor's AOP, but the same thing is applicable using other approaches. First, our service:

public interface ISmsSender
{
	int Send(string to, string msg);
}

public class SmsSender : ISmsSender
{
	public int Send(string to, string msg)
	{
		if(msg.Length>160)
			throw new ArgumentException("too long","msg");
		return to.Length;
	}
}

And now we need to define the interceptor:

public class LoggingInterceptor : IInterceptor
{
	public void Intercept(IInvocation invocation)
	{
		var logger = LogManager.GetLogger(invocation.TargetType);
		try
		{
			StringBuilder sb = null;
			if (logger.IsDebugEnabled)
			{
				sb = new StringBuilder(invocation.TargetType.FullName)
					.Append(".")
					.Append(invocation.Method)
					.Append("(");
				for (int i = 0; i < invocation.Arguments.Length; i++)
				{
					if (i > 0)
						sb.Append(", ");
					sb.Append(invocation.Arguments[i]);
				}
				sb.Append(")");
				logger.Debug(sb);
			}

			invocation.Proceed();
			if(logger.IsDebugEnabled)
			{
				
				logger.Debug("Result of " + sb + " is: " + invocation.ReturnValue);
			}
		}
		catch (Exception e)
		{
			logger.Error(e);
			throw;
		}
	}
}

The code is really simple, we get the logger instance ( in this case, from log4net ), check if logging is enabled and if so, write the method and parameters to the log. We then execute the code and log the return value as well.

If there was an error, we log that as well.

Here is my test code:

var container = new WindsorContainer()
	.Register(
		Component.For<LoggingInterceptor>(),
		Component.For<ISmsSender>().ImplementedBy<SmsSender>()
			.Interceptors(new InterceptorReference(typeof(LoggingInterceptor))).First
	);
BasicConfigurator.Configure(); // configure log4net
var sender = container.Resolve<ISmsSender>();
try
{
	sender.Send("ayende", "short");
	sender.Send("rahien", new string('q', 161));
}
catch (Exception)
{
	
}

And the output for that is:

241 [1] DEBUG ConsoleApplication9.SmsSender (null) - ConsoleApplication9.SmsSender.Int32 Send(System.String, System.String)(ayende, short)
272 [1] DEBUG ConsoleApplication9.SmsSender (null) - Result of ConsoleApplication9.SmsSender.Int32 Send(System.String, System.String)(ayende, short) is: 6
272 [1] DEBUG ConsoleApplication9.SmsSender (null) - ConsoleApplication9.SmsSender.Int32 Send(System.String, System.String)(rahien,
    qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
    qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
    qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq)
273 [1] ERROR ConsoleApplication9.SmsSender (null) - System.ArgumentException: too long
Parameter name: msg
   at ConsoleApplication9.SmsSender.Send(String to, String msg) in C:\Users\ayende\Documents\Visual Studio 2008\Projects\ConsoleApplication9\ConsoleApplication9\ISmsSender.cs:line 15
   at ISmsSenderProxyb24fc36182fb434fa7f466d148259c48.InvocationSend_1.InvokeMethodOnTarget()
   at Castle.DynamicProxy.AbstractInvocation.Proceed() in e:\OSS\Castle\Tools\Castle.DynamicProxy2\Castle.DynamicProxy\AbstractInvocation.cs:line 140
   at ConsoleApplication9.LoggingInterceptor.Intercept(IInvocation invocation) in C:\Users\ayende\Documents\Visual Studio 2008\Projects\ConsoleApplication9\ConsoleApplication9\Program.cs:line 65

Simple, elegant, and doesn't clutter my code.

time to read 2 min | 201 words

This post just went over the line...

image

I mean, how do you expect me to respond to such a message?

In the old tradition of, don't give the man a fish, here is how you find out.

There is this tool called Google. You can s-e-a-r-c-h things there. I used the advance technique of putting the search terms in the text box and clicked on Search.

Here is the result:

image

The third result, surprisingly enough, is from my site. You can see it right there in the link. Clicking on the link will lead you to this page:

 

image

Do you see the big red arrow? It point to a new fangled concept called a link. If you click on it, it takes you somewhere.

Here is where it takes you in this situation:

image

Have fun sharing this with your team, but I suggest taking a Google 101 course at the same time.

time to read 6 min | 1157 words

The question came up in the ALT.Net list, and it is fairly common in the DDD circles as well.

I have an User with Email address, which I want to keep encrypted in the database, but visible to the application. How do I handle this scenario in the entity?

We can do it in our service layer, and encrypt / decrypt that outside the entity, but that has two issues. It couple the entity to the service and it means that we can forget to encrypt the email at some point.It also smells.

Another option, which I call the brute force option, is to pass the dependency to the entity and let it handle its own concerns. Here is implementation:

image

Yes, it is ugly, and even using this approach you can improve upon it, but let us focus on the interesting tidbits. First, not only does our entity know about an infrastructure service, it is also aware of how to create it. It has to do so, because we have to provide an empty ctor for NHibernate to use.

We can generalize this a bit, leading us to this scenario:

image

This piece of code is akin to nails on board to me. It isn't how I want to see code written.

A better way exists. We will start by making a slight modification to the User class, we remove the empty ctor. Well, we can't remove it at the moment, there is check in NHibernate for this, which I need to find and remove, so for now, we have this:

image

Note that this also requires that you'll specify the unsaved value of the identifier in the configuration.

Now that we have established that we have no way of creating the User using the empty ctor, we let NHibernate know how it can create it:

public class EntityDependenciesInterceptor : EmptyInterceptor
{
    public override object Instantiate(string clazz, EntityMode entityMode, object id)
    {
        if(entityMode!=EntityMode.Poco)
            return base.Instantiate(clazz, entityMode, id);
        if(clazz != typeof(User).FullName)
            return base.Instantiate(clazz, entityMode, id);
        return new User(CryptoProvider.Instance);
    }
}

Just for User, we override the way NHibernate create entities. Now we can register it in the session, and we are done:

using (ISession session = sessionFactory.OpenSession(new EntityDependenciesInterceptor()))
{
    IList list = session.CreateCriteria(typeof(User)).List();
}

This is still not something that I particularly like. I don't see encryption as part of the responsibilities of the entity. It is just something we need in order to save encrypted values to the DB. So let make it explicit.

NHibernate has the notion of IUserType, which let you take actions on save / load of values from the DB.

First, we make User POCO again, remove all traces of the previous ugliness. Now, we change the mapping for User so the email property is defined so:

<property name="Email" type="Samples.EncryptedStringUserType, Samples" />		

And now we need to create the user type:

public class EncryptedStringUserType : IUserType
{
    public bool Equals(object x, object y)
    {
        return object.Equals(x, y);
    }

    public int GetHashCode(object x)
    {
        return x.GetHashCode();
    }

    public object NullSafeGet(IDataReader rs, string[] names, object owner)
    {
        object r = rs[names[0]];
        if (r == DBNull.Value)
            return null;
        return CryptoProvider.Instance.Decrypt((string) r);
    }

    public void NullSafeSet(IDbCommand cmd, object value, int index)
    {
        object paramVal = DBNull.Value;
        if (value != null)
            paramVal = CryptoProvider.Instance.Encrypt((string) value);
        IDataParameter parameter = (IDataParameter)cmd.Parameters[index];
        parameter.Value = paramVal;
    }

    public object DeepCopy(object value)
    {
        return value;
    }

    public object Replace(object original, object target, object owner)
    {
        return original;
    }

    public object Assemble(object cached, object owner)
    {
        return cached;
    }

    public object Disassemble(object value)
    {
        return value;
    }

    public SqlType[] SqlTypes
    {
        get { return new SqlType[]{new StringSqlType()}; }
    }

    public Type ReturnedType
    {
        get { return typeof(string); }
    }

    public bool IsMutable
    {
        get { return false; }
    }
}

And that is it, you can now get transparent encryption / decryption when you are going to the DB, at no cost to the application design.

time to read 2 min | 258 words

Adi has posted a post that I am not in agreement with: there are some things a good developer is NOT required to know.

[Quoting Jeff Atwood and Peter Norvig] Know how long it takes your computer to execute an instruction, fetch a word from memory (with and without a cache miss), read consecutive words from disk, and seek to a new location on disk

I did learn these things, but after reading Jeff's post I have been trying to remember a project I worked on during the past 10 years which required this type of knowledge - and I got nothing.

Well, I used knowledge about reading & seeking to / from disks the last two weeks. They are extremely important when you are thinking about high perf large I/O. Fetching words from memory, and the implications of ensuring good locality have huge affects on the type of performance that you are getting to get. As a simple example, a large part of the reason that thread switching is slow is that the CPU cache is mostly misses when you switch a thread.

You should know those things, and you should know how to apply them. No, they don't come often, and I can think of a lot more things that I would like a developer to know (OO, to start with) more than those low level details, but those are important to know.

Useless Java

time to read 5 min | 867 words

This article (The 'Anti-Java' Professor and the Jobless Programmers) has really annoyed me. Not because of its content, but because the professor being interview is so narrowly focused that he is completely out of it.

Let us take this quote:

“Furthermore, Java is mainly used in Web applications that are mostly fairly trivial,” Dewar says, with his characteristic candor. “If all we do is train students to be able to do simple Web programming in Java, they won't get jobs, since those are the jobs that can be easily outsourced. What we need are software engineers who understand how to build complex systems.”

You know what, I might agree with this, trivial web programming is something that doesn't require highly skilled worker. But the context in which he says this makes all the difference in the world. Here is the next statement:

“By the way Java has almost no presence in such systems. At least as of a few months ago, there was not a single line of safety-critical Java flying in commercial or military aircraft. I recently talked to a customer who had a medium-sized application in Java, which was to be adapted to be part of a safety-critical avionics system. They immediately decided that this meant it would have to be recoded in a suitable language, probably either C or Ada.”

So, if it is not aircraft code, it is "simple Web programming"? Excuse me?!

Leaving aside the part about Java's license not being suitable for life critical systems, it annoys me that this is the only criteria that he think is worth mentioning. His bank is running Java, as well as the stock exchange, it is likely that the payroll system in his university as well. Relegating Java to "simple Web programming" is a huge mistake. Java sweet spot is building complex systems, where a lot of the design decisions that make it hard to do the simple stuff pay off tremendously.

I can think of quite a few mission critical (people life on the balance) that are written in C# (for the purpose of this post, equivalent to Java). That is not to say that aircraft systems shouldn't be written in Ada or C, I never written one, so I won't comment, but to say that this is the only field of any complexity is showing a remarkable amount of ignorance and bigotry.

There was another part that caught my eye, this professor interviewing's technique:

Dewar says that if he were interviewing applicants for a development job, he would quickly eliminate the under-trained by asking the following questions:

1.) You begin to suspect that a problem you are having is due to the compiler generating incorrect code. How would you track this down? How would you prepare a bug report for the compiler vendor? How would you work around the problem?

2.) You begin to suspect that a problem you are having is due to a hardware problem, where the processor is not conforming to its specification. How would you track this down? How would you prepare a bug report for the chip manufacturer, and how would you work around the problem?

“I am afraid I would be met by blank stares from most recent CS graduates, many of whom have never seen assembly or machine language!” he says

I can tell quite a lot about what he is doing, just by the questions he ask, but let me try answering them.

1) Well, I would start by creating an isolated test case, seeing if I can still see the odd behavior. I would remove all code that is even half way smart, trying to reduce the amount of work that the compiler has to do. I would get the language spec and pore over it, trying to make sure that I am not relying on unspecified behavior or misunderstanding the spec. Hopefully, I have a decompiler as well, since that would be an independent corroboration that there is or isn't an issue. Then I would take the output and verify disassemble it, comparing it to what is supposed to happen. I will take that info and submit to the compiler vendor. Working around this issue... by now I have some understanding on what triggers it, so I would change the code accordingly, simplifying it, removing any smart tricks, etc. I'll also leave a big comment about why this is done.

2) Try it on a different machine with the same processor, try it on another machine with a different processor. That rules problems with a specific machine and problems with my code, respectively. Reduce the problem to the smallest thing that can repro it, go over the spec, verify understanding. Eventually I should have a small program that demonstrate the issue. This can be incredibly hard to do if you run into something like memory barriers do not work in some scenario on SMP machines, for example. Work around it... depending on context...

time to read 2 min | 320 words

I am working on the versioning chapter for the DSL book, and it is going amazingly well. When I started it I had only the vaguest of idea how to go about writing it, but I wrote 10 pages today, and it is going very fast, and the flow seems to be good. What creeps me out is that I spent months thinking about this, without getting any concrete idea about how to express myself, and than it all flow at once.

The problem, of course, is that it doesn't really flow into the book, here is my current blogging queue (unordered):

  1. NHibernate Interception with Post Sharp
  2. Entities dependencies best practices
  3. Creating logging interceptor with Castle Windsor
  4. Evaluating Prism
  5. Usage scenarios for Cloud Computing
  6. Evaluating Kym
  7. Distributed Hash Tables: Locks usage best practices
  8. Distributed Hash Tables: Locality
  9. Distributed Hash Tables: Namespaces
  10. Distributed Hash Tables: Time sensitive updates
  11. Distributed Hash Tables: Lookup by property
  12. Distributed Hash Tables: Lookup by range
  13. Code Critique: Transactional Queue
  14. Rhino Queues
  15. Useless Java
  16. A DSL trip to the Zoo
  17. I'll get to your application in a week- First, we need to build the framework
  18. Production Quality: A DSL Sample

And I am probably forgetting a few along the way. Just getting this turned around would be hard enough, but I also have three webcasts that I need to record & edit:

  • First Steps With Rhino Commons
  • Building Domain Specific Language with Boo
  • Production Quality Software

Way too much on my plate, I am going to bed.

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 5 days from now
  3. What happens when a sparse file allocation fails? - 7 days from now
  4. NTFS has an emergency stash of disk space - 9 days from now
  5. Challenge: Giving file system developer ulcer - 12 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}