Ayende @ Rahien

It's a girl

A vision of enterprise platform: A database that you don't hide in the attic

imageFor some reason, the moment that peole start working on Enterprise Solutions, there is a... tendency to assume that because we are going to build a big and complex application, we can afford to ignore best practices and proven practices.

That is just wrong. And it applies in many manners to many parts of the application, but it never applies to something as obiviously as it applies to the database.

This is just the results of a very short research:

  • The metadata driven approach, AKA the 5 tables DB
  • The Blob
  • The  5th normalized form DB
  • The 5th denormalized form DB
  • The table a day DB
  • The obsfuscated database - create table tbl15 (fld_a int, fld_b int, fld_g nvarchar(255) )

And those are just of the database structure, we haven't yet arrived to the database behavior yet. Here we have interesting approachs such as rolling your own "Transactions" table, requiring 40 joins to get a simple value, using nvarchar as the ultimate extensibility, etc.

So, how do we approach building a database that we can actually show in public? Let us start with thinking about the constraints that we have for the application. The database needs to support...

  • OLTP for the application itself.
  • Reports.
  • Performance.
  • ETL processes to take data in and out of the sytem.
  • Large amount of data.
  • Large amount of entities.

Notice how we don't have any suggestion about integrating with the application at the database level. That is a fairly common antipattern.

We are going to try to keep a one to one mapping between the entity and the table structure, because that will make it significantly easier to work with the system. One to one means that the Account Name translate to an Account.Name in the entity model, and Accounts.Name in the database model.

 

image

image We probably want the usual auditing suspects, such as Id, CreatedAt, ModifiedAt, CreatedBy, ModifiedBy, Owner, OrganizationUnit. Maybe a few more, if this makes sense. Probably not too much, because that can get to be a problem.

Accessing and manipulating this in the application layer it going to be trivial, so I am not really going to cover that in depth.

What I do want to talk about is how we expose it to the outside world? And by that I mean reports and ETL processes.

We have several options to try to do that. We can just let the reports and ETL processes to read from the database tables directly. This is the simplest approach, I think.

Other options include views, stored procedures, and other DB thingies that we can use. I have seen systems where an entity was composed of several tables, and the report was done off of a view that joined it all together.

The underlying problem is that I have versioning to consider. I am going to dedicate a full post to the problem of building upgradable user customizable applications, so I'll delay it to then, but let us just say that radically changing the database schema between version will be painful for the users. The usual way to handle that is to only make promises for the views, and not for the tables themselves. That is a good way to handle that, in my opinion.

I would suggest putting those in a separate schema, to make it clearer that the seperation is imortant. This also gives you the ability to later do a view to table refactoring, and maintain that with triggers. This is usually a bad idea, but for performance reasons, it may very well be a good solution.

ETL processes can use the same mechanism that reports use to read data from the database efficently, but nothing writes to the database except the application. At one point I wrote a change DB password util that run every two hours, it would change the database password and update the relevant config files.

I think that you can guess what the driving force to that where, no?

Getting data into the database can be done through application services (not neccesarily web services, btw). A simple example would be API similar to this one:

void Insert(params Account[] accounts);
void Update(params Account[] accounts);
void Delete(params Account[] accounts);

This API explicitly allows for bulk operations, so it can be very nice to work with, instead of having to do things on per-row basis, which basically kills performance.

How to get good performance from this system is another hard question. In this case, I would usually recommend on getting a good DBA to look at the perfrormance charactaristics of the application and optimize the database structure if needed. But, a much easier solution to performance problems in the database server is to not hit the DB server, but use caching. Distributed caching solutions, like Memcached, NCache, etc are really good way of handling that.

No business logic in the DB! This is important, if you put business logic in the DB, you have to get to the DB in order to execute the business logic. This kills scalablity, hurts the ability to understand the solution, and in general makes life miserable all around.

Reports are an interesting problem. How do you deal with security, for instance? Consider the security infrastructure that I already presented. This security infrastructure should also come with a database function that you can use like this:

SELECT * FROM Accounts 
WHERE IsAllowed(Accounts.EntitySecurityKey, @UserId, 'Account.View')

Or this:

SELECT * FROM Accounts
WHERE Accounts.EntitySecurityKey IN (
	SELECT EntitySecurityId FROM GetAllowed(@UserId, 'Account.View') 
)

Both of which provides really easy manner to get security for the reports. If we wanted to enforce that, we can force the report writer to write somtehing like this:

SELECT * FROM GetAllowedAccounts(@UserId, 'Account.View')

We can probably get away with assuming that 'Account.View' is the default operation, so it is even shorter. Among other things, this actually have valid performance characteristics.

This post it turning out to be a "don't make stupid mistakes" post, because I don't think that I am writing anything new here. About how to avoid making stupid mistake, that is fairly simple as well. Get a good DBA (that is left as an excersize for the reader), give him/her a big stick and encourage good design through superior firepower.

Envrionment Validation and Windsor Extensibility

So I was in Jermey Miller's talk about maintainable software echo system, and one of the thing that he mentioned that StructureMap does really well is the ability to ask the container to perform envrionment validations, to make sure that the envrionment is ready for us.

I really liked the idea, so I pulled up the laptop and started spiking how to handle this issue. First, let us see Jeremy's solution:

public class FileTextReader : ITextReader
{
	[ValidateConfiguration]
	public void ValidateFileExistance()
	{
		if (File.Exists(fileName) == false)
			throw new FileNotFoundException("Could not find file " + fileName);
	}
}

So, when you ask structure map to validate the environment, it will run all the methods that have been decorated with [ValidateConfiguration].

So, how do that that in Windsor?

The most important thing to realize with Windsor is that it is a container that was built to be extensible. Something like that is not going to be a change to the container, it will be an extension, not a change to the container itself. Extensions are usually facilities, like this one:

public class ValidationFacility : AbstractFacility
{
	private readonly List<string> componentsToValidate = new List<string>();

	protected override void Init()
	{
		Kernel.AddComponent<ValidateConfiguration>();
		IHandler handler = Kernel.GetHandler(typeof(ValidateConfiguration));
		handler.AddCustomDependencyValue("componentsToValidate",
			 componentsToValidate
			);
		Kernel.ComponentRegistered += OnComponentRegistered;
	}

	public void OnComponentRegistered(string key, IHandler handler)
	{
		foreach (MethodInfo method in 
handler.ComponentModel.Implementation.GetMethods()) { bool isValidateMethod = method
.GetCustomAttributes(typeof(ValidateConfigurationAttribute), true)
.Length != 0; if (isValidateMethod) { componentsToValidate.Add(key); break; } } } }

This extends the container, and whenever a component is registered, I am checking if I need to add that to the list of components that needs validation. I am doing a  tiny bit of cheating here and passing the componentsToValidate as a reference to the component, it is simpler that way, but the component gets the same instance, which is probably not what I would like to do with it for other approaches. I would usually got with a sub resolver that matched that issue, if I was doing something like this for more interesting purposes.

Anyway, here how the ValidationConfiguration class is built:

public class ValidateConfiguration
{
	private readonly ICollection<string> componentsToValidate;
	private readonly ILogger logger;
	private readonly IKernel kernel;

	public ValidateConfiguration(
		ICollection<string> componentsToValidate,
		ILogger logger,
		IKernel kernel)
	{
		this.componentsToValidate = componentsToValidate;
		this.logger = logger;
		this.kernel = kernel;
	}

	public void PerformValidation()
	{
		foreach (string key in componentsToValidate)
		{
			ValidateComponent(key);
		}
	}

	private void ValidateComponent(string key)
	{
		IHandler handler = kernel.GetHandler(key);
		if (handler == null)
		{
			logger.Warn("Component {0} was removed before it could be validated", key);
			return;
		}
		try
		{
			object component = handler.Resolve(CreationContext.Empty);
			foreach (MethodInfo method in component.GetType().GetMethods())
			{
				bool isValidateMethod = method.GetCustomAttributes(typeof(ValidateConfiguration), true).Length == 0;
				if (isValidateMethod)
				{
					ExecuteValidationMethod(component, method);
				}
			}
		}
		catch (TargetInvocationException e)
		{
			logger.Error("Failed to run validation for {0}, because: {1}", key, e.InnerException);
		}
		catch (Exception e)
		{
			logger.Error("Failed to run validation for {0}, because: {1}", key, e);
		}
	}

	private void ExecuteValidationMethod(object component, MethodBase method)
	{
		try
		{
			method.Invoke(component, new object[0]);
		}
		catch (Exception e)
		{
			logger.Error("Failed to validate {0}.{1}. Error: {2}",
				method.DeclaringType,
				method.Name,
				e);
		}
	}
}

This is a class that has some deep association with the container. It is usually not something that I would like in my application services, but it is fine for instrastracture pieces, like this one.

Now that I have that , I can actually test the implementation:

IWindsorContainer container = new WindsorContainer();
container.AddFacility("validation", new ValidationFacility());
container.AddComponent<ITextReader, FileTextReader>();
container.Kernel.GetHandler(typeof(ITextReader))
	.AddCustomDependencyValue("fileName", "foo");
container.AddComponent<ILogger, ConsoleLogger>();

ValidateConfiguration resolve = container.Resolve<ValidateConfiguration>();
resolve.PerformValidation();

And this will go over everything and perform whatever validations needs to be done.

As I said, I really like the idea, and extending this to a build task is really trivial (especially if you are using Boo Build System to do things).

The main point, however, is that I managed to write this piece of code (around 100 lines or so), during Jeremy's talk, so from the time he talked about that feature to the time that he finished, I already got that done. This actually has nothing to do with my personal prowess with code, but it has a lot to do with the way Windsor it built, as a set of services that can be so readily extended.

After I have gotten used to the style that Windsor has, it is getting really addictively easy to start extending the container in interesting ways. I highly recommend that you will take a look at those features, they are interesting both from "what I can do with them" and from "what design allowed this".

Code Review: PetShop 3.0

I got some feedback about my previous review, that the PetShop 2.0 was recognized as architecturely unsound, and that I should look at version 3.0 of the code, which is:

Version 3.x of the .NET Pet Shop addresses the valuable feedback given by reviewers of the .NET Pet Shop 2.0 and was created to ensure that the application is aligned with the architecture guideline documents being produced by the Microsoft.

I have to say, it looks like someone told the developers, we need an architecture, go build one. The result is... strange. It make my spider sense tingle. I can't really say that it is wrong, but it makes me uncomfortable.

Take a look at the account class, and how it is structured:

image

Well, I don't know about you, but that is poor naming convention to start with. And I am seeing here an architecture by rote, if this makes any sort of sense.

Then there are such things as:

image

Which leads us to this:

image

The MSG_FAILURE is:

image

I am sorry, but while there was some effort made here over the previous version, I am not really impressed with it. As I said, the architecture is now probably sound, if suspicious because of lack of character, but the implementation is still not really one that I would call decent. I have to admit about a strong bias here, though. I don't like te naked CLR, but the code has missed a lot of opportunities to avoid unnecessary duplication and work.

I have been also asked what I would consider a good sample application, and I would like to recommend Cuyahoga, as the application that probably models my thinking the best. SubText is also good, but it is more interesting case, because I don't like its reliance on stored procedures. Nevertheless, it is a valid approach, and it certainly serving this blog very well.

Code review: The PetShop Application

I gave a talk about ReSharper today, and I used the PetShop demo app as the base code. I have purposefully avoided looking at the source code of the sample until today, because I wanted to get a gueniue experience, rather than a rehearsed one. I don't think it went as well as it could have, but that is not the point of this post. The point is to talk about just the code quality of the PetShop application.

First, let us see what the PetShop application is:

The Microsoft .NET Pet Shop 2.0 illustrates basic and advanced coding concepts within an overall architectural framework

The Pet Shop example is supposed to illustrate "coding concepts", but mostly it demonstrate those that you want to avoid. I was shocked by what I was seeing there.

I am going to touch just the account class, because if has enough issues all on its own. The account class is supposed to be a domain entity, but what do you see when you open it?

image

And I really don't like to see SQL inside my code.

And then there is this:

image

And suddenly I am confused, because I see a class that is doing the work of three classes, and that is just by casual browsing.

And then we had this:

image

Somehow, public fields and violating the .NET naming conventions doesn't really strikes me as a good idea.

Code duplication, like between the Account.Insert() and Account.Update(), both have significant duplication in the form of:

image 

I thought that the whole idea of a sample app was to show of best practices, and that is... quite not so, in this case.

Slipping under the radar

I am having quite a few interesting discussions at DevTeach, and one of those had to do with introducing projections and processes against opposition. For myself, I am a... bit forceful about such suggestions, especially in face of stupid opposition.

One of the things that came up was simply to do it, the old "it is easier to ask for forgiveness than permission". I am both supportive for that and not really comfortable with the idea.

I support it because it is a way to actually get things done, but I got a really good example of why it is not always a smart idea. The story was using Rhino Mocks for mocking, with some team members starting to use it without proper introduction.

The resulting code created tests that passed, but had a strong coupling to the code under test (too many mocks, too much expectation). When the code change, the test broke, because it was specifying too much.

For myself, I have seen similar issues that can result as slipping stuff under the radar, which is why I am not comfortable with that in most cases.

It is not always the case, continuous integration is one such case in which there isn't usually a problem in just setting it up. But if you are adding a dependency to the system, you need to make it clear to the team how it works. Doing otherwise introduce the bus factor, damage the ability of the team, and a host of other problems.

By the way, this doesn't mean that all your team members have to have a vote in any dependency, or any pattern, but it does mean that they all should be aware of them.

The difference between meta programming and IL weaving

Ivan also posted a response to Jeremy's C# vNext features, and he said something that caught my eye:

5. Metaprogramming => no objection here although you can do it now with reflection emit

No, there is a big difference between reflection emit and byte code weaving. Reflection emit is done after compilation is completed, meta programming occurs during compilation. This matters because it means that your code cannot refer to the results of the change being made.

A case in point, I am using IL Weaving to add the Dispose method to a class, but because I am doing it after compilation is completed, I cannot call the Dispose() method on the class, or put it in a using statement, in the same assembly, I would get a compiler error, because the method (and the interface) are not there yet.

Using meta programming, the compiler will know that those now exist, and it can not start using it, because it happened during the compilation process.

The implication of that are pretty significant, if you are talking about what you can and can't do in terms of enriching the language.

Indistiguisable

Nikhil has a post about using MS Ajax with MS MVC*.

What was particulary interesting to me was that it reminded me very strongly of posts that I wrote, exploring Ajax in MonoRail. The method used was the same, the only changes were exteremely minute details, such as different method names with the same intention, etc.

* Can we please get better names for those?

What does Mixin mean?

Bill Wagner has a proposal about the usage of mixins. He is talking about having a marker interface with minimal methods (or no methods), and then extending that using extension methods. To a point, he is correct, this will give you some sense of what a mixin is. But it is not enough.  

It is not enough because of the following reasons:

  • It is not really a cohesive solution. There is no really good way to specify something like SnapshotMixin. You need interface and static class and inherit from the marker interface, etc. Those are... problematic. I want to just be able to say: "also give me the functionality of this class"
  • A more important issue is one of state. The examples in Bill's proposal are all stateless methods, but I want to have a stateful mixin. I can think of several hacks around that, but they are hacks, not a proper way to work.

C# vNext

Jeremy Miller asks what we want in C# vNext. I have only real one request, to have meta programming of sufficent power, after which will be able to add all the required semantics without having the compiler team to argue with.

I am not holding my breath on that one, though. I can just imagine the arguments against it (let us start from the potentail for abuse, move to version and backward compatability hell, and then move forward).

I  want to go over Jeremy's list, and see what I can add there.

  1. Mixin's - Agree 102%. This is something that would so useful, I can't realy understand how it is not already there. Make it a magic attribute, something like [Mixing(typeof(IFoo), typeof(FooImpl))], and you can get away with it with just compiler magic, no changes required to the CLR.
  2. Symbols - I am ambivelent on that one. Syntatic sugar is important, but I have other things that I would value more.
  3. Make hashes a language feature - I think that you can do it right now with this syntax:
  4. var hash = new Hash(
    	Color => "red",
    	Width => 15
    );
  5. Automatic delegation ala Ruby or Objective C - Um, isn't this just mixin?
  6. Metaprogramming! - absolutely. This is something that I have gotten to consider as basic. I am getting tired of having to fight the compiler to get the code that I want to have. The code should express my meaning, I shouldn't have to dance to the compiler's tune.
  7. Everything is virtual by default to make mocking easier - I certainly would like that, but I fear that this is not something that will be changed. AOP as a platorm concept, now that is something that I want to see.

My own request covers:

  1. memberinfo() - the CLR already has this concept, so we just need the syntax for it.
  2. Method Interception - let us start with the easy stuff, I want to be able to intercept methods from any type. I can do it now if I want to mess with the profiler API, but that is not something that I can really make use of for production.
  3. IDynamicObject - I want method missing, damn it! It is just the scratch of meta programming, but this is something that you could probably add to the compiler in a week.
  4. Static interfaces. Since we already has generics to allow us to treat types as interchangable types, I want to extend this concept by just a bit, to get it to work in a more reasonable manner.

I have a few more, but they just called my flight.

Off to dev teach

See you there in about 20 hours or so, if I'll manage to get through this with my sanity intact.

I am going to be at the Party with Palermo, I'll meet you there.

Tags:

Published at

Why we need for Domain Specific Languages

I speak quite often about DSL and how to build them, but so far I did not do was explain why you need a DSL at all. After all, since you are reading this blog, you already know how to program. Can’t we just use “normal” programming languages to do the same job? We can even do with a dash of fluent interfaces and Domain Driven Design to make the code easier to read.

We need to inspect the different needs that lead the drive toward a DSL.

A Technical DSL

A technical DSL is supposed to be consumed by someone that understands development. It is meant to express matters in a more concise form, but it is still very much a programming language at heart. The main difference is that it is a language focused on solving the specific problem at hand. As such, it has all the benefits (and drawbacks) of single purpose languages. Examples for technical DSL include Rake, Binsor, Rhino ETL, etc.

The driving force around that is that you want to have richer expressiveness in specifying what you want to happen. Technical DSL are usually easier to write, because your target audience already understands programming.

In fact, the use of programming features can make a DSL very sweet indeed. We have already seen a Rake sample, so let us see a Binsor sample:

for type in Assembly.Load(“MyApplication.Web”).GetTypes():
     continue unless type.IsAssignableFrom(Controller)
     component type.FullName, type

So we take three lines to register all the controllers in the application. That is quite a bit of expressiveness. Of course, it assumes that you are familiar with the API and its usage, which is not always true, which leads nicely to business focused DSL

A Business DSL

A business DSL is focused on being (at least) readable to a businessperson with no background in programming.

This type of DSL is mainly expressive in the terms of the domain, and it has a lot less emphasis on the programming features that may still exists. It is also tend to be much more declarative than technical DSL in general and a lot more emphasis is placed on the nature of the DSL so those would not really be necessary.

I can’t really think of a good example of a business DSL in the open. A good example of DSL that I have run into include a cellular company that had to have some way to handle all the myriad of different contracts and benefits that it had. It also needed to handle this with a fairly rapid time to market, since the company needed to respond quickly to market conditions.

The end result was a DSL in which you could specify the different conditions and their results. For instance, to specify that you get 300 minutes free if you speak over 300 minutes a month, you would write something similar to this:

if CallMinutesForCurrentMonth > 300 and HasBenefit “300 Minutes Free!!!”:
          AddCallMinutes -300, "300 Minutes Free!!!"

It was fairly simple to define a small language that could describe most of the types of benefits that the company wanted to express. The rest was a matter of naming conventions and dropping files in a specified folder, to be picked up and processed at regular intervals. The structure that surrounds a DSL is a subject that deserves quite a bit of attention on its own.

A businessperson may not always be able to write actions using a business DSL (more on that later), but they should be able to read and understand it. After all, it is their business and their domain that you are trying to describe.

Now, why shouldn’t a businessperson be able to write actions using a business DSL?

The main reason, as I see it, is of error handling. No, I don’t mean in the actual running of the DSL action, I mean when writing it.

A DSL is supposed to be read like a language, but it is still a programming language, and those have little tolerance for such thing as bad casing for keywords, for instance. Certain types of users will simply be unable to go over the first hurdle in the road they face, because of this.

It is important to know your audience, and it is even more important not to be contemptuous toward that mythical businessperson. You may not think that this person can understand programming, only to discover that there are quite a bit of automation going on in their life already, powered by VBScript and excel macros.

If you can leverage this knowledge, you have a very powerful combination in your hand, because you can provide that businessperson the tools, and he can provide the knowledge and the required perspective.

Automation DSL

I am not quite sure about this classification, but it certainly has its place. Another name for this may be the IT DSL. This type of a DSL it often used to expose the internal of an application to the outside world.

Modern games are usually engines that are being configured using some sort of a DSL. In fact, I fondly remember building levels in Neverwinters Nights.

More serious usage for this can certainly be found, such as a DSL that lets you go into the internals of an application and mange it. Think about the ability to run a script that will take re-route all traffic from a server, wait for all current work to complete, and then take the server down, update it and bring it up again.

Right now, it is possible to do this with shell scripts of various kinds, but most enterprise application can certainly have more visibility into them than already exist, and a DSL that will allow me to inspect and modify the internal state would be very welcome.

I can certainly think of a few administrators who would be grateful to have more power in their hands.

 

To conclude, can you think of other types of DSL in use?

How to visualize a Domain Specific Language

Andrey Shchekin made a good point when he commented on my post about graphical DSL:

I do not use DSLs that are purely graphical. But talking about documentation, I think it is very useful to have a graphical _representation_ for some DSLs that is synchronized with (or generated from) the DSL itself.
For example, any workflows or dataflows (think Rhino ETL) are much easier to see at a glance on visual surface. Editing them on visual surface is also an option, but not a requirement.

I certainly am a believer that a DSL should be mostly declarative, and indeed, taking Rhino ETL as an example, it is something that can have a visual representation.

The only remaining question is how?

Long ago there was Lithium and Netron projects, which were good for building the kind of UI that a DSL is composed of. Those have gone commercial now, and while I can probably find the originals somewhere, that is not a very promising concept. There is always the Visual Studio DSL toolkit, but I have a strong suspicion that I would not like it (COM involved), and anyway, I want to do it externally to Visual Studio.

Anyone familiar with useful libraries that can make it happen? Ideally, it would allow me to generate UI similar to the class designer. It should also be simple, but I am not sure how to really quantify that.

Graphical Domain Specific Languages

Another form of DSL is the graphical domain specific language. 

What do I mean by that? I am talking a DSL that is not textual, but rather uses shapes and lines in order to express intent.

A good example would be UML. That is a DSL for describing software systems. In fact, quite a lot of money and effort has been devoted to making UML the One True Model, from which you can generate the rest of the application.

There has been also a lot of effort invested in making it possible to write your own graphical DSL. Microsoft has the Visual Studio DSL Tools, which is a framework that allows you to build tools similar to the class designer and generate code off of them.

Existing examples of Graphical DSL that comes to mind are:

· UML

· BizTalk

· Sql Server Integration Services

· Workflow Foundation

I don’t like graphical DSL. To be rather more exact, I simply love graphical DSL for documentation, but I find that they make a rather poor job when it comes to actual development.

I have some experience with all of the above, and I can say confidently that they all share common problems, and that those are inherit to the graphical DSL model.

For a start, any problem of any real complexity would be very hard to build using a graphical DSL. Just arranging all the shapes in a way that you can browse them easily takes significant bits of time, and then you get to the issue of actually understanding what this thing does.

The whole point of a graphical DSL is to hide information that you don’t want to see, to let you see “big picture”. All of which means that you can’t really see the whole at the same time, and this leads to a lot of time spent jumping between various elements on the DSL surface, trying to gather all the required data.

And then there are UI issues that are important. How do I do a search & replace operation in a graphical DSL? How can I just look for something? And, of course, there is a reason why mouse driven development is not a good idea, if only for your wrist.

Beyond that, there are serious issues with how do you make a reasonable diff between two versions. No, reading XML files diff (assuming they are even diffable) don’t work, you need some way to express diffs in a graphical way, and so far I haven’t seen any good way to do that.

From an implementation perspective, there are other issues that you need to consider. In a graphical DSL, the need to express things visually is pretty important, so we need to have some conventions about what shapes and connections we have. The actual what and how will depend on the actual language that you are trying to implement.

If you haven’t guessed so far, I am not a fan of graphical DSL. Not for programming, at least, it is a valuable tool for documentation and design, but quite a few people are pushing it too hard.

Tasers

imageSteve is talking about tasers and taser related deaths

I remember when we got a few tasers. The rules for using them were pretty strict, they were considered as firearm for all intents and purposes. Using them as a method to "subdue" people once they are handcuffed would fall under the category of torture. I can just imagine what I, personally, would do to anyone who tried that, and I would be the only first step in the chain.

Then, of course, you have the nice legal situation where we literally had to get a doctor's permit to use it. The open fire conditions where the same, but if the inmate had a heart problem, we were to shoot using old fashion bullets rather than use a taser. Old fashion firearms didn't need a doctor's permit to shoot.

This is a screwed up world. Then again, we never gotten into a situation where we actually had to shoot someone, for which I am grateful. I abhor violence. It begets too much paperwork.

A vision of enterprise platform: Hot & Distributed Deployment

imageIf deployment means a week with little sleep and beepers going off just as you nod to sleep, then you are doing it wrong. I am saying that as someone whose standard deployment procedure has two steps, "svn up" and "build.cmd". Actually, I lied, I have a single step, deploy.cmd, which runs them both for me.

Now, you can't always get this procedure to work, this is especially true if you need to deploy several different parts of the same application at the same time. The classic example is making a change that requires modifying the database as well. More interesting examples consist of several layers of the same application, web servers, application servers, dedicated tasks servers and the database servers, all of whom need to be touched in order to deploy the new version of the application.

One of my customers takes the simple approach of shutting down everything and systematically upgrading all the systems when the application itself is offline. That is probably the simplest approach, you don't have to worry about concurrent state and multi versioning, but it is also the one that is harder to the business to deal with.

In fact, service level agreements may very well limit or prevent doing that. Here is a simple table of the common SLA and their meaning, in terms of system maintenance:

Uptime Time for offline maintenance, monthly
99% 7 hours, 12 minutes
99.9% 43 minutes
99.99% 4 minutes
99.999% 26 seconds

The last number is the one we should pay attention to, it is the magic 5 nines that SLA often revolves around. Now, SLA often talks about unplanned downtime, and in many systems, you can safely take it down outside of business hours and tinker as much as you like with it. Those are not the kind of systems that I want to talk about today. The system that you can take down safely are fairly easy to upgrade in a safe manner, as was previously mentioned.

The systems that are more interesting are the ones that you want to be able to update on the fly, without a painfully long deployment cycle. This is directly related to the ability to meet required SLA, but it also has another critically important factor. Developer productivity. If making a change in the application is something that requires stopping the application, updating files, starting it up again and then being hit with the initial, it is painful, annoying and will lead developers to jump through all sorts of really nasty hops along the way.

So easy deployment is critical both for the developer and for production, but the needs of each are vastly different. For developers, having it work "most of the time" is fine, having it consume more memory is also fine, because the application is not going to run for days on end. For production, it must work all the time, and you must watch your resource consumption very carefully.

A typical hot deployment for production will usually look like this:

image

This assumes a typical production scenario in which you have duplicates of everything, for fail over and performance.

This is just an extension of the previous cold deployment, since we are taking servers down, just not the application as a whole.

This assumes that we actually can take vertical parts of the application down, which is not always the case, of course.

But, basically, we start with both servers operational, move all traffic to the second server, and then perform a cold deploy to the first set of servers.

Then we move all traffic to the new server and perform cold deployment on the second server.

In the end, this usually means that you can perform the entire thing upgrade without loss of service (although you may suffer loss of capacity.

As I said, this is probably the ideal scenario for perform hot deployments, but it makes quite a few assumptions that are not always valid. The main one is that the application is structured vertically in such a way that we can close and open parts of it and there are no critical points along the way.

I can think of one common scenario where this is not the case, several web servers sitting on top of the same database server, for instance. So, while this is a good approach, we probably want to consider others as well.

Another option is to design the system with multiply concurrent versions from the get go. This gives us the following diagram:

image

We have several versions running at the same time, and we can migrate users slowly from one version to the next. This also gives us a nice rollback strategy if we have a botched version.

The problem with this approach is that it is a complicated procedure. This is especially true if the application is composed of several independent services that needs to be upgraded separately. Again, database and web servers are common example, but even web services that needs to be upgraded requires some interesting mechanics when you have enough of them.

I am going to talk about multi versioned databases in the next post in the series, so let us just take as a given that this is possible, and consider the implications.

If I can run multiply versions at the same time, this means that I can probably propagate a change throughout a cluster of servers without having to worry about the change over semantics. (The time between the first server getting the change and the last one getting it).

Before we go on with production deployments, I want to talk a bit about developer deployments. What do I mean by that?

Well, deployment is probably the wrong term here, but this is the way most of the enterprise platforms make you go about it.  And that is a problem. Deploying your changes just in order to run them is a problematic issue. It is problematic because it interferes with make-a-change/run approach, which is critically important for developer productivity.

As mentioned above, any multi step process is going to be painful as hell over any significant development. I often have to go through several dozens of change/run before I can say that I have finished with a particular feature.

There is a strong incentive to make it as easy as possible, and as it turns out, we actually have the tools in place to do it very easily indeed. I have produced a screen cast that talks about hot deployment of compiled code, and applying the same principals to code in its textual form is demonstrated here.

image image Here is the directory structure that I envisioned for the platform. As you can guess, this is backed by source control, but more importantly, it is alive (queue mad laugher here).

But what do I mean by that? I mean that any change whatsoever that is done to this directories will be automatically picked up and incorporated by the system as soon as the change is detected.

This supports very nicely the idea of change & run mentality. But what actually goes into these folders?

The parts that are important to us are the entities, controllers and views. For discussing those, we will take the Account entity as an example.

For the system, an Account entity is composed of the following files:

image

There is a reason why I choose to use boo as the primary language for extending the platform. Not just because of personal bias (which exists), but because it makes it so much easier to deal with quite a few issues.

One of them is the ability to define my own syntax, so the content of the Account.boo file would be similar to this:

entity Account:
    AccountId as int
    Name as string

The ability to define entity as a keyword means that I don't need to do anything more to define any persistence concerns, even though I intend to use this as an Active Record class, it is all handled by the framework.

I do intend to allow extension using compiled code, that is why the binaries folder is there for, and you can certainly define C# classes, but the main idea here is to allow easy editing of the whole thing, which means that compilation is not necessarily a good thing.

So, after this explanation, let us go back a bit and talk about what deployment means in this scenario? Well, the first thing that it means is that once a change is detected, you want to recompile the file and keep on moving without breaking your stride. Brail itself is a good example of it. Brail templates are instantly updated if changed, but they are usually compiled (thus supposedly faster). From the developer perspective, there isn't any difference between the two. It works very well in practice, and the productivity boost means that it is encourage the small steps approach. All in all, I am quite happy with it.

I am going to leave the technical details aside for now, let us just say that it is equally easy to do in both source and binary form, and you can either see the webcast or check the post about each one.

There are a few things that we should be worried about, however, mainly, recompiling files all over the place will cause an assembly leak, which can have an adverse affect on memory consumption. Here we get to the some interesting design decisions. Batching compilation will help greatly here, so we can grab all the controllers and compile them into a single assembly, then recompile the changes into separated assemblies. This is the solution used by Brail, for instance.

This seems like it can cause problems, and in theory, it will, but in practice, there are several mitigating factors:

  • During development, we can expect the application lifetime to be short enough that assembly leakage is rarely an issue, if it is, there is a small cost to restarting the application.
  • On production, we will rarely expect to have a lot of churn, so we can handle the extra memory requirement (in the order of a single megabyte or so, in most cases).

More advance scenarios calls for either AppDomain restart (the way ASP.Net does it) or a separate AppDomain that can be safely loaded. Personally, I think that this would make the situation much harder, and I would like to avoid it if possible. The simplest solution works, in my experience.

What this all means is that a developer can go and make a change to the Account controller, hit the page and immediately get the changes made. Deployment now means that we commit to the development branch, merge to the production branch, and then we request the production system to update itself from the source. A form of CI process is a valid idea in this scenario, and will probably be the default approach to updating changes in a single system scenario. We have to have a way to disable that, because we may want to upgrade only some of the servers at a time.

This leaves us with the following issues:

imageDebugging - How can we debug scripts? Especially dynamically changing scripts? Despite what some of the Ruby guys say, I want to have the best debugger that I can available. A platform with no or poor debugging support has a distinct disadvantage.

As it turns out, the problem isn't that big. The Visual Studio debugger is a really smart one, and it is capable of handle this in most cases with aplomb. And since Boo code is compiled to IL, VS has few issues with debugging it. For the more complex scenarios, you can use C# and just direct the build path to the binaries folder.

In any case, debugging would usually involve "Attach to process" rather than "Start with debugger", but this is something that I can live with (Attach is faster anyway).

Database modifications - let us just hand wave the way we are going to handle that for a minute. We will just assume we can do something like UpdateSchema() and it will happen, we still need to think about the implications of that.

Now we need to think about how we are going to handle that when we update an entity. Do we want this update to the schema to be automatic, or do we want it to happen as a result of user input? Furthermore, changing the entity basically invalidate the ability to call the database, so how do we handle that?

Do we disable the usage of this entity until the DB is updated? Do we just let the user to run into errors? What do we do for production, for that matter? I definitely not going to let my system run with DDL permission for production, so that is another problem.

I can give arguments for each of those, but a decision still has to be reached. Right now I think that the following set of policies would serve well enough:

  • For production, all database changes must be handled after an explicit user action. Probably by going to a page and entering the credentials of a user that can execute DDL statements on the database.
  • For development, we will support the same manual approach, but we will also have a policy to auto update the database on entity change.

We are still in somewhat muddy water with regards to deploying to production with regards to changes that affects the entire system, to wit, breaking database changes. I am going to discuss that in the next installment, this one got just a tad out of hand, I am afraid.

Domain Specific Language: Losing the original language

Here is an interesting question, at what point you drive away so far from the original language that you lose the benefits of an internal DSL?

The following examples are several ways to represent the same thing, going from the extreme DSL to the no DSL to C#. I think that they all have value, and neither cross that line, but I believe that they will help make the point. The example here is a a simple MonoRail Controller:

#1:

controller Field:
    action Tag:
           @defaultTags = [ "green", "trees", "growth" ]

#2:

controller Field:
    def Tag():
         @defaultTags = [ "green", "trees", "growth" ]

#3:

controller Field:
    def Tag():
          PropertyBag["defaultTags"] = [ "green", "trees", "growth" ]

#4:

class FieldController(Controller):
   def Tag():
          PropertyBag["defaultTags"] = [ "green", "trees", "growth" ]

#5:

public class FieldController : Controller
{
	public void Tag()
	{
		PropertyBag["defaultTags"] = new string[] { "green", "trees", "growth" };
	}
}

Any comments?

A different UI approach

I have been doing some work with MonoRail recently, and I noticed that I am structuring the UI in a completely different way than the way I would using WebForms.

When I used WebForms, I at first tried to make significant use of the builtin controls and components, thinking that they would make my work easier. But it turned out that they added complexity instead of removing it. Then I tried to use WebForms in as pure HTML generation capacity as possible. I made very big use of repeaters and such controls, some of which I built myself.

When I just started to use MonoRail, I had much the same style, but now I am finding out that I am doing things differently. The more recent solutions relies a lot more about generating JS for data, and then generating the UI using JS. The somewhat primitive approach outlined below represent a spike for a new feature, where we simply wanted to see how we can get it started, but I think that this makes the point fairly obvious.

<script type="text/javascript">
	var orders = [];
	var lines = [];
<% 
	for order in orders: 
%>
	orders.push({
			id: "!{order.Id}",
			name: "!{order.Name}",
		});
<%
		for orderLine in order.OrderLines: 
%>
	lines.push({
		id: "!{orderLine.Id}",
		orderId: "!{order.Id}",
		orderName: "!{order.name}",
		cost:	"!{orderLine.Cost}",
		account: "!{account.name}" 
	});
<% 
	end 
end
%>

Event.observe(window, 'load', function(){
	for(var i = 0; i< lines.length; i++)
	{
		var line = lines[i];
		var td = $('orderlinessTR').down('td');
		var div = document.createElement("div");
		div.className = "order";
		div.tag = line;
		div.innerHTML = line.orderName;
		td.appendChild(div);
	}
	
	for(var i = 0;i<orders.length; i++)
	{
		$('ordersDDL').options.add(new Option(orders[i].name, orders[i].id));
	}

	Event.observe($('ordersDDL'), 'change', function(){
		var e = $('ordersDDL');
		var id = e.options[e.selectedIndex].value;
		var divs = $$('div.line');
		for(var i = 0; i< divs.length; i++)
		{
			if(divs[i].tag.orderId == id)
				divs[i].style.background = "red";
		}
	});
});

</script>

As you can see, we start by generating JS for the data, and then process it somehow. I considered using JSON to serialize my entities directly to JSON strings, but I am absolutely against tying the javascript to my domain model. An explicit translation step is much more welcome in this case, although the one above it is not a very robust one, I will admit.

In fact, as a direct correlation of this trend, I find myself doing a lot more work that centers around creating javascript controls. Those are rich, incredibly easy to develop, frustrating at times (usually when I am working on explorer) and framework independent. I have completely buy into Prototype/JQuery (I really should post about that) as an underlying framework to work with, but I have yet to start working with the JS control libraries that I hear other peoples raving about. I made some use of scriptacolous, but that is about it.

Am I doing something strange? Or is it just a common trend that I missed?

Geek Scripture

At first, there was the bit. And the bit shifted left, and the bit shifted right, and there was the byte. The byte grew into a word, and then into double word. And the developer saw the work and it was good. And the evening and the morning were the first day.

On the next day, the developer came back to the work and couldn’t figure out what he was thinking last night, and spent the whole day figuring that out.

A false sense of security

After reading my post about enterprise security infrastructure, Comlin asks:

Ayende, What would you say if I give you a real world scenario:
1. Server running Microsoft Windows hosts a web site which includes your security features.
2. Separate server hosts database.
Well, common thing isn't it?
We are pretty sure that the first server(it's on the Internet) would be hacked sooner or later and the hacker will acquire admin privileges. And all we have is your "entity security" inside the BL, who needs it anyway, you will take the connection string and perform all the malicious actions inside the database!
I see two ways to handle this situation: developing an app server to process queries or building security infrastructure inside DB using stored procedures or row-level security. What do you think?

Let us go through several of those iterations in turn, shall we?

First, I disagree with the implicit assumption that it will essentially be broken, and broken with admin privileges. But even then... Well, getting admin access to the box doesn't mean that you get my connection string, by no means. That connection string is DPAPI-ed encrypted and keyed to the user running the application. Even as admin, you have no way to get that out.

Nevertheless, let us assume that a nefarious person did get their hands on a connection string to my database, one that had read and write access. Well, what can it do?

Well, doing something like "UPDATE Sales SET TotalSales = TotalSales * 100 WHERE UserId = @myUserId", and get a hefty commission all of a sudden. But is there really a way to stop it at this point?

It seems to me like at that point, anything more it basically pointless, it is closing the doors with the fox in the coop. Let us take into account the idea of using stored procedures, shall we. What extra security measure do we get from that? The user is already in possession of a connection string with privileges to do things in the database. They can now do things like "while @count  < 100 exec AddSaleTo @myUserId".

Oh, you solve that by using windows authentication all the way to the database, and validate permissions on the logged on user? Well, to do that, the web server must also be marked as trusted for delegation. Which means that if the "hacked" server tells the DB to think that it is Super Mario, it will believe it.

Row level security is bad for the same reasoning as before,  but it has an additional twist. Doing security in the database level utterly exclude the option to cache data. That is bad, really bad.

image

 

Then you have the idea of an application server in the middle. Now we have a bit more structure, we may even be able to do some business logic testing on the incoming requests, but, and this is the critical part. The application server has no way to know that the web server was hacked. As far as it is concerned, those requests now being made to it are valid ones, and it will try to honor the thousands AddSaleToUserMessage coming its way.

Now, you could structure you application in such a way that nothing trusted anything else, but that is not a good way to structure an application. To be rather exact, the cost of untrusting the inner circle is very high. Both in terms of the effort involved and as a result of the design, which would be awkward to work with long term.

Now, the diagram here is fairly typical, I would assume, in fact, this is how OS security is built, by having a trusted kernel. The kernel cannot protect itself from itself. That is why rootkits are possible.

And, like rootkits demonstrate, it contains a very problematic flaw, you cannot ensure that all the nodes will remain trustworthy. And when one of them turns red, it is endgame.

A better approach would be to go with the firewall approach, with the external application explicitly not trusted by the internal one, and all communication be inspected for suspicious activities.

 

image You can imagine it a little the diagram to the right. This is another typical architecture design. This time we are going with the application server route, with a firewall (or an application firewall) in the middle.

That point of yellow is an inspection port, in addition to the usual checks a firewall will make, this inspection port will try to analyze the request with a more domain knowledge. It can then trigger alarms or deny access if it suspect that the external application is behaving strangely. Such things that might trigger it are malformed business requests, too much focus on a single item (such as trying to add sales to a salesperson), etc.

Then again, something else that would trigger it might be a data entry clerk that is typing really fast, or a simply sale on the site that means that a lot of people are suddenly buying this one item, etc, etc.

Something that I would definitely be interested to have is an audit trail, so I can revert changes if needed, or at least follow their logic.

In the end, there really isn't a one good way to secure an application, this requires cooperation from developers, IT, networking, etc. If there was, everyone was using it. And while I do believe in defense in depth, I also believe that once the king is taken, the game is over. Starting with the premise that the attacker has gained an admin control over one of your machine is not a start that you want to be in.

A vision of enterprise platform: Security Infrastructure

I have been asked how I would design a security infrastructure for my vision of an enterprise platform, and here is an initial draft of the ideas.

As anything in this series, no actual code was written down to build them. What I am doing is going through the steps that I would usually go before I actually sit down and implement something.

While most systems goes for the Users & Roles metaphor, I have found that this is rarely a valid approach in real enterprise scenarios. You often want to do more than just the users & roles, such as granting and revoking permissions from individuals, business logic based permissions, etc.

What are the requirements for this kind of an infrastructure?

  • Performant
  • Human understandable
  • Flexible
  • Ability to specify permissions using the following scheme:
    • On a
      • Group
      • Individual users
    • Based on
      • Entity Type
      • Specific Entity
      • Entity group

Let us give a few scenarios and then go over how we are going to solve them, shall we?

  1. A helpdesk representative can view account data, cannot edit it. The helpdesk representative also cannot view the account's projected revenue.
  2. Only managers can handle accounts marked as "Special Care"
  3. A team leader can handle all the cases handled by members in the team, team members can handle only their own cases.

The security infrastructure revolves around this interface:

image

The IsAllowed purpose should be clear, I believe, but let us talk a bit about the AddPermissionsToQuery part, shall we?

Once upon a time, I built a system that had a Security Service, that being a separate system running on a different machine. That meant that in order to find out if the user had permission to perform some action, I had to send the security service the entity type, id and the requested operation. This worked, but it was problematic when we wanted to display the user more than a single entity at a time. Because the system was external, we couldn't involve it in the query directly, which meant that we had to send the entire result set to the external service for filtering. Beyond the performance issue, there is another big problem, we had no way to reliability perform paged queries, the service could decide to chop up 50% of the returned results, and we would need to compensate for that somehow. That wasn't fun, let me tell you that.

So, the next application that I built, I used a different approach. Instead of an external security service, I had an internal one, and I could send all my queries through it. The security service would enhance the query so permissions would be observed, and everything just worked. It was very good to observe. In that case, we had a lot of methods that did it, because we had a custom security infrastructure. In this case, I think we can get away with a single AddPermissionsToQuery method, since the security infrastructure in place is standardize.

Now, why do we have a Why method there? Pretty strange method, that one, no?

Well, yes, it is. But this is also something that came up through painful experience. In any security system of significant complexity, you would have to ask yourself questions such as: "Why does this user see this information" and "Why can't I see this information" ?

I remember once getting a Priority Bug that some users were not seeing information that they should see, and I sat there and looked at it, and couldn't figure out how they got to that point. After we gave up understanding on our own, we started debugging it, and we reproduced the "error" on our machines. After stepping through it for ten or twenty times, it suddenly hit me, the system was doing exactly what it was supposed to do. I stepped over the line that did it in each and every one of the times that I debugged it, but I never noticed it.

You really want transparency in such a system, because "Access Denied" is about the second most annoying error to debug, if the system will give you no further information.

Now, I am going to show you the table structure, this is not fixed in stone, and don't try to read too much into seeing a table model here. It simply make it easier to follow the connections that a class diagram would.

image

Let us go over some of the concepts that we have here, shall we?

Users & Groups should be immediately obvious, let us focus for a moment on the Operations and Permissions. What is an operation? Operation is an action that can happen in the application. Examples of operations are:

  • Account.View
  • Account.Edit
  • Account.ProjectedRevenue.View
  • Account.ProjectedRevenue.Edit
  • Account.Assign
  • Account.SendEmail

As you can see, we have a fairly simple convention here. [Entity].[Action] and [Entity].[Field].[Action], this allows me to specify granular permissions in a very easy to grok fashion. The above mentioned operations are entity-based operations, they operate on a single entity instance at a time. We also have feature-based operations, such as:

  • Features.HelpDesk
  • Features.CustomerPortal

Those operate without an object to verify on, and are a way to turn on/off permissions for an entire section of the application. Since some operations are naturally grouped together, we also have relations between operations, so we will have the "Account" operation, which will include the "Account.Edit", "Account.View" as children. If you are granted the Account operation on an entity, you automatically get the "Account.Edit" and "Account.View" on the entity as well.

This makes the design somewhat more awkward, because now we need to go through two levels of operations to find the correct one, but it is not a big deal, since we are going to be smart about how we do it.

Permissions are the set of allowed / revoked permissions for an operation on an EntitySecurityKey (will be immediately explained) which is associated with Group, User or EntityGroup.

A simple example may be something like:

  • For User "Ayende", Allow "Account" on the "Account Entity" EntitySecurityKey, Importance 1
  • For Group "Managers", Revoke "Case.Edit" on "Case Entity" EntitySecurityKey, Importance 1
  • For Group "Users", Revoke "Account.Edit" on "Important Accounts Entity Group" EntitySecurityKey, Importance 1
  • For Group "Managers", Allow "Account.Edit" on "Important Accounts Entity Group" EntitySecurityKey, Importance 10
  • For User "Bob from Northwind", Revoke "Account" on "Northwind Account"  EntitySecurityKey, Importance 1

The algorithm for IsAllowed(account, "Account.Edit", user) is something like this, get all the operations relevant to the current entity, default to deny access, then check operations. Revoke operation gets a +1, so it is more important than an Allow operation in the same level. Or in pseudo code (ie, doesn't really handle all the complexity involved):

bool isAllowed = false;
int isAllowedImportance = 0;
foreach(Operation operation in GetAllOperationsForUser(user, operationName, entity.EntitySecurityKey))
{
	bool importance = operation.Importance;
	if(operation.Allow == false)
		importance + 1; 
	if ( isAllowedimportance <  )
	{
		isAllowed = operation.Allow;
		isAllowedimportance = operation.Importance;
	}
}
return isAllowed;

As you had probably noticed already, we have the notion of an Entity Security Key, what is that?

Well, when you define an entity you also need to define its default security, this way, you can specify who can view and edit it. Then, we we create an entity, its EntitySecurityKey is copied from the default one. If we want to set special permissions on a specific entity, we will create a copy of all the current permissions on the entity type, and then edit that, under a different EntitySecurityKey, which is related to its parent.

All the operations in the child EntitySecurityKey are automatically more important then the ones in the parent EntitySecurityKey, regardless of the important score that the parent operations has.

In addition to all of that, we also have the concept of an EntityGroup to consider. Permissions can be granted and revoked on an Entity Group, and those are applicable to all the entities that are member in this group. This way, business logic that touches permissions doesn't need to be scattered all over the place, when a state change affects the permissions on an entity, it is added or removed to an entity group, which has a well known operations defined on it.

Now that you probably understand the overall idea, let us talk about what problem do we have with this approach.

Performance

The security scheme is complex, and of the top of my head, given all the variables, I can't really think of a single query that will answer it for me. The solution for that, like in all things, it to not solve the complex problem, but to break it down to easier problems.

The first thing that we want to consider is what kind of question are we asking the security system. Right now, I am thinking that the IsAllowed method should have the following signatures:

public bool IsAllowed(Operation, User, Entity);
public bool IsAllowed(Operation, User);

This means that the question that we will always ask is "Does 'User' have 'Operation' on 'Entity'?", and "Does 'User' have 'Operation'?". The last is applicable for feature based operations only, of course.

So, given that this is the question we have, how can we answer this efficiently? Let us try to take the above mentioned table structure and de-normalize it to make queries more efficient. My first attempt is this:

image

This allows you to very easily query by the above semantics, and get all the required information in a single go.

A lot of the rules that I have previously mentioned will already be calculated in advance when we write to this table, so we have a far simpler scenario when we come to check the actual permissions.

For instance, the EntitySecurityKey that we send is always the one on the Entity, so the DenormalizedPermissions table will always have the permissions from the parent EntitySecurityKey copied with pre calculated values.

Since everything is based around the EntitySecurityKey, we also have a very simple time when it comes to updating this table.

All we need to do it rebuilt the permissions for this particular EntitySecurityKey.

This makes things much easier, all around.

 

Querying

What this means, in turn, is that we have the following query to issue when we come to check permissions:

SELECT dp.Allow, dp.Importance FROM DenormalizedPermission dp
WHERE       dp.EntitySecurityKey = :EntitySecurityKey
AND         dp.Operation = :Operation
AND         (dp.User = :User OR dp.Group IN (@UserGroups)
                  OR EntityGroup IN (@EntityGroups) )

All we need to do before the query is to find out all the groups that the user belongs to, directly or indirectly, and all the Entity Groups that the entity belongs to.

When it comes down to check a feature-base operation, we can issue the same query, sans the EntitySecurityKey, and we are done.

Another important consideration is the ability to cache this sort of query. Since we will probably make a lot of those, and since we are probably also going to want to have immediate response to changes in security, caching is important, and write-through caching layer can do wonder for making this optimized.

What is missing

Just to note: this is not complete, I can think of several scenarios that this has no answer for, from the Owner can do things other cannot to supporting permissions if the organization unit is identical for the entity and the user. However, adding those is fairly easy to build within the system, all we need to do is define an action that would add the owner's permissions explicitly to the entity, and remove it when they are changed. The same can be done for entities in an organization unit, you would have the group of users in Organization Unit Foo and the Entity Group of entities in Organization Unit Foo, which will have a permission set for that group.

Final thoughts

This turned out to be quite a bit longer than anticipated, waiting expectantly for you, dear reader, to tell me how grossly off I am.

Next topics:

  • Hot deployments and distributed deployments
  • A database that doesn't make you cry
  • Supporting upgrades
  • Platform I/O - integration with the rest of the enterprise

A vision of enterprise platform

A while ago I asked what kind of constraint an enterprise platform should deal with, I have been thinking about this ever since. Let us go over the list of constraints that have been brought up, and then I am going to discuss how I think about solving them.

  • Extensible in an easy manner - note that this holds for business analysts and for developers, both are groups that are likely to do work on the system. Ideally we can have some sort of a common interface that would make both people happy.
    • New entities
    • User Interface:
      • Forms
      • UI elements
      • Editing existing forms
    • Replacing core services
  • Upgradable - We want to allow the users to move from version 2.0 to 3.0 without having to re-write everything. This means that we need to make clear what we allow the user to do to the system if they want to have a successful upgrade.
  • Auditable - I don't suppose that I have to explain why, right?
  • Performant - well, it should. Considering the other contestant, that could be a major selling point.
  • Scalable - I want to be able to scale wide, so the CRM can scale as I add more servers.
  • IT friendly - expose state in a way that makes sense to the admin, allows monitoring and tracing easily, dashboard, interactive console.
  • Scriptable
  • Security - both for features and for the data
  • Data integration - allow to easily get and extract data from the system, including in bulk.
  • Hostable - can be run in a datacenter, can have multiple instances on a single machine securely
  • Easily Migratable - Data/Plugins/Code needs to be easily migratable/promotable from Development to Test to Production
  • Easily Comparable - I know my Production and Test environment are exactly the same (except for the data)... aren't they?
  • Exportable/Importable - I'll back it up to a file, import it onto a new server. Change a few settings like hostname and connection strings and have a clone site for whatever nefarious purposes.
  • Xcopy Deployable  - Well, as much as possible anyway
  • Easy Backup/Restore process
  • Customizable layout

I should point out, again, this is a theoretic exercise, but I am willing to stand by the assertion that I am making here. Everything is both possible and not too hard.

In addition to the above mentioned constraints, there is also my post about Evaluating a Business Platform, the list of requirements from this post has some duplication with the constraints, naturally, but they also deserve a mention:

  • Source Control - should be easy, simple and painless.
  • Ease of deployment
  • Debuggable - easily
  • Testable - easily
  • Automation of deployment
  • Separation of Concerns
  • Don't Repeat Yourself
  • Doesn't shoot me in the foot
  • Make sense - that is hard to explain, but it should be obvious what is going on there
  • Possible to extend - hacks are not something that I enjoy doing

I am going to just go through the solution I have in mind, and point out how it answers the constraints above.

First of all, we are talking about a web application, that is much easier than a client/server system, although that is an option as well. As such, I am going to use my default architecture here. This means that in terms of technology, we are going to be based on NHibernate and ActiveRecord for data access, MonoRail for a web framework, Windsor as a core concept and Binsor to configure it.

Why this stack? Because it is familiar to me, performant, extensible, easy to scale from the simple demo app to a complex enterprise applications and promote good practices.

The main concepts are change management (source control), distributed deployment and a tiny core that can be extend upon by the users. The core handles such things as managing the application state and displaying the UI, etc. Anything else, including all the core services are handled via extensions. From a deployment perspective, here is what I envision:

image One of the more important ideas is the ability to run multiply instances of the application on the same machine.

This ensure that it is a developer friendly application, not one that attempts to take over the entire machine.

The diagram paints the extreme case, obviously we can collapse all of those into the same machine.

You note that caching is built directly into the deployment design, this is because I intend to make heavy use of it for making the application scalable.

Another interesting thing to note is the SCM server as an underlying concept. Deployment in this application is done by committing a change to the /production branch.

This gives you some nice benefits out of the box, auditing all the changes in the application, for one, as well as very easy way to push changes to production, using well known and debugged tools. It also means that it is trivial to work in a team and that branching is not an issue.

The ASync server is to handle long running tasks, from batch processes to workflows to ETL processes.

Each application gets two databases (probably more, as a matter of fact, I skipped a logging & auditing DB, which I like to keep separate). One is for the application to manage itself, the second is for the actual data that you put inside.

I am not sure about the read only copies, though. Probably this is something that is better off handled by the database itself, using master/slaves arrangement.

I have started from the deployment diagram on purpose, because it is a good overview of the concepts that we are having. Source control and auditing are checked. Scalability through throwing cheap machines at it, checked. There may be some issues about that database, but this is why I introduced caching at this level already. Hostable is also checked, we can run several of those on a single machine. Easy migration is also checked.

XCopy deployment... well, that is a bitch in such a setup, but for a new machine, it should involve getting the code there and running deploy.cmd, not messy setup processes. Backup & restore, we basically need to backup the SCM and the DB, nothing more, so that shouldn't be a problem.

Let us talk a bit about what I mean by putting SCM in the deployment diagram, okay? Here is the structure that I have in mind right now, but remember that this is just speculation at the moment.

image

What is the point in this?

This layout pretty much mirrors the main extension points that I have for the application. I am still undecided about this structure, for several reasons, but let us go through how it works in order to explain what are the problems that I have with it.

The views, controllers and entities are pretty much self explanatory in their purpose, if not their implementation. As I mentioned, except the core, everything is an extension. Let us say that I want to go and define the Account entity. A screen will be provide by the core application for creating and editing entities. This screen will provide UI for defining a new entity, defining new fields and defining their UI characteristics.

This screen will be responsible for generating three files. /entities/Account.boo, /controllers/AccountController.boo, /views/Account/edit.brail

These files represent the way the sum total of information that the system needs to know about an entity. Those are also versioned files, and more importantly, they are very malleable for user modifications.

If the developer prefers to use C#, they can build a project which will contain Account and AccountController classes, and place the resulting assembly in the /binaries folder. In both cases, the application will recognize and load the new code on the fly. This is critically important to ensuring developer productivity and happiness, I have found. If external projects are used, they should be stored in the /src folder.

Obviously, both the boo files and the compiled binaries can also use other assemblies that are located in the binaries folder.

The usage of explicit controllers means that you are free to put whatever logic you want into the process, as well as modify the view to contain additional UI elements that can use to express logic better. Persistence is easy, because using NHibernate and ActiveRecord makes this trivial, but this also points out to an important issue, we can now access the data from the CRM in a very easy fashion, using all the power that NHibernate gives us, which is far from trivial.

Now, what goes into the /tasks and /workflows folders?

Tasks are scheduled tasks, occurring at specified intervals to execute some sort of a business operation (canceling all withstanding orders that are on hold for more than 30 days, for instance). Workflows are just that, a way to specify actions to happen in a more global fashion, using a DSL. A trivial example may be:

on case:
   when case.IsSolved == false and case.DueDate > CurrentTime:
        raiseAlert "Should have handled ${case.Id} by ${case.DueDate}, but the case is still ${case.Status}."

This is also a way for a business user to extend the system. A UI layer for that is probably also in order.

The /config contains the configuration of the application, basically the binsor configuration and maybe the web.config files. Using that, you can override the default services of the CRM by using your own.

One of the interesting challenges to this system is the need to modify existing tables on the fly. I can think of at least three solutions. Build a sync table state myself, use Hibenate's method and do a create temp, copy, remove original and rename (like SQLWB). No comment on that yet.

Security - Role based security is a common theme, but what is needed is actually feature, row and field level security with the additional issue of allowing to specify permissions on role, nested roles and users. Care should be taken to ensure that we can replace that with other security options, such as limiting a user's ability to submit questions to products the user have previously bought.

Authentication - we probably want mixed mode authentication, with the option to define users that are not in Active Directory.

Full text indexing of everything - just something to toy with as well, push luecene for everything there, there should be no reason that I couldn't search for customer:3424 and get it.

I got some nifty ideas about how to implement the security infrastructure, but it is 05:21 AM already, and I am not sure why I am still awake.

Let me know what you think.

An apt description of ALT.Net

Charlie Poolie, on the ALT.Net mailing list:

So one reaction I had to Alt.NET was that it was a group of folks who don't do stupid things: sort of like forming a club for people who don't play
in traffic or don't juggle sharp instruments.
Oddly, as others have pointed out to me, such a group is actually needed in the .NET world.

ROTFL.

Ignorance vs. Negligence

Recently there have been a few posts about people who... misrepresent themselves when it comes to knowledge and authority. A common question that came up in the comments of those posts was to specify who the guilty party is, if not by name, then a description.

 First, I want to be clear about the separation I am making. I am seeing a lot of bad code, and I posted some choice tidbits here in the past. Some of it is inexcusable in a software professional, some of it is valid, because the author is not a developer.

I don't have any issue with bad code coming from people who are not developers (but if you call yourself an architect, and you can't code, get thyself away and ponder your sin). I can either work with them to make sure that they will be developers, or limit the amount of work they need to do on the system.

What drives me into sputtering rages is seeing so called professional developers and architects who couldn't find a clue if I hit them with it. Some random stories, all by professional developers and architects, to make sure that I am clear about who I am talking with.

  • Senior Java Architect calls a high level meeting to discuss my "unreasonable behavior", I am not willing to give him access to the application database. My objection to the needless coupling is "unreasonable, and there is not other way to do it". My suggestion that I'll expose a web service that will be called from his application is "obvious impossible" because "java can't consume web services from .NET" and anyway "java doesn't support web services"
  • The Project Manager whose stated goal is "have very stupid code, we want to hire stupid people, they are cheap" for a business critical application.
  • A team that suggested "hot new tech" that they "have several experts on" that can "make miracles with it" which turns out to be a single guy that read a blog post about it and "really we intend to use this vital project to learn this technology"
  • Senior Developer who designed a three tier system for serving a page. A request arrives, the page puts a message in a queue and then does a synchronous receive from another queue. Another machine that process requests. The message is an SQL statement, with no parameters allowed. There is no concurrency control in the app, so two pages making two totally different requests can get each other's results. I couldn't follow the reasoning for the decision, but apparently it has something to do with performance.
  • Developer who built the "thread per row" batch process, and actually had a bigger server bought to handle the "big application" that processed "lot of business critical data (15,000 rows)"
  • The Senior Consultant who took the code with him on a USB key whenever he left home, as a "theft precaution"
  • The consulting company who we had to go and fix things after, where all the tables were named "a1" to "a46" and the columns "b5" - "b91", an explicit design decision to keep them in place.
  • The Expert that had a business buy a 40,000$ BizTalk solution to do copy of one directory to another. We ended up replacing that with a scheduled task that executed: "robocopy \\server\dir \\anotherserver\dir" every 5 minutes.
  • Whoever it was that designed MS CRM

I could go on, but this is a good representation of the types, I think.

They all have a few things in common, they represent themselves as experts, senior, knowledgeable people. In all those cases, they have actively acted to harm the business they were working for, by action, inaction or misaction.

I have no issue with people not knowing any better, but I do expect people that ought to know better to... actually do know better.

Don't use MS CRM! It is not worth the ulcer you will get!

Yesterday I reached the breaking point with regards to MS CRM.

We needed to push a build to staging, so the customer can start acceptance testing. The way we built it, we have a central CRM development server, and each developer is working mostly locally, but against the same server. I don't like it, but the CRM doesn't really give me much choice.

At any rate, we started the move by trying to move the customizations from the development server to the production server. Naturally, there is no way to diff things, so I have no idea what changed since we last went to production.

The import process failed, some SQL error. There is a reason that I use the term "some SQL error", that is the entire information that I have about this error. checking the event log, trace files, etc reveals no further information. That went beyond annoying. We ended up having to do import of entity by entity, and doesn't merge by hand for the entities that failed to import successfully.

As you can imagine, that took: (a) time, (b) aggravation.

Both development and production machines has the same version installed, and it is not as if I was trying to do something out of the ordinary, it is supposed to be a common operation.

I had enough with this in lapsus alumni platform. I officially declare it as a nightmare platform.

MS CRM - don't bother even trying, you will thank me for it.