Ayende @ Rahien

It's a girl

Fluent Pipelines

I am having a discussion with Jon Skeet about the merits of using Linq for pipelines and delegates/lambda instead of classes.

I kept saying that I don't really see the point, so I went ahead and implemented this:

GenerateData(10000)//enumerator from 0 .. 10000
	.Where(i => i % 3 == 0)
	.Transform(i => (i * 2).ToString() )
	.Act(i => Console.WriteLine(i))
	.Execute();

This uses the same approach as my previous pipeline, but it does it in C# 3.0, so it can use things like extension methods, which make this nicer. The same in C# 2.0 is possible, but take some ridiculous amount of code to do.

This code is much simpler than the code I have shown here, no?

Why do I use the first approach then?

Scalability.

What we are seeing here is about as trivial as it can get. What happens when we have more complex semantics?

Let us take writing to the database as an example. We need to do quite a bit there, more than we would put in the lambda expression, certainly. We can extract to a method, but then we run into another problem, we can't do method inheritance. This means that I have no easy way of abstracting the common stuff outside. Well, I can use template method, but that works if and only if I have a single place I want to change behavior.

As an example of scaling, let us take this piece of code:

public class FibonacciBulkInsert : SqlBulkInsertOperation
{
	public FibonacciBulkInsert() : base("test", "Fibonacci")
	{
	}

	protected override void PrepareSchema()
	{
		Schema["id"] = typeof (int);
	}
}

Which uses this base class to handle the bulk of the work.

One thing that Jon mentioned that was interesting was the ability to take advantage of Linq specific improvements, such as PLinq. This is indeed a consideration, but upon some reflection on it, I realized that the two are not mutually exclusive. If I want to take advantage of any of that, all I need is to modify the pipeline to iterate using PLinq rather than foreach.

Comments

Jon Skeet
01/07/2008 09:02 AM by
Jon Skeet

You correctly mentioned the idea of a DelegateOperation in the original discussion, as a way of using delegates when you want to (although it means even simple things end up having a lot of fluff around them).

The same is true the other way round here - you can still have your SqlBulkInsertOperation and your FibonacciBulkInsert, with a public method with the appropriate signature, and either create a delegate directly from the relevant method or use a lambda expression to call it to avoid the cast. (I can now see how the bug around output type inference could be a problem...) So, four examples of code:

Example 1:

// (Not good if creating a new FibonacciBulkInsert takes time)

var pipeline = ...

.ForEach(item => new FibonacciBulkInsert().Action(item));

Example 2:

SqlBulkInsert inserter = new FibonacciBulkInsert();

var pipeline = ...

.ForEach(item => inserter.Action(item));

Example 3:

var pipeline =

.ForEach((Action<Order>) new FibonacciBulkInsert().Action);

Example 4:

Action insertAction = new FibonacciBulkInsert().Action;

var pipeline = ...

.ForEach(insertAction);

If output type inference worked, of course, we could just do:

var pipeline = ...

.ForEach(new FibonacciBulkInsert().Action);

I can't remember offhand whether you can define implicit conversions from a class to a delegate type. If you could, you could put one on SqlBulkInsert (or a suitable abstract base class) to Action, then do:

var pipeline = ...

.ForEach(new FibonacciBulkInsert());

which is pretty much your original code. I'm not sure I'd recommend it even if it's feasible though - I don't like implicit conversions.

It really seems to me that the two approaches are just different angles on exactly the same thing. I like the fact that MS has done a lot of the grunt work around Select, Where, SelectMany etc for me :) It looks like it's also a case of optimising syntax for the simple cases or optimising syntax for complex cases. The right choice there depends on what you're doing, of course :)

Jon

Jon Skeet
01/07/2008 09:04 AM by
Jon Skeet

Oh, quick other comment if you ever want to put complete code here with minimal extra classes - GenerateData is already implemented as Enumerable.Range(). I haven't found it particularly useful yet outside sample code, but it's nice to know about just in case :)

Jon Skeet
01/07/2008 10:19 AM by
Jon Skeet

Just tried, and you can indeed write an implicit operator converting to a delegate.

Jon

Ayende Rahien
01/07/2008 11:25 AM by
Ayende Rahien

Jon,

Using delegate prevents me from dealing with the operations in a smarter way.

Take a look at the code there.

We are passing context into the operation, which is can make use of.

With the delegate trick, we can do make it seamless to the outside, while retaining the power in having full blown classes, rather than function pointers.

Ayende Rahien
01/07/2008 11:27 AM by
Ayende Rahien

Jon,

Wow! implicit delegate conversions will be put to a cool use.

Comments have been closed on this topic.