Ayende @ Rahien

Hi!
My name is Ayende Rahien
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 5,949 | Comments: 44,548

filter by tags archive

A challenge: Getting a list of products


A few days ago I posted about two phase tests, I have been thinking about this lately, and decided that I have a good example for this, which also demonstrate some important design decisions.

The task is listing the first 10 products that we can sell to a customer. The UI is console application, and the database design and data access method are whatever you want.

That is pretty easy, right?

Expected input is:

/dev/product_listing/bin> list_products

Expected output is:

Milk                  $1.0
Bread               $1.3
Sausage            $2.5
Horror Movie    $5.0

Next requirement is that given the following input:

/dev/product_listing/bin> list_products -pg13

The output is:

Milk                  $1.0
Bread               $1.3
Sausage            $2.5

Next requirement is that given the following input:

/dev/product_listing/bin> list_products -vegetarian

Expected output is:

Milk                  $1.0
Bread               $1.3
Horror Movie    $5.0

Additional requirements of this type will follow, and they can be combined. That is, we can also have:

/dev/product_listing/bin> list_products -pg13 -vegetarian

Expected output is:

Milk                  $1.0
Bread               $1.3

How would you solve this?


Comments

Joshua McKinney

http://ayende.com/Blog/archive/2008/01/05/Pipes-and-filters-The-IEnumerable-appraoch.aspx

Ayende Rahien

Not nice quoting myself to me.

You are supposed to work on it.

Bil Simser

Not sure what the pipe and filters thing is about (I remember reading it but have to go back to refer to it and see if it's appropriate to this solution). However I would go with Chris on using a specification. I'll write up a blog post with an approach to a solution and would like you to critique it please.

silky

i don't get the problem, and the example is strange. since when are 'horror movies' vegetarian? do what? filter on two things? eh?

Evan

Given that the number of products in an average ecommerce site is very low, you can solve it in a number of ways and do "good enough".

Now, if you are amazon, it's going to be a bit trickier..

Did you have something large scale in mind, or just an average app with 50-100 products?

Evan

A trickier version might be 30k products, against an average of 6 filters per query, and a hard response deadline of 30ms..

Paul Cowan

So by vegetarian you mean everything that is not meat.

Surely you would just select everything that is vegetarian or are you looking for the candidate to only look at the requirements and not what the candidate can read between the lines.

I like the IEnumerable of operations approach as well or why stop there why not provide your own linq provider.

How many days does the test take? I could maybe get a beta out in a week

Frans Bouma

isn't this 1st year CS course material?

alberto

Seems like a good candidate for LINQ.

Here's a first and dirty approach to it (excuse possibles mistakes in my syntax, these are my very first lines on LINQ and your textbox's intellisense is broken :)).

  
IEnumerable FilterByPG13Compliance(IEnumerable products)
  
{
  
	var result = from product in products
  
		select where product.RequiredAge  FilterByVegetarianFriendliness(IEnumerable products)
  
{
  
	var result = from product in products
  
		select where product.VegetarianFriendly
  
	return result;
  
}
  
  
  
//main
  
var result = from product in products select product;
  
int i;
  
for (i = 1; i 

                
Ayende Rahien

Frans Bouma,

The actual code, sure.

The design, no.

Ayende Rahien

alberto,

Now imagine that you have 15 of those, how would you scale the problem to handle this?

Benny Thomas

I would say you should handle the args as IEnumerable.

Lets have a registerfase thats register all args we could handle first.

Tuna Toksoz

public interface IFilter

{

 bool Process(IOrderedQueryable<T> q,string parameter);

}

And I loop every item entered from the console. If the process method returns false, i would try this parameter with next filter, if it returns true, i will reiterate the filter list for the next parameter.

Tuna Toksoz

Needless to say, I would implement filters for paging , and so.

The problem would be the order of parameters, I mean if person first enters parameter for page, then types the vegetarian, it would be the problem.

Well, I would change the interface to

public interface IFilter

{

  int CanProcess(string parameter); //returns the order of filter, if it is Integer.Max, it will be processed in the end

  bool Process(IOrderedQueryable<T> queryable,string parameter);

}

and use something like priority queue.

This would require double-iteration of filters, so it is O(m.n) where m is the number of filters and n is the number of parameters.

Tuna Toksoz

Sorry, I was wrong in complexity of the case,

there will be a need for priority queue.

There will be n parameters and m filters.

m*n will be required for iterating and finding the right filter.

n parameter&filter pair will be placed into priority queue, nlogn will be required.

and then they will be popped. O(mn) wasn't right

Ayende Rahien

Tuna,

Assume that there will be 100 filters (very high).

Using the CanExecute, Execute approach, you would have at most 200 iterations.

Since they only add filters, we don't care about that.

Tuna Toksoz

Well, I thought you would ask "what if we had 10000 parameters" if i wouldn't care about the complexity(even I cared, I was wrong and can still be wrong)

Ayende Rahien

If we have 1000 parameters, we need a completely different approach

alwin

If it is small scale / demo purpose, i would use the Filter thing. But without the Linq and with the CanProcess as bool. Just 'yield' everything, and the processing order depends on the order in the registration list.

For larger scale where you need to make a db query, maybe a i would use a specification. I've used this in my last project after a tip from Ayende on the castle user list :)

Dictionary<string, Action> specModifiers;

specModifiers.Add("vegetarian", spec => spec.Vegetarian = true);

Ralf Kretzschmar

How about a solution using a windows command line batch? :-)

here is the code:


@echo off

if "%echo%" neq "" echo %echo%

set prodGroupSearch=

set vegCodeSearch=

:nextArg

if "%1"=="" goto :process

set arg=%1

if "%arg:~0,3%"=="-pg" set prodGroupSearch=%arg:~1,5%

if "%arg%"=="-vegetarian" set vegCodeSearch=veg

shift

goto :nextArg

:process

for /F "eol=# delims=;# tokens=1-4" %%i in (prods) do call :output "%%i" %%j %%k %%l

exit /b

:output

set prod=%~1                                   .

set prod=%prod:~0,15%

set price=%2

set prodGroup=%3

set vegCode=%4

set query=

set val=

if "%prodGroupSearch%" neq "" (set query=%prodGroupSearch%& set val=%prodGroup%)

if "%vegCodeSearch%" neq "" (set query=%query%%vegCodeSearch%& set val=%val%%vegCode%)

if "%val%" neq "%query%" goto :skip 

echo %prod% %price%

:skip

exit /b

The products file looks like this:


Milk;$1.0;pg13;veg#

Bread;$1.3;pg13;veg#

Sausage;$2.5;pg13;nonveg#

Horror Movie;$5.0;pg31;veg#


Btw: I did it TDD. Here are my tests


@echo off

call list_products.bat > out.txt

fc out.txt case1.txt > nul || @echo case1 failed

call list_products.bat -pg13 > out.txt

fc out.txt case2.txt > nul || @echo case2 failed

call list_products.bat -vegetarian >out.txt

fc out.txt case3.txt > nul || @echo case3 failed

call list_products.bat -pg13 -vegetarian >out.txt

fc out.txt case4.txt > nul || @echo case4 failed


Ayende Rahien

The goggles, they do nothing! :-)

That is an unexpected solution, which is utterly unreadable to me,

Alex Simkin

So, did Ralf pass the test?

Ralf Kretzschmar

Full code, test, and testdata for the dos solution can be found on http://www.filedropper.com/ayendechallenge

Dimitar E Dimitrov

Applying some kind of Pipe processing seems nice from design point of view, but when it comes to reading and filtering data, which always comes from relational database I think it is better to let the database do the work, that it supposed to do – find the data, that you need the fastest way.

I don’t accept the argument, that you can’t always use database - even for a sample app with 100 records I would use Sqlite and still have feature a complete database at my disposal.

Reading all data in memory and filtering with line of pipes isn't that feasible to me.

What is wrong doing to most obvious thing like:

public interface IDatabase {

IList<Product> ListProducts(IList<Filter> filters);

}

If you want to take this further here a little more code in case my point isn't that clear:

public class Filter {

public string CmdSwtich { get; private set;}

public string DBColumn  { get; private set;}

public virtual string GetCondition();

}

public class BooleanFilter : Filter {

public override string GetCondition() { return DBColumn + "=1 "; }

}

public class Query {

string query;

public IList<Filter> Filters { get; private set;}


public Query(string query, IList<Filter> filters) {

    this.query  = query;

    Filters     = filters;

}


public string GetSql() {

    if (Filters == null || Filters.Count == 0)

        return query;

    string result = query;

    result += " WHERE ";

    for(int i = 0; i < Filters.Count; i++){

        result += Filters[i].GetCondition();

        if (i < Filters.Count - 1) result += ", ";

    }

    return result;

}

}

Frans Bouma

"The actual code, sure.

The design, no."

  • argument parsing. The biggest challenge. Argument parsing is error prone.

  • filtering a set using multiple predicates. erm... if that's hard, one hasn't passed 1st year of a CS course, IMHO.

-> parse every argument into a predicate. The argument parser then builds a predicate expression with these predicates, with solely AND operators. So this is very easy: you simply execute all predicates in the expression. if they all are true, you have a match.

The predicates are simple classes. You simply pass in the object to filter on and you interpret the predicate with the object. This can be done for example by creating Func instances, like the pg age delimiter.

Func<Product, bool> f = p=>(p.Aspects[Aspect.Pg] !=null) && (bool)p.Aspects[Aspect.pg];

the predicate is then simply the Func. The predicate expression is then simply a list of Func<Product, bool> instances. A product is an instance of a Product class which has Aspects, which is a Dictionary<Aspect, object>, so you can store aspects of it inside the product. Otherwise you can't filter on it in a generic way and have to build custom code per type in the set.

You then traverse the complete set from front to back, applying the predicate expression on all products. If the predicate expression returns true, it's a match, otherwise you move on to the next.

The code is actually more challenging than the design. The crucial thing the participant has to realize is that it has to traverse the set by applying a predicate expression. But that's basic database theory.

So the challenge is actually: how to implement this? I therefore disagree that the code is 'simple' and the design isn't. THe design is straight forward, the code has some irritating aspects because you have to parse arguments. This always sucks 8)

Ayende Rahien

Dimitar,

Very nice, and what I would strive for in these sort of situations

This is also pipe and filters, but your filter build a query, which fits my definition

Ayende Rahien

Frans,

Most people would implement this using ifs until the cow comes home.

Choosing an approach that fits the open close principal is what I was aiming at.

Frans Bouma

I hope this all fits in the post

using System;

using System.Collections.Generic;

namespace Challenge

{

public enum Aspect

{

    Pg,

    ContainsMeat,

    // etc.

}


public class Product

{

    public Product(HashSet<Aspect> initialAspects)

    {

        this.Aspects = new HashSet<Aspect>();

        foreach(Aspect a in initialAspects)

        {

            this.Aspects.Add(a);

        }

    }


    public string Name { get; set;}

    public decimal Price { get; set;}

    public HashSet<Aspect> Aspects { get; private set;}

}



public class ListProducts

{

    public static void Main(string[] args)

    {

        List<Product> products = BuildProducts();

        List<Func<Product, bool>> predicates = ParseArguments(args);

        List<Product> filteredProducts = FilterProducts(products, predicates);

        ShowProducts(filteredProducts);

    }


    private static void ShowProducts(List<Product> toShow)

    {

        Console.WriteLine("Products found:");

        foreach(Product p in toShow)

        {

            Console.WriteLine("{0}\t\t${1}", p.Name, p.Price);

        }

    }


    private static List<Product> FilterProducts(List<Product> products, List<Func<Product, bool>> predicates)

    {

        List<Product> toReturn = new List<Product>();


        foreach(Product p in products)

        {

            bool acceptProduct = true;

            foreach(Func<Product, bool> predicate in predicates)

            {

                acceptProduct &= predicate(p);

                if(!acceptProduct)

                {

                    break;

                }

            }

            if(acceptProduct)

            {

                toReturn.Add(p);

            }

        }


        return toReturn;

    }


    private static List<Func<Product, bool>> ParseArguments(string[] args)

    {

        List<Func<Product, bool>> toReturn = new List<Func<Product, bool>>();

        foreach(string argument in args)

        {

            Func<Product, bool> toAdd = null;

            switch(argument)

            {

                case "-pg13":

                    toAdd = p=>!p.Aspects.Contains(Aspect.Pg);

                    break;

                case "-vegetarian":

                    toAdd = p=>!p.Aspects.Contains(Aspect.ContainsMeat);

                    break;

                default:

                    throw new ArgumentException(string.Format("Unknown argument: {0}", argument));

            }


            toReturn.Add(toAdd);

        }

        return toReturn;

    }


    private static List<Product> BuildProducts()

    {

        List<Product> toReturn = new List<Product>();

        toReturn.Add(new Product(new HashSet<Aspect>()) { Name="Milk", Price=1.0M});

        toReturn.Add(new Product(new HashSet<Aspect>()) { Name="Bread", Price=1.3M});

        toReturn.Add(new Product(new HashSet<Aspect>() { Aspect.ContainsMeat}) { Name="Sausage", Price=2.5M});

        toReturn.Add(new Product(new HashSet<Aspect>() { Aspect.Pg}) { Name="Horror Movie", Price=5M});


        return toReturn;

    }

}

}

Alex Simkin

Interesting that none paid attention to the first requirement:

"listing the first 10 products"

alberto

Ayende, THEN I would refactor (in fact, if it were a face-to-face interview I would have asked you a few questions to get some insight in what you were expecting from me, but since this is not so fluent, I opted for KISS).

Anyway, I have posted my solution in my blog (it was a little longer and I can format it a little there ;))

http://sharpbites.blogspot.com/2008/04/ayendes-challenge.html

Michael Minutillo

I had a quick go with LINQ to SQL. Given the way LINQ to SQL works does the Pipes and Filters Pattern begin to converge with the Specification Pattern in anyone elses mind?

http://wolfbyte-net.blogspot.com/2008/04/pipe-and-filters-fluent-apis-and-linq.html

alberto

So, did I get the job? :D

Alex

SELECT TOP 10 * FROM product WHERE productid NOT IN (SELECT productid FROM producttag WHERE tagid IN ({tagList}))

You guys are Architecture Astronauts.

http://www.joelonsoftware.com/articles/fog0000000018.html

http://www.joelonsoftware.com/items/2005/10/21.html

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. The RavenDB Comic Strip (3):
    28 May 2015 - Part III – High availability & sleeping soundly
  2. Special Offer (2):
    27 May 2015 - 29% discount for all our products
  3. RavenDB Sharding (3):
    22 May 2015 - Adding a new shard to an existing cluster, splitting the shard
  4. Challenge (45):
    28 Apr 2015 - What is the meaning of this change?
  5. Interview question (2):
    30 Mar 2015 - fix the index
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats