Ayende @ Rahien

It's a girl

Searching ain’t simple

The problem statement is best described using:

image

This seems like a nice and easy problem, right? We join the architects table to the designs table and we are done.

select d.* from Designs d 
 join ArchitectsDesigns da on d.Id = da.DesignId
 join Architects a on da.ArchitectId = a.Id
where a.Name = @name

This is a trivial solution, and shouldn’t take a lot of time to build…

It is also the entirely wrong approach for the problem, can you tell me why?

Comments

Jonathan Gibb
03/22/2012 10:21 AM by
Jonathan Gibb

Beyond the search being exact and not allowing for partial matches, spelling etc - the model doesn't allow for collaboration between architects i.e. multiple architects working together to produce 1 design.

Rémi BOURGAREL
03/22/2012 10:28 AM by
Rémi BOURGAREL

@Jonathan, we surely can, the table ArchitectsDesigns is an association table.

I can't see what's wrong here, I'd use subquery instead of join (you're not using the joined table's column) but that's all.

Marc
03/22/2012 10:32 AM by
Marc

when the architect name is not unique you would show designs of different architects

Phillip
03/22/2012 10:33 AM by
Phillip

OH OH!!

It's an unbound result set!

laurts
03/22/2012 10:51 AM by
laurts

Result may show dublicate designs. Query should be something like

select ds.* from Designs ds join ( select distinct d.Id from Designs d join ArchitectsDesigns da on d.Id = da.DesignId join Architects a on da.ArchitectId = a.Id where a.Name = @name) dd dd.Id = ds.Id

Graham Clark
03/22/2012 10:53 AM by
Graham Clark

If there's loads of results, you might want to page the data... Maybe the query doesn't want to select out data on thousands of designs, but only a certain block of designs at a time?

grega_g
03/22/2012 11:06 AM by
grega_g

"...wrong approach for the problem"

I don't think he meant paging or partial matches. He is probably gonna explain how we (aka. users) don't want to search for designs by architect but for group of designs (grouped by structure) by design firm.

Or something.

morcs
03/22/2012 11:16 AM by
morcs

Perhaps because there are two queries hidden behind one? (Find architect by name, find designs by architect).

If you get no results, is it because the architect has no designs, or is it because there is no architect with that name? You have no way of knowing.

mk
03/22/2012 11:41 AM by
mk

maybe a combo with "available architects" ? or grouped results by architect ?

mk
03/22/2012 11:47 AM by
mk

.. or not using LIKE ?

Matt
03/22/2012 12:45 PM by
Matt

Ludwig Mies van der Rohe's "proper" surname is simply "Mies" - the "van der Rohe" is his mother's maiden name, and Mies added took it on in his 30s. In this case, an architect can have multiple names, so it looks like there's some extra work to be done - a simple name lookup isn't sufficient. Names of individuals can change, so don't always assume that there is a single correct spelling of a name.

Nige
03/22/2012 01:00 PM by
Nige

We are a little short in the requirements here. What is the scenario? Is this a public portal with possibly thousands of architects to find or is it a business with a much more limited set of architects to choose from? Also we are not given the structure of any of these tables.

However in any case, our user needs to have sucessfully identified the architect he is searching for prior to the request for "designs by architect" being submitted to the database engine. Just throwing a search directly at the database like this will no have good outcomes, either for the user or the database. As Matt suggests, names may change so our architects designs need an Id but our architects need the possibility for aliases, which further may need the ability for further identification, e.g. Ludwig Mies (Belgiun 1909 - 1975), Mies van der Rohe (Nederlands 1965 - Now)

Markus Zywitza
03/22/2012 01:23 PM by
Markus Zywitza

You don't want all (d.*) about the designs for that architect. Create a view model, consisting of id, name, preview_url instead of querying the whole design row for a list-page.

nick
03/22/2012 01:34 PM by
nick

Because it doesn't use RavenDb and is thus inadequate :)

Felice Pollano
03/22/2012 02:13 PM by
Felice Pollano

I vote for the Unbounded reultset as Phillip point out, but also the fact the query will start returns something just when we type the exact name, user will expect to see somethign while typing, to choose an exact architect name

shahz
03/22/2012 02:59 PM by
shahz

I can think of many answers depending on what the situation is:

1) You are essentially joining two aggregate roots (Architects and Designs) if using the DDD model. Depending on your coding style and architecture, this may not be the right thing to do. 2) Like morcs said above, you can’t tell if architect has no design or there is no architect. You may display "no designs found" which is the wrong message to display if the cause is incorrect architect name. 3) Separating out into two queries/aggregates can be very flexible and beneficial for number of reasons other than 1 and 2 above. The first query can immediately tell you the count and you can display “Loading design xxx of xxx” and the use the other query to then load the data. This can make the app more interactive. The second can even be a web service loaded through client-side to make the app more user-friendly. If design is a long list, you may not even load the remaining designs unless the user scrolls to it. This is just from usability perspective. From architecture prespective, there are many benefits to keeping the two separate.

J
03/22/2012 03:27 PM by
J

Relevance.

Alwin
03/22/2012 04:22 PM by
Alwin

I think the main problem is: what if a user wants to search diagrams by something other than the architect? Do you make a different screen? One screen for each searchable field? Or does the user get a very complicated screen with dozens of searchable fields? Users expect Google-like search, not something as specific as this screen.

Steve Sheldon
03/22/2012 04:31 PM by
Steve Sheldon

I can't say what the right approach is, but using SQL like this always ends up being an area where we spend a lot of time working with QA to resolve why the search doesn't work like google.

configurator
03/22/2012 05:22 PM by
configurator

Why am I getting such a strong sensation of Deja Vu? I know I've seen this or something very similar before.

Bill
03/22/2012 06:06 PM by
Bill

Its not kosher?

roger
03/22/2012 07:08 PM by
roger

Because it needs an 'outer join' between the architects and architectdesigns tables; you want to be able to return a row for the situation where the architect has no designs. Getting 0 rows with the outer join query means you didn't find the architect. The outer join scenario is super-common with 'searching/finding' use cases. Other problems:unusual to do 'exact' searches for names; nearly always need some 'wild-carding' (which potentially gives multiple architects, so needs grouping as well).

edward
03/22/2012 08:57 PM by
edward

roger is closest i think if your result set needs more data.

But if you search for architects only, you only need to search the architect table

Thomas Krause
03/22/2012 09:07 PM by
Thomas Krause

In my opinion the best way to make this user friendly is to have a suggestion box for the architect while typing. If you choose a suggestion from this list it should automatically go to the results using the chosen architect as an Id instead of using the name. In case the user doesn't choose a name from the list, you can still check if there is a single architect matching the name and if not ask the user to choose an architect from the list before he can continue.

In any case from the server side of view this would be two separate queries. One for the architect and one for the designs, like others mentioned already.

I'm curious however how to model this in RavenDB or other document databases. Am I correct, that you would have 2 aggregate roots (architect and design) and that in the design you would save the id of the architect along with the properties of the architect that you usually need to display along with the design (e.g. his name), so that you can view designs and architects independently without the need to join them?

jm
03/22/2012 11:07 PM by
jm

No. Oh come on, you make me deel inadequate!

Stuart Booth
03/23/2012 01:43 AM by
Stuart Booth

Maybe it's because we're querying the wrong place, looks like this db is the Write model, with all the relations etc.

We could have a Read model where we're querying a single table so no joins .. the single table in the read model would be populated 'eventually' after the Write model is updated ... so CQRS ..

The other thing we could be doing instead is searching use Lucene etc. .. this is 'sort' of similar to the separate Read/Write models.

Rafal
03/23/2012 05:51 AM by
Rafal

Ayende look what you've done - the first idea of many commenters is to demolish a relational database or 'fix' a db server with a chainsaw. Isn't it some kind of nosql aberration?

ThomasW
03/23/2012 06:35 AM by
ThomasW

It's a trick question. Everything is alright.

Zoltan
04/22/2012 12:23 PM by
Zoltan

Hi,

I have an NHibernate mapping problem described on the following link:

http://stackoverflow.com/questions/10266347/nhibernate-composite-id-mapping-issue-bi-uni-directional-onetomany-manytoone-a

It's occuring very headache for me, somebody cal help me please?

Thank you, Zoltan

Ayende Rahien
04/22/2012 12:26 PM by
Ayende Rahien

Zoltan, Please use the NHibernate mailing list http://groups.google.com/group/nhusers

Alternatively, you may choose our commercial support option: http://nhprof.com/commercialsupport

Comments have been closed on this topic.