Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 5,953 | Comments: 44,410

filter by tags archive

Searching ain’t simple


The problem statement is best described using:

image

This seems like a nice and easy problem, right? We join the architects table to the designs table and we are done.

select d.* from Designs d 
 join ArchitectsDesigns da on d.Id = da.DesignId
 join Architects a on da.ArchitectId = a.Id
where a.Name = @name

This is a trivial solution, and shouldn’t take a lot of time to build…

It is also the entirely wrong approach for the problem, can you tell me why?


Comments

Jonathan Gibb

Beyond the search being exact and not allowing for partial matches, spelling etc - the model doesn't allow for collaboration between architects i.e. multiple architects working together to produce 1 design.

Rémi BOURGAREL

@Jonathan, we surely can, the table ArchitectsDesigns is an association table.

I can't see what's wrong here, I'd use subquery instead of join (you're not using the joined table's column) but that's all.

Marc

when the architect name is not unique you would show designs of different architects

Phillip

OH OH!!

It's an unbound result set!

laurts

Result may show dublicate designs. Query should be something like

select ds.* from Designs ds join ( select distinct d.Id from Designs d join ArchitectsDesigns da on d.Id = da.DesignId join Architects a on da.ArchitectId = a.Id where a.Name = @name) dd dd.Id = ds.Id

Graham Clark

If there's loads of results, you might want to page the data... Maybe the query doesn't want to select out data on thousands of designs, but only a certain block of designs at a time?

grega_g

"...wrong approach for the problem"

I don't think he meant paging or partial matches. He is probably gonna explain how we (aka. users) don't want to search for designs by architect but for group of designs (grouped by structure) by design firm.

Or something.

morcs

Perhaps because there are two queries hidden behind one? (Find architect by name, find designs by architect).

If you get no results, is it because the architect has no designs, or is it because there is no architect with that name? You have no way of knowing.

mk
mk

maybe a combo with "available architects" ? or grouped results by architect ?

mk
mk

.. or not using LIKE ?

Matt

Ludwig Mies van der Rohe's "proper" surname is simply "Mies" - the "van der Rohe" is his mother's maiden name, and Mies added took it on in his 30s. In this case, an architect can have multiple names, so it looks like there's some extra work to be done - a simple name lookup isn't sufficient. Names of individuals can change, so don't always assume that there is a single correct spelling of a name.

Nige

We are a little short in the requirements here. What is the scenario? Is this a public portal with possibly thousands of architects to find or is it a business with a much more limited set of architects to choose from? Also we are not given the structure of any of these tables.

However in any case, our user needs to have sucessfully identified the architect he is searching for prior to the request for "designs by architect" being submitted to the database engine. Just throwing a search directly at the database like this will no have good outcomes, either for the user or the database. As Matt suggests, names may change so our architects designs need an Id but our architects need the possibility for aliases, which further may need the ability for further identification, e.g. Ludwig Mies (Belgiun 1909 - 1975), Mies van der Rohe (Nederlands 1965 - Now)

Markus Zywitza

You don't want all (d.*) about the designs for that architect. Create a view model, consisting of id, name, preview_url instead of querying the whole design row for a list-page.

nick

Because it doesn't use RavenDb and is thus inadequate :)

Felice Pollano

I vote for the Unbounded reultset as Phillip point out, but also the fact the query will start returns something just when we type the exact name, user will expect to see somethign while typing, to choose an exact architect name

shahz

I can think of many answers depending on what the situation is:

1) You are essentially joining two aggregate roots (Architects and Designs) if using the DDD model. Depending on your coding style and architecture, this may not be the right thing to do. 2) Like morcs said above, you can’t tell if architect has no design or there is no architect. You may display "no designs found" which is the wrong message to display if the cause is incorrect architect name. 3) Separating out into two queries/aggregates can be very flexible and beneficial for number of reasons other than 1 and 2 above. The first query can immediately tell you the count and you can display “Loading design xxx of xxx” and the use the other query to then load the data. This can make the app more interactive. The second can even be a web service loaded through client-side to make the app more user-friendly. If design is a long list, you may not even load the remaining designs unless the user scrolls to it. This is just from usability perspective. From architecture prespective, there are many benefits to keeping the two separate.

J
J

Relevance.

Alwin

I think the main problem is: what if a user wants to search diagrams by something other than the architect? Do you make a different screen? One screen for each searchable field? Or does the user get a very complicated screen with dozens of searchable fields? Users expect Google-like search, not something as specific as this screen.

Steve Sheldon

I can't say what the right approach is, but using SQL like this always ends up being an area where we spend a lot of time working with QA to resolve why the search doesn't work like google.

configurator

Why am I getting such a strong sensation of Deja Vu? I know I've seen this or something very similar before.

Bill

Its not kosher?

roger

Because it needs an 'outer join' between the architects and architectdesigns tables; you want to be able to return a row for the situation where the architect has no designs. Getting 0 rows with the outer join query means you didn't find the architect. The outer join scenario is super-common with 'searching/finding' use cases. Other problems:unusual to do 'exact' searches for names; nearly always need some 'wild-carding' (which potentially gives multiple architects, so needs grouping as well).

edward

roger is closest i think if your result set needs more data.

But if you search for architects only, you only need to search the architect table

Thomas Krause

In my opinion the best way to make this user friendly is to have a suggestion box for the architect while typing. If you choose a suggestion from this list it should automatically go to the results using the chosen architect as an Id instead of using the name. In case the user doesn't choose a name from the list, you can still check if there is a single architect matching the name and if not ask the user to choose an architect from the list before he can continue.

In any case from the server side of view this would be two separate queries. One for the architect and one for the designs, like others mentioned already.

I'm curious however how to model this in RavenDB or other document databases. Am I correct, that you would have 2 aggregate roots (architect and design) and that in the design you would save the id of the architect along with the properties of the architect that you usually need to display along with the design (e.g. his name), so that you can view designs and architects independently without the need to join them?

jm
jm

No. Oh come on, you make me deel inadequate!

Stuart Booth

Maybe it's because we're querying the wrong place, looks like this db is the Write model, with all the relations etc.

We could have a Read model where we're querying a single table so no joins .. the single table in the read model would be populated 'eventually' after the Write model is updated ... so CQRS ..

The other thing we could be doing instead is searching use Lucene etc. .. this is 'sort' of similar to the separate Read/Write models.

Rafal

Ayende look what you've done - the first idea of many commenters is to demolish a relational database or 'fix' a db server with a chainsaw. Isn't it some kind of nosql aberration?

ThomasW

It's a trick question. Everything is alright.

Zoltan

Hi,

I have an NHibernate mapping problem described on the following link:

http://stackoverflow.com/questions/10266347/nhibernate-composite-id-mapping-issue-bi-uni-directional-onetomany-manytoone-a

It's occuring very headache for me, somebody cal help me please?

Thank you, Zoltan

Ayende Rahien

Zoltan, Please use the NHibernate mailing list http://groups.google.com/group/nhusers

Alternatively, you may choose our commercial support option: http://nhprof.com/commercialsupport

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. The RavenDB Comic Strip (3):
    28 May 2015 - Part III – High availability & sleeping soundly
  2. Special Offer (2):
    27 May 2015 - 29% discount for all our products
  3. RavenDB Sharding (3):
    22 May 2015 - Adding a new shard to an existing cluster, splitting the shard
  4. Challenge (45):
    28 Apr 2015 - What is the meaning of this change?
  5. Interview question (2):
    30 Mar 2015 - fix the index
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats