re: Entity Framework Core performance tuning–part I

architecture (618) rss
bugs (451) rss
challanges (123) rss
community (381) rss
databases (481) rss
design (896) rss
development (647) rss
hibernating-practices (72) rss
miscellaneous (592) rss
performance (397) rss
programming (1093) rss
raven (1459) rss
ravendb.net (545) rss
reviews (184) rss

2025
- August (6)
- July (7)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

Think inside the database - RavenDB with native GenAI integration

Oct 04 2017

reEntity Framework Core performance tuning–part I

time to read 3 min | 501 words

I run into a really interesting article about performance optimizations with EF Core and I thought that it deserve a second & third look. You might have noticed that I have been putting a lot of emphasis on performance and I had literally spent years on optimizing relational database access patterns, including building a profiler dedicated for inspecting what an OR/M is doing. I got the source and run the application.

I have a small bet with myself, saying that in any application using a relational database, I’ll be able to find a SELECT N+1 issue within one hour. So far, I think that my rate is 92% or so. In this case, I found the SELECT N+1 issue on the very first page load.

Matching this to the code, we have:

Which leads to:

And here we can already tell that there is a problem, we aren’t accessing the authors. This actually happens here:

So we have the view that is generating 10 out of 12 queries. And the more results per page you have, the more this costs.

But this is easily fixed once you know what you are looking at. Let us look at something else, the actual root query, it looks like this:

Yes, I too needed a minute to recover from this. We have:

One JOIN
Two correlated sub queries

Jon was able to optimize his code by 660ms to 80ms, which is pretty awesome. But that is all by making modifications to the access pattern in the database.

Given what I do for a living, I’m more interested in what it does inside the database, and here is what the query plan tells us:

There are only a few tens of thousands of records and the query is basically a bunch of index seeks and nested loop joins. But note that the way the query is structured forces the database to evaluate all possible results, then filter just the top few. That means that you have to wait until the entire result set has been processed, and as the size of your data grows, so will the cost of this query.

I don’t think that there is much that can be done here, given the relational nature of the data access ( no worries, I’m intending to write another post in this series, you guess what I’m going to write there, right? Smile ).

Tweet Share Share 11 comments

Tags:

Comments

04 Oct 2017
14:01 PM

Jedak

I skimmed the original article. Did it mention the database engine being used? That SQL looks pretty horrible, performance wise. Especially if it's hand written.

04 Oct 2017
14:02 PM

Oren Eini

Jedak, This is against SQL Server, I assume, and the SQL was probably generated by EF Core.

04 Oct 2017
14:09 PM

Paul

What application is generating those screenshots? I want it :)

04 Oct 2017
14:12 PM

Oren Eini

Paul, That is the Entity Framework Profiler, and the screen shots were taken using the Windows Snipping Tool

04 Oct 2017
19:37 PM

Rafal

why do you say that the query has to evaluate all possible results to just take top few? The order is on publish date , filter on soft deleted, so if you have an index on these there should be no need to check all the records.

Anyway, over the last several years i'm gradually transitioning from 'everything in the application/use ORM and forget about SQL' back to 'just do it in SQL and nothing will beat it'. Dont know about your experiences, but after repeatedly having to meticulously build and adjust linq queries to generate the sql i'd like to see (and never mention the fact that its too easy to find Linq constructs unsupported by the provider) i started wondering why go through this at all if i already know what the query should be. And another thing is transactions, its so easy to shoot yourself in the foot when you try to make some transactional operations in the application code..

07 Oct 2017
17:57 PM

Chris Hynes

"I don’t think that there is much that can be done here, given the relational nature of the data access "...

There's still a ton of low hanging fruit in that query. Any time you look at the query plan and see a table being accessed multiple times, and/or multiple index seeks for one statement there's a potential for improvement.

First off, without even changing the SQL, you can create a covering index on Review.BookId, Review.NumStars so you're not scanning two indexes.

Next, combine the subqueries to one so you're not hitting Review twice for no reason:

SELECT TOP(@pageSize) [b].[BookId], [b].[Title], 
     [b].[Price], [b].[PublishedOn], 
CASE
    WHEN [p.Promotion].[PriceOfferId] IS NULL
    THEN [b].[Price] ELSE [p.Promotion].[NewPrice]
END AS [ActualPrice], 
[p.Promotion].[PromotionalText] AS [PromotionPromotionalText], 
[dbo].AuthorsStringUdf([b].[BookId]) AS [AuthorsOrdered], 
r.*
FROM [Books] AS [b]
LEFT JOIN [PriceOffers] AS [p.Promotion] 
    ON [b].[BookId] = [p.Promotion].[BookId]
OUTER APPLY (
    SELECT
        COUNT(*) AS [ReviewsCount], 
         AVG(CAST([y].[NumStars] AS float)) AS [ReviewsAverageVotes]
    FROM [Review] AS [y]
    WHERE [b].[BookId] = [y].[BookId]
) r
WHERE ([b].[SoftDeleted] = 0) 
ORDER BY [ReviewsAverageVotes] DESC

I would also try selecting the page of reviews first and then joining in Book, but the SoftDeleted check may preclude that depending on whether the requirements are to return exactly @pageSize records or @pageSize or less.

08 Oct 2017
05:40 AM

Oren Eini

Chris, Well, to start with, you'll need to get EF Core to generate this query, and I'm not sure that you could. There is always the option of feeding it the SQL, of course. Then, there is AuthorsStringUdf, which I'm not sure about, but if this does a lookup per row, that is going to be expensive.

08 Oct 2017
06:30 AM

Chris Hynes

Oren... in the source article you referenced, step 2 of 4 was using Dapper to run an optimized query. It's about optimizing SQL access in general, not EF core in particular -- 2/3 of the article is non EF ways to optimize, or ways to use EF to warehouse fields vs reading them.

That's a standard technique with OR/M's -- use them for CRUD where performance isn't an issue, but once you're optimizing it only makes sense to hand code SQL vs throwing up your hands and saying "I don’t think that there is much that can be done here, given the relational nature of the data access".

At that point, it's not about generating an EF query, but a SQL query, and I'd bet that an optimized query would be pretty close to the warehoused version -- at least close enough to prefer it.

08 Oct 2017
08:46 AM

Oren Eini

Rafal, You're correct, I was thinking about the query where you sort of the average number of stars. And yes, in many cases, raw SQL is easier when you have more complex queries.

14 Oct 2017
02:00 AM

Nelson

The last picture is too vague to see clearly.

15 Oct 2017
06:41 AM

Oren Eini

Nelson, Yes, I'm sorry, the original was a few thousands pixels, and I had to shrink it to fit. The idea was to give you the rough general outline of the plan, not to inspect it.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB

reEntity Framework Core performance tuning–part I

More posts in "re" series:

Comments

Comment preview

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed

Oren Eini

CEO of RavenDB

Related posts that you may find interesting:

More posts in "re" series:

Comments

Comment preview

Markdown formatting

Phrase Emphasis

Links

Images

Headers

Lists

Blockquotes

Horizontal Rules

Manual Line Breaks

Fenced Code Blocks

Header IDs

Tables

Definition Lists

Footnotes

Abbreviations

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication