﻿<?xml version="1.0" encoding="utf-8"?><rss version="2.0"><channel><title>Ayende @ Rahien</title><link>http://ayende.com</link><description>Ayende @ Rahien</description><copyright>Copyright (C) Ayende Rahien  2004 - 2021 (c) 2026</copyright><ttl>60</ttl><item><title>Petr Antoš commented on Normalization is from the devil</title><description>Guys, very exciting discussion here. You are for sure experts in your domains - massive websites are totally different to intranets and ERPs with predictable users loads and uncomparable for read/write ratios, but RDBMs are still the best known thing to data integrity and if someone invents better approach, it will be for at least Turings if not Nobels prize, I think.
  
@karhgath sed probably full truth and I like whats @Browns sed too. But also @evereq interrests me considering "immutability", as I posted similar question here bit.ly/9nPSXn (no wars please:-), related to "entity instances versioning". I simply dont know if some existing RDBMs supports this approach or at least if some ORMs are aware of this. Could someone reply with some theory and/or products which can do such data records timestamping to generic relations history support? I know only already dead one from MS-DOS (even CP/M era) to which I was almost addicted, and frustrated to develop larger DB apps without it :-)
  
According to "Customers name" issue, when TYPO occurs, I would like to update already existing (logically immutable) version, but when MARRIAGE occurs, then create new version with the same ID but new TS and appropriate changes. Possible somewhere??
  
Thanks a lot in advance!
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment66</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment66</guid><pubDate>Thu, 14 Oct 2010 09:47:15 GMT</pubDate></item><item><title>Misha commented on Normalization is from the devil</title><description>The "ProductCost" (on the OrderLine table) would only need to be "normalized away" (to the Product table) iff there is a business rule stating that the ProductCost is uniquely determined by something less smaller than a candidate key of OrderLine (ProductId, in this case).
  
  
[And similarly for ProductName, of course]
  
  
In the absence of this business rule (e.g. where there is the contrary business rule that the ProductCost on an order line must stay constant even if the ProductCost in the Product changes later), removing ProductCost from OrderLine to Product in this way isn't normalizing -- its failing to implement the requirements properly.
  
  
The temporal aspects tend to bite people because the business users who are writing requirements will often think in terms of the rules that apply at a point-in-time -- so may well state that OrderLine.ProductCost is uniquely determined by the ProductId, but omit the implied qualification "at the time the order is made". Its the job of the requirements analyst to find out what the real requirement is.
  
  
[And, as MarkC noted, dropping the link to Product is crazy -- particularly as we are anticipating product name changes, so there would be no way to link the record back up again afterwards.]
  
  
Ayende's design may well be the right one for certain sets of requirements. Its basically a modified temporal model, where we only retain the historical values of ProductCost for products if there was at least one order using that value of ProductCost, and we physically denormalise that historical record to OrderLine, on the understanding that it is archive data and thus constant and therefore not problematic from an 'update to multiple copies of denormalized data' p-o-v.
  
  
But without understanding the requirements, its impossible to tell.
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment65</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment65</guid><pubDate>Thu, 23 Sep 2010 21:23:34 GMT</pubDate></item><item><title>MarkC commented on Normalization is from the devil</title><description>I agree to some extent regarding the fact this isn't normalisation, the price of an item on an order is not duplicate data of the "stock" price in the product table, because it can change.
  
  
However I wouldn't break the link between order line and product, you still need it there. Why ? simple - the boss says "tell me our top 10 products" - you can't do it because all you have to rely on to identify a product in the OrderLine table is the name - which could change.
  
If you kept the Product ID in OrderLine you'd be able to reliably and safely identify each product sold.
  
Taking it one step further, imagine you want to track purchases to offer other products (like amazon do for example). 
  
  
In summary I'd say this was an over simplification, not de-normalisation.
  
  
It may be convenient for an ORM but I don't think its very useful in the real world.
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment64</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment64</guid><pubDate>Mon, 20 Sep 2010 07:03:46 GMT</pubDate></item><item><title>MikeG commented on Normalization is from the devil</title><description>The problem with OrderLine and Product isn't caused by too much normalization. The problem is caused by taking an entity that should be a snapshot in time (OrderLine) and associating it with an entity that is not designed as a snapshot in time (Product). To make this correct, you would need to use something like a ProductVersion entity in place of Product that has a start and end date. Any time a change to the product is accepted by the people responsible for products, a new ProductVersion is created, and new OrderLine entities would use this new version of the product.
  
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment63</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment63</guid><pubDate>Fri, 17 Sep 2010 22:10:18 GMT</pubDate></item><item><title>Mike Katchourine commented on Normalization is from the devil</title><description>I am definitely with Mr. Brown here.
  
  
There should definitely be a separation in data access strategies in analytical vs transactional approaches.
  
  
It would be much more common to require de-normalized historical information in a reporting context. That can be easily achieved with an analytical extract that is completely denormalized and THEN indexed.
  
  
For high volume transactional processing data integrity is a much bigger concern than space and even performance. Data integrity is the cornerstone of the normalized relational model advantage, not space considerations.
  
  
I understand Ayende too though. Any developer above average is usually obsessively concerned with performance, sometimes at expense of usability and maintainability, especially if he is part of a larger team that consists of a mix of different kinds of developers that are not necessarily above average, but do know how to copy and paste well. What works for Ayende because of his exceptional design skills will not work for the copyandpasters. They need simple rules that can be engraved on their skulls, not erotic DB arts and crafts that are featured here.
  
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment62</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment62</guid><pubDate>Thu, 16 Sep 2010 21:24:28 GMT</pubDate></item><item><title>Michael Brown commented on Normalization is from the devil</title><description>I'm not sure if someone has mentioned this already, but this is where separating the transactional db from the reporting db comes in. And this is why CQRS is gaining traction. The problem you bring up (regarding querying highly normalized data) can be solved by exporting the transactional data to a reporting DB.
  
  
I think that is the solution to the read/write compromise. DON'T! Write to a normalized transactional database (or even better an event store), export to and read from a de-normalized query database. It makes everything simpler. The transactional DB remains you're ONE TRUTH while the query DB provides quick reads.
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment61</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment61</guid><pubDate>Wed, 15 Sep 2010 21:18:30 GMT</pubDate></item><item><title>evereq commented on Normalization is from the devil</title><description>Re @Ayende "I deal with denormalized data with NHibernate all the time, that is what the component tag is for." 
  
  
Sure you deal :) 
  
What I see from other comments that NOT everybody know how to dial with it and especially how to dial efficiently (especially with Updates in DB) :) 
  
  
Maybe that's a reason why a lot of people refuse denormalization like strategies (or at least afraid to use it with ORMs?) 
  
  
That is why I see sense to have more information (via your posts for example) about Denormalization in ORMs (at least NHibernate!). 
  
  
Sure up to you :)
  
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment60</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment60</guid><pubDate>Thu, 09 Sep 2010 13:57:08 GMT</pubDate></item><item><title>Ramon Smits commented on Normalization is from the devil</title><description>@Ayende,
  
  
Sorry :) sometimes I type faster then I think. I just ment NF. Like 1NF, 2NF, 3NF, etc.
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment59</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment59</guid><pubDate>Thu, 09 Sep 2010 12:42:29 GMT</pubDate></item><item><title>Ayende Rahien commented on Normalization is from the devil</title><description>Ramon,
  
BNF?? Backus–Naur Form ?
  
That is a way to define a syntax for a language, I think that you are talking about something else.
  
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment58</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment58</guid><pubDate>Thu, 09 Sep 2010 11:53:50 GMT</pubDate></item><item><title>Ayende Rahien commented on Normalization is from the devil</title><description>Evereq,
  
I deal with denormalized data with NHibernate all the time, that is what the component tag is for.
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment57</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment57</guid><pubDate>Thu, 09 Sep 2010 11:35:53 GMT</pubDate></item><item><title>Ayende Rahien commented on Normalization is from the devil</title><description>Stephane,
  
That works in some systems well, but that requires that you'll have two models.
  
And it doesn't help when you need to look at previous data (what was the name on Nov 2009 ?)
  
And I am actually more concerned with the read perf, not so much with how you store the data
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment56</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment56</guid><pubDate>Thu, 09 Sep 2010 11:32:21 GMT</pubDate></item><item><title>Ramon Smits commented on Normalization is from the devil</title><description>Also, not having the foreignkey references explicit is quite normale. They are three seperate models with possible three different databases or even services.
  
  
Also very normal design pattern for apps being developed nowadays with concepts like composite views in f.e. the browser.
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment55</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment55</guid><pubDate>Thu, 09 Sep 2010 11:28:23 GMT</pubDate></item><item><title>Ayende Rahien commented on Normalization is from the devil</title><description>@Frans,
  
a) except that normalization has been drilled so hard into people's head that it has became not only the default, but the "you must do so". That is something that requires thinking about, for scenarios such as the one that I mentioned in the post.
  
b) even with good ORM support, load data from a large number of table is going to be costly.
  
  
For example, consider:
  
  
Order.Payments
  
Order.OrderLines
  
Order.Discounts
  
Order.Payments.Transactions
  
Order.SupportCalls
  
  
Loading all of that is a costly thing, no matter how you do is.
  
  
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment54</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment54</guid><pubDate>Thu, 09 Sep 2010 11:26:22 GMT</pubDate></item><item><title>Ramon Smits commented on Normalization is from the devil</title><description>I usually like your posts but this is the first post in a long time where I had a wtf ayende what are you thinking :-).
  
  
As a db expert you don't really know about BNF? By the way, in your tiny model I would expect something like a product SKU to use instead of a name as it would be silly to use a name.
  
  
But your posts clearly shows to first look at your data. Then normalize it and possibly denormalize it for performance reasons. Which is quite logical in most environments.
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment53</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment53</guid><pubDate>Thu, 09 Sep 2010 11:25:18 GMT</pubDate></item><item><title>Ayende Rahien commented on Normalization is from the devil</title><description>Frank,
  
You keep a reference to the id, obviously.
  
My point is that "let us make everything a reference" is usually something that requires consideration.
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment52</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment52</guid><pubDate>Thu, 09 Sep 2010 11:24:00 GMT</pubDate></item><item><title>Ayende Rahien commented on Normalization is from the devil</title><description>Jonathan,
  
*snort*, I worked extensively with temporal databases, and to be frank, you can't really get a better example of why RDBMS are painful than using them in that scenario.
  
Just to give you an idea, we had a system where every single property had to be temporal. And we had to track the change on each of them independently.
  
That was a PITA, to say the least. And query performance when you had to access those temporal data sucked (show me the employee's salary for Nov 2009, where you had to show the employee name as it was at that time, for example)
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment51</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment51</guid><pubDate>Thu, 09 Sep 2010 11:19:33 GMT</pubDate></item><item><title>Ayende Rahien commented on Normalization is from the devil</title><description>Jim,
  
It isn't ORMs that I am talking about here, it is the relational model.
  
Admittedly, ORM can mask the cost of traversing the graph, and small objects in the OO world is an anti pattern in the relational world if you have 1:1 mapping.
  
That is why NHibernate allows you to split a single table into multiple objects, maintaining the best view in both worlds (using the component tag).
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment50</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment50</guid><pubDate>Thu, 09 Sep 2010 11:16:04 GMT</pubDate></item><item><title>Slappy commented on Normalization is from the devil</title><description>@Karhgath: well, I don't want to argue back and forth about it but I still disagree with your definition of fact/truth/knowledge.  You make it sound like they are all different concepts when (in terms of data) they are pretty much all the same.  What gets stored in the database is (presumably) the best-known information at the time the data was written.  If your fact tables have incorrect information in them, they are still fact tables, their contents are just not factual.  If my name is in the database as Swappy, then the system thinks my name is Swappy.  It's a "fact" as far as the system is concerned.
  
  
It doesn't even sound like we disagree on this necessarily, but I don't see how this information sheds any light on the larger discussion.
  
  
I don't think anyone is arguing with the method you describe with your example, except it's not denormalization at all (as MANY have pointed out--I'm not going to rehash that) .  So if the DBAs and architects you work with would argue for NOT storing a snapshot address (or product price, or whatever) on the order/orderline, you are working with DBAs and architects who don't understand TNF.
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment49</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment49</guid><pubDate>Wed, 08 Sep 2010 19:43:07 GMT</pubDate></item><item><title>Karhgath commented on Normalization is from the devil</title><description>@Slappy, @evereq
  
  
Slappy is right in saying that normalization's main focus is NOT to save space, which makes Ayende reasoning invalid. It was a nice result, and probably a really appreciated one tho, that it saved space. 
  
  
It was however done for integrity purposes mostly. However, integrity was important back then since, you have one central DB and many applications or processes accessing them. And often you didn't have applications, just a bunch of people accessing it directly. Putting rules and trusting each of them for integrity was not a good approach.
  
  
However today, in some fields at least, people are starting to view data as "embedded" in a single business process (and vendor lockins are contributing to this a lot). This data is served by a specific application. If you want to manage customer, use the Customer Management plateform. No you cannot come and play with the Customer data without going thru the app; we'll share data with out with an external process (services, data replication, batch import, etc.). In that kind of environment, normalization is important but have a much lesser scope and "raison d'etre".
  
  
This is why I believe Ayende had a valid point, but he missed the target with his reasoning.
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment48</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment48</guid><pubDate>Wed, 08 Sep 2010 18:42:59 GMT</pubDate></item><item><title>Karhgath commented on Normalization is from the devil</title><description>@Slappy
  
  
You're right, I should have defined the terms I used. I used the definition of a fact from a datawarehouse perspective, which need to be more scientific: it is a verifiable and provable statement. If your user is misspelled in the DB as "Swappy", it is not the truth, but is used as a fact. The problem in datawarehouses is that facts are useless since they aren't truth. s/fact/truth/ in my post. That might be clearer?
  
  
Facts are indeed local to specific applications (local truths). My customer/order/shipping example is indeed very straightforward and no one will probably argue. This highlight the facts that no one single database holds the "truth", just knowledge (or the layman definition of fact, which is just a statement: this user is called "Swappy".)
  
  
Today DBAs and architects in large enterprises wants to normalize ALL data and enforce data rules in ONE place. It doesn't mean they are right.
  
  
Let's take a DDD principle of boundaries (I'm not a DDD), Order and Customer are 2 different things. You should absolutely normalize what a customer is, 3rd normal form and all, and store it that way. Same for order. But individually and separately.
  
  
You'd probably end up with the same model as if you'd took everything together, normalized it to 3rd form and then denormalized it based on needs throughout the project. Most of denormalization cases are because you cross data/object/aggregate root boundaries (Orders history crosses the boundaries of Products and Customers for example).
  
  
In Ayende's example, I would have 3 boundaries:
  
- Customer
  
- Order
  
- Product
  
  
I'd have a Customer, which is a nicely normalized customer (with multiple addresses with history, and such).
  
  
I'd have a Product with related normalized tables.
  
  
I'd have the Order, but linked to OrderCustomer(and not Customer) who probably have just a single Shipping Address and a single Billing Address, with only useful information needed for the order, no multiple tables with history and such. We can identified that a Customer and OrderCustomer are the same, even have some type of constraints, but they aren't normalized together, since a Customer is not an OrderCustomer.
  
  
Then OrderLine, with OrderLineProduct, and so on.
  
  
In my experience, unless you really have simple simple rules and no scalling and extensibility needs, this is a great way to normalize the DB, and reduces denormalization to a minimum. This principle served me well.
  
  
The awesome thing is... no, you DO NOT NEED to update 1000+ tables when you update the address of a Customer. Why? Because an OrderCustomer does not follow the same rule and do not need to be updated the same way, they aren't really linked. This is because an address in the context of a customer and in the context of an order, does not mean the same thing! Even if they are the same addresses, they are different because of context.
  
  
  
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment47</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment47</guid><pubDate>Wed, 08 Sep 2010 18:32:48 GMT</pubDate></item><item><title>Slappy commented on Normalization is from the devil</title><description>@evereq 
  
  
&gt;Very strange - we read SAME text, but for some reason READ it differently! :) WHERE EXACTLY you see that Ayende say that ONLY ONE "purpose of normalization" IS for example "compression" ??? 
  
  
But you DO agree that it WAS one of his points, and that point is COMPLETELY INVALID, right?
  
  
&gt;"In essence, normalization is compressing the data" - you do not agree with this??? Take 2 database and compare size of normalized and not normalized! :)
  
  
No one is arguing that.  But the easily inferred conclusion, that one of the major justifications for normalization is/was to save space, is false.
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment46</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment46</guid><pubDate>Wed, 08 Sep 2010 14:14:25 GMT</pubDate></item><item><title>evereq commented on Normalization is from the devil</title><description>1) Re @Jeff Darcy: "Since the original claim about the purpose of normalization was false, the remainder based on that claim is meaningless."...
  
  
Very strange - we read SAME text, but for some reason READ it differently! :) WHERE EXACTLY you see that Ayende say that ONLY ONE "purpose of normalization" IS for example "compression" ??? 
  
  
Here is quotes from blog post:
  
  
 "normalization in RDBMS had such a major role because storage was expensive. It made sense to try to optimize this with normalization." Yes, it's TRUE! Not only because of this sure thing (and we all know this), but ALSO because of this!
  
  
"In essence, normalization is compressing the data" - you do not agree with this??? Take 2 database and compare size of normalized and not normalized! :)
  
  
2) Re #Rocky Rocketeer:
  
"Updating 100 places means more lock. If you don't know this problem or not what a DB lock is, you surely lack real world experience in large scale applications.". Sure agree - more locks :) But maybe in whole a LOT of applications, MORE locks actually LESS problem than for example 100s of JOINs! :D :D :D Exactly in such situations (and a lot of others) denormalization work  ;-)
  
Ah, and with Row level locks, it's actually become much less issue than for example some time ago, when say MyISAM support only whole table locks :)
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment45</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment45</guid><pubDate>Wed, 08 Sep 2010 13:27:18 GMT</pubDate></item><item><title>Rocky Rocketeer commented on Normalization is from the devil</title><description>Hmm, makes me feel awkward to see that even an educated guy like Ayende knows only a little about real world enterprise work. I agree with all those who identified his misconception what he wants to store in the tables in the first place.
  
  
Some more thoughts:
  
- Updating 100 places means more lock. If you don't know this problem or not what a DB lock is, you surely lack real world experience in large scale applications.
  
- If you pump billions of transactions per week you begin to care about storage even today.
  
- Indeed you will denormalize for performance. Indexed / materialized views come in mind. It is sometimes needed to have a real table, but that requires triggers or similar which should be avoided as long as possible.
  
- If you need historic data, consider implementing one of the known history patterns for sql databases.
  
  
I have the same discussion about the NoSQL in huge enterprises. It might work for some cases, but who has real world experience with it in the weird real work DBs with one gazillion processes you are not allowed to change, craftfully mapped to 200+ tables?
  
  
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment44</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment44</guid><pubDate>Wed, 08 Sep 2010 12:45:16 GMT</pubDate></item><item><title>Jeff Darcy commented on Normalization is from the devil</title><description>I think Rafal said the one thing that really needed to be said: normalization is not a form of compression.
  
  
To elaborate a bit, I doubt that you'd find much in the original relational literature (Codd et al) to support the claim that normalization has any relationship to storage cost.  It's a mathematical, not economic, construct.  Its goal is to reduce duplication of data to avoid anomalies - i.e. inconsistency.  Sometimes you do want to copy the data and allow the copy to change independently, as in the "current price" vs. "price at time of purchase" example.  That's fine, because those are conceptually *different values* and it makes sense to represent or store them separately.  On the other hand, normalization has to do with the *same value* being used in different contexts, in which case it should be represented and stored once - not to reduce storage cost, but so that an update to the value affects all uses of it.
  
  
Since the original claim about the purpose of normalization was false, the remainder based on that claim is meaningless.
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment43</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment43</guid><pubDate>Wed, 08 Sep 2010 12:30:43 GMT</pubDate></item><item><title>Set commented on Normalization is from the devil</title><description>&gt;If you are writing the next MMORPG or something that requires that kind of throughput, you might start making these kinds of tradeoffs from the start, but there are plenty of enterprise applications in the world that rely on RDBMSs and meet all of these requirements.
  
  
From what I recall, WOW backend is oracle...
  
Any financial application is built over oracle/ms sql/sybase.
  
Happily that rdbms can't scale...
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment42</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment42</guid><pubDate>Wed, 08 Sep 2010 09:25:57 GMT</pubDate></item><item><title>evereq commented on Normalization is from the devil</title><description>@Slappy:
  
Re  "False premise in here somewhere. Databases can't perform well, be scalable and efficient and provide security? ":
  
  
Yes, they can! Because why we use them so long?? Actually why we use RDBMS so long is good name for dedicated blog post :D as seems only last time community massively start see ANOTHER ways, including but not limited to document / object databases etc! 
  
  
BUT sometimes we SHOULD help database engine to meet all requirements together, not only data consistency requirements! And denormalization / other technologies just help us in such cases :D
  
  
Reading most of comments, I do understand that whole a lot of developers just don't come to situations where they must search for another solutions or tweak existed... RDBMS works really well up to some extend :) just make sure you know how to help them to do they job better :)
  
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment41</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment41</guid><pubDate>Wed, 08 Sep 2010 09:22:33 GMT</pubDate></item><item><title>Slappy commented on Normalization is from the devil</title><description>@evereq 
  
  
&gt;it's sometimes one way work and sometimes another way (related to where place validation / consistency rules). 
  
  
I do agree with this, of course.  You should indeed consider all tools when building something.  However, for most data-dependent applications, a properly normalized model is the correct choice (or at least a good choice) a very high percentage of the time.  There are a LOT of benefits to enforcing data rules in the database, but I'm sure I don't need to retread those arguments.
  
  
&gt;for me if you do NOT apply 3rd normal form - it's "denormalization" (at least probably and hope to 2nd normal form :D)
  
for you maybe something different :) 
  
  
No, it's not different.  Like many others here, I am saying that it *does not violate TNF*, and is therefore *not* denormalization.
  
  
&gt;b) Re to "If you need to update data in 100 places, but somehow miss something and only update 99, you are going to have a bug that may not present itself for years, and then have the potential for a data problem that can never be fixed." ... yep, you or me can miss whole a lot :D We developers made bugs, but I not sure it's will be really "a problem" to fix such a bug ones you found it ;-) You can always create some additional logic to do checks for consistency in storage, more so in some storage's like Azure Table Storage even MSFT engineers DO implement some "retry" logic to simulate transactions etc etc :D 
  
  
The difference is that a good RDBMS will enforce the rules for you out of the box, whereas if you have to write the code yourself to enforce the rules, it is much more prone to creating these bugs.  Everyone loves the expression "if all you have is a hammer, everything starts looking like a nail".  This is a case of "If you have a hammer and a nail, stop looking all over the place for a goddamn sledgehammer".
  
  
  
&gt;c) Re to "In 99%+ of cases, data is 10000000 times more important than any other aspect of your application, so treating data with the utmost care should always be the first priority of every developer" :) OK, NOT agree :) If you application store data correctly and even enforce all data rules in Database, but does not fit some non-functional requirements like Performance, Scalability, Efficiency or even Security who will use it!??? Yep, you should NOT lost your data in any case, but sorry I did not see how fact that you store data in 100 places can give you some "issues" with possible data lost ;-)
  
  
False premise in here somewhere.  Databases can't perform well, be scalable and efficient and provide security?  If you are writing the next MMORPG or something that requires that kind of throughput, you might start making these kinds of tradeoffs from the start, but there are plenty of enterprise applications in the world that rely on RDBMSs and meet all of these requirements.
  
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment40</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment40</guid><pubDate>Tue, 07 Sep 2010 22:43:03 GMT</pubDate></item><item><title>Slappy commented on Normalization is from the devil</title><description>@Karhgath: 
  
  
&gt;Holding fact is exponentially hard the more complex it gets because, facts aren't actually facts, they are facts only facts in a specific context (or dimension, or business view or whatever).
  
  
This is the very definition of a fact in the database.  "Fact", in terms of data, does not mean the same as "fact" in the sense of an absolutely certainty (like a mathmatical fact).
  
  
&gt;If you really need complex business views, you can use OLAP/star-schema and have dimensions. I've worked in banks with huge OLAP, and still they have to do a lot of work on the data itself because of inaccuracies, incompleteness, errors and such. These are not facts.
  
  
From the database's perspective, these ARE facts.  If my name is spelled incorrectly in the database, it's not a fact?  Something being incorrect or incomplete does not mean it's not a fact (in the database definition).
  
  
I don't understand much of the rest of your post--it seems that most of it is predicated on not understanding the two different definitions of the word "fact".  You also have a whole paragraph of questions that are meant to be rhetorical?  I say that because I don't see the point of these questions.  Because data can get into an inconsistent state in some situations (which can all be mitigated btw), we should abandon trying to achieve consistency?
  
  
Finally, the customer/order example is not a good one, as no one would argue against this type of storing snapshots of information.  As others have pointed out, it's not a question of normalization or denormalization, it seems more of a misunderstanding of what those terms mean.
  
  
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment39</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment39</guid><pubDate>Tue, 07 Sep 2010 22:32:12 GMT</pubDate></item><item><title>evereq commented on Normalization is from the devil</title><description>+1 to @Karhgath: 100% agree - it's sometimes one way work and sometimes another way (related to where place validation / consistency rules). And dogma like "have all your data in consistence in DB level" (or "Enforce data rules in the database.") sometimes really CAN be "pushed" to "devil" :D
  
  
@Slappy: 
  
  
a) let's not argument about how to call this technology that Ayende apply :) He describe it right way - he just don't tell anywhere about "denormalization" at all - instead he told "my default instinct to apply 3rd normal form has been muted..." :D :D :D So you can call it as you like - for me if you do NOT apply 3rd normal form - it's "denormalization" (at least probably and hope to 2nd normal form :D)
  
for you maybe something different :) 
  
  
b) Re to "If you need to update data in 100 places, but somehow miss something and only update 99, you are going to have a bug that may not present itself for years, and then have the potential for a data problem that can never be fixed." ... yep, you or me can miss whole a lot :D We developers made bugs, but I not sure it's will be really "a problem" to fix such a bug ones you found it ;-) You can always create some additional logic to do checks for consistency in storage, more so in some storage's like Azure Table Storage even MSFT engineers DO implement some "retry" logic to simulate transactions etc etc :D 
  
c) Re to "In 99%+ of cases, data is 10000000 times more important than any other aspect of your application, so treating data with the utmost care should always be the first priority of every developer" :) OK, NOT agree :) If you application store data correctly and even enforce all data rules in Database, but does not fit some non-functional requirements like Performance, Scalability, Efficiency or even Security who will use it!??? Yep, you should NOT lost your data in any case, but sorry I did not see how fact that you store data in 100 places can give you some "issues" with possible data lost ;-)
  
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment38</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment38</guid><pubDate>Tue, 07 Sep 2010 19:34:14 GMT</pubDate></item><item><title>Erik Eckhardt commented on Normalization is from the devil</title><description>Someone is smoking dope. As others have said, the "address at time of order" and "product name at time of order" are discrete facts separate from "current address" and "current product name." All you've done is normalized one thing and denormalized something else completely different that is going to hurt you down the road.
</description><link>http://ayende.com/4620/normalization-is-from-the-devil#comment37</link><guid>http://ayende.com/4620/normalization-is-from-the-devil#comment37</guid><pubDate>Tue, 07 Sep 2010 17:29:54 GMT</pubDate></item></channel></rss>