There ain’t no such thing, the definitive entity definition
I was at a customer site, and we were talking about a problem they had with modeling their domain. Actually, we were discussing a proposed solution, a central and definitive definition for all of their entities, so all of the applications could use that.
I had a minor seizure upon hearing that, but after I recovered, I was able to articulate my objections to this approach.
To start with, it breaks the Single Responsibility Principle, the Open Closed Principle and the Interface Segregation Principle. It also makes versioning hard, and introduce a central place where everyone must coordinate with. Think about the number of people that has to be involved whenever you make a change.
Let us take the customer as the representative entity for this discussion. We can all agree that a customer has to have a name, an email and an id. But billing also need to know his credit card information, help desk needs to track what support contracts he has and sales needs to know what sort of products we sold the guy, so we can sell him upgrades.
Now, would you care to be the guy who has to mediate between of all of those different concerns?
And what about changes and updates? Whenever you need to make a change, you have to wait for all of those teams and application to catch up and update and deploy their apps,.
And what about actual usage? You actually don’t want the help desk system to be able to access the billing information, and you most certainly don’t want them to change anything there.
And does it matter if we have concurrent modifications to the entity by both help desk and billing?
All of those things argue very strongly against having a single source of truth about what an entity is. In my next post, I’ll discuss a solution for this problem, Composite Entities.
Comments
"Now, would you care to be the guy who has to mediate between of all of those different concerns?"
No, and I agree that having an actual entity in the form of a central library wouldn't work, but there needs to be a central schema that people can look up for attribute definitions. And the person to maintain such a schema would be the programming equivalent of a librarian, or a metadata specialist. Sometimes this is called a data dictionary.
Otherwise you get two different departments interpreting 'contract' in their own ways. We face this issue constantly.
I am not sure how deeply such a schema could be usefully integrated with the information system. For example you could limit it to people who need to look up concepts, you can have it drive documentation schema, use it for hbm file validation, auto-generate UML...
Surely - bounded contexts, golden source & service bus...
In my experience the desire for a single entity reference model emerges from application/reporting integration problems. Is that the case here too?
Hmm so it's bad to have a single shared library across multiple applications? No surprise that's exactly how my current job does it to "make things easier".
I'm guessing the solution is going to be having slightly different entities based on each functional module? So the CRM module has a Customer class with only what the CRM needs from a Customer; the AR module has a Customer class that might have the other properties from CRM but also Payment information; the Helpdesk module has a Customer class with support ticket information.
Shashi, if you're using some sort of SOA, you can just worry about the service definitions for the services you depend on. WSDL or some such, possibly; or a client library.
Aside from those interface boundaries, everything is internal to a particular service. Standardize how you define your domain classes and leave it at that.
In my experience some domains can be much more intertwined than it might initially seem. In your example, maybe some senior help desk personnel SHOULD be able to see billing info to help customers with issues in the billing process. And billing might want to be able to pull up an old bill and see any help desk information associated with it.
If you know you are going to have a very closely knit set of applications that need most of the same data and logic around that data it makes sense to have a central definition and data store.
The single source of truth should exist: it is the database. Its representation to the application(s) can vary, though.
Tyler, I believe you're right: a senior help desk personnel should be able to see the billing info in some cases, but this can be done using composition. You could compose several contexts on UI in such a way that a user with higher role can see information from different contexts and act on it; each context with its own stuff(UI part, domain part,etc...). I believe that having only one entity comes from the data side of the things: entities as data holders - especially if they're 'generated' from the database using some kind of tool. I tend to see better the responsibilities of the entities when they have behavior, thus not having things like XXXEntity-Manager/Service classes with ALL the methods that such a big entity might want.....What do you think?
Shashi, And have to wait for someone else because I need to track some additional bit of info for one particular app? That sounds extremely inefficient.
Tyler, In a single app? That is usually not the way it goes. Sure, you may have UI composition that make it possible, but there are very few apps that handle both help desk and billing as part of the same system. And there are even fewer organizations who does that.
Tobi, You make an assumption that there IS a single database. What if you have ten?
You should try integrating into SAP. They have a couple of API document definitions ( ie Business_Partner and Order_Save ) which represent the twisted mother of all canonical data models
Splitting across databases is to avoided. You don't do this without reason.
While reading the post, I was thinking: but but but, sure it's a common pitfall, but there are legitimate use cases, and a good way to implement something like that. And then I read the last two words: "Composite Entities" and smiled. Looking forward to the next post on this.
@tobi - It happens all the time. Companies grow organically. There is no one person dictating technology purchases across all business units. Consider 3rd party reporting databases. Or mergers and acquisitions. Sometimes departments have their own embedded IT people. Sometimes you just write a "stop gap" solution and eventually it becomes business critical. One of my customers had 13 definitions of what a "policy" is. It's cost prohibitive to undo that. You stop the bleeding and move on to the next most important business problem.
I've come across this too, and I've had a difficult time convincing people that shared entities is a bad approach. In my experience, a hand written DAL is the primary reason why people want to share entities, because writing DA code is expensive. Once you convince them to throw away the hand-written DAL, the next step is to convince them that sharing entities is a bad thing because behaviors change depending on the context they are used in.
Code should not be re-used because of the data shape, nor because of cross cutting concerns such as data access.
SRP trumps DRY in almost every case.
Tobi, I have multiple applications, developed by separate teams, deployed at different times, managed by different parts of the org. You REALLY want to have as few deps between them.
You can't have definitive anything anywhere, unless time freezes and stays that way until the end of time.
However having multiple entities that you bring to a common denominator while still allowing specific extensions could be a blessing on an environment where everything has to be integrated together.
However I changing entity definitions in every system to a common denominator would be a deadly blow to every organization's IT budget in existence.
That's why integration gateways and ESBs, have been invented .... and I can see where a Document Database (like Raven) would fit really nice into this story :) ....
@Ayende - I would be interested in hearing a comparison ( or at least your thoughts on it) of the use of a common information domain.
Specifically I am referring to something like TMF SID.
Alistair, I don't believe that you CAN have a common domain, not in a single org, and certainly not in multiple orgs.
What if you have a large amount of different things happening all within the context of a single application (e.g. a single Intranet website that covers help desk, payroll and a host of other tools and products)?
cbp, You are splitting this into multiple applications and compose them at the UI level
Udi Dahan talks abou this all the time at his presentations and I'm told at his courses. His las blog post a few days ago was the very same topic.
I looking forward to comparing how Ayende sees things, if any different.
Ayende, you have convinced me.
@Christopher: That would work, if everything were available via services. It doesn't work if you have everything from MS Access, Java enterprise apps to MVC 3, PHP, etc.
@Ayende:
For the most part, no. but it depends and that's why I talk about 'how deeply do you want to integrate'?
For example, where I work (Big10 university), certain pieces of data have classification levels for compliance reasons, which determine how they are allowed to be used/exposed in applications.
If a new piece of information you want to track falls into a compliance bucket, you are better off waiting for a definitive answer than releasing to production and finding that being compliant will force feature changes.
A shared library won't work, but a mechanism to ensure you are doing the right thing is necessary. We have data stewards to provide guidance for this purpose.
Comment preview