Database Schemas
I was asked to comment on the use of DB schemas, so here it is. The first thing that we need to do is decide what a schema is.
A schema is an organization unit inside the database. You can think about it as a folder structure with an allowed depth of 1. (Yes, just like MS-DOS 1.0). Like folders in the real file system, you can associate security attributes to the schema, and you can put items in the schema. There is the notion of the current schema, and that about it.
Well, so this is what it is. But what are we going to use if for?
People are putting schemas to a lot of usages, from application segregation to versioning. In general, I think that each application should have its own database, and that versioning shouldn't be a concern, because when you upgrade the application, you upgrade the database, and no one else has access to your database.
What we are left with is security and organization. In many applications, the model layout naturally fall out into fairly well define sections. A good example is the user's data (Users, Preferences, Tracking, etc). It is nice to be able to treat those as a cohesive unit for security purposes (imagine wanting to limit table access to the Accounting schema). It is nice, but it is not really something that I would tend to do, mostly because, again, it is only the applications that is accessing the database.
Defense in depth might cause me to have some sort of permission scheme for the database users, but that tends to be rare, and only happen when you have relatively different operation modes.
What I would use schemas for is simply organization. Take a look at Rhino Security as a good example, but default, it will tack its tables into their own schema, to avoid cluttering the default schema with them.
In short, I use schemas mostly for namespacing, and like namespaces elsewhere, they can be used for other things, but I find them most useful for simply adding order.
Comments
Wow... 0 comments. Totally expected a flamewar on this one. Anyhow, totally agree. I'm sure someone will come up with a few technical reasons why schemas help with this or that, but by and large, I think you're right, Ayende.
To throw my 2p worth...
I was recently asked to set up schemas for workflow purposes. One schema was to be called Draft and another was called Live. The schemas effectively separated data in to those workflow stages. We had a sproc to mirror all data in pre-live over to live at the press of a "Publish" button.
This was the domain of content management, so Draft was read/write and Live as read only to the app.
It was neat, but I felt a little like it was going against the grain.
My feeling is that using namespaces like this can get messy: changes to workflow invite more schemas! DRY is violated. Plus every db migration has to be applied across all schemas. And you need to apply tests to all "mirrors" of the schema.
Think I'd prefer to keep schemas reserved for simple namespace as you suggest.
Schema per application only makes sense if your applications are very segregated. But if you are working on internal apps at a company you will have multiple aplications accessing the same data.
Let's say you work at a finance company and you need to represent investments. You also need to represent trades which relate to investments and while there are separate apps that deal only with investments and some with investments and trades, those two things have a relation and are modeled as such in the database, foreign key's etc..
First sentence should be "Database per application", i.e. perhaps a database for a business unit might include multiple apps and make more sense than trying to keep a separate one per app and not be able to use the nice features of a database such as foreign keys, etc. that start to cause hassles when you have to cross databases.
Comment preview