Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 6,128 | Comments: 45,550

filter by tags archive

Designing a document databaseAttachments

time to read 2 min | 344 words

In a previous post, I asked about designing a document DB, and brought up the issue of attachments, along with a set of questions that needs to be handled:

  • Do we allow them at all?

We pretty much have to, otherwise we will have the users sticking them into the document directly, resulting in very inefficient use of space (binaries in Json format sucks).

  • How are they stored?
    • In the DB?
    • Outside the DB?

Storing them in the DB will lead to very high database sizes. And there is the simple question if a Document DB is the appropriate storage for BLOBs. I think that there are better alternatives for that than the Document DB. Things like Rhino DHT, S3, the file system, CDN, etc.

  • Are they replicated?

Out of scope for the document db, I am afraid. That depend on the external storage that you wish for.

  • Should we even care about them at all? Can we apply SoC and say that this is the task of some other part of the system?

Yes we can and we should.

However, we still want to be able to add attachments to documents. I think we can resolve them pretty easily by adding the notion of a document attributes. That would allow us to add external information to a document, such as the attachment URLs. Those should be used for things that are related to the actual document, but are conceptually separated from it.

An attribute would be a typed key/value pair, where both key and value contains strings. The type is an additional piece of information, containing the type of the attribute. This will allow to do things like add relations, specify attachment types, etc.

More posts in "Designing a document database" series:

  1. (17 Mar 2009) What next?
  2. (16 Mar 2009) Remote API & Public API
  3. (16 Mar 2009) Looking at views
  4. (15 Mar 2009) View syntax
  5. (14 Mar 2009) Aggregation Recalculating
  6. (13 Mar 2009) Aggregation
  7. (12 Mar 2009) Views
  8. (11 Mar 2009) Replication
  9. (11 Mar 2009) Attachments
  10. (10 Mar 2009) Authorization
  11. (10 Mar 2009) Concurrency
  12. (10 Mar 2009) Scale
  13. (10 Mar 2009) Storage

Comments

Ayende Rahien

You do realize that I am not going to use SQL Server as the backend, right?

josh

Do you want to support change tracking or version info for the attachments? Or allow multiple attachments per document? Both those are design considerations. For simplicity, I'd assume a user has the same rights on the attachment as the document, but you may think otherwise.

Ayende Rahien

Multiple attachments, certainly.

Version info? I don't think so.

Permission might be an interesting problem if we store this externally

josh

I certainly recommend the external storage approach. worked out well for stuff I've done in the past.

Rafal

Versioning attachments is easy if you track not only versions of the document body, but also its metadata and treat the attachment files as immutable.

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. The worker pattern - about one day from now

There are posts all the way to May 30, 2016

RECENT SERIES

  1. The design of RavenDB 4.0 (14):
    26 May 2016 - The client side
  2. RavenDB 3.5 whirl wind tour (14):
    25 May 2016 - Got anything to declare, ya smuggler?
  3. Tasks for the new comer (2):
    15 Apr 2016 - Quartz.NET with RavenDB
  4. Code through the looking glass (5):
    18 Mar 2016 - And a linear search to rule them
  5. Find the bug (8):
    29 Feb 2016 - When you can't rely on your own identity
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats