Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 6,124 | Comments: 45,470

filter by tags archive

Time series feature designSystem behavior

time to read 2 min | 251 words

It is easy to forget that a database isn’t just about storing and retrieving data. A lot of work goes into the actual behavior of the system beyond the storage aspect.

In the case of time series database, just storing the data isn’t enough, we very rarely actually want to access all the data. If we have a sensor that send us a value once a minute, that comes to 43,200 data points per month. There is very little that we actually want to do for that amount of data. Usually we want to do things over some rollup of the data. For example, we might want to see the mean per day, or the standard deviation on a weakly basis, etc.

We might also want to do some down sampling. By that I mean that we take a series whose value is stored on a per minute / second basis and we want to store just the per day totals and delete the old data to save space.

The reason that I am using time series data for this series of blog posts is that there really isn’t all that much that you can do for a time series data, to be honest. You store it, aggregate over it, and… that is about it. Users might be able to do derivations on top of that, but that is out of scope for a database product.

Can you think about any other behaviors that the system needs to provide?

More posts in "Time series feature design" series:

  1. (04 Mar 2014) Storage replication & the bee’s knees
  2. (28 Feb 2014) The Consensus has dRafted a decision
  3. (25 Feb 2014) Replication
  4. (20 Feb 2014) Querying over large data sets
  5. (19 Feb 2014) Scale out / high availability
  6. (18 Feb 2014) User interface
  7. (17 Feb 2014) Client API
  8. (14 Feb 2014) System behavior
  9. (13 Feb 2014) The wire format
  10. (12 Feb 2014) Storage

Comments

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. RavenDB 3.5 whirl wind tour: You want all the data, you can’t handle all the data - about one day from now
  2. The design of RavenDB 4.0: Making Lucene reliable - 3 days from now
  3. RavenDB 3.5 whirl wind tour: I’ll find who is taking my I/O bandwidth and they SHALL pay - 4 days from now
  4. The design of RavenDB 4.0: Physically segregating collections - 5 days from now
  5. RavenDB 3.5 Whirlwind tour: I need to be free to explore my data - 6 days from now

And 14 more posts are pending...

There are posts all the way to May 30, 2016

RECENT SERIES

  1. RavenDB 3.5 whirl wind tour (14):
    29 Apr 2016 - A large cluster goes into a bar and order N^2 drinks
  2. The design of RavenDB 4.0 (13):
    28 Apr 2016 - The implications of the blittable format
  3. Tasks for the new comer (2):
    15 Apr 2016 - Quartz.NET with RavenDB
  4. Code through the looking glass (5):
    18 Mar 2016 - And a linear search to rule them
  5. Find the bug (8):
    29 Feb 2016 - When you can't rely on your own identity
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats