Skip to content

Instantly share code, notes, and snippets.

@framp
Last active August 29, 2015 14:05
Show Gist options
  • Save framp/b5e9d7f7943824fd2d61 to your computer and use it in GitHub Desktop.
Save framp/b5e9d7f7943824fd2d61 to your computer and use it in GitHub Desktop.
Weird thoughts about a weird problem

http://stackoverflow.com/questions/25415424/whats-an-efficient-way-to-store-a-time-series

Group the data by TYPE (what I'd like, ideally)
A timestamp in ms is 8 bytes, the 2 floats could be real of 4 bytes each
16*3091472167 = 49463554672 bytes ~ 46GiB ~ nearly 3 times the zipped files

Group by month
Each timestamp is now between 0 and 2678400000, which fits a 4b integer
Using a smallint for storing the difference from the first float
All the types are columns
Assuming all the timestamps have information about all the types
(4+(4+2)*14)*3091472167/14 = 18GiB ~ still worse than zip
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment