Gorilla Compression algorithm
https://blog.acolyer.org/2016/05/03/gorilla-a-fast-scalable-in-memory-time-series-database/
https://www.timescale.com/blog/time-series-compression-algorithms-explained/
Richard Feynman Technique
RIVERS principle (check Addy Osami's post)
Learn basics, basic principles, foundation principles, first principles. A good and solid foundation is key!
Data recorded over a period of time is time series data
Stock price over a period of time.
Weather over a period of time.
CPU usage over a period of time.
For CPU usage over a period of time, an example might look like this:
75% at 10am
74% at 10:10am
73% at 10:20am
Not sure.
One way is to just write ✍️ down the values with corresponding time values. That would be a simple written representation like the above CPU usage example. Another way is to plot the data values against time values on a graph. The time is on the X axis and the data is on few Y axis
It's an open source software. It's a relational database. It runs on top of PostgreSQL as an extension
What does Timescale mean when it says it can be used for analytical purposes too? Is it an Online Analytical Processing (OLAP) DB? Or Online Transactional Processing (OLTP) DB or hybrid? HTAP
What's the compatibility matrix of Timescale and PostgreSQL? Are all versions of Timescale compatible with all versions of PostgreSQL?
How to install Timescale extension on PostgreSQL? Like any other extension? Or are there any exceptions?
How easy is it to upgrade PostgreSQL and Timescale independently after installing Timescale? And after inserting data into the database? Inserting data using Timescale SQL features
How can we use Timescale's features using SQL?
What are hyper functions? Why are they called that? "Hyperfunctions"
How do hyperfunctions make time series easier? Why should it be easy? Does easy come with tradeoffs?
How would one go about storing, updating and retrieving (and deleting) time series data with vanilla PostgreSQL without Timescale extension? Can it be better than Timescale? How is Timescale 10x to 100x faster?
Is Timescale faster than vanilla PostgreSQL, Influx DB, MongoDB? 10x faster? 100x faster? How? Also, data? Benchmark results?
How does one store time series data in Influx DB, in MongoDB?
"Write millions of data points per second per node. Horizontally scale to petabytes. Don’t worry about cardinality."
How is it possible to write ✍️ millions of data points per second per node? What would be the kind and size of data, and the kind and size and architecture of the node? How come we can do horizontal scaling with Timescale DB? Is it true? Or there are catches in horizontal scaling? Also, what is cardinality? And why would anyone worry about cardinality?
How is it possible to scale and store petabytes of data in Timescale DB?
Pros and cons of Timescale? Usage, cost, operatioal ease etc
What are hyper tables? In Timescale DB
What are the different problems that Timescale DB team faced while building Timescale DB and how did they solve them?
What's the difference between Timescale V1 and Timescale V2?
How has Timescale DB evolved over time?
In what language is Timescale DB written in and why? History? Reasoning?
Why did Timescale DB team build Timescale? How did it happen?
How good is Timescale DB? It's performance, stability, reliability? Benchmarks and other results?
Who are Timescale DB users? Which big and small companies (startups etc) use them? How do they use them? What are they saying about their experience and usage of Timescale DB? Pros, cons. Problems. Advantages, disadvantages. Happy things. Sad things. Annoying things. Blog posts about issues and resolutions at tech level or process level etc
Which users (big and small companies, startups, popular individuala) endorse and advocate for Timescale DB? Why?
What are the features required in a DB, especially a time series DB?
Different operations people perform while working with time series data?
Does Timescale DB do indexing? Searching? Sharding? Does it have High Availability? Cluster mode? Multi node? Multi master?
How does Timescale help with cost reduction / optimized cost?
How does Timescale compress data? Does it compress time series data? Or all kinds of data? What compression algorithm(s) does it use? Can we use custom algorithms?
How does Timescale decouple compute and storage? Sounds like serveless DB 🤔🤨
What kind of data retention policies can we enforce on the time series data in Timescale?
What is downsampling?
What's the difference between managed Timescale DB vs self hosted Timescale DB? In terms of cost (price, management costs like paying team managing the Timescale DB). Why would anyone go with self hosted Timescale DB? Self hosted Timescale DB can be a bummer if not done well, right? Hence increase costs due to issues / errors / problems. (Error budget etc)
What are users saying about the self hosted Timescale DB and the managed Timescale DB? Given the managed service is paid, Timescale DB could be built with hard to operate DB in mind too, to incentivise getting the managed service which has good management services etc
Is there an enterprise version of Timescale DB? That has features different from open source Timescale DB. Like Influx DB vs paid Influx DB
Behind the scenes in Timescale DB, how does it provides all the features that it provides? The algorithm, the mechanism, the logic, math, research etc
Below are the features of Timescale DB on the Cloud:
Time-series Analytics
Data Lifecycle management
Operational management
Goal: Teach Timescale DB to a newbie audience
About the audience: The audience have heard the term database, that's all. They don't know databases, database internals or time series database or Timescale DB
Questions that might come up from the audience:
Content formats for explanations
What are some examples of Time Series Databases?
OpenTS DB, Influx DB, Graphite DB, Timescale DB, Goku DB (by Pinterest)