Skip to content

Instantly share code, notes, and snippets.

@phpfour
Created July 29, 2022 00:38
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save phpfour/575f408ccd09311ae8927b7232be6a39 to your computer and use it in GitHub Desktop.
Save phpfour/575f408ccd09311ae8927b7232be6a39 to your computer and use it in GitHub Desktop.
Logging DB analysis

Current Implementation

  • Currently, all log information is in the logging table and as of now it has around ~2,13,92,493 records
  • Total size of data is ~36 GB (retrieved from information_schema table)
  • A record can have a size between 1KB to 10KB based on the amount of data stored.
  • With the current indexes in the table, logs for a single user can be fetched pretty quickly and uses const ref
mysql> EXPLAIN SELECT * FROM `logging` WHERE `member_id` = 45373 ORDER BY `id` DESC LIMIT 100 OFFSET 0;
+----+-------------+---------+------------+------+---------------------------+---------------------------+---------+-------+------+----------+-------------+
| id | select_type | table   | partitions | type | possible_keys             | key                       | key_len | ref   | rows | filtered | Extra       |
+----+-------------+---------+------------+------+---------------------------+---------------------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | logging | NULL       | ref  | logging_member_id_foreign | logging_member_id_foreign | 5       | const |    1 |   100.00 | Using where |
+----+-------------+---------+------------+------+---------------------------+---------------------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.05 sec)

Alternative Datastores

I have tried to see which of the available datastores might fit the need, here's an overview:

Name Description PHP/Laravel Package Recommend
ClickHouse ClickHouse® is an open-source, high performance columnar OLAP database management system for real-time analytics using SQL. https://github.com/smi2/phpClickHouse https://github.com/bavix/laravel-clickhouse Yes
Elasticsearch Elasticsearch, Kibana, Beats, and Logstash (also known as the ELK Stack). Reliably and securely take data from any source, in any format, then search, analyze, and visualize it in real time. https://github.com/elastic/elasticsearch-php https://github.com/sleimanx2/plastic Yes
PrestoDB Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. No well-maintained PHP client No, lacks good integration
InfluxDB At its heart, InfluxDB is a database purpose-built to handle the epic volumes and countless sources of time-stamped data produced by sensors, applications and infrastructure. https://github.com/influxdata/influxdb-client-php No, unrelated to our need
Apache HBase Apache HBase™ is good when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. No well-maintained PHP client, requires Throttle protocol No, lacks good integration
Tarantool Tarantool keeps all the data in random-access memory (RAM). Solutions of this class are fast, but they often demonstrate a lack of data integrity. Tarantool solves all these problems. https://github.com/tarantool-php/client No, unrelated to our need

Suggestions

  • Keep using MySQL in a separate, dedicated database and if needed in the future, introduce sharding data based on some logic (i.e. year).

  • Introduce Elasticsearch, which has a good official PHP client and popular Laravel packages. It also has other bundled tools for data visualization which might come in handy.

  • Introduce ClickHouse, which has a community-developed PHP client and less-used Laravel package. It features SQL like syntax so it can be used for financial data analytics and dashboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment