Last active
June 15, 2021 15:00
-
-
Save tosiara/a0c4ebfb2e0707fc1920cf204f6cde0f to your computer and use it in GitHub Desktop.
Hashtopolis hack script to workaround DB resource limitation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?php | |
/* | |
This script is a temporary hack to overcome Hashtopolis DB resource limitation. | |
Original thread: https://github.com/hashtopolis/server/issues/644 | |
Script location: https://gist.github.com/tosiara/a0c4ebfb2e0707fc1920cf204f6cde0f | |
tl;dr: when a task has too many chunks in the table, apache will fetch them all | |
and process in a loop in order to calculate task progress - this will cause | |
out of memory at some point, depending on the server's RAM and CPU. | |
This script will query the DB, find a continuous chunk sequence (chain) and merge | |
them into one big (lengthy) chunk. | |
*/ | |
require_once("inc/conf.php"); | |
$conn = new mysqli($CONN["server"], $CONN["user"], $CONN["pass"], $CONN["db"]); | |
if ($conn->connect_error) | |
die ("Could not connect to DB: " . $conn->connect_error . "\n"); | |
if (!isset($argv[1])) | |
die("Usage: php " . $argv[0] . " taskId\n"); | |
// taskId | |
$id = intval($argv[1]); | |
//state 4 means the chunk has been processed | |
$stmt = $conn->prepare("SELECT * FROM Chunk WHERE taskId = ? AND state = 4 ORDER BY skip"); | |
if ($stmt == FALSE) | |
die ("Could not prepare statement. Bad SQL?\n"); | |
$stmt->bind_param("i", $id); | |
$stmt->execute(); | |
$result = $stmt->get_result(); | |
if ($result == NULL) | |
die("Could not query, result is NULL\n"); | |
if ($result->num_rows < 1) | |
{ | |
die ("Unexpected empty result set. Bad task id?\n"); | |
} | |
// we start from the first chunk, which has "skip" (or offset) 0 | |
$chain = 0; | |
$count = 0; | |
while ($row = $result->fetch_assoc()) | |
{ | |
$chunkId = $row["chunkId"]; | |
$skip = $row["skip"]; | |
$length = $row["length"]; | |
$progress = $row["progress"]; | |
if ($progress != 10000) | |
{ | |
echo ($chunkId . "\n"); | |
echo ("WARN: the chunk is done, but the progress is not 100% (" . $progress . ")\n"); | |
} | |
if ($length < 1) | |
{ | |
echo ($chunkId . "\n"); | |
echo ("WARN: the length is invalid (" . $length . ")\n"); | |
} | |
// looks like we have got a chunk that is out of continuous sequence, stop here | |
if ($skip != $chain) | |
break; | |
// all good, calc this chunk and move to the next | |
$chain = $skip + $length; | |
$count++; | |
} | |
// we did not find any chunk expect the first one, nothing to do | |
if ($chain == 0) | |
die("Could not find a continuous batch of chunks to optimize\n"); | |
echo ("Checked ". $count . " records. Found last chunk offset: " . $chain . "\n"); | |
echo ("DELETE FROM Chunk WHERE taskId = " . $id . " AND state = 4 AND skip < " . $chain . " AND skip <> 0\n"); | |
echo ("UPDATE Chunk SET length = " . $chain . ", checkpoint = " . $chain . " WHERE taskId = " . $id . " AND state = 4 AND skip = 0\n"); | |
$result->close(); | |
$stmt->close(); | |
$conn->close(); | |
?> |
Hashtopolis won't allow to merge chunks that are assigned to cracked hashes.
To do so, we need first to migrate hashes to the chunk with skip=0
:
mysql> select chunkId FROM Chunk WHERE taskId = 20 AND state = 4 AND skip = 0;
+---------+
| chunkId |
+---------+
| 117863 |
+---------+
1 row in set (0.54 sec)
mysql> UPDATE Hash set chunkId = 117863 where chunkId in (select chunkId FROM Chunk WHERE taskId = 20 AND state = 4 AND skip < 5545364397099 AND skip <> 0);
Query OK, 1317 rows affected (0.46 sec)
Rows matched: 1317 Changed: 1317 Warnings: 0
After that you will be able to merge chunks:
mysql> DELETE FROM Chunk WHERE taskId = 20 AND state = 4 AND skip < 5545364397099 AND skip <> 0;
Query OK, 295950 rows affected (29.54 sec)
mysql> UPDATE Chunk SET length = 5545364397099 WHERE taskId = 20 AND state = 4 AND skip = 0;
Query OK, 1 row affected (0.04 sec)
Rows matched: 1 Changed: 1 Warnings: 0
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Sample output:
This would drop close to 300k records from the database.