Skip to content

Instantly share code, notes, and snippets.

@stephanGarland
Created September 30, 2023 18:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save stephanGarland/06ff4820f99e5966ba097aa8b10ec750 to your computer and use it in GitHub Desktop.
Save stephanGarland/06ff4820f99e5966ba097aa8b10ec750 to your computer and use it in GitHub Desktop.
Benchmarking MySQL bulk load into InnoDB and MyISAM
#!/usr/bin/env bash
function create_tables() {
tbl_1=$(cat <<EOF
CREATE TABLE IF NOT EXISTS test_innodb (
user_id CHAR(32) NOT NULL PRIMARY KEY,
user_email VARCHAR(254) NOT NULL,
created_at DATETIME NOT NULL,
tags JSON NOT NULL DEFAULT ('{}'),
shared_with JSON NOT NULL DEFAULT ('{}'),
is_completed TINYINT NOT NULL DEFAULT 0,
is_overdue TINYINT NOT NULL DEFAULT 0,
description TEXT NOT NULL,
title VARCHAR(255) NOT NULL,
user_name VARCHAR(255) NOT NULL
);
EOF
)
tbl_2=$(cat <<EOF
CREATE TABLE IF NOT EXISTS test_myisam (
user_id CHAR(32) NOT NULL PRIMARY KEY,
user_email VARCHAR(254) NOT NULL,
created_at DATETIME NOT NULL,
tags JSON NOT NULL DEFAULT ('{}'),
shared_with JSON NOT NULL DEFAULT ('{}'),
is_completed TINYINT NOT NULL DEFAULT 0,
is_overdue TINYINT NOT NULL DEFAULT 0,
description TEXT NOT NULL,
title VARCHAR(255) NOT NULL,
user_name VARCHAR(255) NOT NULL)
ENGINE = MYISAM
;
EOF
)
declare tables=()
tables+=("$tbl_1")
tables+=("$tbl_2")
for tbl in "${tables[@]}"; do
mysql -D test -Be "$tbl"
done
}
function truncate() {
local tbl="$1"
printf "%s\n" "TRUNCATING table ${tbl}"
mysql -D test -Be "TRUNCATE ${tbl}"
}
function get_lines() {
local filename="$1"
num_lines=$(($(wc -l "$filename" | cut -d' ' -f1,1)-1))
printf "%s\n" "${num_lines}"
}
function load() {
local load_time=""
local tbl="$1"
local file_path="$2"
temp_file=$(mktemp)
printf "%s\n" '\sql SET GLOBAL local_infile=1;' > "$temp_file"
printf "%s\n" '\use test;' >> "$temp_file"
printf "%s\n" "util.importTable('${file_path}', {table: '${tbl}', fieldsTerminatedBy: ',', fieldsEnclosedBy: \"'\", fieldsOptionallyEnclosed: true, fieldsEscapedBy: '\\\\', showProgress: true, sessionInitSQL: [\"SET time_zone='+00:00';\"]})" >> "$temp_file"
mysqlsh --quiet-start=1 -i <<< $(cat $temp_file)
rm -f "$temp_file"
}
innodb_tbl_name="test_innodb"
myisam_tbl_name="test_myisam"
file_path="/mnt/ramdisk/25M.csv"
file_path_split_1="/mnt/ramdisk/xaa.csv"
file_path_split_2="/mnt/ramdisk/xab.csv"
file_size=$(stat -c "%s" "${file_path}" | numfmt --to=iec-i)
printf "%s\n" "Getting line count of input file ${file_path}"
num_lines=$(get_lines "${file_path}")
printf "%s\n" "${file_path} has ${num_lines} lines and is ${file_size}"
printf "%s\n" "Splitting ${file_path} into two equal chunks"
pushd "$(dirname ${file_path})"
rm -f xaa.csv xab.csv
tail -n +2 "$file_path" | split -l $((${num_lines} / 2)) --additional-suffix ".csv"
popd
printf "\n%s\n" "Creating tables for InnoDB (${innodb_tbl_name}) and MyISAM (${myisam_tbl_name})"
create_tables
printf "\n%s\n" "Truncating tables"
truncate "$innodb_tbl_name"
truncate "$myisam_tbl_name"
printf "\n%s\n" "Loading data into empty ${innodb_tbl_name} with LOAD DATA INFILE"
load "$innodb_tbl_name" "$file_path_split_1"
printf "\n%s\n" "Loading data into empty ${myisam_tbl_name} with LOAD DATA INFILE"
load "$myisam_tbl_name" "$file_path_split_1"
printf "\n%s\n" "Loading data into non-empty ${innodb_tbl_name} with LOAD DATA INFILE"
load "$innodb_tbl_name" "$file_path_split_2"
printf "\n%s\n" "Loading data into non-empty ${myisam_tbl_name} with LOAD DATA INFILE"
load "$myisam_tbl_name" "$file_path_split_2"
@stephanGarland
Copy link
Author

❯ ./bench_mysql.sh
Getting line count of input file /mnt/ramdisk/25M.csv
/mnt/ramdisk/25M.csv has 25000000 lines and is 5.2Gi
Splitting /mnt/ramdisk/25M.csv into two equal chunks
/mnt/ramdisk ~
~

Creating tables for InnoDB (test_innodb) and MyISAM (test_myisam)

Truncating tables
TRUNCATING table test_innodb
TRUNCATING table test_myisam

Loading data into empty test_innodb with LOAD DATA INFILE
Creating a Classic session to 'sgarland@192.168.1.179'
Your MySQL connection id is 430273
Server version: 8.0.34 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.
Query OK, 0 rows affected (0.0002 sec)
Default schema set to `test`.
Importing from file '/mnt/ramdisk/xaa.csv' to table `test`.`test_innodb` in MySQL Server at 192.168.1.179:3306 using 8 threads
...
100% (2.75 GB / 2.75 GB), 0.00 B/s
File '/mnt/ramdisk/xaa.csv' (2.75 GB) was imported in 6 min 57.2374 sec at 6.59 MB/s
Total rows affected in test.test_innodb: Records: 12500000  Deleted: 0  Skipped: 0  Warnings: 0
Bye!

---

Loading data into empty test_myisam with LOAD DATA INFILE
Creating a Classic session to 'sgarland@192.168.1.179'
Your MySQL connection id is 430492
Server version: 8.0.34 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.
Query OK, 0 rows affected (0.0007 sec)
Default schema set to `test`.
Importing from file '/mnt/ramdisk/xaa.csv' to table `test`.`test_myisam` in MySQL Server at 192.168.1.179:3306 using 8 threads
...
100% (2.75 GB / 2.75 GB), 1.44 MB/s
File '/mnt/ramdisk/xaa.csv' (2.75 GB) was imported in 11 min 18.4550 sec at 4.05 MB/s
Total rows affected in test.test_myisam: Records: 12500000  Deleted: 0  Skipped: 0  Warnings: 0
Bye!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment