Skip to content

Instantly share code, notes, and snippets.

@willkg
Last active August 29, 2015 14:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save willkg/7c8091a74bbe5f2c34b8 to your computer and use it in GitHub Desktop.
Save willkg/7c8091a74bbe5f2c34b8 to your computer and use it in GitHub Desktop.
table space/rows

README

This is my total crap script(s) to figure out the space used on disk and number of rows for tables in kitsune.

Do:

  1. edit get_schema.sh replacing paths with paths for your system
  2. run get_schema.sh.

    it'll generate a schema.sql file and then use that to generate a table_rows.txt file.

    then it'll do an ls -lha on the directory with the .idb files in it and then tweak that. it'll eventually end up in a file called table_space.txt.

  3. run python merge.py which will spit out the merged data.
#!/bin/bash
# Get row counts
(mysql kitsune -e "SELECT CONCAT('SELECT \"', table_name, '\" AS table_name, COUNT(*) AS exact_row_count FROM ', table_schema, '.', table_name, ' UNION ') FROM INFORMATION_SCHEMA.TABLES WHERE table_schema = 'kitsune';" | grep -v "CONCAT" | sed 's/UNION//' | sed 's/|//' | sed 's/-//' | sed 's/$/;/' > schema.sql
mysql kitsune < schema.sql | grep -v "table_name") > table_rows.txt
# Get table space usage
sudo -H -u root ls -lha /var/lib/mysql/kitsune/ | grep -v "^total" | grep -v "^d" | grep '.ibd' | sed 's/.ibd//' > table_space.txt
sudo -H -u root chmod 755 /home/willkg/mozilla/kitsune/bin/table_space.txt
# Run python script to merge everything
# FIXME
#!/usr/bin/env python
import re
def get_rows(fn):
SPLITTER = re.compile(r'^([^\s]+)\s+([^\s]+)$')
parts = [SPLITTER.match(line) for line in open(fn, 'r').readlines()]
return dict(
(part.group(1), part.group(2))
for part in parts
)
def get_space(fn):
lines = open(fn, 'r').readlines()
lines = [line.split() for line in lines]
return dict(
(line[8], line[4])
for line in lines
)
def convert_space(text):
num = float(text[:-1])
if text.endswith('K'):
return num * 1000
if text.endswith('M'):
return num * 1000000
if text.endswith('G'):
return num * 1000000000
raise ValueError(text)
def main():
row_counts = get_rows('./table_rows.txt')
row_space = get_space('./table_space.txt')
# Merge the two together into one data set of tuples
dataset = []
for table_name in row_counts.keys():
dataset.append((table_name, row_counts[table_name], row_space[table_name]))
for line in sorted(dataset, key=lambda item: convert_space(item[2]), reverse=True):
print '%45s %8s %8s' % line
if __name__ == '__main__':
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment