Skip to content

Instantly share code, notes, and snippets.

@btelles
Created November 2, 2012 20:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save btelles/4004109 to your computer and use it in GitHub Desktop.
Save btelles/4004109 to your computer and use it in GitHub Desktop.
# This Gist helps you analyze mysql slow query logs.
# This Ruby code should not be used as an example of tested,
# production ready, or idiomatic Ruby. You've been warned.
# 1. Download and extract slow logs.
# 2. Specify a directory to analyze, using Dir[] syntax:
DIRECTORY_TO_ANALYZE = "./mysql_slow_logs/2012/*/*/mysqld_slow.log"
# 3. Run the script with "ruby ./mysql_slow_log_parser.rb"
# The script organizes each slow query into a single row,
# with all valid parameters in separate columns.
# It also generalizes query strings by replacing 'some_string' with 'S'
# and replacing numbers with N.
# 4. Import the output with the following mysql:
#
#
# CREATE TABLE `slow_query_logs` (
# `id` int(11) NOT NULL AUTO_INCREMENT,
# `executed_at` datetime DEFAULT NULL,
# `query_time` decimal(18,9) DEFAULT NULL,
# `lock_time` decimal(18,9) DEFAULT NULL,
# `rows_sent` int(11) DEFAULT NULL,
# `rows_examined` int(11) DEFAULT NULL,
# `queries` varchar(1500) DEFAULT NULL,
# PRIMARY KEY (`id`)
# ) ENGINE=InnoDB AUTO_INCREMENT=1074941 DEFAULT CHARSET=latin1#
#
# LOAD DATA INFILE '/var/lib/mysql/analysis/parsed_mysql.csv'
# INTO TABLE slow_query_logs
# FIELDS TERMINATED BY '\t'
# ENCLOSED BY '"'
# IGNORE 1 LINES
# ( @line_date,
# @a,
# query_time,
# lock_time,
# rows_sent,
# rows_examined,
# @query1,
# @query2,
# @query3,
# @query4,
# @query5)
# SET
# executed_at = str_to_date(@line_date, '%b %d %H:%i:%s'),
# queries = concat(@query1, @query2,@query3);
#
#
# 5. Analyze the data with the following (or your own variation):
# select queries ,
# count(id) as count,
# sum(query_time) as sum_query_time,
# sum(lock_time) as sum_lock_time,
# sum(rows_sent) as sum_rows_sent,
# sum(rows_examined) as sum_rows_examined,
# avg(query_time) as avg_query_time,
# avg(lock_time) as avg_lock_time,
# avg(rows_sent) as avg_rows_sent,
# avg(rows_examined) as avg_rows_examined,
# std(query_time) as std_query_time,
# std(lock_time) as std_lock_time,
# std(rows_sent) as std_rows_sent,
# std(rows_examined) as std_rows_examined*/
# from slow_query_logs
# group by queries
# order by count(id) desc;
class Parser
def initialize(input_exp, output_file)
@input_exp = input_exp
@output_file = output_file
@full_entry = ''
@line_no = 0
end
def parse_all
@output_io = File.open(@output_file,'w')
@output_io.puts "Date\tQuery Time\tLock Time\tRows Sent\tRows Examined\tQueries"
@output_io.close
Dir[DIRECTORY_TO_ANALYZE].each do |input_file|
@output_io = File.open(@output_file,'a')
puts "processing: #{input_file}"
input = File.open(input_file,'r')
input.each_line do |line|
cleaned_line = clean_line(line)
write_or_store(cleaned_line)
end
input.close
@output_io.close
puts "completed: #{input_file}"
end
end
def clean_line(line)
return '' if line =~ /SET timestamp=/
if line[53..53] != '#'
line = line[14..-1]
line.gsub!(/\d+/,'N')
while line =~ /N, N/ do
line.gsub!(/N, N/, 'N')
end
while line =~ /N,N/ do
line.gsub!(/N,N/, 'N')
end
line.gsub!(/;N$/,';')
line.gsub!(/^N /,'')
line.gsub!(/'[\w,\d,\s,\(,\),%,\!,@,#,$,^,&,\\',\-,\+,\.,\|,\[,\],\/]+'/,"'S'")
end
line.gsub!(/# Query_time:|Rows_examined:|Lock_time:|Rows_sent:|app-suite-db-master mysqld_slow_log: /,"\"\t\"" )
line.gsub(/\n/,"\"\t\"").squeeze("\t")
end
def write_or_store(line_text)
if line_text && line_text[19..30] =~ /^# Time: |^# User@Host/
write_contents
else
@full_entry = @full_entry + line_text if line_text.size > 0 && line_text[19..30] != /^# User@Host|^# Time/
end
end
def write_contents
@output_io.puts('"'+@full_entry+'"')
@full_entry = ''
end
end
input_exp = ARGV[0]
output_file = ARGV[1] || 'parsed_mysql.csv'
parser = Parser.new(input_exp, output_file)
parser.parse_all
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment