Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save kost/8445cd68f0af98d27ca69e5d53192d61 to your computer and use it in GitHub Desktop.
Save kost/8445cd68f0af98d27ca69e5d53192d61 to your computer and use it in GitHub Desktop.
Parsing the MySQL slow query log via Logstash (the easy way?)

The MySQL slow query log is a difficult format to extract information from. After looking at various examples with mixed results, I realized that it's much easier to configure MySQL to write the slow query log to a table in CSV format!

From the MySQL documentation:

By default, the log tables use the CSV storage engine that writes data in comma-separated values format. For users who have access to the .CSV files that contain log table data, the files are easy to import into other programs such as spreadsheets that can process CSV input.

my.cnf

Note: don't forget to open up permissions on your slow query log CSV file so logstash can read it!

# enable slow query log
slow_query_log = 1
# configure log output to CSV table
log_output = table

# optionally tweak slow query log behavior
#log_queries_not_using_indexes = 1
#long_query_time = 10
#min_examined_row_limit = 100

mysql_slow_log.conf

Example logstash configuration:

input {
  file {
    # slow query table is located at:
    #   <mysql datadir>/mysql/slow_log.CSV
    path => "/var/lib/mysql/mysql/slow_log.CSV"
    type => "mysql_slow_query"
    start_position => "beginning"
  }
}

filter {
  # mysql escapes double quotes with backslashes, but
  # ruby expects pairs of double quotes
  mutate { gsub => [ "message", '\\"', '""' ] }

  csv {
    columns => [ "start_time", "user_host", "query_time", "lock_time",
                 "rows_sent", "rows_examined", "db", "last_insert_id",
                 "insert_id", "server_id", "sql_text" ]
  }

  # convert various fields to integer
  mutate { convert => [ "rows_sent", "integer" ] }
  mutate { convert => [ "rows_examined", "integer" ] }
  mutate { convert => [ "last_insert_id", "integer" ] }
  mutate { convert => [ "insert_id", "integer" ] }
  mutate { convert => [ "server_id", "integer" ] }

  # convert start_time to @timestamp
  date {
    match => [ "start_time", "YYYY-MM-DD HH:mm:ss" ]
    remove_field => [ "start_time" ]
  }

  # normalize query_time from HH::mm::ss to seconds
  ruby { code => "event['query_time'] = event['query_time'] ? event['query_time'].split(':').inject(0){|a, m| a = a * 60 + m.to_i} : 0" }

  # normalize lock_time from HH:mm:ss to seconds
  ruby { code => "event['lock_time'] = event['lock_time'] ? event['lock_time'].split(':').inject(0){|a, m| a = a * 60 + m.to_i} : 0" }

  # remove original message
  mutate { remove => [ "message" ] }

}

output {
  stdout { codec => rubydebug }
}

The logstash filter was adapted from the slow_log table definition:

| slow_log | CREATE TABLE `slow_log` (
  `start_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `user_host` mediumtext NOT NULL,
  `query_time` time NOT NULL,
  `lock_time` time NOT NULL,
  `rows_sent` int(11) NOT NULL,
  `rows_examined` int(11) NOT NULL,
  `db` varchar(512) NOT NULL,
  `last_insert_id` int(11) NOT NULL,
  `insert_id` int(11) NOT NULL,
  `server_id` int(10) unsigned NOT NULL,
  `sql_text` mediumtext NOT NULL
) ENGINE=CSV DEFAULT CHARSET=utf8 COMMENT='Slow log' |

logstash output

{
          "@version" => "1",
        "@timestamp" => "2015-01-07T18:20:02.000Z",
              "type" => "mysql_slow_query",
              "host" => "hera",
              "path" => "/var/lib/mysql/mysql/slow_log.CSV",
         "user_host" => "root[root] @ localhost []",
        "query_time" => 0,
         "lock_time" => 0,
         "rows_sent" => 0,
     "rows_examined" => 21,
                "db" => "foo",
    "last_insert_id" => 0,
         "insert_id" => 0,
         "server_id" => 0,
          "sql_text" => "select * from person where name like \"foo%\""
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment