Skip to content

Instantly share code, notes, and snippets.

@brenes
Created November 30, 2010 11:27
Show Gist options
  • Save brenes/721548 to your computer and use it in GitHub Desktop.
Save brenes/721548 to your computer and use it in GitHub Desktop.
# We have some cached pages (not every page) and we need to analyze if there's a huge concentration of misses as we have poor performance
# This simple script gets a rails log file and parses it to find the hits/misses to the cache and the response times
# key points:
# - Cache expires with time, so it's easy to parse the logs and detect misses
# The ouput is a CSV format with four columns: Minute, misses, hits and average request time
# CHANGES:
# 2010-11-30 16:11 : Extracted the regexp so we can use it for another logfiles
# 2010-11-30 16:11 : Included URL regexps so we can define which URLs shoud be analyzed and wich ones ignored
require 'time'
cache_expiration_time = 15 * 60
misses = {}
hits = {}
ignored = {}
cached_times = {}
response_times = {}
current_time = "0000-00-00 00:00:00"
current_minute = "0000-00-00 00:00"
hits[current_minute] = []
misses[current_minute] = []
ignored[current_minute] = []
response_times[current_minute] = {:request => [], :render => [], :db => []}
# rails log format regexps
# The processing line is: "Processing controller#action (for it at yyyy-mm-dd hh:ii:ss) [GET]" so we can extract the hour
time_regexp = /[0-9]*-[0-9]*-[0-9]* [0-9]*:[0-9]*:[0-9]*/
time_line_regexp = /^Processing /
# The completed line is "Completed in time (X reqs/sec) | Rendering: rendering time (rendering ratio) | DB: database time (database ratio) | status [url]"
url_regexp = /\[[^\]]*\]/
request_completion_line_regexp = /^Completed in*/
allowed_url_regexp = /(url1|url2)/
response_time_regexp = /[0-9]+\.[0-9]+/
File.open("production.log").each do |line|
# if it's a line with tiem info we check if we are in a new minute and reload the info
if line.index(time_line_regexp)
current_time = line.gsub(time_regexp).first
if current_minute != current_time[0..-4]
puts "#{current_minute} , #{misses[current_minute].count} , #{hits[current_minute].count} , #{ignored[current_minute].count} , #{response_times[current_minute][:request].empty? ? 0 : response_times[current_minute][:request].inject(0) {|total, t| total += t.to_f } /response_times[current_minute][:request].length}"
current_minute = current_time[0..-4]
hits[current_minute] = []
misses[current_minute] = []
ignored[current_minute] = []
response_times[current_minute] = {:request => [], :render => [], :db => []}
end
end
# If it's a completed line we check if the cache expires and store the info about the miss/hit and the response times
if line.index(request_completion_line_regexp)
url = line.gsub(url_regexp).first
if url.index(allowed_url_regexp)
cache_time = cached_times[url]
if cache_time.nil? or ((Time.parse(current_time) - Time.parse(cache_time)) > cache_expiration_time)
cached_times[url] = current_time
misses[current_minute] << url
else
hits[current_minute] << url
end
line_times = line.gsub(response_time_regexp).map{|t|t}
response_times[current_minute][:request] = line_times[0]
response_times[current_minute][:render] = line_times[1]
response_times[current_minute][:db] = line_times[2]
else
ignored[current_minute] << url
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment