Skip to content

Instantly share code, notes, and snippets.

@jorinvo
Last active August 29, 2015 14:13
Show Gist options
  • Save jorinvo/250150dbf6ea2c58e64a to your computer and use it in GitHub Desktop.
Save jorinvo/250150dbf6ea2c58e64a to your computer and use it in GitHub Desktop.
A solution for the Dev Challenge. Analyses a log file.
#!/usr/bin/env ruby
# USAGE:
# Script can be called from the command line with `./log_stats.rb`.
# Optional a log file can be specified: `./log_stats.rb path/to/file.log`.
# This function kicks off the script in case it's called from the command line
def run!
# Optional:
# Use the first argument as path to the log file.
# Ignore other arguments.
file_path = ARGV.first || 'sample.log'
# Instantiate LogStats with the urls of interest
# and outputs the statistics.
puts LogStats.new.
add_url('get', '/api/users/{user_id}/count_pending_messages').
add_url('get', '/api/users/{user_id}/get_messages').
add_url('get', '/api/users/{user_id}/get_friends_progress').
add_url('get', '/api/users/{user_id}/get_friends_score').
add_url('post', '/api/users/{user_id}').
add_url('get', '/api/users/{user_id}').
# Collect information from the log file.
parse(file_path)
end
# This is the main class for the script.
# It supports multiple urls to collect statistics for.
# And it can read log files.
class LogStats
# Each LogStats instance has its own set of urls.
def initialize
@urls = []
end
# Add a new url.
#
# method - The method for the url.
# url - The url itself.
def add_url(method, url)
@urls << Url.new(method, url)
self
end
# Parse a log file and add the collected data to the urls.
#
# file_path - Path to the log file to read.
def parse(file_path)
File.open(file_path, 'r') do |f|
f.each_line do |l|
update_url Line.new(l)
end
end
self
end
# Returns the formated statistics for all urls.
def to_s
@urls.join "\n"
end
private
# Find the right url the given line and update it with the data.
#
# line - The Line to add to the urls.
def update_url(line)
url = @urls.find { |u| u.matches? line }
url.update(line) if url
end
end
# A data structure containing the information of a line from a log file.
class LogStats::Line
# Accessible data for a line.
attr_reader :method, :url, :dyno, :time
# Regexp to abstract the needed data from a line String.
MATCH_PARTS = /method=(.+?) .*?path=(.+?) .*?dyno=(.+?) .*?connect=(.+?)ms.*?service=(.+?)ms/
# Parses a log line and uses its data.
#
# str - String containing a raw line from the log file.
def initialize(str)
parts = MATCH_PARTS.match str
@method = parts[1].downcase
@url = parts[2]
@dyno = parts[3]
@time = parts[4].to_i + parts[5].to_i
end
end
# Collect data for a single url.
# Calculate statistics from the data and display it nicely formatted.
class LogStats::Url
# Placeholder String used in urls for the user id.
PLACEHOLDER = '{user_id}'
# Initalized all the data field for counting.
#
# method - The method for the url.
# url - The url itself. LogUrls can be identified by method + url.
def initialize(method, url)
@url = url
@method = method.downcase
@count = 0
@dyno_counts = Hash.new(0)
@times = []
# Generate regex from url and placeholder to compare to actual line data later.
@url_matcher = Regexp.new "^#{url.gsub(PLACEHOLDER, '[0-9]+?')}(\\?.*)?$"
end
# Checks if a Line belongs to this Url.
#
# line - Line to compare the Url to.
#
# Returns a Boolean indicating if the line matches or not.
def matches?(line)
@method == line.method && @url_matcher.match(line.url)
end
# Add line to data of this Url.
#
# line - Line containing data to add.
def update(line)
@count += 1
@times << line.time
@dyno_counts[line.dyno] += 1
end
# Returns a formatted String with calculated statisics for this Url.
def to_str
return "#{@method.upcase} #{@url} - no calls" if @times.empty?
"#{@method.upcase} #{@url} - calls: #{@count}, mean: #{mean}ms, " +
"median: #{median}ms, mode: #{mode}, most used dyno: #{max_dyno}"
end
private
# Calculate the mean response time for this url.
#
# Returns an Integer with the average time in ms.
def mean
(@times.inject(:+) / @times.size).to_i
end
# Calculate the median response time for this url.
#
# Returns an Integer with the median time in ms.
def median
sorted = @times.sort
middle = sorted.size / 2
if sorted.size.odd?
sorted[middle]
else
(sorted[middle.floor] + sorted[middle.ceil]) / 2
end
end
# Calculate the mode response time for this url.
# If there is no unique maximum just one of them is returned.
#
# Returns an Integer with the mode time in ms.
def mode
counts = @times.inject( Hash.new(0) ) do |counts, t|
counts[t] += 1
counts
end
max_val( counts )
end
# Find the dyno that was used the most.
# If there is no unique maximum just one of them is returned.
#
# Returns the dyno with the most counts.
def max_dyno
max_val( @dyno_counts )
end
# Find the key of an Array with the highest value.
#
# Returns the found key.
def max_val(arr)
arr.max_by { |k, v| v }[0]
end
end
# Run if called from command line.
run! if __FILE__ == $0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment