Skip to content

Instantly share code, notes, and snippets.

@iNecas
Created October 10, 2018 10:46
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save iNecas/a3f193ed26a1ee2ddadc262356bb9cf3 to your computer and use it in GitHub Desktop.
Save iNecas/a3f193ed26a1ee2ddadc262356bb9cf3 to your computer and use it in GitHub Desktop.
capsule-sync-analyze.rb
#!/usr/bin/env ruby
require 'rubygems'
require 'nokogiri'
require 'set'
require 'yaml'
require 'time'
# Usage:
#
# # run inside a directory with extracted task export
# ruby capsule-sync-analyze.rb | tee capsule_sync_analysis.csv
#
puts "file, capsule_name, started_at, ended_at, duration, status, result, number_of_repos, orgs, envs"
files = ARGV
if files.empty?
files = Dir.glob('*.html')
end
def repo_id_to_org_env(repo_id)
parts = repo_id.split('-')
return if parts.first =~ /^\d*$/ # skip malformed formats
org, env = parts
return org, env
end
def parse_time(str)
Time.strptime(str, '%F %T %Z')
end
files.reverse.each do |filename|
file = File.read(filename)
page = Nokogiri::HTML(open(filename))
ps = page.css('body').css('p')
headers = ps.map(&:text)[0..5]
task_data = {}
headers.each do |line|
field, value = line.strip.split("\n").map(&:strip)
task_data[field.downcase.tr(':','').tr(' ','_')] = value
end
if (proxy_data = page.css('.lang-yaml').first)
data = YAML.load(proxy_data.text)
if data['smart_proxy']
task_data['capsule'] = data['smart_proxy']['name']
end
end
next unless task_data['label'] == "Actions::Katello::CapsuleContent::Sync"
data = YAML.load(proxy_data.text)
run = page.css('#run').css("table.flow")
repo_ids = Set.new
orgs = Set.new
envs = Set.new
run.css('td:not(.flow)').each do |tr|
data = YAML.load(tr.css('.lang-yaml').first.text)
if data.key?('repo_id')
repo_ids << data['repo_id']
end
end
repo_ids.each do |repo_id|
org, env = repo_id_to_org_env(repo_id)
if org
orgs << org
envs << env
end
end
if task_data['ended_at']
duration = parse_time(task_data['ended_at']) - parse_time(task_data['started_at'])
end
# might be useful to group the repos by environments
# repo_by_env = repo_ids.group_by do |repo_id|
# _, env = repo_id_to_org_env(repo_id)
# env
# end
puts [filename,
task_data['capsule'],
task_data['started_at'],
task_data['ended_at'],
duration,
task_data['status'],
task_data['result'],
repo_ids.size,
orgs.to_a.join(' '),
envs.to_a.join(' '),
].map(&:to_s).join(',')
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment