Skip to content

Instantly share code, notes, and snippets.

@lukaszgryglicki
Created April 21, 2017 08:23
Show Gist options
  • Save lukaszgryglicki/efe4d5623781a5ce6ceac72bb976dcce to your computer and use it in GitHub Desktop.
Save lukaszgryglicki/efe4d5623781a5ce6ceac72bb976dcce to your computer and use it in GitHub Desktop.
Check if angular have more issues than kubernetes (angular and kubernetes repos combined, from all time, statistics got manually from each separate github repo - manually)
For each repo in angular and kubernetes (from those that were in Top 50) go to the github site manually and:
Click Issues and get opened and closed
Click Pull Requests and get opened and closed
Click code and get: commits,authors,branches,releases,watch,star,fork
Repeat for all kubernetes and angular repos.
Save this data manually in `all_time.csv`
Created tool `manual.rb` in ruby which:
-Sums data per org (kubernetes and angular)
-Groping by org, summary row have "repo1+repo2+...+repoN" name
-Sum all values except "authors" (can be the same in repos), for authors use max from all repos.
-Create issues and PRs entry (issues = open issues + closed issues, PRs = open PRs + closed PRs)
-Create output array containing separate repos and summary repos.
-Sort it by # of issues desc. Save into `all_time_combined.csv`
See:
angualar (sum of all repos): 29077 issues.
kubernetes (sum of all repos): 21424 issues.
So angular have more issues than kubernetes (for all time data).
Files: all_time.csv - manually created input, manual.rb (tool to generate combines summary), all_time_combined.csv - output of ruby tool
org repo issues open issues closed PRs open PRs closed commits authors branches releases watch star fork
kubernetes kubernetes 5057 13313 645 25739 46954 1158 31 237 1714 22520 7848
kubernetes kubernetes.github.io 328 782 42 2317 4878 624 11 0 55 165 1156
kubernetes contrib 319 494 67 1678 3469 212 9 11 146 975 921
kubernetes dashboard 128 549 16 1176 1613 70 3 18 82 969 308
kubernetes test-infra 133 321 32 2067 5310 126 1 0 44 77 129
angular angular 1223 8803 156 5957 7326 429 14 121 2444 23321 5918
angular angular-cli 496 3777 72 1678 1423 223 4 75 766 9136 1837
angular material2 402 2027 77 1684 1418 138 6 16 882 8365 1338
angular material 595 7550 77 2380 4336 311 19 89 953 15403 3356
angular protractor 174 2863 19 1179 1628 228 8 85 439 6513 1617
angular angular.io 184 983 34 2339 2859 344 40 0 133 886 996
org repo issues open issues closed PRs open PRs closed commits authors branches releases watch star fork issues PRs
angular angular+angular-cli+material2+material+protractor+angular.io 3074 26003 435 15217 18990 429 91 386 5617 63624 15062 29077 15652
kubernetes kubernetes+kubernetes.github.io+contrib+dashboard+test-infra 5965 15459 802 32977 62224 1158 55 266 2041 24706 10362 21424 33779
kubernetes kubernetes 5057 13313 645 25739 46954 1158 31 237 1714 22520 7848 18370 26384
angular angular 1223 8803 156 5957 7326 429 14 121 2444 23321 5918 10026 6113
angular material 595 7550 77 2380 4336 311 19 89 953 15403 3356 8145 2457
angular angular-cli 496 3777 72 1678 1423 223 4 75 766 9136 1837 4273 1750
angular protractor 174 2863 19 1179 1628 228 8 85 439 6513 1617 3037 1198
angular material2 402 2027 77 1684 1418 138 6 16 882 8365 1338 2429 1761
angular angular.io 184 983 34 2339 2859 344 40 0 133 886 996 1167 2373
kubernetes kubernetes.github.io 328 782 42 2317 4878 624 11 0 55 165 1156 1110 2359
kubernetes contrib 319 494 67 1678 3469 212 9 11 146 975 921 813 1745
kubernetes dashboard 128 549 16 1176 1613 70 3 18 82 969 308 677 1192
kubernetes test-infra 133 321 32 2067 5310 126 1 0 44 77 129 454 2099
require 'csv'
require 'pry'
max_cols = ['authors']
sort_by_col = 'issues'
orgs = {}
CSV.foreach('./all_time.csv', headers: true) do |row|
h = row.to_h
org = h['org']
orgs[org] = {items: [], sum: {}} unless orgs.key? org
orgs[org][:items] << h
h.each do |k, v|
v = v.to_i.to_s == v ? v.to_i : v
zero_v = v.class == String ? '' : 0
orgs[org][:sum][k] = zero_v unless orgs[org][:sum].key? k
if max_cols.include? k
orgs[org][:sum][k] = [orgs[org][:sum][k], v].max
else
orgs[org][:sum][k] += v.class == String ? '+'+v : v
end
orgs[org][:items].last[k] = v unless v.class == String
end
end
orgs.each do |_, data|
data[:sum]['issues'] = data[:sum]['issues open'] + data[:sum]['issues closed']
data[:sum]['PRs'] = data[:sum]['PRs open'] + data[:sum]['PRs closed']
data[:items].each do |item|
item['issues'] = item['issues open'] + item['issues closed']
item['PRs'] = item['PRs open'] + item['PRs closed']
end
end
orgs.each do |_, data|
data[:sum].each do |col, val|
next unless val.class == String
data[:sum][col] = val[1..-1].split('+').uniq.join('+')
end
end
ary = []
orgs.each do |_, data|
ary << data[:sum]
data[:items].each { |item| ary << item }
end
ary = ary.sort_by { |row| -row[sort_by_col] }
binding.pry
CSV.open("all_time_combined.csv", "w", headers: ary[0].keys) do |csv|
csv << ary[0].keys
ary.each do |row|
csv << row.values
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment