Skip to content

Instantly share code, notes, and snippets.

@ohpauleez
Created March 1, 2011 06:57
Show Gist options
  • Save ohpauleez/848741 to your computer and use it in GitHub Desktop.
Save ohpauleez/848741 to your computer and use it in GitHub Desktop.
(ns clopi.core
(:import (java.util.zip GZIPInputStream)
(java.io StringReader)))
(def *feed-archive-url* "http://clojars.org/repo/feed.clj.gz")
(defn gunzip
"Unzip a gzip archive into an input stream"
[archive]
(GZIPInputStream. (clojure.java.io/input-stream archive)))
(defn istream->urls
"Resolve the feed input stream lazily, generating a set of all the URLS"
[istream]
(let [rdr (clojure.java.io/reader istream)]
(reduce (fn [url-set line]
(let [jar-map (read-string line)]
(conj url-set (get jar-map :url "")))) #{} (line-seq rdr))))
(defn project->map
"Process a project.clj, generating a map of the data"
[project-str]
(if (< 1 (count project-str))
(let [project-map-str (-> project-str
(.replace "(" "")
(.replace ")" "")
(.replaceFirst "defproject" "{")
(str "}"))]
(read-string project-map-str))
{}))
(defn fetch-url
"Fetch the dependencies and dev-dependencies of a project.clj url"
[url]
(let [rdr (try
(clojure.java.io/reader url)
(catch Exception e (clojure.java.io/reader (StringReader. ""))))
proj-map (try
(project->map (apply str (line-seq rdr)))
(catch Exception e {}))]
(select-keys proj-map [:dependencies :dev-dependencies])))
(defn fetch-github
"Fetch the dependencies and dev-dependencies of a github host project"
[url]
(try
(fetch-url (.replaceFirst (str url "/raw/master/project.clj") "http:" "https:"))
(catch Exception e {})))
(defn count-dep-vec
[dep-vector stats-map-start]
(reduce (fn [stats-map [proj-index version-str]]
(let [proj-index (str proj-index)]
(update-in stats-map [proj-index version-str] (fnil inc 0))))
stats-map-start dep-vector))
(defn count-deps
"Create a map of all artifacts, and how many times someone has a dependency on a specific version"
[deps]
(reduce (fn [stats-map dep-map]
(try
(count-dep-vec (into (get dep-map :dependencies [])
(get dep-map :dev-dependencies [])) stats-map)
(catch Exception e (do (println dep-map) stats-map))))
(sorted-map) deps))
(ns clopi.github
(:require [clj-github.repos :as github]))
;; this needs to be way smarter about finding the end page, and about when to rate-limit/sleep
(defn clojure-repos
""
[]
(let [auth {:user "OhPauleez" :token "XXXXXXXXXXXXXXXXXXXXXXXXX"}]
(reduce (fn [res-vec page]
(into res-vec (do (Thread/sleep 1000) (github/search-repos auth "clojure" :language "Clojure" :start-page page)))) [] (range 1 50))))
(defn results->urls
"Take a search results vector, and return all the urls"
[results]
(into #{} (map #(get %1 :url "") results)))
@ohpauleez
Copy link
Author

A union of github and clojars returns me 1735 repos/projects.
From Feb 28th to April 26th, that's 324/57 new projects a day (about 5 to 6).

I wrote (and am finishing up) BitBucket bindings, which will allow us to see a large portion of all Clojure projects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment