Skip to content

Instantly share code, notes, and snippets.

@corecode
Created March 8, 2013 18:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save corecode/5118449 to your computer and use it in GitHub Desktop.
Save corecode/5118449 to your computer and use it in GitHub Desktop.
(ns domlink-neo4j-import.core
(:require [clojure.string :as s])
(:gen-class))
(defn get-name-id
[name-str {:keys [names next-id] :as r :or {names {} next-id 1}}]
(let [id (names name-str)]
(if id
[id r]
(let [id next-id
next-id (inc next-id)]
[id {:names (assoc names name-str id) :next-id next-id}]))))
(defn process-data
[inf outf]
(binding [*flush-on-newline* false
*out* outf]
(->> (line-seq inf)
(map #(s/split % #"\s+"))
(reduce
(fn [names [from to link-count]]
(let [[from-id names] (get-name-id from names)
[to-id names] (get-name-id to names)]
(println from-id to-id link-count)
names))
{})
;; we have the get-name-id map now, get the names from it
:names
(map #(println (s/join "\t" %)))
;; map is lazy, so make sure we process all operations
doall)
nil))
(defn -main
"Reads an input file with lines containing
from-name to-name num-count
and replaces names with unique ids.
XXX output id->name mapping in the end"
([inname]
(with-open [inf (clojure.java.io/reader inname)
outf (clojure.java.io/writer java.lang.System/out)]
(process-data inf outf)))
([inname outname]
(with-open [inf (clojure.java.io/reader inname)
outf (clojure.java.io/writer outname)]
(process-data inf outf))))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment