Skip to content

Instantly share code, notes, and snippets.

/Output.clj Secret

Created November 4, 2015 01:55
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save anonymous/703ffb1fe824bed2cb2e to your computer and use it in GitHub Desktop.
Save anonymous/703ffb1fe824bed2cb2e to your computer and use it in GitHub Desktop.
(time (process-stars Double/POSITIVE_INFINITY))
Total Stars Processed: 115372
Minimum Distance = 0.0
Maximum Distance = 55928.959904381874
Mean Distance = 43.352289954998135
"Elapsed time: 1197280.621103 msecs"
Program Description
A program to read star information from the HYG database, ignoring distances which
are not accurately known, and compute the minimum, maximum, and mean shortest
distances between the stars. That is, for each star, finds the distance
to its nearest neighbor and tallies the statistics for that distance to its nearest
neighbor for all stars.
The HYG Database was downloaded from https://github.com/astronexus/HYG-Database.
The file used was hygxyz.csv, which contained 119617 lines of stars.
Stars with a distance of 10,000,000 were filtered out in the program.
Program Design
My program reads each line in from the csv file, and then splits those lines on
the comma character, storing the results. The relevant information from each
line is then used to create a Star object. These Star objects are used as the
basis for all the remaining processing.
After the Star objects are created, they are looped through to compute the nearest
neighbor distance value for each. Every Star object is checked against the set
of all Star objects to find the closest one, resulting in an O(n2) algorithm.
(ns clojure-noob.core
(:gen-class)
(:require [clojure.data.csv :as csv]
[clojure.java.io :as io]
[clojure.math.numeric-tower :as math]))
(defn take-csv
"Takes file name and reads data into list of vectors"
[fname]
(with-open [in-file (io/reader fname)]
(doall
(csv/read-csv in-file))))
(def input-seq (take-csv "hygxyz.csv"))
(defn to-num
"Outputs a number for nth item in collection"
[coll n]
(read-string (nth coll n)))
(defn create-map
"Transforms list of star vectors into list of maps"
[stars]
(map (fn [vec]
(assoc {}
:id (to-num vec 0)
:distance (to-num vec 9)
:x (to-num vec 17)
:y (to-num vec 18)
:z (to-num vec 19)))
stars))
(defn distance-filter
"Filter star maps to only desired stars"
[stars maxDistance]
(filter (fn [star]
; Filter out distances greater than input
; Also filter out 10 mil distances
(and (<= (:distance star) maxDistance)
(not= (:distance star) 10000000)))
; Ignore the first line
(rest stars)))
(defn compute-distance
"Computes the distance between two stars"
[a b]
(math/sqrt
(+ (math/expt (- (:x a) (:x b)) 2)
(math/expt (- (:y a) (:y b)) 2)
(math/expt (- (:z a) (:z b)) 2))))
(defn mean
"Returns the mean of a collection"
[coll]
(/ (reduce + coll) (count coll)))
(defn find-nearest-neighbor
"Returns the distance to nearest neighboring star"
[star stars]
(reduce (fn [last x]
(min last (compute-distance star x)))
(compute-distance star (first stars)) (rest stars)))
(defn remove-self
"Removes the given star from stars collection"
[star stars]
(remove #(= (:id star) (:id %)) stars))
(defn process-stars
"Get the min, max, and mean distances of all nearest neighbors"
[max-distance]
(let [
; create a list of stars that are within specified distance
stars (distance-filter (create-map input-seq) max-distance)
; create a list of the distances to nearest neighbor for each star
distances (pmap #(find-nearest-neighbor % (remove-self % stars))
stars)
]
(do
(println "Total Stars Processed: " (count stars))
(println "Minimum Distance = " (apply min distances))
(println "Maximum Distance = " (apply max distances))
(println "Mean Distance = " (mean distances)))))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment