I hereby claim:
- I am erasmas on github.
- I am dmorozov (https://keybase.io/dmorozov) on keybase.
- I have a public key whose fingerprint is 6200 D027 37D4 E068 BDE5 78AB 186A 11CD 1418 AD13
To claim this, I am signing this object:
(ns hello-world.core | |
(:require | |
[cljs.core.async :as async :refer [put! chan <! >! close!]] | |
[cljs.nodejs :as node]) | |
(:require-macros | |
[cljs.core.async.macros :refer [go]])) | |
(enable-console-print!) | |
(node/enable-util-print!) |
I hereby claim:
To claim this, I am signing this object:
(testing | |
"take-ordered returns the first N elements of an RDD using the natural ordering" | |
(is (= (-> (s/parallelize c [[1 -1] [2 -2] [3 -3] [4 -4]]) | |
(s/take-ordered 1)) | |
[[1 -1]]))) | |
(testing | |
"take-ordered returns the first N elements of an RDD as defined by the specified comparator" | |
(is (= (-> (s/parallelize c [[1 -1] [2 -2] [3 -3] [4 -4]]) | |
(s/take-ordered 1 (comparator (fn [[_ y1] [_ y2]] (< y1 y2))))) |
; In R language there’s a very cool function which allows you to read FWF files w/o a hassle: | |
; http://stat.ethz.ch/R-manual/R-patched/library/utils/html/read.fwf.html | |
; Following is a Clojure function to parse a line from a FWF file. | |
(defn fixed-width-format | |
[s widths] | |
(->> (reductions (fn [acc n] (+ acc (Math/abs n))) (cons 0 widths)) | |
(partition 2 1) | |
(map (fn [[start end]] (subs s start end))) | |
(keep-indexed (fn [idx e] (if (pos? (nth widths idx)) e))))) |
Following are the steps to setup a fancy build status icon in your OSX bar using AnyBar. At my current project we use Bamboo for CI, which provides REST API to get build statuses that you can visualize in system bar with green or red icon depending on the state of build.
Install Dependencies
# install cask and anybar
brew install caskroom/cask/brew-cask
brew cask install anybar
# install socat and jq
(defn dirs-with-parquet | |
"Recursively find directories on HDFS that contain *.parquet files" | |
[path] | |
(let [fs (FileSystem/getLocal (Configuration.)) | |
directory (Path. path) | |
dir? #(.isDirectory fs %) | |
contains-parquet (fn [path] (not (empty? (.globStatus fs (Path. path "*.parquet"))))) | |
dirs (tree-seq dir? | |
(fn [path] (map #(.getPath %) (.listStatus fs path))) | |
directory)] |
package com.mycompany; | |
import cascading.flow.Flow; | |
import cascading.flow.FlowConnector; | |
import cascading.flow.FlowDef; | |
import cascading.flow.hadoop2.Hadoop2MR1FlowConnector; | |
import cascading.pipe.GroupBy; | |
import cascading.pipe.Pipe; | |
import cascading.property.AppProps; | |
import cascading.scheme.Scheme; |
2015-02-04 12:47:45,300 INFO [LocalJobRunner Map Task Executor #0] mapred.Task (Task.java:initialize(581)) - Using ResourceCalculatorProcessTree : null | |
2015-02-04 12:47:45,327 INFO [LocalJobRunner Map Task Executor #0] io.MultiInputSplit (MultiInputSplit.java:readFields(161)) - current split input path: file:/tmp/staging/data/staging-path/part-00000 | |
2015-02-04 12:47:45,328 INFO [LocalJobRunner Map Task Executor #0] mapred.MapTask (MapTask.java:updateJobWithSplit(462)) - Processing split: cascading.tap.hadoop.io.MultiInputSplit@29033afe | |
2015-02-04 12:47:45,333 WARN [LocalJobRunner Map Task Executor #0] io.MultiInputFormat (Util.java:retry(768)) - unable to get record reader, but not retrying | |
java.io.IOException: file:/tmp/staging/data/staging-path/part-00000 not a SequenceFile | |
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1850) | |
at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1810) | |
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1759) | |
a |
(ns cascalog-class.core | |
(:require [cascalog.api :refer :all] | |
[cascalog.ops :as c])) | |
(defmapcatop split | |
[^String sentence] | |
(.split sentence "\\s+")) | |
(def -main | |
(?<- (stdout) |
(let [fs (FileSystem/getLocal (Configuration.)) | |
path (Path. "/data/in/*.txt.gz")] | |
(map #(.getPath %) (.globStatus fs path))) |