Skip to content

Instantly share code, notes, and snippets.

View Antwnis's full-sized avatar

Antonios Chalkiopoulos Antwnis

View GitHub Profile
044a5b73d79fa4482277b7706c0f1b524edea0a4277115599c59bd375780a79c9a4dea40ed66aef20799aeee8cef8155029438a831a94815a8074cd1e3a7375856
version: '2'
services:
zookeeper:
image: "confluentinc/cp-zookeeper"
container_name: zookeeper
ports:
- "2181:2181"
environment:
- ZOOKEEPER_CLIENT_PORT=2181
kafka:
for(Footer f : ParquetFileReader.readFooters(conf, fs, false)){
for(BlockMetaData b : f.getParquetMetadata().getBlocks()){
rowCount += b.getRowCount();
}
@Antwnis
Antwnis / gist:0dea8383788549302eb3
Created January 29, 2015 11:25
Capturing Deleted/Inserted records
package org.fannan.etl.examples
import cascading.pipe.joiner.{OuterJoin, LeftJoin}
import com.twitter.scalding._
class IUDJob(args: Args) extends Job(args) {
val schema = List('CustID,'AccountID,'LastUpdateDate)
val old_data = List(
@Antwnis
Antwnis / gist:eb248ebf6d8812e902e6
Created November 20, 2014 09:41
install_maven_centos.sh
## Maven ##
wget http://www.motorlogy.com/apache/maven/maven-3/3.2.1/binaries/apache-maven-3.2.1-bin.zip -O /usr/local/src/maven-3.2.1.zip
unzip /usr/local/src/maven-3.2.1.zip -d /opt
mv /opt/apache-maven-3.2.1 /opt/maven
ln -s /opt/maven/bin/mvn /usr/bin/mvn
bash -c "echo 'MAVEN_HOME=/opt/maven' > /etc/profile.d/maven.sh"
bash -c "echo 'MAVEN_OPTS=\"-Xmx2g -Xmx512m -XX:MaxPermSize=512m -XX:ReservedCodeCacheSize=512m\"' >> /etc/profile.d/maven.sh"
bash -c "echo 'export CLASSPATH=.' >> /etc/profile.d/maven.sh"
source /etc/profile.d/maven.sh
@Antwnis
Antwnis / install_gradle_centos.sh
Last active September 22, 2015 11:03
Install Gradle 2.2 on CentOS-7
#!/bin/bash
# installs to /opt/gradle
# existing versions are not overwritten/deleted
# seamless upgrades/downgrades
# $GRADLE_HOME points to latest *installed* (not released)
gradle_version=2.2
mkdir /opt/gradle
wget -N http://services.gradle.org/distributions/gradle-${gradle_version}-all.zip
unzip -oq ./gradle-${gradle_version}-all.zip -d /opt/gradle
ln -sfnv gradle-${gradle_version} /opt/gradle/latest
@Antwnis
Antwnis / install_scala_centos.sh
Last active September 9, 2022 20:09
Install Scala CentOS
export SCALA_VERSION=scala-2.11.5
sudo wget http://www.scala-lang.org/files/archive/${SCALA_VERSION}.tgz
sudo echo "SCALA_HOME=/usr/local/scala/scala-2.11.5" > /etc/profile.d/scala.sh
sudo echo 'export SCALA_HOME' >> /etc/profile.d/scala.sh
sudo mkdir -p /usr/local/scala
sudo -s cp $SCALA_VERSION.tgz /usr/local/scala/
cd /usr/local/scala/
sudo -s tar xvf $SCALA_VERSION.tgz
sudo rm -f $SCALA_VERSION.tgz
sudo chown -R root:root /usr/local/scala
@Antwnis
Antwnis / install_sbt_centos.sh
Created November 5, 2014 12:38
Install SBT 0.13.1
sudo echo 'SBT_HOME=/usr/local/sbt/sbt-0.13.1' > /etc/profile.d/sbt.sh
sudo echo 'export SBT_HOME' >> /etc/profile.d/sbt.sh
sudo mkdir -p /usr/local/sbt
wget http://repo.scala-sbt.org/scalasbt/sbt-native-packages/org/scala-sbt/sbt/0.13.1/sbt.tgz
sudo -s cp sbt.tgz /usr/local/sbt/
cd /usr/local/sbt/
sudo -s tar xvf sbt.tgz
sudo mv sbt sbt-0.13.1
sudo rm -f sbt.tgz
sudo chown -R root:root /usr/local/sbt
@Antwnis
Antwnis / build.sbt
Last active April 4, 2018 12:55
Exclude SBT dependencies for "hadoop-common"
import AssemblyKeys._
assemblySettings
net.virtualvoid.sbt.graph.Plugin.graphSettings
name := "hbasetest"
version := "1.0"
@Antwnis
Antwnis / gist:4953c1effc38800e9d0b
Created August 19, 2014 10:40
Hashing all Fields in Scalding
import cascading.pipe.Pipe
import cascading.tuple.{TupleEntry, Fields}
import com.twitter.scalding._
// One trait + one object = Custom Operations
trait HashOperations extends FieldConversions {
def self: RichPipe
def generateHash : Pipe = self
.map(Fields.ALL -> 'hash) { te : TupleEntry =>
val tuple = te.getTuple