Skip to content

Instantly share code, notes, and snippets.

View OElesin's full-sized avatar

Olalekan Fuad Elesin OElesin

View GitHub Profile

Examples for python and Spark


  • Word Count
import sys
from operator import add
from pyspark import SparkContext

if __name__ == "__main__":
    if len(sys.argv) != 2:
-- This is a Hive program. Hive is an SQL-like language that compiles
-- into Hadoop Map/Reduce jobs. It's very popular among analysts at
-- Facebook, because it allows them to query enormous Hadoop data
-- stores using a language much like SQL.
-- Our logs are stored on the Hadoop Distributed File System, in the
-- directory /logs/ They're ordinary Apache
-- logs in *.gz format.
-- We want to pretend that these gzipped log files are a database table,
OElesin /
Last active August 29, 2015 14:24 — forked from mikedfunk/

This uses Twitter Bootstrap classes for CodeIgniter pagination.

Drop this file into application/config.

OElesin /
Last active August 29, 2015 14:24 — forked from staltz/

The introduction to Reactive Programming you've been missing

(by @andrestaltz)

So you're curious in learning this new thing called Reactive Programming, particularly its variant comprising of Rx, Bacon.js, RAC, and others.

Learning it is hard, even harder by the lack of good material. When I started, I tried looking for tutorials. I found only a handful of practical guides, but they just scratched the surface and never tackled the challenge of building the whole architecture around it. Library documentations often don't help when you're trying to understand some function. I mean, honestly, look at this:

Rx.Observable.prototype.flatMapLatest(selector, [thisArg])

Projects each element of an observable sequence into a new sequence of observable sequences by incorporating the element's index and then transforms an observable sequence of observable sequences into an observable sequence producing values only from the most recent observable sequence.

OElesin / Mail.scala
Created October 9, 2015 05:41 — forked from mariussoutier/Mail.scala
Sending mails fluently in Scala
package object mail {
implicit def stringToSeq(single: String): Seq[String] = Seq(single)
implicit def liftToOption[T](t: T): Option[T] = Some(t)
sealed abstract class MailType
case object Plain extends MailType
case object Rich extends MailType
case object MultiPart extends MailType
OElesin / codeigniter-rating-lib.php
Created December 5, 2015 14:00 — forked from escapeboy/codeigniter-rating-lib.php
CodeIgniter Rating Library + Microdata (optional)
<?php if ( ! defined('BASEPATH')) exit('No direct script access allowed');
* Rating Library
* Using jQuery Raty plugin to rate products
* @author Nikola Katsarov
* @website
class Rating {
package botkop.sparti.receiver
import com.rabbitmq.client._
import org.apache.spark.Logging
import org.apache.spark.streaming.StreamingContext
import org.apache.spark.streaming.dstream.ReceiverInputDStream
import org.apache.spark.streaming.receiver.Receiver
import scala.reflect.ClassTag
OElesin /
Created March 9, 2016 18:12 — forked from Antwnis/
Install Scala CentOS
export SCALA_VERSION=scala-2.11.5
sudo wget${SCALA_VERSION}.tgz
sudo echo "SCALA_HOME=/usr/local/scala/scala-2.11.5" > /etc/profile.d/
sudo echo 'export SCALA_HOME' >> /etc/profile.d/
sudo mkdir -p /usr/local/scala
sudo -s cp $SCALA_VERSION.tgz /usr/local/scala/
cd /usr/local/scala/
sudo -s tar xvf $SCALA_VERSION.tgz
sudo rm -f $SCALA_VERSION.tgz
sudo chown -R root:root /usr/local/scala
OElesin / LDA_SparkDocs
Created July 22, 2016 15:33 — forked from jkbradley/LDA_SparkDocs
LDA Example: Modeling topics in the Spark documentation
This example uses Scala. Please see the MLlib documentation for a Java example.
Try running this code in the Spark shell. It may produce different topics each time (since LDA includes some randomization), but it should give topics similar to those listed above.
This example is paired with a blog post on LDA in Spark:
import scala.collection.mutable
OElesin /
Created September 1, 2016 14:11 — forked from alekseyl1992/
Apache Spark. Training MLP on MNIST
from __future__ import print_function
from pyspark import SparkContext, SparkConf
from pyspark.mllib.linalg import DenseVector, VectorUDT
from pyspark.sql import SQLContext
from import MultilayerPerceptronClassifier
from import MulticlassClassificationEvaluator
from pyspark.sql.types import StructType, StructField, StringType, DoubleType, ArrayType