Skip to content

Instantly share code, notes, and snippets.

View billmetangmo's full-sized avatar

Bill Metangmo billmetangmo

  • France
View GitHub Profile
@billmetangmo
billmetangmo / page_rank_amazon.scala
Created February 5, 2018 12:27
Simple PageRank code for spark-shell using Amazon Movies & TV reviews as data source
import org.apache.spark._
import org.apache.spark.graphx._
import java.io.File
import java.time._
import java.util.Calendar
import scala.util.MurmurHash
import java.io.PrintWriter
import org.apache.spark.rdd.RDD
@billmetangmo
billmetangmo / NutchToElastic.scala
Created February 5, 2018 12:38
Index Nutch data in ElasticSearch 5.x from spark-shell
import org.elasticsearch.hadoop._
import org.elasticsearch.spark._
import java.time._
import java.time.format._
// Get parameters
// args(0) = Namenode IP Address ( Resource Manager on MapR distribution)
// args(1) = nutch segment name
// args(2) = Elasticsearch index name
val args = sc.getConf.get("spark.driver.args").split("\\s+")
@billmetangmo
billmetangmo / install_ansible_docker_offline.sh
Created February 5, 2018 12:51
Install & clean ansible and docker without access to EPEL or offline
#!/usr/bin/env bash
## default variables values ##
## TODO: ansible_mirror ####
## TODO: des_copy == src_install ##
## refers parameter expansion : https://opensource.com/article/17/6/bash-parameter-expansion?sc_cid=70160000001273HAAQ ###
mirror=ftp://195.220.108.108/linux
docker_mirror=https://yum.dockerproject.org/repo/main/centos/7/Packages
dest_copy=${HOME}/ansible/Packages
@billmetangmo
billmetangmo / git_lf_ending_only.sh
Created February 5, 2018 13:19
configure git to allow work with LF on Windows
## Reference: https://stackoverflow.com/questions/1967370/git-replacing-lf-with-crlf
git config --local core.autocrlf input
@billmetangmo
billmetangmo / DockerFile
Created February 5, 2018 13:49
DockerFile Spark + RDMA from HiBD University of Ohio
FROM centos:latest
LABEL maintainer Bill METANGMO @billmetangmo github \
description="Optimized for HPC machine learning & Graph processing apps"
#dependencies="bash,procps,openjdk8-jre-base,openssh,ca-certificates"\
#external="scala"
# proxy are not defined as env variable as thi image would be used in client env where our proxy doesn't make sense
ARG http_proxy=
ARG https_proxy=
ARG no_proxy=localhost,127.0.0.1
@billmetangmo
billmetangmo / README-Template.md
Created February 14, 2018 14:55 — forked from PurpleBooth/README-Template.md
A template to make good README.md

Project Title

One Paragraph of project description goes here

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

@billmetangmo
billmetangmo / registry.yml
Last active May 30, 2018 13:57
ansible playbook to set up a docker registry with xfs + proxies + limesurvey
---
- hosts: registry
become: true
vars:
public_ip: 10.197.138.130
tasks:
- name: set selinux to permissive
selinux:
policy: targeted
@billmetangmo
billmetangmo / unzip.go
Created April 30, 2018 11:17
Extract files from a zip that was zipped without directories but just files even if files are inside directories
// Iterate through all the zip files in multi-part form-data (memory or disk)
// For each file, extract it to destDir and change permissions of the file to the user one
// If the zip file is a directory , empty subdirectories are ignored
func extractFilesWithUserRights(files []*zip.File, filePath string, username string) error {
uid, gid, err := getIdentifier(username)
if err != nil {
return err
}
@billmetangmo
billmetangmo / binary_search.py
Created March 20, 2019 18:10
Binary search for a value upper than an item
def binary_search(alist, item):
first = 0
last = len(alist) - 1
found = False
while first <= last and not found:
midpoint = (first + last) / 2
if alist[midpoint] >= item:
found = True
else:
@billmetangmo
billmetangmo / remove_unicode_escape
Last active April 10, 2021 21:51
Open refine Jypthon interpreter
return value.encode("utf-8").decode('raw-unicode-escape')