Skip to content

Instantly share code, notes, and snippets.

View dksifoua's full-sized avatar
🎯
Focusing

Dimitri Sifoua dksifoua

🎯
Focusing
View GitHub Profile
__all__ = ['EMO_UNICODE', 'UNICODE_EMO', 'EMOTICONS', 'EMOTICONS_EMO']
EMOTICONS = {
u":‑\)": "Happy face or smiley",
u":\)": "Happy face or smiley",
u":-\]": "Happy face or smiley",
u":\]": "Happy face or smiley",
u":-3": "Happy face smiley",
u":3": "Happy face smiley",
u":->": "Happy face smiley",
$ sudo useradd kafka -m
The -m flag ensures that a home directory will be created for the user.
This home directory, /home/kafka, will act as our workspace directory for executing commands in the sections below.
Set the password using passwd:
$ sudo passwd kafka
Add the kafka user to sudo group
$ sudo adduser kafka sudo
# Create a file /etc/systemd/system/zookeeper.service and add it this content
==============================================================================================
[Unit]
Requires=network.target remote-fs.target
After=network.target remote-fs.target
[Service]
Type=simple
User=dimitri_sifoua
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties
@dksifoua
dksifoua / gcloud
Last active September 23, 2019 11:58
## SSH
> gcloud compute ssh spark-cluster-m --zone=us-east1-c --ssh-flag="-D" --ssh-flag="-N" --ssh-flag="10000"
The flag -D is to allow dynamic port forwardinig
The flag -N is to instruct gcloud to not open a remote shell
The flag 10000 is the port on which we want to open the ssh connection
## Start new browser session that uses the SOCKS proxy through the ssh tunnel created.
> "DIR of chrome.exe" "http://spark-cluster-m:8080" --proxy-server="socks5://localhost:10000" --host-resolver-rules="MAP * 0.0.0.0 , EXCLUDE localhost" --user-data-dir=/tmp/spark-cluster-m
from pyspark.sql.types import StringType
from pyspark.sql.functions import udf
maturity_udf = udf(lambda age: "adult" if age >=18 else "child", StringType())
df = sqlContext.createDataFrame([{'name': 'Alice', 'age': 1}])
df.withColumn("maturity", maturity_udf(df.age))
from textblob import TextBlob
from textblob import blob
nltk.download('averaged_perceptron_tagger')
nltk.download('punkt')
nltk.download('wordnet')
def to_wordnet(tag):
_wordnet = _wordnet
if tag in ("NN", "NNS", "NNP", "NNPS"):
@dksifoua
dksifoua / gist:d2c775e2de272091e150f9aa259680ee
Created October 11, 2019 15:41
run_python_script_background.sh
nohup python index.py > output.log &
ps ax | grep index.py
[Unit]
Description=Zeppelin service
After=syslog.target network.target
[Service]
Type=forking
ExecStart=/opt/zeppelin-0.8.2-bin-all/bin/zeppelin-daemon.sh start
ExecStop=/opt/zeppelin-0.8.2-bin-all/bin/zeppelin-daemon.sh stop
ExecReload=/opt/zeppelin-0.8.2-bin-all/bin/zeppelin-daemon.sh reload
Restart=always
# On the master
# The master will then launch et give you its address like: spark://IP:PORT
$ sudo ./spark-class org.apache.spark.deploy.master.Master
# On the workers
$ sudo ./spark-class org.apache.spark.deploy.worker.Worker spark://IP:PORT
# Start the shell
$ sudo ./spark-shell --master spark://IP:PORT
@dksifoua
dksifoua / openai-gym-rendering-colab.py
Last active March 3, 2020 21:33
Rendering OpenAi Gym in Google Colaboratory
# Install gym dependencies
!apt-get update > /dev/null 2>&1
!apt-get install python-opengl -y > /dev/null 2>&1
!apt install xvfb -y --fix-missing > /dev/null 2>&1
!apt-get install ffmpeg > /dev/null 2>&1
!apt-get install x11-utils > /dev/null 2>&1
# Install rendering environment
!pip install pyvirtualdisplay > /dev/null 2>&1
!pip install piglet > /dev/null 2>&1