Skip to content

Instantly share code, notes, and snippets.

View gbraccialli's full-sized avatar

Gui Braccialli gbraccialli

View GitHub Profile
@gbraccialli
gbraccialli / clickstream.ipynb
Last active February 23, 2021 04:45
clickstream
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
#(crontab -l 2>/dev/null; echo "*/1 * * * * /root/check_memory.sh") | crontab -
free=$( free -g | grep "buffers/cache" | awk '{print $4}')
if [ "$free" -le 100 ]
then
echo "memory too high $free, killing `date`" | tee -a /var/log/memory_check.log
ps aux --sort -rss | tee -a /var/log/memory_check.log
pkill -9 python
else
echo "memory ok $free free at `date`" | tee -a /var/log/memory_check.log
cd /tmp
mkdir /app
rm -f /tmp/Anaconda3-2019.10-Linux-x86_64.sh
wget https://repo.anaconda.com/archive/Anaconda3-2019.10-Linux-x86_64.sh
chmod 744 ./Anaconda3-2019.10-Linux-x86_64.sh
/tmp/Anaconda3-2019.10-Linux-x86_64.sh -b -p /app/anaconda3/
rm -f /tmp/Anaconda3-2019.10-Linux-x86_64.sh
/app/anaconda3/bin/conda install -y -c conda-forge jupyterhub
/app/anaconda3/bin/pip install jupyter-server-proxy
/app/anaconda3/bin/pip install ipykernel
cd /tmp
sudo sh -c 'mkdir /app'
sudo sh -c 'rm -f /tmp/Anaconda3-2019.10-Linux-x86_64.sh'
sudo sh -c 'wget https://repo.anaconda.com/archive/Anaconda3-2019.10-Linux-x86_64.sh'
sudo sh -c 'chmod 744 ./Anaconda3-2019.10-Linux-x86_64.sh'
sudo sh -c '/tmp/Anaconda3-2019.10-Linux-x86_64.sh -b -p /app/anaconda3/'
sudo sh -c 'rm -f /tmp/Anaconda3-2019.10-Linux-x86_64.sh'
sudo sh -c '/app/anaconda3/bin/conda init'
sudo sh -c '/app/anaconda3/bin/conda create -y -n xxx python=3.6'
sudo sh -c 'aws configure set s3.signature_version s3v4'
#option 2, obtain token by username/password
username = 'jupyter'
password = 'jupyter'
# step 1: login with username + password
r = requests.post(login_url, data={'username': username, 'password': password}, allow_redirects=False)
r.raise_for_status()
cookies = r.cookies
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
#-----------------------------------------------------------------------------------------------
#copy emr conf
#-----------------------------------------------------------------------------------------------
emr_ip=10.135.241.137
sudo rm -rf /etc/yum.repos.d/emr-*.repo
sudo rm -rf /var/aws/emr/repoPublicKey.txt
sudo mkdir -p /var/aws/emr/
sudo chmod +r -R /var/aws/
sudo rm -rf /etc/spark/
sudo rm -rf /etc/hadoop/
GIT_PROJECT = "xxxx"
PROJECT = "aaaaa"
USERNAME = "guilherme"
BRANCH = "develop"
SPARK_MODE = "local" # local or yarn
%run /home/jupyter/kedro_load.py $GIT_PROJECT $PROJECT $USERNAME $BRANCH $SPARK_MODE
######################################################################
def randomString(stringLength=10):
mkdir ~/spark
cd ~/spark
wget https://archive.apache.org/dist/spark/spark-2.4.3/spark-2.4.3-bin-without-hadoop.tgz
wget https://archive.apache.org/dist/hadoop/core/hadoop-3.1.1/hadoop-3.1.1.tar.gz
tar xvf hadoop-3.1.1.tar.gz
tar xvf spark-2.4.3-bin-without-hadoop.tgz
cd ~
##################
import seaborn as sns
pal = sns.color_palette(n_colors=50)
pal.as_hex()