Skip to content

Instantly share code, notes, and snippets.

View krokrob's full-sized avatar
🥞
Data Warehousing

Kevin ROBERT krokrob

🥞
Data Warehousing
View GitHub Profile

Apple Silicon x TensorFlow

Config

Open a Terminal window.

Open your zsh config file:

code ~/.zshrc
REQUIRED=('pytest' 'pylint' 'ipdb' 'PyYAML' 'nbresult' 'autopep8' 'flake8' 'yapf' 'lxml' 'requests' 'bs4' 'jupyterlab' 'pandas' 'matplotlib' 'seaborn' 'plotly' 'scikit-learn' 'tensorflow' 'nbconvert' 'xgboost' 'statsmodels' 'pandas-profiling' 'jupyter-resource-usage' 'dtale')
PACKAGES=$(pip freeze)
PACKS=()
while read -r line; do
PACKS+=("$line")
done <<< "$PACKAGES"
missing=()
for r in ${REQUIRED[@]}; do
present=0
for p in ${PACKS[@]}; do

Manual K-fold Cross Validation without data leakage

Considering a dataset data which is already one hot encoded:

n_split = 5
len_split = int(data.shape[0]/n_split)

# Select only  numerical values for this example
data_num = data.select_dtypes(exclude=['object'])
print('Loading pandas...')
import pandas as pd
df = pd.DataFrame({'pandas':['OK']})
df.shape
print('✅ pandas OK')
print('Loading Scikit-learn...')
from sklearn.decomposition import PCA
pca = PCA()
print('✅ Scikit-learn OK')
print('Loading TensorFlow...')
REQUIRED=('pytest' 'pylint' 'ipdb' 'PyYAML' 'nbresult' 'autopep8' 'flake8' 'yapf' 'lxml' 'requests' 'bs4' 'jupyterlab' 'pandas' 'matplotlib' 'seaborn' 'plotly' 'scikit-learn' 'tensorflow' 'nbconvert' 'xgboost' 'statsmodels' 'pandas-profiling' 'jupyter-resource-usage' 'dtale')
PACKAGES=$(pip freeze)
PACKS=()
while read -r line; do
PACKS+=("$line")
done <<< "$PACKAGES"
missing=()
arch_name="$(uname -m)"
if [ "${arch_name}" = "x86_64" ]; then
if [ "$(sysctl -in sysctl.proc_translated)" = "1" ]; then
import os
import pandas as pd
class Olist:
def get_data(self):
"""
This function returns a Python dict.
Its keys should be 'sellers', 'orders', 'order_items' etc...
from time import sleep
class Trainer:
def run(self):
print("Training launched.")
print("Training running...")
sleep(2)
print("Taining finished.")
if __name__ == '__main__':
@krokrob
krokrob / minimal_requirements.txt
Last active November 21, 2022 01:43
Le Wagon Data Science Bootcamp minimal requirements to start a fresh virtualenv.
pytest
pylint
ipdb
jupyterlab
numpy
pandas
matplotlib
seaborn
scikit-learn
packgenlite @ git+https://github.com/krokrob/packgenlite.git@master

How to install hub?

Mac OSX 🍏

brew install hub

Then restart your terminal.

Windows Git Bash 🖼

import os
import pandas as pd
class Olist:
def get_data(self):
"""
01-01 > This function returns all Olist datasets
as DataFrames within a Python dict.