Skip to content

Instantly share code, notes, and snippets.

@pancodia
pancodia / bootstrap_jupyter.sh
Last active July 12, 2021 05:51 — forked from nicor88/bootstrap_jupyter.sh
Bootstrap action to install Conda and Jupyter on EMR
#!/usr/bin/env bash
set -x -e
JUPYTER_PASSWORD=${1:-"myJupyterPassword"}
NOTEBOOK_DIR=${2:-"s3://myS3Bucket/notebooks/"}
# home backup
if [ ! -d /mnt/home_backup ]; then
sudo mkdir /mnt/home_backup
sudo cp -a /home/* /mnt/home_backup
hass:account
hass:alert
hass:alert-circle
hass:altimeter
hass:apple-safari
hass:apps
hass:arrow-bottom-left
hass:arrow-down
hass:arrow-left
hass:arrow-right
@pancodia
pancodia / Liberal Regex Pattern for URLs
Created June 14, 2018 22:26 — forked from winzig/Liberal Regex Pattern for URLs
Updated @gruber's regex with a modified version that looks for 2-13 letters rather than trying to look for specific TLDs. Given the recent addition of ~1400 gTLDs, it may be time to give up on that front. (UPDATE 2018-05-15: Naked URLs without protocol prefix now capable of matching more advanced URLs. Also escaped / and " so it's easier to copy…
# Single-line version:
(?i)\b((?:https?:(?:\/{1,3}|[a-z0-9%])|[a-z0-9.\-]+\.(?:[a-z0-9]{2,13})\/)(?:[^\s()<>{}\[\]]+|\([^\s()]*?\([^\s()]+\)[^\s()]*?\)|\([^\s]+?\))+(?:\([^\s()]*?\([^\s()]+\)[^\s()]*?\)|\([^\s]+?\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’])|(?:(?<!@)[a-z0-9]+(?:[.\-][a-z0-9]+)*[.](?:[a-z0-9]{2,13})\b\/?(?!@)(?:[^\s()<>{}\[\]]+|\([^\s()]*?\([^\s()]+\)[^\s()]*?\)|\([^\s]+?\))+(?:\([^\s()]*?\([^\s()]+\)[^\s()]*?\)|\([^\s]+?\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’])))
# Commented multi-line version:
(?xi)
\b
( # Capture 1: entire matched URL
(?:
https?: # URL protocol and colon
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import Ridge, RidgeCV
# import seaborn as sns # failed to show marker with this line uncommented
# Load data (Hitters dataset)
hitters_df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/Hitters.csv')
hitters_clean_df = hitters_df.dropna()
hitters_clean_df = pd.get_dummies(hitters_clean_df, drop_first=True)
@pancodia
pancodia / sklearn_ridge_cv_issue.py
Created July 8, 2017 04:27
RidgeCV gives a different result from running Ridge with manually implemented CV
'''
Python 3.6.1 |Anaconda 4.4.0 (x86_64)| (default, May 11 2017, 13:04:09)
In [1]: import sklearn
In [2]: print(sklearn.__version__)
0.18.1
'''
import pandas as pd
from sklearn.model_selection import KFold