Skip to content

Instantly share code, notes, and snippets.

View JoaoCarabetta's full-sized avatar
🏊
data swimming

João Carabetta JoaoCarabetta

🏊
data swimming
View GitHub Profile
@JoaoCarabetta
JoaoCarabetta / multiprocessing_multivariable.py
Last active June 1, 2020 23:26
Ready to go parallel process implementation for functions with more than one argument
from multiprocessing.pool import Pool
from functools import partial
import time
n_processes = 3
def func(extra, t):
time.sleep(t + extra)
print('t', t, 'extra', extra)
@JoaoCarabetta
JoaoCarabetta / list_in_chuncks.py
Created March 26, 2020 19:05
Break list in chuncks
break_list_in_chuncks = lambda data, chunck: [data[x:x+chunck] for x in range(0, len(data), chunck)]
@JoaoCarabetta
JoaoCarabetta / safely_access_key_from_nested_dict.py
Created February 11, 2020 20:36
Safely access key from nested dict
def accessr(d, keys, default=None):
if len(keys) and d is not None:
return accessr(d.get(keys[0], default), keys[1:], default)
else:
return d
def access(d, keys, default=None):
for k in keys:
if d is not None:
d = d.get(k, default)
@JoaoCarabetta
JoaoCarabetta / safe_create_path.py
Created February 6, 2020 17:17
Safely create a path in python
def safe_create_path(path, replace=False):
try:
if replace:
if os.path.isfile(path):
shutil.rmtree(path)
os.makedirs(path)
except Exception as e:
pass
@JoaoCarabetta
JoaoCarabetta / request2pdf2string.py
Created February 2, 2020 21:45
From request to pdf to string
import requests
# Download
res = requests.get('https://www.camara.leg.br/proposicoesWeb/prop_mostrarintegra?codteor=938381&filename=PL+2699/2011')
# To PDF
with open('metadata.pdf', 'wb') as f:
f.write(res.content)
# To string
@JoaoCarabetta
JoaoCarabetta / cufflinks_template.py
Last active January 10, 2020 21:03
Cufflinks Templates

How to use

This class is supposed to be used inside a Jupyter notebook with Keplergl activated.

It just makes it easier to organize maps and save configuration files.

The advantage is that once the config is save, the map with the same identifier_string will load with the saved config.

Initialize Map

@JoaoCarabetta
JoaoCarabetta / strict_polygon_overlay_over_linestring.py
Last active November 25, 2019 15:19
Linestring Polygon Overlay with Geopandas
def line_polygon_intersection(line_df, poly_df):
column_geom_poly = poly_df._geometry_column_name
column_geom_line = line_df._geometry_column_name
spatial_index = line_df.sindex
bbox = poly_df.geometry.apply(lambda x: x.bounds)
sidx = bbox.apply(lambda x: list(spatial_index.intersection(x)))
nei = []
@JoaoCarabetta
JoaoCarabetta / plotly_cufflinks_subplot_uniquelegend.py
Created October 30, 2019 20:04
Subplot with cufflinks with unique legend
@JoaoCarabetta
JoaoCarabetta / stratified_train_test_split.sql
Last active October 23, 2019 20:36
Splits stratified dataset into train and test using SQL
/* Stratified splits dataset into training and test
It guarantees that each group has the minimum size to be split.
*/
with ssize as (
select
group
from to_split_table
group by group