Skip to content

Instantly share code, notes, and snippets.

View bgmello's full-sized avatar
🇧🇷

Bruno Görresen Mello bgmello

🇧🇷
  • Military Engineering Institute (IME)
  • Rio de Janeiro
View GitHub Profile
@bgmello
bgmello / dataLoader.py
Created May 29, 2020 21:29
Class to load data from multiple urls
import os
import asyncio
import requests
from concurrent.futures import ThreadPoolExecutor
class DataLoader():
def __init__(self, urls, fnames, data_dir, workers=200, verbose=True):
@bgmello
bgmello / remove_output.py
Created May 4, 2020 17:06 — forked from damianavila/remove_output.py
Remove output from IPython notebook from the command line (dev version 1.0)
"""
Usage: python remove_output.py notebook.ipynb [ > without_output.ipynb ]
Modified from remove_output by Minrk
"""
import sys
import io
import os
from IPython.nbformat.current import read, write
@bgmello
bgmello / fingerprint.py
Last active March 27, 2020 14:21 — forked from cjdd3b/fingerprint.py
Python implementation of Google Refine fingerprinting algorithms here: https://github.com/OpenRefine/OpenRefine/wiki/Clustering-In-Depth
# -*- coding: utf-8 -*-
import re, string
import unicodedata, html
PUNCTUATION = re.compile('[%s]' % re.escape(string.punctuation))
class Fingerprinter(object):
'''
Python implementation of Google Refine fingerprinting algorithm described here: