Skip to content

Instantly share code, notes, and snippets.

View martinapugliese's full-sized avatar
🙌

Martina Pugliese martinapugliese

🙌
View GitHub Profile
# Imports
import pandas as pd
import numpy as np
from scipy.stats import entropy
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from matplotlib import pyplot as plt

A collection of little libraries that help workflow

TQDM

Shows progress bar in a notebook's cell.

for i in tqdm(range(10), 'wasting time', unit='iterations wasted'):
    sleep(0.5)

Pyplot reference stuff

Those things that I always forget how to do.

import pyplot as plt

Matplotlib styles

A collection of useful command line hacks (Unix)

Memory usage

MACOS

vm_stat is the command, this makes output user friendly, thanks to this.

vm_stat | perl -ne '/page size of (\d+)/ and $size=$1; /Pages\s+([^:]+)[^\d]+(\d+)/ and printf("%-16s % 16.2f Mi\n", "$1:", $2 * $size / 1048576);'

Pandas reference things

df is a DataFrame.

Grouping df on multiple functions and dropping hierarchical level

grouped_df = df.groupby(['colA', 'colB']) \
    .agg(
 {
@martinapugliese
martinapugliese / string_builtins.py
Created August 12, 2016 11:59
Collection of examples of Python built-in methods for manipulating strings
# Copyright (C) 2016 Martina Pugliese
def run_methods():
print '\n'
print '* Count occurrences of substring in string'
print 'Martina'.count('art')
print 'Martina'.count('a')
@martinapugliese
martinapugliese / nltk_plotfreqs.py
Last active August 17, 2016 20:53
Plotting the frequencies in a FreqDist in NLTK instead of the counts.
# Copyright (C) 2016 Martina Pugliese
def plot_freqdist_freq(fd,
max_num=None,
cumulative=False,
title='Frequency plot',
linewidth=2):
"""
As of NLTK version 3.2.1, FreqDist.plot() plots the counts and has no kwarg for normalising to frequency. Work this around here.
@martinapugliese
martinapugliese / printingclass.py
Created August 7, 2016 20:56
A class for styled printing (coloured/styled text, time of execution available), attributes combinable.
# Copyright (C) 2016 Martina Pugliese
# Imports
from datetime import datetime
# #################### ANSI Escape codes for terminal #########################
codes_dict = {