Skip to content

Instantly share code, notes, and snippets.

View michaelgao8's full-sized avatar
🤔

Michael Gao michaelgao8

🤔
  • Duke University
  • Durham, NC
View GitHub Profile
@fny
fny / shrink_dataframe.py
Created December 13, 2019 00:12
Reduce Memory Usage of a Pandas Dataframe
import numpy as np
def shrink_df(df, categorize=False, verbose=False):
"""Reduces the memory use of a data frame by using more compact types.
Args:
df (pandas.DataFrame): The dataframe
categorize (bool): Whether strings should be converted to categorical values.
Note this may cause memory use to increase slightly.
verbose (bool): Whether to print memory savings to stdout.
@HarshTrivedi
HarshTrivedi / pad_packed_demo.py
Last active May 11, 2024 19:28 — forked from Tushar-N/pad_packed_demo.py
Minimal tutorial on packing (pack_padded_sequence) and unpacking (pad_packed_sequence) sequences in pytorch.
import torch
from torch import LongTensor
from torch.nn import Embedding, LSTM
from torch.autograd import Variable
from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence
## We want to run LSTM on a batch of 3 character sequences ['long_str', 'tiny', 'medium']
#
# Step 1: Construct Vocabulary
# Step 2: Load indexed data (list of instances, where each instance is list of character indices)