Skip to content

Instantly share code, notes, and snippets.

View duylebkHCM's full-sized avatar
🎯
Focusing

Lê Anh Duy duylebkHCM

🎯
Focusing
View GitHub Profile
@rcanepa
rcanepa / gist:8a334d7c4f46df948c676aab489fe2c2
Last active June 28, 2024 12:39
Undo committed files (move them back to the staging area without cancelling changes)
- Reset current branch to the parent of HEAD
git reset --soft HEAD~
- Reset the unwanted files in order to leave them out from the commit:
git reset HEAD path/to/unwanted_file
- Commit again using the same commit message:
git commit -c ORIG_HEAD
(*) git reset --soft HEAD~
@stormwild
stormwild / conda-environments.md
Last active September 5, 2023 04:55
Creating python2 environment in conda

Conda Environment

Create and activate an environment

conda create -n python2 python=2.7 anaconda
source activate python2

Deactivate environment

@thomwolf
thomwolf / gradient_accumulation.py
Last active January 16, 2024 02:38
PyTorch gradient accumulation training loop
model.zero_grad() # Reset gradients tensors
for i, (inputs, labels) in enumerate(training_set):
predictions = model(inputs) # Forward pass
loss = loss_function(predictions, labels) # Compute loss function
loss = loss / accumulation_steps # Normalize our loss (if averaged)
loss.backward() # Backward pass
if (i+1) % accumulation_steps == 0: # Wait for several backward steps
optimizer.step() # Now we can do an optimizer step
model.zero_grad() # Reset gradients tensors
if (i+1) % evaluation_steps == 0: # Evaluate the model when we...
@GuiMarthe
GuiMarthe / pandas_caching_decorator.py
Last active November 15, 2023 19:10
This decorator caches a pandas.DataFrame returning function. It saves the pandas.DataFrame in a parquet file in the cache_dir.
import pandas as pd
from pathlib import Path
from functools import wraps
def cache_pandas_result(cache_dir, hard_reset: bool):
'''
This decorator caches a pandas.DataFrame returning function.
It saves the pandas.DataFrame in a parquet file in the cache_dir.
It uses the following naming scheme for the caching files:
@duylebkHCM
duylebkHCM / book_splitter.py
Last active December 8, 2023 16:27
Automatically split ICDAR proceedings into separated papers
import re
import fitz
from fitz import Page
import argparse
import pandas as pd
from pathlib import Path
from collections import defaultdict
EXCLUDE_KEYWORD = [