Skip to content

Instantly share code, notes, and snippets.

@jxmorris12
Created May 24, 2023 19:45
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jxmorris12/1bb7e1bc0d078e87aefdcfc7ff73b9d8 to your computer and use it in GitHub Desktop.
Save jxmorris12/1bb7e1bc0d078e87aefdcfc7ff73b9d8 to your computer and use it in GitHub Desktop.
Python
map a function to a list: — map (f, list) — NOT the other way around
set a breakpoint: import pdb; pdb.set_trace()
—> ACTUALLY starting in python 3.7 you can just do breakpoint() !
best way to profile any python code: pip install pyinstrument; python -m pyinstrument ./myprog.py
run a pytest test by pattern: pytest -k <pattern>
will run all tests that match the pattern
grammar-check text:
import language_tool_python
tool = language_tool_python.LanguageTool('en-US') # use a local server (automatically set up), language English
text = … # some text to check
matches = tool.check(text) # to get errors
fixed_text = tool.correct(text)
GIFs
optimize a GIF with lots of blank space, like a screen cast:
gifsicle -O input.gif -O3 --no-extensions --colors=64 -o input_optimized.gif
Google Docs / Google Sheets
Make any table look nice in google sheets
Format > Alternating colors # just pick a nice color scheme that you like
Latex
easy way to make tables: make a pandas df then do print(df.to_latex())
newline: \\
unbreakable space: ~
fancy quotes: ``like this’’ not “like this” !
fix bbl file undefined citation error / set up project for posting on arxiv:
(1) rm *.aux *.bbl
(2) pdflatex <file>.tex
(3) bibtex <file>
(4) pdflatex <file>.tex
(5) pdflatex <file>.tex # yes, you have to do it twice
compile a latex manuscript and make sure there are no missing references/citations/label mismatches/etc.
pdflatex emnlp2020.tex | grep undefined
Miscellaneous
Run a VSCode server from anywhere — and code!
curl -fsSL https://code-server.dev/install.sh | sh # install
code-server --link # to run VS Code server
search all subfiles matching a pattern for another pattern:
for FILE in $(find . -name "*.csv"); do echo "Searching file $FILE"; grep "WorkerId" $FILE; done
# Used this specific command to find all CSVs in a repository and make sure none of them contained the sensitive field ‘WorkerId’ (from Mechanical Turk)
revert some specific person’s commits
git log --author somedummy --oneline
git rever commithash
see git log in one line
git log --oneline
undo your last commit
git reset —soft HEAD~1
reset every file to last commit
git reset —hard
print a cool tree view of git history
git log --graph --all --oneline
how to see who’s running a process
# good way to figure out who’s using the gpus!
ps -o user -p <process_id>
how to copy a file between any two servers
# upload
curl --upload-file ./config.yml https://transfer.sh/
https://transfer.sh/66nb8/config.yml
# download
wget https://transfer.sh/66nb8/config.yml
fix bad state for a site in Google Chrome
navigate to chrome://settings/siteData
find site
fix out-of-sync Django database (when models update but database doesn’t)
python manage.py makemigrations
python manage.py migrate --run-syncdb
Show full row in Pandas:
pd.set_option('display.max_colwidth', -1) # shows full-length strings, etc
# undo
pd.remove_option(‘display.max_colwidth’)
Tensorflow
see if tensorflow2 can find gpus
import tensorflow as tf; tf.config.experimental.list_physical_devices('GPU')
Pytorch
add a fake batch dimension
If you have a single sample, just use input.unsqueeze(0)
check GPU usage
nvidia-smi
check cuda GPU integration:
python_command="import torch; print('Torch version:', torch.__version__); print('CUDA is available:', torch.cuda.is_available());"
echo $python_command | python
debug CUDA errors synchronously
CUDA_LAUNCH_BLOCKING=1 python my_program.py
# this is nice because CUDA typically overlaps computation with transferring variables on or off the GPU. so
# when an error occurs, the stack trace might not appear in the right place
Bash/admin stuff
unzip a file with a checkpoint
tar -xzf <somefile.tar.gz> --checkpoint=.100
re-run a command you recently ran without typing the full thing out
<Ctrl-R> <type part of that command>
on Mac: open a folder in Finder, or a file using its default program
open . # to open cwd in Finder
open <folder_name> # to open folder in finder
open README.md # to open file in its default program
kill process running on a certain port (Linux):
kill $(lsof -ti:3000)
better way to kill process on a port (Linux; requires npm >= 5.2.0)
npx kill-port 6379
convert a folder full of .mp3 files to .wav:
 for f in ls; do ffmpeg -i $f $(basename $f).wav; done
5 most recent files:
ls -1t | head -5
Easy way to clean your home folder
rm -rf ~/.cache (I think this is pip cache) and conda clean —all
find big files and delete them
brew install ncdu (or sudo apt-get install)
ncdu / # show home folder by size of each folder’s contents, can sort folder, enter them, or delete files
list <n> largest folders in a directory
# usage
du -a /my/directory | sort -n -r | head -n <n>
# example
du -a /home/jxm | sort -n -r | head -n 15
diff the output of two commands
diff <(ls old) <(ls new)
how to rename screen session:
C-a :sessionname mySessionName
How to download from google drive on linux
pip install gdown
gdown --id FILEID -O FILENAME
[FILEID is the hash in a google drive URL]
Grep and print out files that match
grep 'pattern' file /dev/null
install nodejs in conda
conda install -c conda-forge nodejs
run command and redirect output to file without buffering:
# (without buffering you can see outputs in the file as they happen, instead of
# waiting for buffer to fill or program to terminate. (though there is an overhead
# from writing to the file each time)
stdbuf -oL command > output
shuffle lines of multiple files into a single file:
cat file1.txt file2.txt | sort -R > out.txt
play audio from command line [MacOS]
afplay <sound_file>
fix sound when it randomly stops working [Mac OS Catalina]
sudo killall coreaudiod
[thanks to https://apple.stackexchange.com/questions/52939/restarting-speakers-without-rebooting]
SSH escape commands (on Mac) —
type ~? in SSH session for help
~. quit SSH session # especially helps if ur SSH session hangs bc broken internet connection
~^Z (~ then ctrl-Z) to suspend connection [can resume with jobs; fg %jobid]
suspend running jobs: ctrl-Z
see running jobs with `jobs`
then run in background with bg
or bring it back to the foreground with fg
see which process has a port open:
netstat -nlp | grep <portnum>
# helpful when I was accidentally running a process on someone else’s server
# and i needed to shut it off but couldn’t figure out where it was running from
print out the path to a file:
# I use to cd to a folder then run pwd, copy that, and type filename after…
# then i found realpath
realpath /path/to/dir/* # or real_path /path/to/one/file
kill a running process by name or keyword:
killall -9 <processname>
# or, the two-line format, for clarity
ps | grep <processname>
kill <pid>
kill all processes by a grep
# to see processes by grep
ps | grep my_string
# to kill them all (includes SSH session!)
pkill -f my_string
kill a process after a timeout
timeout <time> python program.py # or whatever command to timeout
# especially useful when your program hangs and only prints useful output in the first few ms
# for example, i’m running `timeout 0.5 make qemu` to debug my homework
name a file after the current directory (without the full path)
vim ${PWD##*/}.cpp # example: C++ file named after directory
fix homebrew permissions errors:
sudo chown -R $(whoami) $(brew --prefix)/*
Reverse output (by lines)
use `tac` via pipe
# `history | tac | less`
# view your bash commands, *backwards*, in less
Kill all processes that match a string
kill $(ps aux | grep 'LanguageTool' | awk '{print $2}')
# this kills all my processes that have ‘LanguageTool’ somewhere in their info (file path, process name, etc)
# be careful with this! :)
print the n-th column of output w `awk`
command-that-prints-columns best | awk '{print $4}’ # where 4 is the column #
# note that if different rows have different # printed columns, this can be very confusing
Kill all running processes that I might have forgotten about:
for pid in $(ps -aux | grep python | awk '{print $2}'); do kill $pid; done
# replace ‘grep python’ with desired filter
notes on bash history
The C shell keeps an ordered list of all the commands that you have entered. Each command is given a number according to the order it was entered.
% history (show command history list)
If you are using the C shell, you can use the exclamation character (!) to recall commands easily.
% !! (recall last command)
% !-3 (recall third most recent command)
% !5 (recall 5th command in list)
% !grep (recall last command starting with grep)
zsh doesn’t store bash history by default, so you need to run this when you configure a shell for zsh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment