Skip to content

Instantly share code, notes, and snippets.

View nateraw's full-sized avatar
💯

Nathan Raw nateraw

💯
View GitHub Profile
@simonw
simonw / gist:104413
Created April 30, 2009 11:19
Turn a BeautifulSoup form in to a dict of fields and default values - useful for screen scraping forms and then resubmitting them
def extract_form_fields(self, soup):
"Turn a BeautifulSoup form in to a dict of fields and default values"
fields = {}
for input in soup.findAll('input'):
# ignore submit/image with no name attribute
if input['type'] in ('submit', 'image') and not input.has_key('name'):
continue
# single element nome/value fields
if input['type'] in ('text', 'hidden', 'password', 'submit', 'image'):
@jiffyclub
jiffyclub / markdown_doc
Last active August 1, 2023 11:16
This script turns Markdown into HTML using the Python markdown library and wraps the result in a complete HTML document with default Bootstrap styling so that it's immediately printable. Requires the python libraries jinja2, markdown, and mdx_smartypants.
#!/usr/bin/env python
import argparse
import sys
import jinja2
import markdown
TEMPLATE = """<!DOCTYPE html>
<html>
@yrevar
yrevar / imagenet1000_clsidx_to_labels.txt
Last active June 30, 2024 06:36
text: imagenet 1000 class idx to human readable labels (Fox, E., & Guestrin, C. (n.d.). Coursera Machine Learning Specialization.)
{0: 'tench, Tinca tinca',
1: 'goldfish, Carassius auratus',
2: 'great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias',
3: 'tiger shark, Galeocerdo cuvieri',
4: 'hammerhead, hammerhead shark',
5: 'electric ray, crampfish, numbfish, torpedo',
6: 'stingray',
7: 'cock',
8: 'hen',
9: 'ostrich, Struthio camelus',
@hadware
hadware / bytes_to_wav.py
Last active April 26, 2024 10:50
Convert wav in bytes for to numpy ndarray, then back to bytes
from scipy.io.wavfile import read, write
import io
## This may look a bit intricate/useless, considering the fact that scipy's read() and write() function already return a
## numpy ndarray, but the BytesIO "hack" may be useful in case you get the wav not through a file, but trough some websocket or
## HTTP Post request. This should obviously work with any other sound format, as long as you have the proper decoding function
with open("input_wav.wav", "rb") as wavfile:
input_wav = wavfile.read()
@loretoparisi
loretoparisi / ffmpeg_frames.sh
Last active April 19, 2024 05:17
Extract all frames from a movie using ffmpeg
# Output a single frame from the video into an image file:
ffmpeg -i input.mov -ss 00:00:14.435 -vframes 1 out.png
# Output one image every second, named out1.png, out2.png, out3.png, etc.
# The %01d dictates that the ordinal number of each output image will be formatted using 1 digits.
ffmpeg -i input.mov -vf fps=1 out%d.png
# Output one image every minute, named out001.jpg, out002.jpg, out003.jpg, etc.
# The %02d dictates that the ordinal number of each output image will be formatted using 2 digits.
ffmpeg -i input.mov -vf fps=1/60 out%02d.jpg
@mkocabas
mkocabas / coco.sh
Created April 9, 2018 09:41
Download COCO dataset. Run under 'datasets' directory.
mkdir coco
cd coco
mkdir images
cd images
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
wget http://images.cocodataset.org/zips/test2017.zip
wget http://images.cocodataset.org/zips/unlabeled2017.zip
@ZijiaLewisLu
ZijiaLewisLu / Tricks to Speed Up Data Loading with PyTorch.md
Last active June 26, 2024 02:45
Tricks to Speed Up Data Loading with PyTorch

In most of deep learning projects, the training scripts always start with lines to load in data, which can easily take a handful minutes. Only after data ready can start testing my buggy code. It is so frustratingly often that I wait for ten minutes just to find I made a stupid typo, then I have to restart and wait for another ten minutes hoping no other typos are made.

In order to make my life easy, I devote lots of effort to reduce the overhead of I/O loading. Here I list some useful tricks I found and hope they also save you some time.

  1. use Numpy Memmap to load array and say goodbye to HDF5.

    I used to relay on HDF5 to read/write data, especially when loading only sub-part of all data. Yet that was before I realized how fast and charming Numpy Memmapfile is. In short, Memmapfile does not load in the whole array at open, and only later "lazily" load in the parts that are required for real operations.

Sometimes I may want to copy the full array to memory at once, as it makes later operations

@Mahedi-61
Mahedi-61 / cuda_11.8_installation_on_Ubuntu_22.04
Last active June 12, 2024 10:40
Instructions for CUDA v11.8 and cuDNN 8.9.7 installation on Ubuntu 22.04 for PyTorch 2.1.2
#!/bin/bash
### steps ####
# Verify the system has a cuda-capable gpu
# Download and install the nvidia cuda toolkit and cudnn
# Setup environmental variables
# Verify the installation
###
### to verify your gpu is cuda enable check
@GuruMulay
GuruMulay / video_to_frames_writer.py
Last active January 11, 2023 03:53
A script to read a video using PyAV and write its individual frames using PIL or skimage.
import av
import PIL
import skimage.io
from skimage.transform import resize, pyramid_reduce
import numpy as np
import os
import argparse
"""
@psa-jforestier
psa-jforestier / lambda_function.py
Created February 5, 2019 10:13
AWS Lambda function to gzip compress file when upload to S3 (will replace original file with gz version)
###
### This gist contains 2 files : settings.json and lambda_function.py
###
### settings.json
{
"extensions" : ["*.hdr", "*.glb", "*.wasm"]
}
### lambda_function.py