Useful Pandas Snippets
A personal diary of DataFrame munging over the years.
Data Types and Conversion
Convert Series datatype to numeric (will error if column has non-numeric values)
(h/t @makmanalp)
javascript: Promise.all([import('https://unpkg.com/turndown@6.0.0?module'), import('https://unpkg.com/@tehshrike/readability@0.2.0'), ]).then(async ([{ | |
default: Turndown | |
}, { | |
default: Readability | |
}]) => { | |
/* Optional vault name */ | |
const vault = ""; | |
/* Optional folder name such as "Clippings/" */ |
A personal diary of DataFrame munging over the years.
Convert Series datatype to numeric (will error if column has non-numeric values)
(h/t @makmanalp)
The dplyr
package in R makes data wrangling significantly easier.
The beauty of dplyr
is that, by design, the options available are limited.
Specifically, a set of key verbs form the core of the package.
Using these verbs you can solve a wide range of data problems effectively in a shorter timeframe.
Whilse transitioning to Python I have greatly missed the ease with which I can think through and solve problems using dplyr in R.
The purpose of this document is to demonstrate how to execute the key dplyr verbs when manipulating data using Python (with the pandas
package).
dplyr is organised around six key verbs:
torch.manual_seed(42) | |
x_tensor = torch.from_numpy(x).float() | |
y_tensor = torch.from_numpy(y).float() | |
# Builds dataset with ALL data | |
dataset = TensorDataset(x_tensor, y_tensor) | |
# Splits randomly into train and validation datasets | |
train_dataset, val_dataset = random_split(dataset, [80, 20]) |
Pretty print tables summarizing properties of tensor arrays in numpy, pytorch, jax, etc. | |
Now on pip! `pip install arrgh` https://github.com/nmwsharp/arrgh |
HTTP is a stateless protocol. Sessions allow us to chain multiple requests together into a conversation between client and server.
Sessions should be an option of last resort. If there's no where else that the data can possibly go to achieve the desired functionality, only then should it be stored in the session. Sessions can be vulnerable to security threats from third parties, malicious users, and can cause scaling problems.
That doesn't mean we can't use sessions, but we should only use them where necessary.
package main | |
import ( | |
"database/sql" | |
"gopkg.in/gorp.v1" | |
"log" | |
"strconv" | |
"github.com/gin-gonic/gin" | |
_ "github.com/go-sql-driver/mysql" |