Skip to content

Instantly share code, notes, and snippets.

View sidneyarcidiacono's full-sized avatar

Sid sidneyarcidiacono

  • Los Angeles, CA
View GitHub Profile
@sidneyarcidiacono
sidneyarcidiacono / PROTEINS_embedding.py
Created May 17, 2021 21:01
Set optimizer and define train/test functions
# Set our optimizer (adam)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
# Define our loss function
criterion = torch.nn.CrossEntropyLoss()
# Initialize our train function
def train():
model.train()
for data in train_loader: # Iterate in batches over the training dataset.
@sidneyarcidiacono
sidneyarcidiacono / PROTEINS_embedding.py
Created May 17, 2021 20:59
Building our model with pytorch-geometric
# Import everything we need to build our network:
from torch.nn import Linear
import torch.nn.functional as F
from torch_geometric.nn import GCNConv
from torch_geometric.nn import global_mean_pool
# Define our GCN class as a pytorch Module
class GCN(torch.nn.Module):
def __init__(self, hidden_channels):
super(GCN, self).__init__()
@sidneyarcidiacono
sidneyarcidiacono / PROTEINS_embedding.py
Created May 17, 2021 20:57
Initialize loaders for batching with pytorch-geometric
# Import DataLoader for batching
from torch_geometric.data import DataLoader
# our DataLoader creates diagonal adjacency matrices, and concatenates features
# and target matrices in the node dimension. This allows differing numbers of nodes and edges
# over examples in one batch. (from pytorch geometric docs)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)
@sidneyarcidiacono
sidneyarcidiacono / PROTEINS_embedding.py
Created May 17, 2021 20:56
Train/test split with pytorch-geometric
# Now, we need to perform our train/test split.
# We create a seed, and then shuffle our data
torch.manual_seed(12345)
dataset = dataset.shuffle()
# Once it's shuffled, we slice the data to split
train_dataset = dataset[150:-150]
test_dataset = dataset[0:150]
# Take a look at the training versus test graphs
@sidneyarcidiacono
sidneyarcidiacono / PROTEINS_embedding.py
Created May 17, 2021 20:55
Checking out our data with pytorch-geometric
# Let's take a look at our data. We'll look at dataset (all data) and data (our first graph):
data = dataset[0] # Get the first graph object.
print()
print(f'Dataset: {dataset}:')
print('====================')
# How many graphs?
print(f'Number of graphs: {len(dataset)}')
# How many features?
@sidneyarcidiacono
sidneyarcidiacono / PROTEINS_embedding.py
Created May 17, 2021 20:54
Download dataset from pytorch-geometric
import torch
from torch_geometric.datasets import TUDataset
# Like Spektral, pytorch geometric provides us with benchmark TUDatasets
dataset = TUDataset(root='data/TUDataset', name='PROTEINS')
@sidneyarcidiacono
sidneyarcidiacono / PROTEINS_embedding.py
Created May 17, 2021 20:53
Installing packages for pytorch-geometric tutorial
# Install required packages.
!pip install -q torch-scatter -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
!pip install -q torch-sparse -f https://pytorch-geometric.com/whl/torch-1.8.0+cu101.html
!pip install -q torch-geometric
@sidneyarcidiacono
sidneyarcidiacono / PROTEINS_embedding.py
Created May 17, 2021 20:53
Performing test validation with Spektral GCN
# And feed it to our model by calling .load()
loss = model.evaluate(loader.load(), steps=loader.steps_per_epoch)
print('Test loss: {}'.format(loss))
@sidneyarcidiacono
sidneyarcidiacono / PROTEINS_embedding.py
Created May 17, 2021 20:52
Creating test loader for Spektral GCN
# To evaluate, let's instantiate another loader to test
test_loader = BatchLoader(data_test, batch_size=32)
@sidneyarcidiacono
sidneyarcidiacono / PROTEINS_embedding.py
Created May 17, 2021 20:51
Calling fit on our Spektral GCN
# Now we can train! We don't need to specify a batch size, since our loader is basically a generator
# But we do need to specify the steps_per_epoch parameter
model.fit(loader.load(), steps_per_epoch=loader.steps_per_epoch, epochs=10)