Skip to content

Instantly share code, notes, and snippets.

View sachdevkartik's full-sized avatar

Kartik sachdevkartik

View GitHub Profile
Architecture Model_I Model_II Model_III
Convolutional Vision Transformer 91.54% 99.41% 99.04%
CrossFormer 91.20% 97.42% 98.13%
LeViT 91.64% 97.12% 97.97%
TwinsSVT 91.08% 97.44% 98.48%
CCT 89.71% 69.68% 99.48%
CrossViT 84.20% 91.33% 81.29%
CaiT 67.93% 62.97% 69.38%
T2TViT 88.12% 94.33% 77.29%
PiT 40.63% 33.60% 34.18%
TWINSSVT_CONFIG = {
"network_type": "TwinsSVT",
"pretrained": False,
"image_size": 224,
"batch_size": 64,
"num_epochs": 15,
"optimizer_config": {
"name": "AdamW",
"weight_decay": 0.01,
"lr": 0.001,
Hyperparameter Value
Batch size 64
Epochs 15
Image channels 1
Optimizer AdamW
Weight decay 0.01
Learning rate schedule Cosine schedule with warmup
Initial learning rate 0.01
from typing import Optional
from torchvision import transforms
from PIL import Image
import albumentations as A
from albumentations.pytorch import ToTensorV2
from torch.utils.data import DataLoader, Dataset
def get_transform_train(
upsample_size: int, final_size: int, channels: Optional[int] = 1