Skip to content

Instantly share code, notes, and snippets.

@m-klasen
Last active November 22, 2023 19:48
Show Gist options
  • Save m-klasen/651297e28199b4bb7907fc413c49f58f to your computer and use it in GitHub Desktop.
Save m-klasen/651297e28199b4bb7907fc413c49f58f to your computer and use it in GitHub Desktop.

Get pretrained weights:

wget https://dl.fbaipublicfiles.com/detr/detr-r50-e632da11.pth

Remove class weights

checkpoint = torch.load("detr-r50-e632da11.pth", map_location='cpu')
del checkpoint["model"]["class_embed.weight"]
del checkpoint["model"]["class_embed.bias"]
torch.save(checkpoint,"detr-r50_no-class-head.pth")

and make sure to set non-strict weight loading in main.py

model_without_ddp.load_state_dict(checkpoint['model'], strict=False)

Your dataset should ideally be in the COCO-format. Make your own data-builder (alternatively rename your train/valid/annotation file to match the COCO Dataset) In datasets.coco.py add:

def build_your_dataset(image_set, args):
    root = Path(args.coco_path)
    assert root.exists(), f'provided COCO path {root} does not exist'
    mode = 'instances'
    PATHS = {
        "train": (root / "train", root / "annotations" / f'train.json'),
        "val": (root / "valid", root / "annotations" / f'valid.json'),
    }

    img_folder, ann_file = PATHS[image_set]
    dataset = CocoDetection(img_folder, ann_file, transforms=make_coco_transforms(image_set), return_masks=args.masks)
    return dataset

In datasets.__init__.py add your builder as an option:

def build_dataset(image_set, args):
    if args.dataset_file == 'coco':
        return build_coco(image_set, args)
    if args.dataset_file == 'your_dataset':
        return build_your_dataset(image_set, args)
    [...]

And lastly define how many classes you have in models.detr.py

def build(args):
    [...]
    if args.dataset_file == 'your_dataset': num_classes = 4
    [...]

Run your model (example): python main.py --dataset_file your_dataset --coco_path data --epochs 50 --lr=1e-4 --batch_size=2 --num_workers=4 --output_dir="outputs" --resume="detr-r50_no-class-head.pth"

@Madhusakth
Copy link

Hi, I am trying to train on a custom dataset with 38K train images, 6 classes. I fine-tuned the ResNet detr-r50 model for about 15 epochs and the MAP remains at zero. What are the recommended numbers of epochs to train with this dataset?
Since I have just one GPU, training takes about an hour for each epoch and I wanted to make sure I got other parameters right before I train it for longer.

Thanks!

@1chimaruGin
Copy link

can you please share the inference script for custom trained checkpoint.path for prediction ??
Thanks in advance !

Check it

https://github.com/woctezuma/finetune-detr/blob/master/finetune_detr.ipynb

It works for me.

@rsharmapty
Copy link

Hello,
I had made an attempt to change the num_queries to 500 as images have approximately 450 objects.
to which i received following error.

Traceback (most recent call last):
File "main.py", line 248, in
main(args)
File "main.py", line 178, in main
model_without_ddp.load_state_dict(checkpoint['model'], strict=False)
File "/home/rsharma/git/detr/.venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 847, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DETR:
size mismatch for query_embed.weight: copying a param with shape torch.Size([100, 256]) from checkpoint, the shape in current model is torch.Size([500, 256]).

Can anybody please help me with this ??

@woctezuma
Copy link

Maybe:

If you're fine-tuning, I don't recommend changing the number of queries on the fly, it is extremely unlikely to work out of the box. In this case you're probably better off retraining from scratch (you can change the --num_queries arg from our training script).

facebookresearch/detr#9 (comment)

@rsharmapty
Copy link

@woctezuma
not able to resolve.

@m-klasen
Copy link
Author

m-klasen commented Oct 8, 2020

Hi,
if you change your number of queries, unfortunately, you will have to pretty much train from scratch (except for the resnet backbone). You cannot use transformer weights with num_queries=100for a transformer with 500, same goes for class_embed and bbox_embed. Basically retraining of everything except for the backbone is required. Sorry and good luck.

@rsharmapty
Copy link

@m-klasen thanks for your quick response,
I am having a small dataset ~2k images having 450 max objects if I am not using the transfer learning results are showing 0mAP (no accuracy) with default hyperparametrs, can you suggest something so that I can be able to predict 500 abjects per image with small dataset ??

@m-klasen
Copy link
Author

m-klasen commented Oct 8, 2020

Transformers are notoriously difficult to train, take a long time to converge (200 Epochs on COCO). Unless you can split your images into smaller subsets which feature >100 detections per crop it is going to be difficult.

@rsharmapty
Copy link

just to be clear,

  1. by no means I can change the num_queries and use the transfer learning.
  2. if I am training from scratch, I can get decent results in 200 epochs (because i see 0 mAP for 300 epochs in this case as well)

@woctezuma
Copy link

woctezuma commented Oct 8, 2020

See facebookresearch/detr#216 to work around the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment