Skip to content

Instantly share code, notes, and snippets.

@kwea123
Last active March 9, 2024 19:23
Show Gist options
  • Save kwea123/a3c541a325e895ef79ecbc0d2e6d7221 to your computer and use it in GitHub Desktop.
Save kwea123/a3c541a325e895ef79ecbc0d2e6d7221 to your computer and use it in GitHub Desktop.
nerf_colab.ipynb
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@JianingHe0115
Copy link

Hi kwea123 :)
When I run the Installation cell in the colab, I met this problem. But I was able to run it successfully before, can you help me?

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
ERROR: Could not find a version that satisfies the requirement torch==1.4.0 (from versions: 1.7.1, 1.8.0, 1.8.1, 1.9.0, 1.9.1, 1.10.0, 1.10.1, 1.10.2, 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 2.0.0)
ERROR: No matching distribution found for torch==1.4.0

@xiafengdongzhi
Copy link

Hi kwea123 :) When I run the Installation cell in the colab, I met this problem. But I was able to run it successfully before, can you help me?

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/ ERROR: Could not find a version that satisfies the requirement torch==1.4.0 (from versions: 1.7.1, 1.8.0, 1.8.1, 1.9.0, 1.9.1, 1.10.0, 1.10.1, 1.10.2, 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 2.0.0) ERROR: No matching distribution found for torch==1.4.0

I met the same problem, did you fix it?

@aedutta
Copy link

aedutta commented May 3, 2023

Go to requirements.txt and change the version of torch. In this case, I believe it does not matter what version you use.

@Dawn11041107
Copy link

Can any kind guy tell me the latest version of each version? Can you show me your requirements.txt?

@sukikotori
Copy link

sukikotori commented Nov 3, 2023

Can any kind guy tell me the latest version of each version? Can you show me your requirements.txt?

This is my change to the text
this can indeed build the environment
But I'm not sure if it can be used in NeRF


torch==2.1.0
torchvision==0.16.0
pytorch-lightning==0.7.5
test-tube
kornia==0.2.0
opencv-python==4.8.1.78
matplotlib
jupyter


Remember to make sure your COLAB is connected to the GPU

@Dawn11041107
Copy link

Dawn11041107 commented Nov 3, 2023

This is my change to the text this can indeed build the environment But I'm not sure if it can be used in NeRF
torch==2.1.0 torchvision==0.16.0 pytorch-lightning==0.7.5 test-tube kornia==0.2.0 opencv-python==4.8.1.78 matplotlib jupyter
Remember to make sure your COLAB is connected to the GPU

Thannnnnnnk you very much for your reply~ After letting me modify it according to your suggestion, the following error occurred:
Traceback (most recent call last):

File "/content/nerf_pl/train.py", line 7, in <module>
  from datasets import dataset_dict
File "/content/nerf_pl/datasets/__init__.py", line 1, in <module>
  from .blender import BlenderDataset
File "/content/nerf_pl/datasets/blender.py", line 9, in <module>
  from .ray_utils import *
File "/content/nerf_pl/datasets/ray_utils.py", line 2, in <module>
  from kornia import create_meshgrid
File "/usr/local/lib/python3.10/dist-packages/kornia/__init__.py", line 12, in <module>
  from kornia import augmentation
File "/usr/local/lib/python3.10/dist-packages/kornia/augmentation/__init__.py", line 1, in <module>
  from .augmentation import *
File "/usr/local/lib/python3.10/dist-packages/kornia/augmentation/augmentation.py", line 7, in <module>
  from . import functional as F
File "/usr/local/lib/python3.10/dist-packages/kornia/augmentation/functional.py", line 488
  input = input.view((-1, (*input.shape[-3:])))

SyntaxError: cannot use starred expression here
Maybe because COLAB use python3.10, and the version of kornia does not support...something wrong about tuple

@Dawn11041107
Copy link

Dawn11041107 commented Nov 6, 2023

Can any kind guy tell me the latest version of each version? Can you show me your requirements.txt?

Guys, this is my final change to the requirements.txt(on Nov. 4th, 2023), and it works for me!

torch==1.11.0
torchvision==0.1.6
pytorch-lightning==1.6.0
test-tube==0.7.5
kornia==0.7.0
opencv-python==3.4.0.14
matplotlib
jupyter

# for mesh
PyMCubes
pycollada
trimesh
pyglet

# for point cloud
plyfile
open3d

But I also changed some codes in train.py...to match the version

from opt import get_opts
import torch
from collections import defaultdict

from torch.utils.data import DataLoader
from datasets import dataset_dict

# models
from models.nerf import Embedding, NeRF
from models.rendering import render_rays

# optimizer, scheduler, visualization
from utils import *

# losses
from losses import loss_dict

# metrics
from metrics import *

# pytorch-lightning
from pytorch_lightning.callbacks import ModelCheckpoint
from pytorch_lightning.callbacks import EarlyStopping
from pytorch_lightning import LightningModule, Trainer
from pytorch_lightning.loggers import TestTubeLogger
from pytorch_lightning.profiler import PyTorchProfiler

class NeRFSystem(LightningModule):
    def __init__(self, hparams):
        super(NeRFSystem, self).__init__()
        self.save_hyperparameters(hparams)
        # self.hparams = hparams

        self.loss = loss_dict[hparams.loss_type]()

        self.embedding_xyz = Embedding(3, 10) # 10 is the default number
        self.embedding_dir = Embedding(3, 4) # 4 is the default number
        self.embeddings = [self.embedding_xyz, self.embedding_dir]

        self.nerf_coarse = NeRF()
        self.models = [self.nerf_coarse]
        if hparams.N_importance > 0:
            self.nerf_fine = NeRF()
            self.models += [self.nerf_fine]

    def decode_batch(self, batch):
        rays = batch['rays'] # (B, 8)
        rgbs = batch['rgbs'] # (B, 3)
        return rays, rgbs

    def forward(self, rays):
        """Do batched inference on rays using chunk."""
        B = rays.shape[0]
        results = defaultdict(list)
        for i in range(0, B, self.hparams.chunk):
            rendered_ray_chunks = \
                render_rays(self.models,
                            self.embeddings,
                            rays[i:i+self.hparams.chunk],
                            self.hparams.N_samples,
                            self.hparams.use_disp,
                            self.hparams.perturb,
                            self.hparams.noise_std,
                            self.hparams.N_importance,
                            self.hparams.chunk, # chunk size is effective in val mode
                            self.train_dataset.white_back)

            for k, v in rendered_ray_chunks.items():
                results[k] += [v]

        for k, v in results.items():
            results[k] = torch.cat(v, 0)
        return results

    def prepare_data(self):
        dataset = dataset_dict[self.hparams.dataset_name]
        kwargs = {'root_dir': self.hparams.root_dir,
                  'img_wh': tuple(self.hparams.img_wh)}
        if self.hparams.dataset_name == 'llff':
            kwargs['spheric_poses'] = self.hparams.spheric_poses
            kwargs['val_num'] = self.hparams.num_gpus
        self.train_dataset = dataset(split='train', **kwargs)
        self.val_dataset = dataset(split='val', **kwargs)

    def configure_optimizers(self):
        self.optimizer = get_optimizer(self.hparams, self.models)
        scheduler = get_scheduler(self.hparams, self.optimizer)
        
        return [self.optimizer], [scheduler]

    def train_dataloader(self):
        return DataLoader(self.train_dataset,
                          shuffle=True,
                          num_workers=4,
                          batch_size=self.hparams.batch_size,
                          pin_memory=True)

    def val_dataloader(self):
        return DataLoader(self.val_dataset,
                          shuffle=False,
                          num_workers=4,
                          batch_size=1, # validate one image (H*W rays) at a time
                          pin_memory=True)
    
    def training_step(self, batch, batch_nb):
        log = {'lr': get_learning_rate(self.optimizer)}
        rays, rgbs = self.decode_batch(batch)
        results = self(rays)
        log['train/loss'] = loss = self.loss(results, rgbs)
        typ = 'fine' if 'rgb_fine' in results else 'coarse'

        with torch.no_grad():
            psnr_ = psnr(results[f'rgb_{typ}'], rgbs)
            log['train/psnr'] = psnr_

        return {'loss': loss,
                'progress_bar': {'train_psnr': psnr_},
                'log': log
               }

    def validation_step(self, batch, batch_nb):
        rays, rgbs = self.decode_batch(batch)
        rays = rays.squeeze() # (H*W, 3)
        rgbs = rgbs.squeeze() # (H*W, 3)
        results = self(rays)
        log = {'val_loss': self.loss(results, rgbs)}
        typ = 'fine' if 'rgb_fine' in results else 'coarse'
    
        if batch_nb == 0:
            W, H = self.hparams.img_wh
            img = results[f'rgb_{typ}'].view(H, W, 3).cpu()
            img = img.permute(2, 0, 1) # (3, H, W)
            img_gt = rgbs.view(H, W, 3).permute(2, 0, 1).cpu() # (3, H, W)
            depth = visualize_depth(results[f'depth_{typ}'].view(H, W)) # (3, H, W)
            stack = torch.stack([img_gt, img, depth]) # (3, 3, H, W)
            self.logger.experiment.add_images('val/GT_pred_depth',
                                               stack, self.global_step)

        log['val_psnr'] = psnr(results[f'rgb_{typ}'], rgbs)
        return log

    def validation_epoch_end(self, outputs):
        mean_loss = torch.stack([x['val_loss'] for x in outputs]).mean()
        mean_psnr = torch.stack([x['val_psnr'] for x in outputs]).mean()

        return {'progress_bar': {'val_loss': mean_loss,
                                 'val_psnr': mean_psnr},
                'log': {'val/loss': mean_loss,
                        'val/psnr': mean_psnr}
               }
    def lr_scheduler_step(self, scheduler, optimizer_idx, monitor_val=None):
        scheduler.step()


if __name__ == '__main__':
    hparams = get_opts()
    system = NeRFSystem(hparams)
    
    checkpoint_callback = ModelCheckpoint(dirpath=os.path.join(f'ckpts/{hparams.exp_name}',
                                                                '{epoch:d}'),
                                          monitor='val/loss',
                                          mode='min',
                                          save_top_k=5,)

    logger = TestTubeLogger(
        save_dir="logs",
        name=hparams.exp_name,
        debug=False,
        create_git_tag=False
    )

    early_stop_callback = EarlyStopping(
        monitor=None,
        min_delta=0.0,
        patience=3,
        verbose=True,
        mode='min'
    )


    trainer = Trainer(max_epochs=hparams.num_epochs, 
              callbacks=[early_stop_callback],
              checkpoint_callback=checkpoint_callback,
              resume_from_checkpoint=hparams.ckpt_path,
              logger=logger,
              weights_summary=None,
              progress_bar_refresh_rate=1,
              gpus=hparams.num_gpus,
              # distributed_backend='ddp' if hparams.num_gpus>1 else None,
              strategy='ddp' if hparams.num_gpus>1 else None,
              num_sanity_val_steps=1,
              benchmark=True,
              profiler=PyTorchProfiler()
              # profiler=hparams.num_gpus==1
              )

    trainer.fit(system)

@aedutta
Copy link

aedutta commented Nov 6, 2023

Awesome! I will add these changes and see if it works for me.

@utkrshkmr
Copy link

This is my change to the text this can indeed build the environment But I'm not sure if it can be used in NeRF
torch==2.1.0 torchvision==0.16.0 pytorch-lightning==0.7.5 test-tube kornia==0.2.0 opencv-python==4.8.1.78 matplotlib jupyter
Remember to make sure your COLAB is connected to the GPU

Thannnnnnnk you very much for your reply~ After letting me modify it according to your suggestion, the following error occurred: Traceback (most recent call last):

File "/content/nerf_pl/train.py", line 7, in <module>
  from datasets import dataset_dict
File "/content/nerf_pl/datasets/__init__.py", line 1, in <module>
  from .blender import BlenderDataset
File "/content/nerf_pl/datasets/blender.py", line 9, in <module>
  from .ray_utils import *
File "/content/nerf_pl/datasets/ray_utils.py", line 2, in <module>
  from kornia import create_meshgrid
File "/usr/local/lib/python3.10/dist-packages/kornia/__init__.py", line 12, in <module>
  from kornia import augmentation
File "/usr/local/lib/python3.10/dist-packages/kornia/augmentation/__init__.py", line 1, in <module>
  from .augmentation import *
File "/usr/local/lib/python3.10/dist-packages/kornia/augmentation/augmentation.py", line 7, in <module>
  from . import functional as F
File "/usr/local/lib/python3.10/dist-packages/kornia/augmentation/functional.py", line 488
  input = input.view((-1, (*input.shape[-3:])))

SyntaxError: cannot use starred expression here Maybe because COLAB use python3.10, and the version of kornia does not support...something wrong about tuple

It is still showing me this error, how to resolve?

@Dawn11041107
Copy link

This is my change to the text this can indeed build the environment But I'm not sure if it can be used in NeRF
torch==2.1.0 torchvision==0.16.0 pytorch-lightning==0.7.5 test-tube kornia==0.2.0 opencv-python==4.8.1.78 matplotlib jupyter
Remember to make sure your COLAB is connected to the GPU

Thannnnnnnk you very much for your reply~ After letting me modify it according to your suggestion, the following error occurred: Traceback (most recent call last):

File "/content/nerf_pl/train.py", line 7, in <module>
  from datasets import dataset_dict
File "/content/nerf_pl/datasets/__init__.py", line 1, in <module>
  from .blender import BlenderDataset
File "/content/nerf_pl/datasets/blender.py", line 9, in <module>
  from .ray_utils import *
File "/content/nerf_pl/datasets/ray_utils.py", line 2, in <module>
  from kornia import create_meshgrid
File "/usr/local/lib/python3.10/dist-packages/kornia/__init__.py", line 12, in <module>
  from kornia import augmentation
File "/usr/local/lib/python3.10/dist-packages/kornia/augmentation/__init__.py", line 1, in <module>
  from .augmentation import *
File "/usr/local/lib/python3.10/dist-packages/kornia/augmentation/augmentation.py", line 7, in <module>
  from . import functional as F
File "/usr/local/lib/python3.10/dist-packages/kornia/augmentation/functional.py", line 488
  input = input.view((-1, (*input.shape[-3:])))

SyntaxError: cannot use starred expression here Maybe because COLAB use python3.10, and the version of kornia does not support...something wrong about tuple

It is still showing me this error, how to resolve?

Hey, my friend! You can give the answer I replied on Nov 6 a try. I made some changes to the version and code. Hopefully, it will be helpful to you too~

@utkrshkmr
Copy link

utkrshkmr commented Dec 18, 2023

This is my change to the text this can indeed build the environment But I'm not sure if it can be used in NeRF
torch==2.1.0 torchvision==0.16.0 pytorch-lightning==0.7.5 test-tube kornia==0.2.0 opencv-python==4.8.1.78 matplotlib jupyter
Remember to make sure your COLAB is connected to the GPU

Thannnnnnnk you very much for your reply~ After letting me modify it according to your suggestion, the following error occurred: Traceback (most recent call last):

File "/content/nerf_pl/train.py", line 7, in <module>
  from datasets import dataset_dict
File "/content/nerf_pl/datasets/__init__.py", line 1, in <module>
  from .blender import BlenderDataset
File "/content/nerf_pl/datasets/blender.py", line 9, in <module>
  from .ray_utils import *
File "/content/nerf_pl/datasets/ray_utils.py", line 2, in <module>
  from kornia import create_meshgrid
File "/usr/local/lib/python3.10/dist-packages/kornia/__init__.py", line 12, in <module>
  from kornia import augmentation
File "/usr/local/lib/python3.10/dist-packages/kornia/augmentation/__init__.py", line 1, in <module>
  from .augmentation import *
File "/usr/local/lib/python3.10/dist-packages/kornia/augmentation/augmentation.py", line 7, in <module>
  from . import functional as F
File "/usr/local/lib/python3.10/dist-packages/kornia/augmentation/functional.py", line 488
  input = input.view((-1, (*input.shape[-3:])))

SyntaxError: cannot use starred expression here Maybe because COLAB use python3.10, and the version of kornia does not support...something wrong about tuple

It is still showing me this error, how to resolve?

Hey, my friend! You can give the answer I replied on Nov 6 a try. I made some changes to the version and code. Hopefully, it will be helpful to you too~

You did help with resolving the issue on building requirements, however I still faced issue with respect to running the training code, is there a specific version of python that works?

can you snapshot your requirements and also provide the final code, I am asking much but please help me

@Dawn11041107
Copy link

It is still showing me this error, how to resolve?

Hey, my friend! You can give the answer I replied on Nov 6 a try. I made some changes to the version and code. Hopefully, it will be helpful to you too~

You did help with resolving the issue on building requirements, however I still faced issue with respect to running the training code, is there a specific version of python that works?

can you snapshot your requirements and also provide the final code, I am asking much but please help me

1.When running this code in Google Colab, please note that the built-in Python version is 3.10. You can use the following code to check the version:
import sys print(sys.version)
2.The code I provided(in the following screenshot) is my final requirement along with the train.py code. Please feel free to ask any questions if you need assistance!
image

@JeffreyLuW
Copy link

@Dawn11041107 Hi my friend. I made the same changes in the requirement.txt and train.py files that you previously made. However, upon running the code, I encountered the following error:


Building wheels for collected packages: test-tube, opencv-python, pycollada
  Building wheel for test-tube (setup.py) ... done
  Created wheel for test-tube: filename=test_tube-0.7.5-py3-none-any.whl size=25328 sha256=e81f157fbb28dda5795f5ff96e34cc4bf30c5d1cb8a112c9380b0b1ee0b34e4d
  Stored in directory: /root/.cache/pip/wheels/28/d4/8b/1aeb47c0dedd931b8e6aec55a8091864a69ac6f0adc5b12ea9
  error: subprocess-exited-with-error
  
  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  Building wheel for opencv-python (setup.py) ... error
  ERROR: Failed building wheel for opencv-python
  Running setup.py clean for opencv-python
  Building wheel for pycollada (setup.py) ... done
  Created wheel for pycollada: filename=pycollada-0.7.2-py3-none-any.whl size=127017 sha256=3e607381f7ab945b02a3fc1c02ff15efafc3136b12304969d6f1633d9181e774
  Stored in directory: /root/.cache/pip/wheels/d5/ba/33/1e99a7e7defd1d77f0210e7a39ff58de2a2d8d4c22466bb2da
Successfully built test-tube pycollada
Failed to build opencv-python
ERROR: Could not build wheels for opencv-python, which is required to install pyproject.toml-based projects
/content/nerf_pl/torchsearchsorted

Thank you in advance for your help

@pauloabner
Copy link

pauloabner commented Jan 10, 2024

@Dawn11041107 Hi my friend. I made the same changes in the requirement.txt and train.py files that you previously made. However, upon running the code, I encountered the following error:


Building wheels for collected packages: test-tube, opencv-python, pycollada
  Building wheel for test-tube (setup.py) ... done
  Created wheel for test-tube: filename=test_tube-0.7.5-py3-none-any.whl size=25328 sha256=e81f157fbb28dda5795f5ff96e34cc4bf30c5d1cb8a112c9380b0b1ee0b34e4d
  Stored in directory: /root/.cache/pip/wheels/28/d4/8b/1aeb47c0dedd931b8e6aec55a8091864a69ac6f0adc5b12ea9
  error: subprocess-exited-with-error
  
  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  Building wheel for opencv-python (setup.py) ... error
  ERROR: Failed building wheel for opencv-python
  Running setup.py clean for opencv-python
  Building wheel for pycollada (setup.py) ... done
  Created wheel for pycollada: filename=pycollada-0.7.2-py3-none-any.whl size=127017 sha256=3e607381f7ab945b02a3fc1c02ff15efafc3136b12304969d6f1633d9181e774
  Stored in directory: /root/.cache/pip/wheels/d5/ba/33/1e99a7e7defd1d77f0210e7a39ff58de2a2d8d4c22466bb2da
Successfully built test-tube pycollada
Failed to build opencv-python
ERROR: Could not build wheels for opencv-python, which is required to install pyproject.toml-based projects
/content/nerf_pl/torchsearchsorted

Thank you in advance for your help

@JeffreyLuW , I am having the same issue, do you have some progress?

@Dawn11041107
Copy link

Dawn11041107 commented Jan 12, 2024

@Dawn11041107 Hi my friend. I made the same changes in the requirement.txt and train.py files that you previously made. However, upon running the code, I encountered the following error:


Building wheels for collected packages: test-tube, opencv-python, pycollada
  Building wheel for test-tube (setup.py) ... done
  Created wheel for test-tube: filename=test_tube-0.7.5-py3-none-any.whl size=25328 sha256=e81f157fbb28dda5795f5ff96e34cc4bf30c5d1cb8a112c9380b0b1ee0b34e4d
  Stored in directory: /root/.cache/pip/wheels/28/d4/8b/1aeb47c0dedd931b8e6aec55a8091864a69ac6f0adc5b12ea9
  error: subprocess-exited-with-error
  
  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  Building wheel for opencv-python (setup.py) ... error
  ERROR: Failed building wheel for opencv-python
  Running setup.py clean for opencv-python
  Building wheel for pycollada (setup.py) ... done
  Created wheel for pycollada: filename=pycollada-0.7.2-py3-none-any.whl size=127017 sha256=3e607381f7ab945b02a3fc1c02ff15efafc3136b12304969d6f1633d9181e774
  Stored in directory: /root/.cache/pip/wheels/d5/ba/33/1e99a7e7defd1d77f0210e7a39ff58de2a2d8d4c22466bb2da
Successfully built test-tube pycollada
Failed to build opencv-python
ERROR: Could not build wheels for opencv-python, which is required to install pyproject.toml-based projects
/content/nerf_pl/torchsearchsorted

Thank you in advance for your help

Hi, my friend. I tried again and found that the problem you mentioned above does occur. I used !pip show test-tube opencv-python pycollada to check if they were installed. However, I still find some other issues...

In the end, I decided to remove the version specifications and continued using the train.py I wrote. Surprisingly, it worked perfectly fine without any problems.

image
image
PS: I met the OSError so that I add import os in the train.py.

@NyaNyav2
Copy link

@Dawn11041107 . Hi my friend, i follow your guide and I face this bug in train.py
/content/nerf_pl
Traceback (most recent call last):
File "/content/nerf_pl/train.py", line 47, in
from pytorch_lightning.loggers import TestTubeLogger
ImportError: cannot import name 'TestTubeLogger' from 'pytorch_lightning.loggers' (/usr/local/lib/python3.10/dist-packages/pytorch_lightning/loggers/init.py)
Can u teach me how to debug this pls.

@AtharvaJadhav7
Copy link

AtharvaJadhav7 commented Mar 9, 2024

@Dawn11041107 I am getting following error,

Validation DataLoader 0: 0% 0/1 [00:01<?, ?it/s]
Epoch 0: 100% 3536/3536 [16:02<00:00, 3.67it/s, loss=0.0143, v_num=2]
Traceback (most recent call last):
File "/content/nerf_pl/train.py", line 197, in
trainer.fit(system)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 771, in fit
self._call_and_handle_interrupt(
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 724, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 812, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 1237, in _run
results = self._run_stage()
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 1324, in _run_stage
return self._run_train()
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 1354, in _run_train
self.fit_loop.run()
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/loops/base.py", line 205, in run
self.on_advance_end()
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/loops/fit_loop.py", line 297, in on_advance_end
self.trainer._call_callback_hooks("on_train_epoch_end")
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 1637, in _call_callback_hooks
fn(self, self.lightning_module, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/early_stopping.py", line 179, in on_train_epoch_end
self._run_early_stopping_check(trainer)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/early_stopping.py", line 190, in _run_early_stopping_check
if trainer.fast_dev_run or not self._validate_condition_metric( # disable early_stopping with fast_dev_run
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/early_stopping.py", line 145, in _validate_condition_metric
raise RuntimeError(error_msg)
RuntimeError: Early stopping conditioned on metric None which is not available. Pass in or modify your EarlyStopping callback to use any of the following: ``
[W observer.cpp:104] Warning: Leaked callback handle: 1 (function operator())`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment