Skip to content

Instantly share code, notes, and snippets.

@stevenkolawole
Created February 9, 2023 03:41
Show Gist options
  • Save stevenkolawole/52489c98444cf4ed627fdcdd9a2874c7 to your computer and use it in GitHub Desktop.
Save stevenkolawole/52489c98444cf4ed627fdcdd9a2874c7 to your computer and use it in GitHub Desktop.
2023-02-09 11:38:54,186 INFO:Run name: EUR-Lex_bert_20230209113854
2023-02-09 11:38:54,800 INFO:Created a temporary directory at /tmp/tmp5_bwr35i
2023-02-09 11:38:54,800 INFO:Writing /tmp/tmp5_bwr35i/_remote_module_non_scriptable.py
2023-02-09 11:38:55,241 INFO:Global seed set to 1337
2023-02-09 11:38:55,246 INFO:Using device: cuda
2023-02-09 11:38:56,089 INFO:Load data from data/EUR-Lex/train.txt.
2023-02-09 11:38:56,930 INFO:Load data from data/EUR-Lex/test.txt.
2023-02-09 11:38:57,126 INFO:Finish loading dataset (train: 12359 / val: 3090 / test: 3865)
2023-02-09 11:38:57,128 INFO:Initialize model from scratch.
2023-02-09 11:38:57,136 INFO:Read 3956 labels.
/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/utilities/parsing.py:268: UserWarning: Attribute 'network' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['network'])`.
rank_zero_warn(
/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:447: LightningDeprecationWarning: Setting `Trainer(gpus=1)` is deprecated in v1.7 and will be removed in v2.0. Please use `Trainer(accelerator='gpu', devices=1)` instead.
rank_zero_deprecation(
2023-02-09 11:38:58,756 INFO:GPU available: True (cuda), used: True
2023-02-09 11:38:58,757 INFO:TPU available: False, using: 0 TPU cores
2023-02-09 11:38:58,757 INFO:IPU available: False, using: 0 IPUs
2023-02-09 11:38:58,757 INFO:HPU available: False, using: 0 HPUs
2023-02-09 11:38:58,800 INFO:`Trainer(limit_train_batches=1.0)` was configured so 100% of the batches per epoch will be used..
2023-02-09 11:38:58,800 INFO:`Trainer(limit_val_batches=1.0)` was configured so 100% of the batches will be used..
2023-02-09 11:38:58,800 INFO:`Trainer(limit_test_batches=1.0)` was configured so 100% of the batches will be used..
2023-02-09 11:38:58,800 INFO:Finish writing log to ./runs/EUR-Lex_bert_20230209113854/logs.json.
/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py:616: UserWarning: Checkpoint directory /home/tmp/LibMultiLabel/runs/EUR-Lex_bert_20230209113854 exists and is not empty.
rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")
Traceback (most recent call last):
File "main.py", line 222, in <module>
main()
File "main.py", line 209, in main
trainer.train()
File "/home/tmp/LibMultiLabel/torch_trainer.py", line 210, in train
self.trainer.fit(self.model, train_loader, val_loader)
File "/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 696, in fit
self._call_and_handle_interrupt(
File "/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 735, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1147, in _run
self.strategy.setup(self)
File "/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/strategies/single_device.py", line 73, in setup
self.model_to_device()
File "/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/strategies/single_device.py", line 70, in model_to_device
self.model.to(self.root_device)
File "/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/core/mixins/device_dtype_mixin.py", line 113, in to
return super().to(*args, **kwargs)
File "/home/tmp/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 989, in to
return self._apply(convert)
File "/home/tmp/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 641, in _apply
module._apply(fn)
File "/home/tmp/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 641, in _apply
module._apply(fn)
File "/home/tmp/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 641, in _apply
module._apply(fn)
[Previous line repeated 2 more times]
File "/home/tmp/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 664, in _apply
param_applied = fn(param)
File "/home/tmp/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 987, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 90.00 MiB (GPU 0; 23.70 GiB total capacity; 8.00 KiB already allocated; 704.00 KiB free; 2.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment