Created
February 9, 2023 03:41
-
-
Save stevenkolawole/52489c98444cf4ed627fdcdd9a2874c7 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2023-02-09 11:38:54,186 INFO:Run name: EUR-Lex_bert_20230209113854 | |
2023-02-09 11:38:54,800 INFO:Created a temporary directory at /tmp/tmp5_bwr35i | |
2023-02-09 11:38:54,800 INFO:Writing /tmp/tmp5_bwr35i/_remote_module_non_scriptable.py | |
2023-02-09 11:38:55,241 INFO:Global seed set to 1337 | |
2023-02-09 11:38:55,246 INFO:Using device: cuda | |
2023-02-09 11:38:56,089 INFO:Load data from data/EUR-Lex/train.txt. | |
2023-02-09 11:38:56,930 INFO:Load data from data/EUR-Lex/test.txt. | |
2023-02-09 11:38:57,126 INFO:Finish loading dataset (train: 12359 / val: 3090 / test: 3865) | |
2023-02-09 11:38:57,128 INFO:Initialize model from scratch. | |
2023-02-09 11:38:57,136 INFO:Read 3956 labels. | |
/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/utilities/parsing.py:268: UserWarning: Attribute 'network' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['network'])`. | |
rank_zero_warn( | |
/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:447: LightningDeprecationWarning: Setting `Trainer(gpus=1)` is deprecated in v1.7 and will be removed in v2.0. Please use `Trainer(accelerator='gpu', devices=1)` instead. | |
rank_zero_deprecation( | |
2023-02-09 11:38:58,756 INFO:GPU available: True (cuda), used: True | |
2023-02-09 11:38:58,757 INFO:TPU available: False, using: 0 TPU cores | |
2023-02-09 11:38:58,757 INFO:IPU available: False, using: 0 IPUs | |
2023-02-09 11:38:58,757 INFO:HPU available: False, using: 0 HPUs | |
2023-02-09 11:38:58,800 INFO:`Trainer(limit_train_batches=1.0)` was configured so 100% of the batches per epoch will be used.. | |
2023-02-09 11:38:58,800 INFO:`Trainer(limit_val_batches=1.0)` was configured so 100% of the batches will be used.. | |
2023-02-09 11:38:58,800 INFO:`Trainer(limit_test_batches=1.0)` was configured so 100% of the batches will be used.. | |
2023-02-09 11:38:58,800 INFO:Finish writing log to ./runs/EUR-Lex_bert_20230209113854/logs.json. | |
/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/callbacks/model_checkpoint.py:616: UserWarning: Checkpoint directory /home/tmp/LibMultiLabel/runs/EUR-Lex_bert_20230209113854 exists and is not empty. | |
rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.") | |
Traceback (most recent call last): | |
File "main.py", line 222, in <module> | |
main() | |
File "main.py", line 209, in main | |
trainer.train() | |
File "/home/tmp/LibMultiLabel/torch_trainer.py", line 210, in train | |
self.trainer.fit(self.model, train_loader, val_loader) | |
File "/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 696, in fit | |
self._call_and_handle_interrupt( | |
File "/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _call_and_handle_interrupt | |
return trainer_fn(*args, **kwargs) | |
File "/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 735, in _fit_impl | |
results = self._run(model, ckpt_path=self.ckpt_path) | |
File "/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1147, in _run | |
self.strategy.setup(self) | |
File "/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/strategies/single_device.py", line 73, in setup | |
self.model_to_device() | |
File "/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/strategies/single_device.py", line 70, in model_to_device | |
self.model.to(self.root_device) | |
File "/home/tmp/.local/lib/python3.8/site-packages/pytorch_lightning/core/mixins/device_dtype_mixin.py", line 113, in to | |
return super().to(*args, **kwargs) | |
File "/home/tmp/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 989, in to | |
return self._apply(convert) | |
File "/home/tmp/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 641, in _apply | |
module._apply(fn) | |
File "/home/tmp/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 641, in _apply | |
module._apply(fn) | |
File "/home/tmp/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 641, in _apply | |
module._apply(fn) | |
[Previous line repeated 2 more times] | |
File "/home/tmp/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 664, in _apply | |
param_applied = fn(param) | |
File "/home/tmp/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 987, in convert | |
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) | |
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 90.00 MiB (GPU 0; 23.70 GiB total capacity; 8.00 KiB already allocated; 704.00 KiB free; 2.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment