Created
September 22, 2022 21:59
-
-
Save nousr/21586fd3c794768e9c7d42760dbe4342 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading intelmpi version 2021.4.0 | |
hosts gpu-st-p4d-24xlarge-340 | |
go 1 | |
/opt/slurm/bin/srun: line 27: [: too many arguments | |
cpu-bind=MASK - gpu-st-p4d-24xlarge-340, task 0 0 [40118]: mask 0xffffffffffff set | |
/usr/lib64/python3.7/runpy.py:125: RuntimeWarning: 'clip_retrieval.clip_inference.worker' found in sys.modules after import of package 'clip_retrieval.clip_inference', but prior to execution of 'clip_retrieval.clip_inference.worker'; this may result in unpredictable behaviour | |
warn(RuntimeWarning(msg)) | |
/usr/lib64/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown | |
len(cache)) | |
wandb: | |
wandb: Run history: | |
wandb: average_inference_duration_per_sample ▁ | |
wandb: average_read_duration_per_sample ▁ | |
wandb: average_total_duration_per_sample ▁ | |
wandb: average_write_duration_per_sample ▁ | |
wandb: sample_count ▁ | |
wandb: sample_per_sec ▁ | |
wandb: total_job_duration ▁ | |
wandb: | |
wandb: Run summary: | |
wandb: average_inference_duration_per_sample 0.0007 | |
wandb: average_read_duration_per_sample 0.00194 | |
wandb: average_total_duration_per_sample 0.00265 | |
wandb: average_write_duration_per_sample 0.0 | |
wandb: sample_count 19118 | |
wandb: sample_per_sec 1809.67095 | |
wandb: total_job_duration 10.56435 | |
wandb: | |
wandb: Synced glad-aardvark-16: https://wandb.ai/nousr_laion/clip_retrieval/runs/2j8sv9xm | |
wandb: Synced 4 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) | |
wandb: Find logs at: ./wandb/run-20220922_213855-2j8sv9xm/logs | |
[1m[31mERROR: [0mCould not consume arg: s3 | |
Usage: worker.py --input_dataset='['"'"'pipe:aws' s3 cp --quiet s3://s-datasets/laion5b/laion2B-data/000000.tar '-'"'"',' ''"'"'pipe:aws' s3 cp --quiet s3://s-datasets/laion5b/laion2B-data/000001.tar '-'"'"']' --output_folder=s3://s-laion/clip-h-embeddings-test --input_format=webdataset --cache_path=/fsx/nousr/.cache --batch_size=64 --num_prepro_workers=6 --enable_text=True --enable_image=True | |
For detailed information on this command, run: | |
worker.py --input_dataset='['"'"'pipe:aws' s3 cp --quiet s3://s-datasets/laion5b/laion2B-data/000000.tar '-'"'"',' ''"'"'pipe:aws' s3 cp --quiet s3://s-datasets/laion5b/laion2B-data/000001.tar '-'"'"']' --output_folder=s3://s-laion/clip-h-embeddings-test --input_format=webdataset --cache_path=/fsx/nousr/.cache --batch_size=64 --num_prepro_workers=6 --enable_text=True --enable_image=True --help | |
sample_per_sec 1809 ; sample_count 19118 srun: error: gpu-st-p4d-24xlarge-337: task 0: Exited with exit code 2 | |
[I debug.cpp:47] [c10d] The debug level is set to INFO. | |
The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`. | |
Moving 0 files to the new cache system | |
0it [00:00, ?it/s] | |
0it [00:00, ?it/s] | |
There was a problem when trying to write in your cache folder (/home/zion/.cache/huggingface/hub). You should set the environment variable TRANSFORMERS_CACHE to a writable directory. | |
/usr/lib64/python3.7/runpy.py:125: RuntimeWarning: 'clip_retrieval.clip_inference.worker' found in sys.modules after import of package 'clip_retrieval.clip_inference', but prior to execution of 'clip_retrieval.clip_inference.worker'; this may result in unpredictable behaviour | |
warn(RuntimeWarning(msg)) | |
[I debug.cpp:47] [c10d] The debug level is set to INFO. | |
The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`. | |
Moving 0 files to the new cache system | |
0it [00:00, ?it/s] | |
0it [00:00, ?it/s] | |
There was a problem when trying to write in your cache folder (/home/zion/.cache/huggingface/hub). You should set the environment variable TRANSFORMERS_CACHE to a writable directory. | |
/usr/lib64/python3.7/runpy.py:125: RuntimeWarning: 'clip_retrieval.clip_inference.worker' found in sys.modules after import of package 'clip_retrieval.clip_inference', but prior to execution of 'clip_retrieval.clip_inference.worker'; this may result in unpredictable behaviour | |
warn(RuntimeWarning(msg)) | |
wandb: Currently logged in as: nousr_laion. Use `wandb login --relogin` to force relogin | |
[I debug.cpp:47] [c10d] The debug level is set to INFO. | |
The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`. | |
Moving 0 files to the new cache system | |
0it [00:00, ?it/s] | |
0it [00:00, ?it/s] | |
There was a problem when trying to write in your cache folder (/home/zion/.cache/huggingface/hub). You should set the environment variable TRANSFORMERS_CACHE to a writable directory. | |
wandb: wandb version 0.13.3 is available! To upgrade, please run: | |
wandb: $ pip install wandb --upgrade | |
wandb: Tracking run with wandb version 0.12.21 | |
wandb: Run data is saved locally in /fsx/nousr/clip-retrieval/wandb/run-20220922_213925-35qnxpnk | |
wandb: Run `wandb offline` to turn off syncing. | |
wandb: Syncing run eager-lake-17 | |
wandb: ⭐️ View project at https://wandb.ai/nousr_laion/clip_retrieval | |
wandb: 🚀 View run at https://wandb.ai/nousr_laion/clip_retrieval/runs/35qnxpnk | |
wandb: Waiting for W&B process to finish... (success). | |
wandb: - 0.002 MB of 0.002 MB uploaded (0.000 MB deduped) | |
wandb: \ 0.002 MB of 0.002 MB uploaded (0.000 MB deduped) | |
wandb: | 0.002 MB of 0.004 MB uploaded (0.000 MB deduped) | |
wandb: / 0.004 MB of 0.004 MB uploaded (0.000 MB deduped) | |
wandb: - 0.004 MB of 0.004 MB uploaded (0.000 MB deduped) | |
wandb: \ 0.004 MB of 0.004 MB uploaded (0.000 MB deduped) | |
wandb: | 0.004 MB of 0.004 MB uploaded (0.000 MB deduped) | |
wandb: / 0.004 MB of 0.004 MB uploaded (0.000 MB deduped) | |
wandb: - 0.004 MB of 0.004 MB uploaded (0.000 MB deduped) | |
wandb: \ 0.004 MB of 0.004 MB uploaded (0.000 MB deduped) | |
wandb: | |
/usr/lib64/python3.7/runpy.py:125: RuntimeWarning: 'clip_retrieval.clip_inference.worker' found in sys.modules after import of package 'clip_retrieval.clip_inference', but prior to execution of 'clip_retrieval.clip_inference.worker'; this may result in unpredictable behaviour | |
warn(RuntimeWarning(msg)) | |
/usr/lib64/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown | |
len(cache)) | |
wandb: | |
wandb: Run history: | |
wandb: average_inference_duration_per_sample ▁ | |
wandb: average_read_duration_per_sample ▁ | |
wandb: average_total_duration_per_sample ▁ | |
wandb: average_write_duration_per_sample ▁ | |
wandb: sample_count ▁ | |
wandb: sample_per_sec ▁ | |
wandb: total_job_duration ▁ | |
wandb: | |
wandb: Run summary: | |
wandb: average_inference_duration_per_sample 0.0007 | |
wandb: average_read_duration_per_sample 0.00194 | |
wandb: average_total_duration_per_sample 0.00265 | |
wandb: average_write_duration_per_sample 0.0 | |
wandb: sample_count 19118 | |
wandb: sample_per_sec 1854.90355 | |
wandb: total_job_duration 10.30674 | |
wandb: | |
wandb: Synced eager-lake-17: https://wandb.ai/nousr_laion/clip_retrieval/runs/35qnxpnk | |
wandb: Synced 4 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s) | |
wandb: Find logs at: ./wandb/run-20220922_213925-35qnxpnk/logs | |
[1m[31mERROR: [0mCould not consume arg: s3 | |
Usage: worker.py --input_dataset='['"'"'pipe:aws' s3 cp --quiet s3://s-datasets/laion5b/laion2B-data/000000.tar '-'"'"',' ''"'"'pipe:aws' s3 cp --quiet s3://s-datasets/laion5b/laion2B-data/000001.tar '-'"'"']' --output_folder=s3://s-laion/clip-h-embeddings-test --input_format=webdataset --cache_path=/fsx/nousr/.cache --batch_size=64 --num_prepro_workers=6 --enable_text=True --enable_image=True | |
For detailed information on this command, run: | |
worker.py --input_dataset='['"'"'pipe:aws' s3 cp --quiet s3://s-datasets/laion5b/laion2B-data/000000.tar '-'"'"',' ''"'"'pipe:aws' s3 cp --quiet s3://s-datasets/laion5b/laion2B-data/000001.tar '-'"'"']' --output_folder=s3://s-laion/clip-h-embeddings-test --input_format=webdataset --cache_path=/fsx/nousr/.cache --batch_size=64 --num_prepro_workers=6 --enable_text=True --enable_image=True --help | |
sample_per_sec 1854 ; sample_count 19118 srun: error: gpu-st-p4d-24xlarge-340: task 0: Exited with exit code 2 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment