Created
July 18, 2023 15:43
-
-
Save toanbku/538d4482640923d179aad2e5acd41976 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(cuda118) [RedmondAI] ubuntu@oa-server-4:~/OA/model/model_training$ CUDA_VISIBLE_DEVICES=1 OMP_NUM_THREADS=1 accelerate launch --main_process_port 29501 --config_file configs/accelerate_config.yaml --num_processes 1 trainer_rl.py --configs defaults defaults_rlhf pythia_rlhf oa_df | |
[2023-07-18 15:22:12,144] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect) | |
[2023-07-18 15:22:15,570] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect) | |
RNG seed: 2703368087 | |
Found cached dataset json (/home/ubuntu/.cache/huggingface/datasets/toanbku___json/toanbku--oa-df-811abf2c8473a2c5/0.0.0/8bb11242116d547c741b2e8a1f18598ffdd40a1d4f2a2872c7a28b697434bc96) | |
OASST HF dataset toanbku/oa-df: len(train)=19, len(val)=4 | |
len self.tokenizer 50282 | |
[2023-07-18 15:22:19,161] [WARNING] [comm.py:152:init_deepspeed_backend] NCCL backend in DeepSpeed not yet implemented | |
[2023-07-18 15:22:19,161] [INFO] [comm.py:594:init_distributed] cdb=None | |
[2023-07-18 15:22:19,161] [INFO] [comm.py:625:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl | |
[RANK 0] Initializing model: toanbku/oa-pythia-12b-sft-df | |
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████| 3/3 [00:17<00:00, 5.70s/it] | |
Resizing embeddings to 50282 | |
Number of trainable parameters: 11841M | |
wandb: Currently logged in as: toanbku. Use `wandb login --relogin` to force relogin | |
wandb: Tracking run with wandb version 0.15.5 | |
wandb: Run data is saved locally in /home/ubuntu/OA/model/model_training/wandb/run-20230718_152646-l4eu07k3 | |
wandb: Run `wandb offline` to turn off syncing. | |
wandb: Syncing run trainer_rl/oa-pythia-12b-sft-df/1gpu:main | |
wandb: ⭐️ View project at https://wandb.ai/toanbku/rlhf | |
wandb: 🚀 View run at https://wandb.ai/toanbku/rlhf/runs/l4eu07k3 | |
[2023-07-18 15:26:54,809] [WARNING] [cpu_adam.py:84:__init__] FP16 params for CPUAdam may not work on AMD CPUs | |
Using /home/ubuntu/.cache/torch_extensions/py310_cu118 as PyTorch extensions root... | |
Detected CUDA files, patching ldflags | |
Emitting ninja build file /home/ubuntu/.cache/torch_extensions/py310_cu118/cpu_adam/build.ninja... | |
Building extension module cpu_adam... | |
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) | |
ninja: no work to do. | |
Loading extension module cpu_adam... | |
Time to load cpu_adam op: 2.5278260707855225 seconds | |
Adam Optimizer #0 is created with AVX2 arithmetic capability. | |
Config: alpha=0.000001, betas=(0.900000, 0.999000), weight_decay=0.000001, adam_w=1 | |
[2023-07-18 15:26:58,261] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed info: version=0.9.5, git-hash=unknown, git-branch=unknown | |
[2023-07-18 15:27:07,634] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False | |
[2023-07-18 15:27:07,636] [INFO] [logging.py:96:log_dist] [Rank 0] Removing param_group that has no 'params' in the client Optimizer | |
[2023-07-18 15:27:07,637] [INFO] [logging.py:96:log_dist] [Rank 0] Using client Optimizer as basic optimizer | |
[2023-07-18 15:27:07,676] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Basic Optimizer = DeepSpeedCPUAdam | |
[2023-07-18 15:27:07,676] [INFO] [utils.py:54:is_zero_supported_optimizer] Checking ZeRO support for optimizer=DeepSpeedCPUAdam type=<class 'deepspeed.ops.adam.cpu_adam.DeepSpeedCPUAdam'> | |
[2023-07-18 15:27:07,676] [INFO] [logging.py:96:log_dist] [Rank 0] Creating torch.float16 ZeRO stage 2 optimizer | |
[2023-07-18 15:27:07,676] [INFO] [stage_1_and_2.py:133:__init__] Reduce bucket size 200000000 | |
[2023-07-18 15:27:07,676] [INFO] [stage_1_and_2.py:134:__init__] Allgather bucket size 200000000 | |
[2023-07-18 15:27:07,676] [INFO] [stage_1_and_2.py:135:__init__] CPU Offload: True | |
[2023-07-18 15:27:07,677] [INFO] [stage_1_and_2.py:136:__init__] Round robin gradient partitioning: False | |
Rank: 0 partition count [1] and sizes[(11894364162, False)] | |
[2023-07-18 15:27:55,937] [INFO] [utils.py:785:see_memory_usage] Before initializing optimizer states | |
[2023-07-18 15:27:55,938] [INFO] [utils.py:786:see_memory_usage] MA 22.78 GB Max_MA 22.78 GB CA 22.8 GB Max_CA 23 GB | |
[2023-07-18 15:27:55,939] [INFO] [utils.py:793:see_memory_usage] CPU Virtual Memory: used = 53.97 GB, percent = 14.3% | |
[2023-07-18 15:29:17,258] [INFO] [utils.py:785:see_memory_usage] After initializing optimizer states | |
[2023-07-18 15:29:17,259] [INFO] [utils.py:786:see_memory_usage] MA 22.78 GB Max_MA 22.78 GB CA 22.8 GB Max_CA 23 GB | |
[2023-07-18 15:29:17,260] [INFO] [utils.py:793:see_memory_usage] CPU Virtual Memory: used = 208.36 GB, percent = 55.1% | |
[2023-07-18 15:29:17,260] [INFO] [stage_1_and_2.py:488:__init__] optimizer state initialized | |
[2023-07-18 15:29:17,370] [INFO] [utils.py:785:see_memory_usage] After initializing ZeRO optimizer | |
[2023-07-18 15:29:17,371] [INFO] [utils.py:786:see_memory_usage] MA 22.78 GB Max_MA 22.78 GB CA 22.8 GB Max_CA 23 GB | |
[2023-07-18 15:29:17,371] [INFO] [utils.py:793:see_memory_usage] CPU Virtual Memory: used = 208.36 GB, percent = 55.1% | |
[2023-07-18 15:29:17,384] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed Final Optimizer = DeepSpeedCPUAdam | |
[2023-07-18 15:29:17,384] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed using client LR scheduler | |
[2023-07-18 15:29:17,384] [INFO] [logging.py:96:log_dist] [Rank 0] DeepSpeed LR Scheduler = None | |
[2023-07-18 15:29:17,384] [INFO] [logging.py:96:log_dist] [Rank 0] step=0, skipped=0, lr=[1e-06], mom=[[0.9, 0.95]] | |
[2023-07-18 15:29:17,385] [INFO] [config.py:960:print] DeepSpeedEngine configuration: | |
[2023-07-18 15:29:17,385] [INFO] [config.py:964:print] activation_checkpointing_config { | |
"partition_activations": false, | |
"contiguous_memory_optimization": false, | |
"cpu_checkpointing": false, | |
"number_checkpoints": null, | |
"synchronize_checkpoint_boundary": false, | |
"profile": false | |
} | |
[2023-07-18 15:29:17,386] [INFO] [config.py:964:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True} | |
[2023-07-18 15:29:17,386] [INFO] [config.py:964:print] amp_enabled .................. False | |
[2023-07-18 15:29:17,386] [INFO] [config.py:964:print] amp_params ................... False | |
[2023-07-18 15:29:17,386] [INFO] [config.py:964:print] autotuning_config ............ { | |
"enabled": false, | |
"start_step": null, | |
"end_step": null, | |
"metric_path": null, | |
"arg_mappings": null, | |
"metric": "throughput", | |
"model_info": null, | |
"results_dir": "autotuning_results", | |
"exps_dir": "autotuning_exps", | |
"overwrite": true, | |
"fast": true, | |
"start_profile_step": 3, | |
"end_profile_step": 5, | |
"tuner_type": "gridsearch", | |
"tuner_early_stopping": 5, | |
"tuner_num_trials": 50, | |
"model_info_path": null, | |
"mp_size": 1, | |
"max_train_batch_size": null, | |
"min_train_batch_size": 1, | |
"max_train_micro_batch_size_per_gpu": 1.024000e+03, | |
"min_train_micro_batch_size_per_gpu": 1, | |
"num_tuning_micro_batch_sizes": 3 | |
} | |
[2023-07-18 15:29:17,386] [INFO] [config.py:964:print] bfloat16_enabled ............. False | |
[2023-07-18 15:29:17,386] [INFO] [config.py:964:print] checkpoint_parallel_write_pipeline False | |
[2023-07-18 15:29:17,386] [INFO] [config.py:964:print] checkpoint_tag_validation_enabled True | |
[2023-07-18 15:29:17,386] [INFO] [config.py:964:print] checkpoint_tag_validation_fail False | |
[2023-07-18 15:29:17,387] [INFO] [config.py:964:print] comms_config ................. <deepspeed.comm.config.DeepSpeedCommsConfig object at 0x7febe82fed10> | |
[2023-07-18 15:29:17,387] [INFO] [config.py:964:print] communication_data_type ...... None | |
[2023-07-18 15:29:17,387] [INFO] [config.py:964:print] compression_config ........... {'weight_quantization': {'shared_parameters': {'enabled': False, 'quantizer_kernel': False, 'schedule_offset': 0, 'quantize_groups': 1, 'quantize_verbose': False, 'quantization_type': 'symmetric', 'quantize_weight_in_forward': False, 'rounding': 'nearest', 'fp16_mixed_quantize': False, 'quantize_change_ratio': 0.001}, 'different_groups': {}}, 'activation_quantization': {'shared_parameters': {'enabled': False, 'quantization_type': 'symmetric', 'range_calibration': 'dynamic', 'schedule_offset': 1000}, 'different_groups': {}}, 'sparse_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'row_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'head_pruning': {'shared_parameters': {'enabled': False, 'method': 'topk', 'schedule_offset': 1000}, 'different_groups': {}}, 'channel_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'layer_reduction': {'enabled': False}} | |
[2023-07-18 15:29:17,387] [INFO] [config.py:964:print] curriculum_enabled_legacy .... False | |
[2023-07-18 15:29:17,387] [INFO] [config.py:964:print] curriculum_params_legacy ..... False | |
[2023-07-18 15:29:17,387] [INFO] [config.py:964:print] data_efficiency_config ....... {'enabled': False, 'seed': 1234, 'data_sampling': {'enabled': False, 'num_epochs': 1000, 'num_workers': 0, 'curriculum_learning': {'enabled': False}}, 'data_routing': {'enabled': False, 'random_ltd': {'enabled': False, 'layer_token_lr_schedule': {'enabled': False}}}} | |
[2023-07-18 15:29:17,387] [INFO] [config.py:964:print] data_efficiency_enabled ...... False | |
[2023-07-18 15:29:17,387] [INFO] [config.py:964:print] dataloader_drop_last ......... False | |
[2023-07-18 15:29:17,387] [INFO] [config.py:964:print] disable_allgather ............ False | |
[2023-07-18 15:29:17,387] [INFO] [config.py:964:print] dump_state ................... False | |
[2023-07-18 15:29:17,387] [INFO] [config.py:964:print] dynamic_loss_scale_args ...... {'init_scale': 4096, 'scale_window': 1000, 'delayed_shift': 2, 'consecutive_hysteresis': False, 'min_scale': 1} | |
[2023-07-18 15:29:17,387] [INFO] [config.py:964:print] eigenvalue_enabled ........... False | |
[2023-07-18 15:29:17,387] [INFO] [config.py:964:print] eigenvalue_gas_boundary_resolution 1 | |
[2023-07-18 15:29:17,388] [INFO] [config.py:964:print] eigenvalue_layer_name ........ bert.encoder.layer | |
[2023-07-18 15:29:17,388] [INFO] [config.py:964:print] eigenvalue_layer_num ......... 0 | |
[2023-07-18 15:29:17,388] [INFO] [config.py:964:print] eigenvalue_max_iter .......... 100 | |
[2023-07-18 15:29:17,388] [INFO] [config.py:964:print] eigenvalue_stability ......... 1e-06 | |
[2023-07-18 15:29:17,388] [INFO] [config.py:964:print] eigenvalue_tol ............... 0.01 | |
[2023-07-18 15:29:17,388] [INFO] [config.py:964:print] eigenvalue_verbose ........... False | |
[2023-07-18 15:29:17,388] [INFO] [config.py:964:print] elasticity_enabled ........... False | |
[2023-07-18 15:29:17,388] [INFO] [config.py:964:print] flops_profiler_config ........ { | |
"enabled": false, | |
"recompute_fwd_factor": 0.0, | |
"profile_step": 1, | |
"module_depth": -1, | |
"top_modules": 1, | |
"detailed": true, | |
"output_file": null | |
} | |
[2023-07-18 15:29:17,388] [INFO] [config.py:964:print] fp16_auto_cast ............... False | |
[2023-07-18 15:29:17,388] [INFO] [config.py:964:print] fp16_enabled ................. true | |
[2023-07-18 15:29:17,388] [INFO] [config.py:964:print] fp16_master_weights_and_gradients False | |
[2023-07-18 15:29:17,388] [INFO] [config.py:964:print] global_rank .................. 0 | |
[2023-07-18 15:29:17,388] [INFO] [config.py:964:print] grad_accum_dtype ............. None | |
[2023-07-18 15:29:17,389] [INFO] [config.py:964:print] gradient_accumulation_steps .. 1 | |
[2023-07-18 15:29:17,389] [INFO] [config.py:964:print] gradient_clipping ............ 1.0 | |
[2023-07-18 15:29:17,389] [INFO] [config.py:964:print] gradient_predivide_factor .... 1.0 | |
[2023-07-18 15:29:17,389] [INFO] [config.py:964:print] hybrid_engine ................ enabled=False max_out_tokens=512 inference_tp_size=1 release_inference_cache=False pin_parameters=True tp_gather_partition_size=8 | |
[2023-07-18 15:29:17,389] [INFO] [config.py:964:print] initial_dynamic_scale ........ 4096 | |
[2023-07-18 15:29:17,389] [INFO] [config.py:964:print] load_universal_checkpoint .... False | |
[2023-07-18 15:29:17,389] [INFO] [config.py:964:print] loss_scale ................... 0 | |
[2023-07-18 15:29:17,389] [INFO] [config.py:964:print] memory_breakdown ............. False | |
[2023-07-18 15:29:17,389] [INFO] [config.py:964:print] mics_hierarchial_params_gather False | |
[2023-07-18 15:29:17,389] [INFO] [config.py:964:print] mics_shard_size .............. -1 | |
[2023-07-18 15:29:17,389] [INFO] [config.py:964:print] monitor_config ............... tensorboard=TensorBoardConfig(enabled=False, output_path='', job_name='DeepSpeedJobName') wandb=WandbConfig(enabled=False, group=None, team=None, project='deepspeed') csv_monitor=CSVConfig(enabled=False, output_path='', job_name='DeepSpeedJobName') enabled=False | |
[2023-07-18 15:29:17,389] [INFO] [config.py:964:print] nebula_config ................ { | |
"enabled": false, | |
"persistent_storage_path": null, | |
"persistent_time_interval": 100, | |
"num_of_version_in_retention": 2, | |
"enable_nebula_load": true, | |
"load_path": null | |
} | |
[2023-07-18 15:29:17,390] [INFO] [config.py:964:print] optimizer_legacy_fusion ...... False | |
[2023-07-18 15:29:17,390] [INFO] [config.py:964:print] optimizer_name ............... None | |
[2023-07-18 15:29:17,390] [INFO] [config.py:964:print] optimizer_params ............. None | |
[2023-07-18 15:29:17,390] [INFO] [config.py:964:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0} | |
[2023-07-18 15:29:17,390] [INFO] [config.py:964:print] pld_enabled .................. False | |
[2023-07-18 15:29:17,390] [INFO] [config.py:964:print] pld_params ................... False | |
[2023-07-18 15:29:17,390] [INFO] [config.py:964:print] prescale_gradients ........... False | |
[2023-07-18 15:29:17,390] [INFO] [config.py:964:print] scheduler_name ............... None | |
[2023-07-18 15:29:17,390] [INFO] [config.py:964:print] scheduler_params ............. None | |
[2023-07-18 15:29:17,390] [INFO] [config.py:964:print] sparse_attention ............. None | |
[2023-07-18 15:29:17,390] [INFO] [config.py:964:print] sparse_gradients_enabled ..... False | |
[2023-07-18 15:29:17,390] [INFO] [config.py:964:print] steps_per_print .............. inf | |
[2023-07-18 15:29:17,390] [INFO] [config.py:964:print] train_batch_size ............. 1 | |
[2023-07-18 15:29:17,390] [INFO] [config.py:964:print] train_micro_batch_size_per_gpu 1 | |
[2023-07-18 15:29:17,391] [INFO] [config.py:964:print] use_node_local_storage ....... False | |
[2023-07-18 15:29:17,391] [INFO] [config.py:964:print] wall_clock_breakdown ......... False | |
[2023-07-18 15:29:17,391] [INFO] [config.py:964:print] world_size ................... 1 | |
[2023-07-18 15:29:17,391] [INFO] [config.py:964:print] zero_allow_untested_optimizer True | |
[2023-07-18 15:29:17,391] [INFO] [config.py:964:print] zero_config .................. stage=2 contiguous_gradients=True reduce_scatter=True reduce_bucket_size=200000000 allgather_partitions=True allgather_bucket_size=200000000 overlap_comm=False load_from_fp32_weights=True elastic_checkpoint=False offload_param=None offload_optimizer=DeepSpeedZeroOffloadOptimizerConfig(device='cpu', nvme_path=None, buffer_count=4, pin_memory=True, pipeline=False, pipeline_read=False, pipeline_write=False, fast_init=False) sub_group_size=1,000,000,000 cpu_offload_param=None cpu_offload_use_pin_memory=None cpu_offload=None prefetch_bucket_size=50,000,000 param_persistence_threshold=100,000 model_persistence_threshold=sys.maxsize max_live_parameters=1,000,000,000 max_reuse_distance=1,000,000,000 gather_16bit_weights_on_model_save=False stage3_gather_fp16_weights_on_model_save=False ignore_unused_parameters=True legacy_stage1=False round_robin_gradients=False mics_shard_size=-1 mics_hierarchical_params_gather=False memory_efficient_linear=True | |
[2023-07-18 15:29:17,391] [INFO] [config.py:964:print] zero_enabled ................. True | |
[2023-07-18 15:29:17,391] [INFO] [config.py:964:print] zero_force_ds_cpu_optimizer .. True | |
[2023-07-18 15:29:17,391] [INFO] [config.py:964:print] zero_optimization_stage ...... 2 | |
[2023-07-18 15:29:17,392] [INFO] [config.py:950:print_user_config] json = { | |
"fp16": { | |
"enabled": "true", | |
"loss_scale": 0, | |
"loss_scale_window": 1000, | |
"initial_scale_power": 12, | |
"hysteresis": 2, | |
"min_loss_scale": 1, | |
"auto_cast": false | |
}, | |
"zero_optimization": { | |
"stage": 2, | |
"allgather_partitions": true, | |
"allgather_bucket_size": 2.000000e+08, | |
"overlap_comm": false, | |
"reduce_scatter": true, | |
"reduce_bucket_size": 2.000000e+08, | |
"contiguous_gradients": true, | |
"offload_optimizer": { | |
"device": "cpu", | |
"pin_memory": true | |
} | |
}, | |
"gradient_accumulation_steps": 1, | |
"train_micro_batch_size_per_gpu": 1, | |
"train_batch_size": 1, | |
"gradient_clipping": 1.0, | |
"steps_per_print": inf, | |
"wall_clock_breakdown": false, | |
"bf16": { | |
"enabled": false | |
}, | |
"zero_allow_untested_optimizer": true | |
} | |
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:17<00:00, 5.87s/it] | |
Resizing embeddings to 50282 | |
Number of trainable parameters: 11841M | |
[RANK 0] Starting training | |
[rollout 0 / 16]: 0%| | 0/16 [00:00<?, ?it/s]You're using a GPTNeoXTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding. | |
[rollout 16 / 16]: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [02:34<00:00, 9.65s/it] | |
[RANK 0] Evaluating model | |
[generation sweep 1/1 | eval batch 7/7]: 100%|██████████████████████████████████████████████████████████████████████████████| 7/7 [00:20<00:00, 2.90s/it] | |
[RANK 0] Computing rewards | |
[RANK 0] Summarizing evaluation | |
Evaluation #0 reward/mean: 5.67 | |
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓ | |
┃ prompt ┃ output ┃ reward ┃ | |
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩ | |
│ <|prompter|>Can you tell me about GLaDOS?<|endoftext|><|assistant|> │ Sure, I'd be happy to! │ 5.75 │ | |
│ │ │ │ | |
│ │ GLaDOS is a computer program that was created by the game developer, │ │ | |
│ │ Valve. The purpose of GLaDOS is to serve as a villain in the video │ │ | |
│ │ game Portal, and she appears as a voice in the game's narrator. │ │ | |
│ │ │ │ | |
│ │ GLaDOS is a hostile, malicious computer program that seeks to test │ │ | |
│ │ the player's intelligence and test their willpower. She is often │ │ | |
│ │ sarcastic, taunting the player, and will try to trick or manipulate │ │ | |
│ │ the player into doing her bidding. │ │ | |
│ │ │ │ | |
│ │ GLaDOS is a complex character with a backstory and motivations, and │ │ | |
│ │ her interactions with the player have a significant impact on the │ │ | |
│ │ game's narrative and themes. │ │ | |
│ │ │ │ | |
│ │ Overall, GLaDOS is a memorable character in the Portal universe, and │ │ | |
│ │ her role in the game has made her a popular and beloved figure among │ │ | |
│ │ fans.<|endoftext|> │ │ | |
├───────────────────────────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────────┼────────┤ | |
│ <|prompter|>What is the chemical symbol for │ The chemical symbol for gold is Au.<|endoftext|> │ 6.74 │ | |
│ gold?<|endoftext|><|assistant|> │ │ │ | |
├───────────────────────────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────────┼────────┤ | |
│ <|prompter|>If you were the President of the United States, what │ As an AI language model, I do not have personal opinions or beliefs. │ 4.57 │ | |
│ would you do?<|endoftext|><|assistant|> │ As the President of the United States, I would work to ensure the │ │ | |
│ │ safety and well-being of the American people by promoting policies │ │ | |
│ │ that create jobs, improve education and health care, and promote │ │ | |
│ │ international cooperation. I would also work to reduce the impact of │ │ | |
│ │ climate change and promote sustainable development.<|endoftext|> │ │ | |
└───────────────────────────────────────────────────────────────────────┴───────────────────────────────────────────────────────────────────────┴────────┘ | |
[losses/total_loss: 0.11 | losses/policy_loss: -0.08 | losses/value_loss: 0.19]: 1%|▏ | 4/640 [04:06<10:39:43, 60.35s/it]Traceback (most recent call last): | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 57, in _wrapfunc | |
return bound(*args, **kwds) | |
TypeError: clip() received an invalid combination of arguments - got (float, float, out=NoneType), but expected one of: | |
* (Tensor min, Tensor max) | |
* (Number min, Number max) | |
During handling of the above exception, another exception occurred: | |
Traceback (most recent call last): | |
File "/home/ubuntu/OA/model/model_training/trainer_rl.py", line 199, in <module> | |
main() | |
File "/home/ubuntu/OA/model/model_training/trainer_rl.py", line 184, in main | |
trainer = trlx.train( | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/trlx/trlx.py", line 126, in train | |
trainer.learn() | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/trlx/trainer/accelerate_base_trainer.py", line 631, in learn | |
self.post_backward_callback() | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/trlx/trainer/accelerate_ppo_trainer.py", line 221, in post_backward_callback | |
self.kl_ctl.update(self.mean_kl, n_steps=self.config.train.batch_size) | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/trlx/models/modeling_ppo.py", line 51, in update | |
proportional_error = np.clip(current / self.target - 1, -0.2, 0.2) # ϵₜ | |
File "<__array_function__ internals>", line 200, in clip | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 2180, in clip | |
return _wrapfunc(a, 'clip', a_min, a_max, out=out, **kwargs) | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 66, in _wrapfunc | |
return _wrapit(obj, method, *args, **kwds) | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 43, in _wrapit | |
result = getattr(asarray(obj), method)(*args, **kwds) | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/torch/_tensor.py", line 970, in __array__ | |
return self.numpy() | |
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first. | |
Traceback (most recent call last): | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 57, in _wrapfunc | |
return bound(*args, **kwds) | |
TypeError: clip() received an invalid combination of arguments - got (float, float, out=NoneType), but expected one of: | |
* (Tensor min, Tensor max) | |
* (Number min, Number max) | |
During handling of the above exception, another exception occurred: | |
Traceback (most recent call last): | |
File "/home/ubuntu/OA/model/model_training/trainer_rl.py", line 199, in <module> | |
main() | |
File "/home/ubuntu/OA/model/model_training/trainer_rl.py", line 184, in main | |
trainer = trlx.train( | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/trlx/trlx.py", line 126, in train | |
trainer.learn() | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/trlx/trainer/accelerate_base_trainer.py", line 631, in learn | |
self.post_backward_callback() | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/trlx/trainer/accelerate_ppo_trainer.py", line 221, in post_backward_callback | |
self.kl_ctl.update(self.mean_kl, n_steps=self.config.train.batch_size) | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/trlx/models/modeling_ppo.py", line 51, in update | |
proportional_error = np.clip(current / self.target - 1, -0.2, 0.2) # ϵₜ | |
File "<__array_function__ internals>", line 200, in clip | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 2180, in clip | |
return _wrapfunc(a, 'clip', a_min, a_max, out=out, **kwargs) | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 66, in _wrapfunc | |
return _wrapit(obj, method, *args, **kwds) | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/numpy/core/fromnumeric.py", line 43, in _wrapit | |
result = getattr(asarray(obj), method)(*args, **kwds) | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/torch/_tensor.py", line 970, in __array__ | |
return self.numpy() | |
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first. | |
wandb: Waiting for W&B process to finish... (failed 1). Press Control-C to abort syncing. | |
wandb: \ 0.020 MB of 0.020 MB uploaded (0.000 MB deduped) | |
wandb: Run history: | |
wandb: exp_scores/mean ▁ | |
wandb: exp_scores/running_mean ▁ | |
wandb: exp_scores/running_std ▁ | |
wandb: kl_ctl_value ▁ | |
wandb: learning_rate_group_0 ▁▁▁▁ | |
wandb: losses/policy_loss █▇▄▁ | |
wandb: losses/total_loss █▇▄▁ | |
wandb: losses/value_loss █▇▅▁ | |
wandb: old_values/max ▁▁▁▁ | |
wandb: old_values/mean ▁▁▁▁ | |
wandb: old_values/min ▁▁▁▁ | |
wandb: old_values/std ▁▁▁▁ | |
wandb: padding_percentage ▁▁▁▁ | |
wandb: policy/approx_kl ▁▁▂█ | |
wandb: policy/clipfrac ▁▁▂█ | |
wandb: policy/sqrt_kl ▁ | |
wandb: ratio █▇▆▁ | |
wandb: returns/max ▁▁▁▁ | |
wandb: returns/mean ▁▁▁▁ | |
wandb: returns/min ▁▁▁▁ | |
wandb: returns/std ▁▁▁▁ | |
wandb: reward/mean ▁ | |
wandb: time/backward █▁▁▁ | |
wandb: time/exp ▁ | |
wandb: time/exp_generate ▁ | |
wandb: time/exp_score ▁ | |
wandb: time/forward █▁▁▁ | |
wandb: time/generate ▁ | |
wandb: values/clipfrac ▁▁▁█ | |
wandb: values/max ▁▁▃█ | |
wandb: values/mean ▁▂▃█ | |
wandb: values/min ▁▂▃█ | |
wandb: values/std ▁▁▂█ | |
wandb: values/values_error █▇▅▁ | |
wandb: values/values_mape_error ▂▁▁█ | |
wandb: | |
wandb: Run summary: | |
wandb: exp_scores/mean 5.29421 | |
wandb: exp_scores/running_mean 7.13103 | |
wandb: exp_scores/running_std 2.26968 | |
wandb: exp_scores/std nan | |
wandb: kl_ctl_value 0.1 | |
wandb: learning_rate_group_0 0.0 | |
wandb: losses/policy_loss -0.08002 | |
wandb: losses/total_loss 0.10822 | |
wandb: losses/value_loss 0.18823 | |
wandb: old_values/max 1.08789 | |
wandb: old_values/mean 0.45728 | |
wandb: old_values/min -0.17883 | |
wandb: old_values/std 0.24377 | |
wandb: padding_percentage 0.0 | |
wandb: policy/approx_kl 0.06049 | |
wandb: policy/clipfrac 0.15267 | |
wandb: policy/sqrt_kl 0.01389 | |
wandb: ratio 0.95557 | |
wandb: returns/max 4.0 | |
wandb: returns/mean 0.73096 | |
wandb: returns/min 0.32275 | |
wandb: returns/std 0.67627 | |
wandb: reward/mean 5.67346 | |
wandb: time/backward 58.6278 | |
wandb: time/exp 6.51247 | |
wandb: time/exp_generate 4.38555 | |
wandb: time/exp_score 1.11744 | |
wandb: time/forward 0.1586 | |
wandb: time/generate 20.26932 | |
wandb: values/clipfrac 0.0687 | |
wandb: values/max 1.38965 | |
wandb: values/mean 0.71729 | |
wandb: values/min 0.05487 | |
wandb: values/std 0.26367 | |
wandb: values/values_error 0.35596 | |
wandb: values/values_mape_error 0.5249 | |
wandb: | |
wandb: 🚀 View run trainer_rl/oa-pythia-12b-sft-df/1gpu:main at: https://wandb.ai/toanbku/rlhf/runs/l4eu07k3 | |
wandb: Synced 6 W&B file(s), 1 media file(s), 3 artifact file(s) and 0 other file(s) | |
wandb: Find logs at: ./wandb/run-20230718_152646-l4eu07k3/logs | |
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 56207) of binary: /home/ubuntu/mambaforge/envs/cuda118/bin/python3.10 | |
Traceback (most recent call last): | |
File "/home/ubuntu/mambaforge/envs/cuda118/bin/accelerate", line 8, in <module> | |
sys.exit(main()) | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main | |
args.func(args) | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/accelerate/commands/launch.py", line 964, in launch_command | |
deepspeed_launcher(args) | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/accelerate/commands/launch.py", line 687, in deepspeed_launcher | |
distrib_run.run(args) | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/torch/distributed/run.py", line 785, in run | |
elastic_launch( | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in __call__ | |
return launch_agent(self._config, self._entrypoint, list(args)) | |
File "/home/ubuntu/mambaforge/envs/cuda118/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent | |
raise ChildFailedError( | |
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: | |
============================================================ | |
trainer_rl.py FAILED | |
------------------------------------------------------------ | |
Failures: | |
<NO_OTHER_FAILURES> | |
------------------------------------------------------------ | |
Root Cause (first observed failure): | |
[0]: | |
time : 2023-07-18_15:41:40 | |
host : oa-server-4 | |
rank : 0 (local_rank: 0) | |
exitcode : 1 (pid: 56207) | |
error_file: <N/A> | |
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html | |
============================================================ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment