Created
March 15, 2022 17:53
-
-
Save HamidShojanazeri/ce87ffca399cece4b74c11f25945fa68 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(PT-nightly-FSDP) ubuntu@ip-172-31-37-85:~/serve/examples/Huggingface_Transformers$ curl -X POST http://127.0.0.1:8080/predictions/Textgeneration -T Seq_classification_artifacts/sample_text_captum_input.txt | |
Bloomberg has decided to publish a new report on the global economy.From(PT-nightly-FSDP) | |
(PT-nightly-FSDP) ubuntu@ip-172-31-37-85:~/serve/examples/Huggingface_Transformers$ WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance. | |
2022-03-15T17:48:09,043 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Loading snapshot serializer plugin... | |
2022-03-15T17:48:09,098 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager... | |
2022-03-15T17:48:09,233 [INFO ] main org.pytorch.serve.ModelServer - | |
Torchserve version: 0.5.3 | |
TS Home: /home/ubuntu/anaconda3/envs/PT-nightly-FSDP/lib/python3.8/site-packages | |
Current directory: /home/ubuntu/serve/examples/Huggingface_Transformers | |
Temp directory: /tmp | |
Number of GPUs: 4 | |
Number of CPUs: 48 | |
Max heap size: 30688 M | |
Python executable: /home/ubuntu/anaconda3/envs/PT-nightly-FSDP/bin/python3.8 | |
Config file: N/A | |
Inference address: http://127.0.0.1:8080 | |
Management address: http://127.0.0.1:8081 | |
Metrics address: http://127.0.0.1:8082 | |
Model Store: /home/ubuntu/serve/examples/Huggingface_Transformers/model_store | |
Initial Models: N/A | |
Log dir: /home/ubuntu/serve/examples/Huggingface_Transformers/logs | |
Metrics dir: /home/ubuntu/serve/examples/Huggingface_Transformers/logs | |
Netty threads: 0 | |
Netty client threads: 0 | |
Default workers per model: 4 | |
Blacklist Regex: N/A | |
Maximum Response Size: 6553500 | |
Maximum Request Size: 6553500 | |
Limit Maximum Image Pixels: true | |
Prefer direct buffer: false | |
Allowed Urls: [file://.*|http(s)?://.*] | |
Custom python dependency for model allowed: false | |
Metrics report format: prometheus | |
Enable metrics API: true | |
Workflow Store: /home/ubuntu/serve/examples/Huggingface_Transformers/model_store | |
Model config: N/A | |
2022-03-15T17:48:09,242 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel. | |
2022-03-15T17:48:09,297 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://127.0.0.1:8080 | |
2022-03-15T17:48:09,298 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel. | |
2022-03-15T17:48:09,298 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://127.0.0.1:8081 | |
2022-03-15T17:48:09,299 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel. | |
2022-03-15T17:48:09,299 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://127.0.0.1:8082 | |
Model server started. | |
2022-03-15T17:48:10,157 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,158 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:247.67612075805664|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,158 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:140.0098114013672|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,158 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:36.1|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,159 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,159 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,159 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,159 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,159 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,160 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,160 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,160 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,160 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,160 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,160 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,160 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,161 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:187895.44921875|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,161 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:1493.96875|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:48:10,161 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:1.7|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366490 | |
2022-03-15T17:49:10,126 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:16.7|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,127 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:247.6761131286621|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,127 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:140.00981903076172|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,127 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:36.1|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,127 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,127 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,127 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,127 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,128 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,128 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,128 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,128 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,128 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,128 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,128 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,128 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,129 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:187889.33984375|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,129 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:1500.078125|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:49:10,129 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:1.7|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366550 | |
2022-03-15T17:50:10,123 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,124 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:247.6761131286621|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,124 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:140.00981903076172|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,124 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:36.1|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,124 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,124 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,124 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,124 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,125 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,125 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,125 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,125 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,125 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,125 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,125 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,125 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,125 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:187893.00390625|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,126 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:1496.40625|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:50:10,126 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:1.7|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366610 | |
2022-03-15T17:51:10,118 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,118 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:247.67610549926758|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,118 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:140.00982666015625|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,118 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:36.1|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,119 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,119 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,119 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,119 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,119 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,119 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,119 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:0.0|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,119 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:0|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,120 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,120 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,120 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,120 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,120 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:187896.70703125|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,120 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:1492.7109375|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:10,120 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:1.7|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366670 | |
2022-03-15T17:51:27,951 [DEBUG] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model Textgeneration | |
2022-03-15T17:51:27,952 [DEBUG] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model Textgeneration | |
2022-03-15T17:51:27,952 [INFO ] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.ModelManager - Model Textgeneration loaded. | |
2022-03-15T17:51:27,953 [DEBUG] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.ModelManager - updateModel: Textgeneration, count: 1 | |
2022-03-15T17:51:27,958 [DEBUG] W-9000-Textgeneration_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/ubuntu/anaconda3/envs/PT-nightly-FSDP/bin/python3.8, /home/ubuntu/anaconda3/envs/PT-nightly-FSDP/lib/python3.8/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /tmp/.ts.sock.9000] | |
2022-03-15T17:51:28,628 [INFO ] W-9000-Textgeneration_1.0-stdout MODEL_LOG - Listening on port: /tmp/.ts.sock.9000 | |
2022-03-15T17:51:28,628 [INFO ] W-9000-Textgeneration_1.0-stdout MODEL_LOG - [PID]48672 | |
2022-03-15T17:51:28,629 [INFO ] W-9000-Textgeneration_1.0-stdout MODEL_LOG - Torch worker started. | |
2022-03-15T17:51:28,629 [INFO ] W-9000-Textgeneration_1.0-stdout MODEL_LOG - Python runtime: 3.8.12 | |
2022-03-15T17:51:28,629 [DEBUG] W-9000-Textgeneration_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-Textgeneration_1.0 State change null -> WORKER_STARTED | |
2022-03-15T17:51:28,632 [INFO ] W-9000-Textgeneration_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /tmp/.ts.sock.9000 | |
2022-03-15T17:51:28,637 [INFO ] W-9000-Textgeneration_1.0-stdout MODEL_LOG - Connection accepted: /tmp/.ts.sock.9000. | |
2022-03-15T17:51:28,639 [INFO ] W-9000-Textgeneration_1.0 org.pytorch.serve.wlm.WorkerThread - Flushing req. to backend at: 1647366688639 | |
2022-03-15T17:51:28,659 [INFO ] W-9000-Textgeneration_1.0-stdout MODEL_LOG - model_name: Textgeneration, batchSize: 1 | |
2022-03-15T17:51:29,363 [INFO ] W-9000-Textgeneration_1.0-stdout MODEL_LOG - Transformers version 4.11.0 | |
2022-03-15T17:51:46,799 [INFO ] W-9000-Textgeneration_1.0-stdout MODEL_LOG - Transformer model from path /tmp/models/950aedf0a0bd4e5293c9bee6640e5199 loaded successfully | |
2022-03-15T17:51:46,800 [INFO ] W-9000-Textgeneration_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 18141 | |
2022-03-15T17:51:46,801 [DEBUG] W-9000-Textgeneration_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-Textgeneration_1.0 State change WORKER_STARTED -> WORKER_MODEL_LOADED | |
2022-03-15T17:51:46,801 [INFO ] W-9000-Textgeneration_1.0 TS_METRICS - W-9000-Textgeneration_1.0.ms:18845|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366706 | |
2022-03-15T17:51:46,801 [INFO ] W-9000-Textgeneration_1.0 TS_METRICS - WorkerThreadTime.ms:21|#Level:Host|#hostname:ip-172-31-37-85,timestamp:null | |
2022-03-15T17:51:46,804 [INFO ] epollEventLoopGroup-3-1 ACCESS_LOG - /127.0.0.1:56052 "POST /models?model_name=Textgeneration&url=Textgeneration.mar&batch_size=1&max_batch_delay=5000&initial_workers=1&synchronous=true HTTP/1.1" 200 26065 | |
2022-03-15T17:51:46,805 [INFO ] epollEventLoopGroup-3-1 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:ip-172-31-37-85,timestamp:null | |
2022-03-15T17:52:10,111 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,111 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:247.20067977905273|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,111 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:140.4852523803711|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,111 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:36.2|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,111 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:9.927857568336753|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,111 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:1500|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,112 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:8.948308954927526|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,112 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:1352|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,112 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:8.948308954927526|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,112 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:1352|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,112 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:8.948308954927526|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,112 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:1352|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,112 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,112 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,112 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,112 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,112 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:181765.68359375|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,113 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:7575.734375|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:10,113 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:4.9|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366730 | |
2022-03-15T17:52:25,234 [INFO ] epollEventLoopGroup-3-2 ACCESS_LOG - /127.0.0.1:43170 "POST /predictions/my_tc HTTP/1.1" 404 4 | |
2022-03-15T17:52:25,234 [INFO ] epollEventLoopGroup-3-2 TS_METRICS - Requests4XX.Count:1|#Level:Host|#hostname:ip-172-31-37-85,timestamp:null | |
2022-03-15T17:52:57,290 [INFO ] W-9000-Textgeneration_1.0 org.pytorch.serve.wlm.WorkerThread - Flushing req. to backend at: 1647366777290 | |
2022-03-15T17:52:57,291 [INFO ] W-9000-Textgeneration_1.0-stdout MODEL_LOG - Backend received inference at: 1647366777 | |
2022-03-15T17:52:57,291 [INFO ] W-9000-Textgeneration_1.0-stdout MODEL_LOG - Received text: 'Bloomberg has decided to publish a new report on the global economy.' | |
2022-03-15T17:52:57,291 [WARN ] W-9000-Textgeneration_1.0-stderr MODEL_LOG - /home/ubuntu/anaconda3/envs/PT-nightly-FSDP/lib/python3.8/site-packages/transformers/configuration_utils.py:336: UserWarning: Passing `gradient_checkpointing` to a config initialization is deprecated and will be removed in v5 Transformers. Using `model.gradient_checkpointing_enable()` instead, or if you are using the `Trainer` API, pass `gradient_checkpointing=True` in your `TrainingArguments`. | |
2022-03-15T17:52:57,292 [WARN ] W-9000-Textgeneration_1.0-stderr MODEL_LOG - warnings.warn( | |
2022-03-15T17:52:57,292 [WARN ] W-9000-Textgeneration_1.0-stderr MODEL_LOG - Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`. | |
2022-03-15T17:52:57,295 [WARN ] W-9000-Textgeneration_1.0-stderr MODEL_LOG - /home/ubuntu/anaconda3/envs/PT-nightly-FSDP/lib/python3.8/site-packages/transformers/tokenization_utils_base.py:2211: FutureWarning: The `pad_to_max_length` argument is deprecated and will be removed in a future version, use `padding=True` or `padding='longest'` to pad to the longest sequence in the batch, or use `padding='max_length'` to pad to a max length. In this case, you can give a specific length with `max_length` (e.g. `max_length=45`) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert). | |
2022-03-15T17:52:57,295 [WARN ] W-9000-Textgeneration_1.0-stderr MODEL_LOG - warnings.warn( | |
2022-03-15T17:52:57,296 [WARN ] W-9000-Textgeneration_1.0-stderr MODEL_LOG - Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation. | |
2022-03-15T17:52:57,296 [WARN ] W-9000-Textgeneration_1.0-stderr MODEL_LOG - Input length of input_ids is 150, but ``max_length`` is set to 50.This can lead to unexpected behavior. You should consider increasing ``config.max_length`` or ``max_length``. | |
2022-03-15T17:52:57,325 [INFO ] W-9000-Textgeneration_1.0-stdout MODEL_LOG - Generated text: '['Bloomberg has decided to publish a new report on the global economy.From']' | |
2022-03-15T17:52:57,325 [INFO ] W-9000-Textgeneration_1.0-stdout MODEL_LOG - Generated text ['Bloomberg has decided to publish a new report on the global economy.From'] | |
2022-03-15T17:52:57,325 [INFO ] W-9000-Textgeneration_1.0-stdout MODEL_METRICS - HandlerTime.Milliseconds:33.68|#ModelName:Textgeneration,Level:Model|#hostname:ip-172-31-37-85,requestID:3fbd2201-ce14-4063-a12d-5aa076394135,timestamp:1647366777 | |
2022-03-15T17:52:57,325 [INFO ] W-9000-Textgeneration_1.0-stdout MODEL_METRICS - PredictionTime.Milliseconds:33.74|#ModelName:Textgeneration,Level:Model|#hostname:ip-172-31-37-85,requestID:3fbd2201-ce14-4063-a12d-5aa076394135,timestamp:1647366777 | |
2022-03-15T17:52:57,325 [INFO ] W-9000-Textgeneration_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 34 | |
2022-03-15T17:52:57,326 [INFO ] W-9000-Textgeneration_1.0 ACCESS_LOG - /127.0.0.1:43172 "POST /predictions/Textgeneration HTTP/1.1" 200 37 | |
2022-03-15T17:52:57,326 [INFO ] W-9000-Textgeneration_1.0 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:ip-172-31-37-85,timestamp:null | |
2022-03-15T17:52:57,326 [DEBUG] W-9000-Textgeneration_1.0 org.pytorch.serve.job.Job - Waiting time ns: 147579, Backend time ns: 36311344 | |
2022-03-15T17:52:57,326 [INFO ] W-9000-Textgeneration_1.0 TS_METRICS - QueueTime.ms:0|#Level:Host|#hostname:ip-172-31-37-85,timestamp:null | |
2022-03-15T17:52:57,326 [INFO ] W-9000-Textgeneration_1.0 TS_METRICS - WorkerThreadTime.ms:2|#Level:Host|#hostname:ip-172-31-37-85,timestamp:null | |
2022-03-15T17:53:10,113 [INFO ] pool-3-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,113 [INFO ] pool-3-thread-1 TS_METRICS - DiskAvailable.Gigabytes:247.20066833496094|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,113 [INFO ] pool-3-thread-1 TS_METRICS - DiskUsage.Gigabytes:140.4852638244629|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,113 [INFO ] pool-3-thread-1 TS_METRICS - DiskUtilization.Percent:36.2|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,113 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:10.21907472367463|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,113 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:1544|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,113 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:9.02773181547422|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,114 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:1364|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,114 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:9.02773181547422|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,114 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:1364|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,114 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUtilization.Percent:9.04096895889867|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,114 [INFO ] pool-3-thread-1 TS_METRICS - GPUMemoryUsed.Megabytes:1366|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,114 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:0|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,114 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:1|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,114 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:2|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,114 [INFO ] pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:0|#Level:Host,device_id:3|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,114 [INFO ] pool-3-thread-1 TS_METRICS - MemoryAvailable.Megabytes:181728.546875|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,114 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUsed.Megabytes:7612.8671875|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
2022-03-15T17:53:10,114 [INFO ] pool-3-thread-1 TS_METRICS - MemoryUtilization.Percent:4.9|#Level:Host|#hostname:ip-172-31-37-85,timestamp:1647366790 | |
Every 0.1s: nvidia-smi ip-172-31-37-85: Tue Mar 15 17:53:32 2022 | |
Tue Mar 15 17:53:32 2022 | |
+-----------------------------------------------------------------------------+ | |
| NVIDIA-SMI 450.142.00 Driver Version: 450.142.00 CUDA Version: 11.0 | | |
|-------------------------------+----------------------+----------------------+ | |
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | |
| | | MIG M. | | |
|===============================+======================+======================| | |
| 0 Tesla T4 On | 00000000:00:1B.0 Off | 0 | | |
| N/A 38C P0 26W / 70W | 1544MiB / 15109MiB | 0% Default | | |
| | | N/A | | |
+-------------------------------+----------------------+----------------------+ | |
| 1 Tesla T4 On | 00000000:00:1C.0 Off | 0 | | |
| N/A 37C P0 27W / 70W | 1364MiB / 15109MiB | 0% Default | | |
| | | N/A | | |
+-------------------------------+----------------------+----------------------+ | |
| 2 Tesla T4 On | 00000000:00:1D.0 Off | 0 | | |
| N/A 39C P0 27W / 70W | 1364MiB / 15109MiB | 0% Default | | |
| | | N/A | | |
+-------------------------------+----------------------+----------------------+ | |
| 3 Tesla T4 On | 00000000:00:1E.0 Off | 0 | | |
| N/A 37C P0 27W / 70W | 1366MiB / 15109MiB | 0% Default | | |
| | | N/A | | |
+-------------------------------+----------------------+----------------------+ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment