Skip to content

Instantly share code, notes, and snippets.

@davidxia

davidxia/events Secret

Last active November 24, 2022 17:40
Show Gist options
  • Save davidxia/874d87c0ecd5e3d40ce9ac27f474f9fd to your computer and use it in GitHub Desktop.
Save davidxia/874d87c0ecd5e3d40ce9ac27f474f9fd to your computer and use it in GitHub Desktop.
kubectl describe rayclusters dxia-test
Name: dxia-test
Namespace: default
Labels: controller-tools.k8s.io=1.0
API Version: ray.io/v1alpha1
Kind: RayCluster
Spec:
Enable In Tree Autoscaling: false
Head Group Spec:
Ray Start Params:
Block: true
Dashboard - Host: 0.0.0.0
Num - Cpus: 0
Replicas: 1
Service Type: ClusterIP
Template:
Spec:
Containers:
Image: eu.gcr.io/kubeflow-platform/hendrix-ray:0.2.3.dev1-py38-gpu
Lifecycle:
Post Start:
Exec:
Command:
/bin/sh
-c
mkdir -p /home/ray/notebooks && jupyter notebook --no-browser --ip 0.0.0.0 --port 8081 --MappingKernelManager.cull_idle_timeout=3600 --NotebookApp.disable_check_xsrf=True --NotebookApp.notebook_dir='/home/ray/notebooks' --NotebookApp.token='' > /dev/null 2>&1 &
Pre Stop:
Exec:
Command:
/bin/sh
-c
ray stop
Name: ray-head
Ports:
Container Port: 3000
Name: vscode
Protocol: TCP
Container Port: 6379
Name: gcs
Protocol: TCP
Container Port: 8265
Name: dashboard
Protocol: TCP
Container Port: 8080
Name: metrics
Protocol: TCP
Container Port: 10001
Name: client
Protocol: TCP
Resources:
Limits:
Cpu: 15
Memory: 48Gi
Requests:
Cpu: 15
Memory: 48Gi
Volume Mounts:
Mount Path: /tmp/ray
Name: ray-logs
Service Account Name: default-editor
Volumes:
Empty Dir:
Name: ray-logs
Ray Version: 2.0.0
Worker Group Specs:
Group Name: worker
Max Replicas: 100
Min Replicas: 1
Ray Start Params:
Block: true
Replicas: 1
Template:
Spec:
Containers:
Image: eu.gcr.io/kubeflow-platform/hendrix-ray:0.2.3.dev1-py38-gpu
Lifecycle:
Pre Stop:
Exec:
Command:
/bin/sh
-c
ray stop
Name: main
Ports:
Container Port: 8080
# Please edit the object below. Lines beginning with a '#' will be ignored,
Name: metrics
Protocol: TCP
Resources:
Limits:
Cpu: 15
Memory: 48Gi
Requests:
Cpu: 15
Memory: 48Gi
Volume Mounts:
Mount Path: /tmp/ray
Name: ray-logs
Init Containers:
Command:
sh
-c
until nslookup $RAY_IP.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for myservice; sleep 2; done
Image: busybox:1.28
Name: init
Service Account Name: default-editor
Volumes:
Empty Dir:
Name: ray-logs
Status:
Reason: pods "dxia-test-head-md2k2" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
State: failed
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning PodReconciliationError 117s raycluster-controller pods "dxia-test-head-cxs5k" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
Warning PodReconciliationError 116s raycluster-controller pods "dxia-test-head-p6wsw" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
Warning PodReconciliationError 116s raycluster-controller pods "dxia-test-head-l9gtv" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
Warning PodReconciliationError 116s raycluster-controller pods "dxia-test-head-8dfgz" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
Warning PodReconciliationError 116s raycluster-controller pods "dxia-test-head-5lr8q" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
Warning PodReconciliationError 116s raycluster-controller pods "dxia-test-head-4tj59" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
Warning PodReconciliationError 116s raycluster-controller pods "dxia-test-head-6pn4m" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
Warning PodReconciliationError 115s raycluster-controller pods "dxia-test-head-658ww" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
Warning PodReconciliationError 115s raycluster-controller pods "dxia-test-head-rwjjf" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
Warning PodReconciliationError 20s (x7 over 113s) raycluster-controller (combined from similar events): pods "dxia-test-head-md2k2" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
❯ kubectl get events
LAST SEEN TYPE REASON OBJECT MESSAGE
2m45s Warning PodReconciliationError raycluster/dxia-test pods "dxia-test-head-cxs5k" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
2m44s Warning PodReconciliationError raycluster/dxia-test pods "dxia-test-head-p6wsw" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
2m44s Warning PodReconciliationError raycluster/dxia-test pods "dxia-test-head-l9gtv" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
2m44s Warning PodReconciliationError raycluster/dxia-test pods "dxia-test-head-8dfgz" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
2m44s Warning PodReconciliationError raycluster/dxia-test pods "dxia-test-head-5lr8q" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
2m44s Warning PodReconciliationError raycluster/dxia-test pods "dxia-test-head-4tj59" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
2m44s Warning PodReconciliationError raycluster/dxia-test pods "dxia-test-head-6pn4m" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
2m43s Warning PodReconciliationError raycluster/dxia-test pods "dxia-test-head-658ww" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
2m43s Warning PodReconciliationError raycluster/dxia-test pods "dxia-test-head-rwjjf" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
15s Warning PodReconciliationError raycluster/dxia-test (combined from similar events): pods "dxia-test-head-6rrxq" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
2022-11-23T18:34:00.765Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "dxia-test"}
2022-11-23T18:34:00.775Z INFO controllers.RayCluster Pod Service created successfully {"service name": "dxia-test-head-svc"}
2022-11-23T18:34:00.775Z INFO controllers.RayCluster reconcilePods {"creating head pod for cluster": "dxia-test"}
2022-11-23T18:34:00.775Z INFO RayCluster-Controller Setting pod namespaces {"namespace": "dxia-test"}
2022-11-23T18:34:00.775Z INFO controllers.RayCluster head pod labels {"labels": {"app.kubernetes.io/created-by":"kuberay-operator","app.kubernetes.io/name":"kuberay","ray.io/cluster":"dxia-test","ray.io/cluster-dashboard":"dxia-test-dashboard","ray.io/group":"headgroup","ray.io/identifier":"dxia-test-head","ray.io/is-ray-node":"yes","ray.io/node-type":"head"}}
2022-11-23T18:34:00.775Z INFO RayCluster-Controller Head pod container with index 0 identified as Ray container.
2022-11-23T18:34:00.775Z DEBUG events Normal {"object": {"kind":"RayCluster","namespace":"dxia-test","name":"dxia-test","uid":"161844f3-806f-451b-9c3a-e4195a699a50","apiVersion":"ray.io/v1alpha1","resourceVersion":"409255872"}, "reason": "Created", "message": "Created service dxia-test-head-svc"}
2022-11-23T18:34:00.775Z INFO controllers.RayCluster createHeadPod {"head pod with name": "dxia-test-head-"}
2022-11-23T18:34:00.808Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "dxia-test", "namespace": "dxia-test", "error": "pods \"dxia-test-head-4grhl\" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:34:00.808Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "dxia-test"}
2022-11-23T18:34:00.809Z INFO controllers.RayCluster reconcileServices {"headService service found": "dxia-test-head-svc"}
2022-11-23T18:34:00.809Z INFO controllers.RayCluster reconcilePods {"creating head pod for cluster": "dxia-test"}
2022-11-23T18:34:00.809Z INFO RayCluster-Controller Setting pod namespaces {"namespace": "dxia-test"}
2022-11-23T18:34:00.809Z INFO controllers.RayCluster head pod labels {"labels": {"app.kubernetes.io/created-by":"kuberay-operator","app.kubernetes.io/name":"kuberay","ray.io/cluster":"dxia-test","ray.io/cluster-dashboard":"dxia-test-dashboard","ray.io/group":"headgroup","ray.io/identifier":"dxia-test-head","ray.io/is-ray-node":"yes","ray.io/node-type":"head"}}
2022-11-23T18:34:00.809Z INFO RayCluster-Controller Head pod container with index 0 identified as Ray container.
2022-11-23T18:34:00.809Z INFO controllers.RayCluster createHeadPod {"head pod with name": "dxia-test-head-"}
2022-11-23T18:34:00.822Z ERROR controllers.RayCluster RayCluster update state error {"cluster name": "dxia-test", "error": "Operation cannot be fulfilled on rayclusters.ray.io \"dxia-test\": the object has been modified; please apply your changes to the latest version and try again"}
github.com/ray-project/kuberay/ray-operator/controllers/ray.(*RayClusterReconciler).Reconcile
/workspace/controllers/ray/raycluster_controller.go:100
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:34:00.828Z ERROR controllers.RayCluster RayCluster update reason error {"cluster name": "dxia-test", "error": "Operation cannot be fulfilled on rayclusters.ray.io \"dxia-test\": the object has been modified; please apply your changes to the latest version and try again"}
github.com/ray-project/kuberay/ray-operator/controllers/ray.(*RayClusterReconciler).Reconcile
/workspace/controllers/ray/raycluster_controller.go:100
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:114
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:311
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:34:00.828Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "dxia-test", "namespace": "dxia-test", "error": "pods \"dxia-test-head-k2gk9\" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:34:00.828Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "dxia-test"}
2022-11-23T18:34:00.828Z INFO controllers.RayCluster reconcileServices {"headService service found": "dxia-test-head-svc"}
2022-11-23T18:34:00.829Z INFO controllers.RayCluster reconcilePods {"creating head pod for cluster": "dxia-test"}
2022-11-23T18:34:00.829Z INFO RayCluster-Controller Setting pod namespaces {"namespace": "dxia-test"}
2022-11-23T18:34:00.829Z INFO controllers.RayCluster head pod labels {"labels": {"app.kubernetes.io/created-by":"kuberay-operator","app.kubernetes.io/name":"kuberay","ray.io/cluster":"dxia-test","ray.io/cluster-dashboard":"dxia-test-dashboard","ray.io/group":"headgroup","ray.io/identifier":"dxia-test-head","ray.io/is-ray-node":"yes","ray.io/node-type":"head"}}
2022-11-23T18:34:00.829Z INFO RayCluster-Controller Head pod container with index 0 identified as Ray container.
2022-11-23T18:34:00.829Z INFO controllers.RayCluster createHeadPod {"head pod with name": "dxia-test-head-"}
2022-11-23T18:34:00.852Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "dxia-test", "namespace": "dxia-test", "error": "pods \"dxia-test-head-9jrg5\" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:34:00.852Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "dxia-test"}
2022-11-23T18:34:00.852Z INFO controllers.RayCluster reconcileServices {"headService service found": "dxia-test-head-svc"}
2022-11-23T18:34:00.853Z INFO controllers.RayCluster reconcilePods {"creating head pod for cluster": "dxia-test"}
2022-11-23T18:34:00.853Z INFO RayCluster-Controller Setting pod namespaces {"namespace": "dxia-test"}
2022-11-23T18:34:00.853Z INFO controllers.RayCluster head pod labels {"labels": {"app.kubernetes.io/created-by":"kuberay-operator","app.kubernetes.io/name":"kuberay","ray.io/cluster":"dxia-test","ray.io/cluster-dashboard":"dxia-test-dashboard","ray.io/group":"headgroup","ray.io/identifier":"dxia-test-head","ray.io/is-ray-node":"yes","ray.io/node-type":"head"}}
2022-11-23T18:34:00.853Z INFO RayCluster-Controller Head pod container with index 0 identified as Ray container.
2022-11-23T18:34:00.853Z INFO controllers.RayCluster createHeadPod {"head pod with name": "dxia-test-head-"}
2022-11-23T18:34:00.875Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "dxia-test", "namespace": "dxia-test", "error": "pods \"dxia-test-head-dcjp4\" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:34:00.875Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "dxia-test"}
2022-11-23T18:34:00.875Z INFO controllers.RayCluster reconcileServices {"headService service found": "dxia-test-head-svc"}
2022-11-23T18:34:00.875Z INFO controllers.RayCluster reconcilePods {"creating head pod for cluster": "dxia-test"}
2022-11-23T18:34:00.875Z INFO RayCluster-Controller Setting pod namespaces {"namespace": "dxia-test"}
2022-11-23T18:34:00.875Z INFO controllers.RayCluster head pod labels {"labels": {"app.kubernetes.io/created-by":"kuberay-operator","app.kubernetes.io/name":"kuberay","ray.io/cluster":"dxia-test","ray.io/cluster-dashboard":"dxia-test-dashboard","ray.io/group":"headgroup","ray.io/identifier":"dxia-test-head","ray.io/is-ray-node":"yes","ray.io/node-type":"head"}}
2022-11-23T18:34:00.875Z INFO RayCluster-Controller Head pod container with index 0 identified as Ray container.
2022-11-23T18:34:00.875Z INFO controllers.RayCluster createHeadPod {"head pod with name": "dxia-test-head-"}
2022-11-23T18:34:00.899Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "dxia-test", "namespace": "dxia-test", "error": "pods \"dxia-test-head-7f8p7\" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:34:00.916Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "dxia-test"}
2022-11-23T18:34:00.916Z INFO controllers.RayCluster reconcileServices {"headService service found": "dxia-test-head-svc"}
2022-11-23T18:34:00.916Z INFO controllers.RayCluster reconcilePods {"creating head pod for cluster": "dxia-test"}
2022-11-23T18:34:00.916Z INFO RayCluster-Controller Setting pod namespaces {"namespace": "dxia-test"}
2022-11-23T18:34:00.916Z INFO controllers.RayCluster head pod labels {"labels": {"app.kubernetes.io/created-by":"kuberay-operator","app.kubernetes.io/name":"kuberay","ray.io/cluster":"dxia-test","ray.io/cluster-dashboard":"dxia-test-dashboard","ray.io/group":"headgroup","ray.io/identifier":"dxia-test-head","ray.io/is-ray-node":"yes","ray.io/node-type":"head"}}
2022-11-23T18:34:00.916Z INFO RayCluster-Controller Head pod container with index 0 identified as Ray container.
2022-11-23T18:34:00.916Z INFO controllers.RayCluster createHeadPod {"head pod with name": "dxia-test-head-"}
2022-11-23T18:34:00.941Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "dxia-test", "namespace": "dxia-test", "error": "pods \"dxia-test-head-2z4f5\" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:34:01.102Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "dxia-test"}
2022-11-23T18:34:01.102Z INFO controllers.RayCluster reconcileServices {"headService service found": "dxia-test-head-svc"}
2022-11-23T18:34:01.102Z INFO controllers.RayCluster reconcilePods {"creating head pod for cluster": "dxia-test"}
2022-11-23T18:34:01.102Z INFO RayCluster-Controller Setting pod namespaces {"namespace": "dxia-test"}
2022-11-23T18:34:01.102Z INFO controllers.RayCluster head pod labels {"labels": {"app.kubernetes.io/created-by":"kuberay-operator","app.kubernetes.io/name":"kuberay","ray.io/cluster":"dxia-test","ray.io/cluster-dashboard":"dxia-test-dashboard","ray.io/group":"headgroup","ray.io/identifier":"dxia-test-head","ray.io/is-ray-node":"yes","ray.io/node-type":"head"}}
2022-11-23T18:34:01.102Z INFO RayCluster-Controller Head pod container with index 0 identified as Ray container.
2022-11-23T18:34:01.102Z INFO controllers.RayCluster createHeadPod {"head pod with name": "dxia-test-head-"}
2022-11-23T18:34:01.125Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "dxia-test", "namespace": "dxia-test", "error": "pods \"dxia-test-head-t8h9q\" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:34:01.445Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "dxia-test"}
2022-11-23T18:34:01.445Z INFO controllers.RayCluster reconcileServices {"headService service found": "dxia-test-head-svc"}
2022-11-23T18:34:01.445Z INFO controllers.RayCluster reconcilePods {"creating head pod for cluster": "dxia-test"}
2022-11-23T18:34:01.445Z INFO RayCluster-Controller Setting pod namespaces {"namespace": "dxia-test"}
2022-11-23T18:34:01.445Z INFO controllers.RayCluster head pod labels {"labels": {"app.kubernetes.io/created-by":"kuberay-operator","app.kubernetes.io/name":"kuberay","ray.io/cluster":"dxia-test","ray.io/cluster-dashboard":"dxia-test-dashboard","ray.io/group":"headgroup","ray.io/identifier":"dxia-test-head","ray.io/is-ray-node":"yes","ray.io/node-type":"head"}}
2022-11-23T18:34:01.445Z INFO RayCluster-Controller Head pod container with index 0 identified as Ray container.
2022-11-23T18:34:01.445Z INFO controllers.RayCluster createHeadPod {"head pod with name": "dxia-test-head-"}
2022-11-23T18:34:01.469Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "dxia-test", "namespace": "dxia-test", "error": "pods \"dxia-test-head-429h9\" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:34:02.110Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "dxia-test"}
2022-11-23T18:34:02.110Z INFO controllers.RayCluster reconcileServices {"headService service found": "dxia-test-head-svc"}
2022-11-23T18:34:02.110Z INFO controllers.RayCluster reconcilePods {"creating head pod for cluster": "dxia-test"}
2022-11-23T18:34:02.110Z INFO RayCluster-Controller Setting pod namespaces {"namespace": "dxia-test"}
2022-11-23T18:34:02.110Z INFO controllers.RayCluster head pod labels {"labels": {"app.kubernetes.io/created-by":"kuberay-operator","app.kubernetes.io/name":"kuberay","ray.io/cluster":"dxia-test","ray.io/cluster-dashboard":"dxia-test-dashboard","ray.io/group":"headgroup","ray.io/identifier":"dxia-test-head","ray.io/is-ray-node":"yes","ray.io/node-type":"head"}}
2022-11-23T18:34:02.110Z INFO RayCluster-Controller Head pod container with index 0 identified as Ray container.
2022-11-23T18:34:02.110Z INFO controllers.RayCluster createHeadPod {"head pod with name": "dxia-test-head-"}
2022-11-23T18:34:02.134Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "dxia-test", "namespace": "dxia-test", "error": "pods \"dxia-test-head-kdn5w\" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:34:03.415Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "dxia-test"}
2022-11-23T18:34:03.415Z INFO controllers.RayCluster reconcileServices {"headService service found": "dxia-test-head-svc"}
2022-11-23T18:34:03.415Z INFO controllers.RayCluster reconcilePods {"creating head pod for cluster": "dxia-test"}
2022-11-23T18:34:03.415Z INFO RayCluster-Controller Setting pod namespaces {"namespace": "dxia-test"}
2022-11-23T18:34:03.415Z INFO controllers.RayCluster head pod labels {"labels": {"app.kubernetes.io/created-by":"kuberay-operator","app.kubernetes.io/name":"kuberay","ray.io/cluster":"dxia-test","ray.io/cluster-dashboard":"dxia-test-dashboard","ray.io/group":"headgroup","ray.io/identifier":"dxia-test-head","ray.io/is-ray-node":"yes","ray.io/node-type":"head"}}
2022-11-23T18:34:03.415Z INFO RayCluster-Controller Head pod container with index 0 identified as Ray container.
2022-11-23T18:34:03.415Z INFO controllers.RayCluster createHeadPod {"head pod with name": "dxia-test-head-"}
2022-11-23T18:34:03.441Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "dxia-test", "namespace": "dxia-test", "error": "pods \"dxia-test-head-gnrc2\" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:34:06.002Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "dxia-test"}
2022-11-23T18:34:06.003Z INFO controllers.RayCluster reconcileServices {"headService service found": "dxia-test-head-svc"}
2022-11-23T18:34:06.003Z INFO controllers.RayCluster reconcilePods {"creating head pod for cluster": "dxia-test"}
2022-11-23T18:34:06.003Z INFO RayCluster-Controller Setting pod namespaces {"namespace": "dxia-test"}
2022-11-23T18:34:06.003Z INFO controllers.RayCluster head pod labels {"labels": {"app.kubernetes.io/created-by":"kuberay-operator","app.kubernetes.io/name":"kuberay","ray.io/cluster":"dxia-test","ray.io/cluster-dashboard":"dxia-test-dashboard","ray.io/group":"headgroup","ray.io/identifier":"dxia-test-head","ray.io/is-ray-node":"yes","ray.io/node-type":"head"}}
2022-11-23T18:34:06.003Z INFO RayCluster-Controller Head pod container with index 0 identified as Ray container.
2022-11-23T18:34:06.003Z INFO controllers.RayCluster createHeadPod {"head pod with name": "dxia-test-head-"}
2022-11-23T18:34:06.030Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "dxia-test", "namespace": "dxia-test", "error": "pods \"dxia-test-head-zqnj7\" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:34:11.151Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "dxia-test"}
2022-11-23T18:34:11.151Z INFO controllers.RayCluster reconcileServices {"headService service found": "dxia-test-head-svc"}
2022-11-23T18:34:11.151Z INFO controllers.RayCluster reconcilePods {"creating head pod for cluster": "dxia-test"}
2022-11-23T18:34:11.151Z INFO RayCluster-Controller Setting pod namespaces {"namespace": "dxia-test"}
2022-11-23T18:34:11.151Z INFO controllers.RayCluster head pod labels {"labels": {"app.kubernetes.io/created-by":"kuberay-operator","app.kubernetes.io/name":"kuberay","ray.io/cluster":"dxia-test","ray.io/cluster-dashboard":"dxia-test-dashboard","ray.io/group":"headgroup","ray.io/identifier":"dxia-test-head","ray.io/is-ray-node":"yes","ray.io/node-type":"head"}}
2022-11-23T18:34:11.151Z INFO RayCluster-Controller Head pod container with index 0 identified as Ray container.
2022-11-23T18:34:11.151Z INFO controllers.RayCluster createHeadPod {"head pod with name": "dxia-test-head-"}
2022-11-23T18:34:11.174Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "dxia-test", "namespace": "dxia-test", "error": "pods \"dxia-test-head-ls62g\" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:34:21.415Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "dxia-test"}
2022-11-23T18:34:21.415Z INFO controllers.RayCluster reconcileServices {"headService service found": "dxia-test-head-svc"}
2022-11-23T18:34:21.415Z INFO controllers.RayCluster reconcilePods {"creating head pod for cluster": "dxia-test"}
2022-11-23T18:34:21.415Z INFO RayCluster-Controller Setting pod namespaces {"namespace": "dxia-test"}
2022-11-23T18:34:21.415Z INFO controllers.RayCluster head pod labels {"labels": {"app.kubernetes.io/created-by":"kuberay-operator","app.kubernetes.io/name":"kuberay","ray.io/cluster":"dxia-test","ray.io/cluster-dashboard":"dxia-test-dashboard","ray.io/group":"headgroup","ray.io/identifier":"dxia-test-head","ray.io/is-ray-node":"yes","ray.io/node-type":"head"}}
2022-11-23T18:34:21.415Z INFO RayCluster-Controller Head pod container with index 0 identified as Ray container.
2022-11-23T18:34:21.415Z INFO controllers.RayCluster createHeadPod {"head pod with name": "dxia-test-head-"}
2022-11-23T18:34:21.441Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "dxia-test", "namespace": "dxia-test", "error": "pods \"dxia-test-head-hzg9w\" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:34:41.922Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "dxia-test"}
2022-11-23T18:34:41.922Z INFO controllers.RayCluster reconcileServices {"headService service found": "dxia-test-head-svc"}
2022-11-23T18:34:41.922Z INFO controllers.RayCluster reconcilePods {"creating head pod for cluster": "dxia-test"}
2022-11-23T18:34:41.922Z INFO RayCluster-Controller Setting pod namespaces {"namespace": "dxia-test"}
2022-11-23T18:34:41.922Z INFO controllers.RayCluster head pod labels {"labels": {"app.kubernetes.io/created-by":"kuberay-operator","app.kubernetes.io/name":"kuberay","ray.io/cluster":"dxia-test","ray.io/cluster-dashboard":"dxia-test-dashboard","ray.io/group":"headgroup","ray.io/identifier":"dxia-test-head","ray.io/is-ray-node":"yes","ray.io/node-type":"head"}}
2022-11-23T18:34:41.922Z INFO RayCluster-Controller Head pod container with index 0 identified as Ray container.
2022-11-23T18:34:41.922Z INFO controllers.RayCluster createHeadPod {"head pod with name": "dxia-test-head-"}
2022-11-23T18:34:41.950Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "dxia-test", "namespace": "dxia-test", "error": "pods \"dxia-test-head-955z8\" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:35:22.910Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "dxia-test"}
2022-11-23T18:35:22.910Z INFO controllers.RayCluster reconcileServices {"headService service found": "dxia-test-head-svc"}
2022-11-23T18:35:22.911Z INFO controllers.RayCluster reconcilePods {"creating head pod for cluster": "dxia-test"}
2022-11-23T18:35:22.911Z INFO RayCluster-Controller Setting pod namespaces {"namespace": "dxia-test"}
2022-11-23T18:35:22.911Z INFO controllers.RayCluster head pod labels {"labels": {"app.kubernetes.io/created-by":"kuberay-operator","app.kubernetes.io/name":"kuberay","ray.io/cluster":"dxia-test","ray.io/cluster-dashboard":"dxia-test-dashboard","ray.io/group":"headgroup","ray.io/identifier":"dxia-test-head","ray.io/is-ray-node":"yes","ray.io/node-type":"head"}}
2022-11-23T18:35:22.911Z INFO RayCluster-Controller Head pod container with index 0 identified as Ray container.
2022-11-23T18:35:22.911Z INFO controllers.RayCluster createHeadPod {"head pod with name": "dxia-test-head-"}
2022-11-23T18:35:22.937Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "dxia-test", "namespace": "dxia-test", "error": "pods \"dxia-test-head-w4shj\" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
2022-11-23T18:36:20.340Z INFO controllers.RayCluster Read request instance not found error! {"name": "default/gke-ml-compute-1-cpu-2-d3b58cbe-k26f.172a473f15a335b7"}
2022-11-23T18:36:44.858Z INFO controllers.RayCluster reconciling RayCluster {"cluster name": "dxia-test"}
2022-11-23T18:36:44.858Z INFO controllers.RayCluster reconcileServices {"headService service found": "dxia-test-head-svc"}
2022-11-23T18:36:44.858Z INFO controllers.RayCluster reconcilePods {"creating head pod for cluster": "dxia-test"}
2022-11-23T18:36:44.859Z INFO RayCluster-Controller Setting pod namespaces {"namespace": "dxia-test"}
2022-11-23T18:36:44.859Z INFO controllers.RayCluster head pod labels {"labels": {"app.kubernetes.io/created-by":"kuberay-operator","app.kubernetes.io/name":"kuberay","ray.io/cluster":"dxia-test","ray.io/cluster-dashboard":"dxia-test-dashboard","ray.io/group":"headgroup","ray.io/identifier":"dxia-test-head","ray.io/is-ray-node":"yes","ray.io/node-type":"head"}}
2022-11-23T18:36:44.859Z INFO RayCluster-Controller Head pod container with index 0 identified as Ray container.
2022-11-23T18:36:44.859Z INFO controllers.RayCluster createHeadPod {"head pod with name": "dxia-test-head-"}
2022-11-23T18:36:44.884Z ERROR controller.raycluster-controller Reconciler error {"reconciler group": "ray.io", "reconciler kind": "RayCluster", "name": "dxia-test", "namespace": "dxia-test", "error": "pods \"dxia-test-head-bs5fn\" is forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.1/pkg/internal/controller/controller.go:227
kubectl get rayclusters dxia-test -o yaml
apiVersion: ray.io/v1alpha1
kind: RayCluster
metadata:
name: dxia-test
namespace: dxia-test
spec:
enableInTreeAutoscaling: false
headGroupSpec:
rayStartParams:
block: "true"
dashboard-host: 0.0.0.0
num-cpus: "0"
replicas: 1
serviceType: ClusterIP
template:
spec:
containers:
- image: eu.gcr.io/kubeflow-platform/hendrix-ray:0.2.3.dev1-py38-gpu
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- mkdir -p /home/ray/notebooks && jupyter notebook --no-browser --ip
0.0.0.0 --port 8081 --MappingKernelManager.cull_idle_timeout=3600
--NotebookApp.disable_check_xsrf=True --NotebookApp.notebook_dir='/home/ray/notebooks'
--NotebookApp.token='' > /dev/null 2>&1 &
preStop:
exec:
command:
- /bin/sh
- -c
- ray stop
name: ray-head
ports:
- containerPort: 3000
name: vscode
protocol: TCP
- containerPort: 6379
name: gcs
protocol: TCP
- containerPort: 8265
name: dashboard
protocol: TCP
- containerPort: 8080
name: metrics
protocol: TCP
- containerPort: 10001
name: client
protocol: TCP
resources:
limits:
cpu: "15"
memory: 48Gi
requests:
cpu: "15"
memory: 48Gi
volumeMounts:
- mountPath: /tmp/ray
name: ray-logs
serviceAccountName: default-editor
volumes:
- emptyDir: {}
name: ray-logs
rayVersion: 2.0.0
workerGroupSpecs:
- groupName: worker
maxReplicas: 100
minReplicas: 1
rayStartParams:
block: "true"
replicas: 1
template:
spec:
containers:
- image: eu.gcr.io/kubeflow-platform/hendrix-ray:0.2.3.dev1-py38-gpu
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- ray stop
name: main
ports:
- containerPort: 8080
name: metrics
protocol: TCP
resources:
limits:
cpu: "15"
memory: 48Gi
requests:
cpu: "15"
memory: 48Gi
volumeMounts:
- mountPath: /tmp/ray
name: ray-logs
initContainers:
- command:
- sh
- -c
- until nslookup $RAY_IP.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local;
do echo waiting for myservice; sleep 2; done
image: busybox:1.28
name: init
serviceAccountName: default-editor
volumes:
- emptyDir: {}
name: ray-logs
status:
reason: 'pods forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15,
used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1'
state: failed
kubectl --context gke_kubeflow-platform_europe-west4-b_ml-compute-1 -n dxia-test describe rayclusters dxia-test
Name: dxia-test
Namespace: dxia-test
API Version: ray.io/v1alpha1
Kind: RayCluster
Spec:
Enable In Tree Autoscaling: false
Head Group Spec:
Ray Start Params:
Block: true
Dashboard - Host: 0.0.0.0
Num - Cpus: 0
Replicas: 1
Service Type: ClusterIP
Template:
Spec:
Containers:
Image: eu.gcr.io/kubeflow-platform/hendrix-ray:0.2.3.dev1-py38-gpu
Lifecycle:
Post Start:
Exec:
Command:
/bin/sh
-c
mkdir -p /home/ray/notebooks && jupyter notebook --no-browser --ip 0.0.0.0 --port 8081 --MappingKernelManager.cull_idle_timeout=3600 --NotebookApp.disable_check_xsrf=True --NotebookApp.notebook_dir='/home/ray/notebooks' --NotebookApp.token='' > /dev/null 2>&1 &
Pre Stop:
Exec:
Command:
/bin/sh
-c
ray stop
Name: ray-head
Ports:
Container Port: 3000
Name: vscode
Protocol: TCP
Container Port: 6379
Name: gcs
Protocol: TCP
Container Port: 8265
Name: dashboard
Protocol: TCP
Container Port: 8080
Name: metrics
Protocol: TCP
Container Port: 10001
Name: client
Protocol: TCP
Resources:
Limits:
Cpu: 15
Memory: 48Gi
Requests:
Cpu: 15
Memory: 48Gi
Volume Mounts:
Mount Path: /tmp/ray
Name: ray-logs
Service Account Name: default-editor
Volumes:
Empty Dir:
Name: ray-logs
Ray Version: 2.0.0
Worker Group Specs:
Group Name: worker
Max Replicas: 100
Min Replicas: 1
Ray Start Params:
Block: true
Replicas: 1
Template:
Spec:
Containers:
Image: eu.gcr.io/kubeflow-platform/hendrix-ray:0.2.3.dev1-py38-gpu
Lifecycle:
Pre Stop:
Exec:
Command:
/bin/sh
-c
ray stop
Name: main
Ports:
Container Port: 8080
Name: metrics
Protocol: TCP
Resources:
Limits:
Cpu: 15
Memory: 48Gi
Requests:
Cpu: 15
Memory: 48Gi
Volume Mounts:
Mount Path: /tmp/ray
Name: ray-logs
Init Containers:
Command:
sh
-c
until nslookup $RAY_IP.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for myservice; sleep 2; done
Image: busybox:1.28
Name: init
Service Account Name: default-editor
Volumes:
Empty Dir:
Name: ray-logs
Status:
Reason: pods forbidden: exceeded quota: quota, requested: limits.cpu=15,requests.cpu=15, used: limits.cpu=0,requests.cpu=0, limited: limits.cpu=1,requests.cpu=1
State: failed
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Created 2m16s raycluster-controller Created service dxia-test-head-svc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment