Skip to content

Instantly share code, notes, and snippets.

View Jeffwan's full-sized avatar

Jiaxin Shan Jeffwan

  • Bytedance
  • Seattle, WA
View GitHub Profile
@Jeffwan
Jeffwan / dcp-metrics-included.csv
Created March 22, 2024 21:29
dcgm metric configuration
We can make this file beautiful and searchable if this error is corrected: It looks like row 3 should actually have 1 column, instead of 3. in line 2.
# Format
# If line starts with a '#' it is considered a comment
# DCGM FIELD, Prometheus metric type, help message
# Clocks
DCGM_FI_DEV_SM_CLOCK, gauge, SM clock frequency (in MHz).
DCGM_FI_DEV_MEM_CLOCK, gauge, Memory clock frequency (in MHz).
# Temperature
DCGM_FI_DEV_MEMORY_TEMP, gauge, Memory temperature (in C).
@Jeffwan
Jeffwan / guidance.md
Last active June 16, 2022 06:55
workspace-operator example

Setups to try workspace operator

  1. Make sure you have ray-system namespace. if not, kubectl create ns ray-system
  2. kubectl create -f ray.io_workspaces.yaml
  3. kubectl apply -f workspace-operator.yaml
  4. Create a jupyter notebook. kubectl apply -f ray.io_v1alpha1_workspace.yaml
  5. Use the nodeport or port-forward the service. Then open browser nodeip:nodeport/kuberay/workspace.

Note: operator image and jupyter image can be used directly. I upload to my personal dockerhub. I will try to finish OSS process soon.

@Jeffwan
Jeffwan / spark-ray-redis.py
Last active April 11, 2021 05:03
spark-ray-redis.py
import os
import ray
import raydp
HEAD_SERVICE_IP_ENV = "EXAMPLE_CLUSTER_RAY_HEAD_SERVICE_HOST"
head_service_ip = os.environ[HEAD_SERVICE_IP_ENV]
ray.init(address=f"{head_service_ip}:6379")
@Jeffwan
Jeffwan / ray-xgboost-auto.py
Created April 8, 2021 21:10
ray-xgboost-auto.py
import os
import ray
from xgboost_ray import RayDMatrix, RayParams, train
from sklearn.datasets import load_breast_cancer
ray.init(address="auto")
train_x, train_y = load_breast_cancer(return_X_y=True)
train_set = RayDMatrix(train_x, train_y)
@Jeffwan
Jeffwan / raydp-spark-remote.py
Last active April 8, 2021 22:38
raydp-spark-remote.py
import os
import ray
import raydp
HEAD_SERVICE_IP_ENV = "EXAMPLE_CLUSTER_RAY_HEAD_SERVICE_HOST"
HEAD_SERVICE_CLIENT_PORT_ENV = "EXAMPLE_CLUSTER_RAY_HEAD_SERVICE_PORT_CLIENT"
head_service_ip = os.environ[HEAD_SERVICE_IP_ENV]
client_port = os.environ[HEAD_SERVICE_CLIENT_PORT_ENV]
import argparse
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import numpy as np
import os
import ray
from ray import tune
from ray.util.sgd.tf.tf_trainer import TFTrainer, TFTrainable
@Jeffwan
Jeffwan / ray-xgboost.py
Last active April 7, 2021 23:13
ray-xgboost.py
import os
import ray
from xgboost_ray import RayDMatrix, RayParams, train
from sklearn.datasets import load_breast_cancer
HEAD_SERVICE_IP_ENV = "EXAMPLE_CLUSTER_RAY_HEAD_SERVICE_HOST"
HEAD_SERVICE_CLIENT_PORT_ENV = "EXAMPLE_CLUSTER_RAY_HEAD_SERVICE_PORT_CLIENT"
head_service_ip = os.environ[HEAD_SERVICE_IP_ENV]
@Jeffwan
Jeffwan / raydp-spark.py
Last active April 8, 2021 22:34
raydp-spark.py
import os
import ray
import raydp
HEAD_SERVICE_IP_ENV = "EXAMPLE_CLUSTER_RAY_HEAD_SERVICE_HOST"
HEAD_SERVICE_CLIENT_PORT_ENV = "EXAMPLE_CLUSTER_RAY_HEAD_SERVICE_PORT_CLIENT"
head_service_ip = os.environ[HEAD_SERVICE_IP_ENV]
client_port = os.environ[HEAD_SERVICE_CLIENT_PORT_ENV]
ray.util.connect(f"{head_service_ip}:{client_port}")
@Jeffwan
Jeffwan / gist:5f11afe8593950ed444c959160eb2822
Created August 28, 2020 17:00
Your access to this site has been restricted
```
curl -v https://github.com/kubeflow/manifests/archive/v1.0.2.tar.gz 09:55:14
* Trying 192.30.255.113...
* TCP_NODELAY set
* Connected to github.com (192.30.255.113) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/cert.pem
CApath: none
```
WARN[0007] Encountered error applying application application: (kubeflow.error): Code 500 with message: Apply.Run : unable to recognize "/tmp/kout799748204": no matches for kind "Application" in version "app.k8s.io/v1beta1" filename="kustomize/kustomize.go:266"
WARN[0007] Will retry in 3 seconds. filename="kustomize/kustomize.go:267"
```