Skip to content

Instantly share code, notes, and snippets.

@yuvalif
Last active April 11, 2023 13:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save yuvalif/caaa5c69675b3e179d660ba698227219 to your computer and use it in GitHub Desktop.
Save yuvalif/caaa5c69675b3e179d660ba698227219 to your computer and use it in GitHub Desktop.

K8s Setup

install minikube, run minikube with enough CPUs and 2 extra disks (for 2 OSDs):

$ minikube start --cpus 6 --extra-disks=2 --driver=kvm2

install kubectl and use from from the host:

$ eval $(minikube docker-env)

Install Jaeger

  • install cert-manager:
$ kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.11.0/cert-manager.yaml
  • install jaeger in the observability namespace:
$ kubectl create namespace observability
$ kubectl apply -f https://github.com/jaegertracing/jaeger-operator/releases/download/v1.42.0/jaeger-operator.yaml -n observability

Create Jaeger Instance

  • create a simple all-in-one pod:
$ cat << EOF | kubectl apply -f -
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: simplest
  namespace: observability
EOF
  • expose the query api as a NodePort service:
$ cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
  name: simplest-query-external
  namespace: observability
  labels:
    app: jaeger
    app.kubernetes.io/component: service-query
    app.kubernetes.io/instance: simplest
    app.kubernetes.io/managed-by: jaeger-operator
    app.kubernetes.io/name: simplest-query
    app.kubernetes.io/part-of: jaeger
spec:
  ports:
  - name: http-query
    port: 16686
    protocol: TCP
    targetPort: 16686
  selector:
    app.kubernetes.io/component: all-in-one
    app.kubernetes.io/instance: simplest
    app.kubernetes.io/managed-by: jaeger-operator
    app.kubernetes.io/name: simplest
    app.kubernetes.io/part-of: jaeger
    app: jaeger
  sessionAffinity: None
  type: NodePort
EOF

Install Rook

  • make sure there are disks without a filesystem:
$ minikube ssh lsblk
  • download and install rook operator (use v1.10):
$ git clone -b release-1.10 https://github.com/rook/rook.git
$ cd rook/deploy/examples
$ kubectl create -f crds.yaml -f common.yaml

in operator.yaml increase debug level:

data:
  # The logging level for the operator: ERROR | WARNING | INFO | DEBUG
  ROOK_LOG_LEVEL: "DEBUG"

then apply the oprator:

$ kubectl create -f operator.yaml

Start a Ceph Cluster with Object Store

use a developer build of ceph that supports tracing. to do that edit cluster-test.yaml and replace the line:

image: quay.io/ceph/ceph:v17

with:

image: quay.ceph.io/ceph-ci/ceph:wip-yuval-full-putobj-trace

add the following jaeger argumnets in the ConfigMap in cluster-test.yaml under the [global] section:

jaeger_tracing_enable = true
jaeger_agent_port = 6831

add annotations to the cluster. so that jaeger will inject an agent side-car to OSD pods:

spec:
  annotations:
    osd:
      sidecar.jaegertracing.io/inject: "true"

and apply the cluster:

$ kubectl create -f cluster-test.yaml

start the object store:

kubectl create -f object-test.yaml
  • add annotations to the object store. so that jaeger will inject an agent side-car to RGW pods:
gateway:
  annotations:
    sidecar.jaegertracing.io/inject: "true"

Test

  • we will create storage class and a bucket:
$ kubectl create -f storageclass-bucket-delete.yaml
$ kubectl create -f object-bucket-claim-delete.yaml
  • create a service so that it could be accessed from outside of k8s:
$ cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
  name: rook-ceph-rgw-my-store-external
  namespace: rook-ceph
  labels:
    app: rook-ceph-rgw
    rook_cluster: rook-ceph
    rook_object_store: my-store
spec:
  ports:
  - name: rgw
    port: 80
    protocol: TCP
    targetPort: 8080
  selector:
    app: rook-ceph-rgw
    rook_cluster: rook-ceph
    rook_object_store: my-store
  sessionAffinity: None
  type: NodePort
EOF
  • fetch the URL that allow access to the RGW service from the host running the minikube VM:
$ export AWS_URL=$(minikube service --url rook-ceph-rgw-my-store-external -n rook-ceph)
  • user credentials and bucket name:
$ export AWS_ACCESS_KEY_ID=$(kubectl -n default get secret ceph-delete-bucket -o jsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 --decode)
$ export AWS_SECRET_ACCESS_KEY=$(kubectl -n default get secret ceph-delete-bucket -o jsonpath='{.data.AWS_SECRET_ACCESS_KEY}' | base64 --decode)
$ export BUCKET_NAME=$(kubectl get objectbucketclaim ceph-delete-bucket -o jsonpath='{.spec.bucketName}')
  • now use them to upload an object:
$ echo "hello world" > hello.txt
$ aws --endpoint-url "$AWS_URL" s3 cp hello.txt s3://"$BUCKET_NAME"
  • fetch the URL that allow access to the jaeger query service from the host running the minikube VM:
$ export JAEGER_URL=$(minikube service --url simplest-query-external -n observability)
  • query traces:
$ curl "$JAEGER_URL/api/traces?service=rgw&limit=20&lookback=1h" | jq

Cleanup

  • delete the objects uploaded to the bucket:
$ aws --endpoint-url "$AWS_URL" s3 rm s3://"$BUCKET_NAME"/hello.txt
  • delete the OBC:
$ kubectl delete obc ceph-delete-bucket
  • delete the object store:
$ kubectl -n rook-ceph delete cephobjectstore my-store 
  • delete the cluster
$ kubectl -n rook-ceph delete CephBlockPool builtin-mgr 
$ kubectl -n rook-ceph delete cephcluster my-cluster

if this does not work, kill the k8s cluster :-)

$ minikube stop
$ minikube delete
{
"data": [
{
"traceID": "95088d6beba7bbd0fdc6aff1c2806dd0",
"spanID": "87c4f24b9c20a852",
"operationName": "put_obj tx00000c3e5625681a409ee-00643532b3-11a2-my-store",
"references": [],
"startTime": 1681207987650319,
"duration": 178800,
"tags": [
{
"key": "otel.library.name",
"type": "string",
"value": "rgw"
},
{
"key": "otel.library.version",
"type": "string",
"value": "1.4.0"
},
{
"key": "op",
"type": "string",
"value": "put_obj"
},
{
"key": "type",
"type": "string",
"value": "request"
},
{
"key": "return",
"type": "int64",
"value": 0
},
{
"key": "user_id",
"type": "string",
"value": "obc-default-ceph-bucket-heavy"
},
{
"key": "bucket_name",
"type": "string",
"value": "ceph-bkt-heavy-add7d77f-e70e-4ed8-81fc-743e590e9058"
},
{
"key": "object_name",
"type": "string",
"value": "tmp5.txt"
},
{
"key": "internal.span.format",
"type": "string",
"value": "proto"
}
],
"logs": [],
"processID": "p1",
"warnings": null
},
{
"traceID": "95088d6beba7bbd0fdc6aff1c2806dd0",
"spanID": "99b64a84d8a49063",
"operationName": "enqueue_op",
"references": [
{
"refType": "CHILD_OF",
"traceID": "95088d6beba7bbd0fdc6aff1c2806dd0",
"spanID": "22d10c172b0593c6"
}
],
"startTime": 1681207987734297,
"duration": 216,
"tags": [
{
"key": "otel.library.name",
"type": "string",
"value": "osd"
},
{
"key": "otel.library.version",
"type": "string",
"value": "1.4.0"
},
{
"key": "internal.span.format",
"type": "string",
"value": "proto"
}
],
"logs": [
{
"timestamp": 1681207987734304,
"fields": [
{
"key": "event",
"type": "string",
"value": "enqueue_op"
},
{
"key": "cost",
"type": "int64",
"value": 4195430
},
{
"key": "epoch",
"type": "int64",
"value": 34
},
{
"key": "priority",
"type": "int64",
"value": 63
},
{
"key": "type",
"type": "int64",
"value": 42
}
]
}
],
"processID": "p2",
"warnings": null
},
{
"traceID": "95088d6beba7bbd0fdc6aff1c2806dd0",
"spanID": "22d10c172b0593c6",
"operationName": "op-request-created",
"references": [
{
"refType": "CHILD_OF",
"traceID": "95088d6beba7bbd0fdc6aff1c2806dd0",
"spanID": "74bed88bc71801dc"
}
],
"startTime": 1681207987734274,
"duration": 602910689,
"tags": [
{
"key": "otel.library.name",
"type": "string",
"value": "osd"
},
{
"key": "otel.library.version",
"type": "string",
"value": "1.4.0"
},
{
"key": "internal.span.format",
"type": "string",
"value": "proto"
}
],
"logs": [],
"processID": "p2",
"warnings": null
}
],
"processes": {
"p1": {
"serviceName": "rgw",
"tags": [
{
"key": "cluster",
"type": "string",
"value": "undefined"
},
{
"key": "container.name",
"type": "string",
"value": "rgw"
},
{
"key": "deployment.name",
"type": "string",
"value": "rook-ceph-rgw-my-store-a"
},
{
"key": "host.ip",
"type": "string",
"value": "192.168.39.143"
},
{
"key": "pod.name",
"type": "string",
"value": "rook-ceph-rgw-my-store-a-95695cf8f-j6clp"
},
{
"key": "pod.namespace",
"type": "string",
"value": "rook-ceph"
},
{
"key": "telemetry.sdk.language",
"type": "string",
"value": "cpp"
},
{
"key": "telemetry.sdk.name",
"type": "string",
"value": "opentelemetry"
},
{
"key": "telemetry.sdk.version",
"type": "string",
"value": "1.4.0"
}
]
},
"p2": {
"serviceName": "osd",
"tags": [
{
"key": "cluster",
"type": "string",
"value": "undefined"
},
{
"key": "container.name",
"type": "string",
"value": "osd"
},
{
"key": "deployment.name",
"type": "string",
"value": "rook-ceph-osd-0"
},
{
"key": "host.ip",
"type": "string",
"value": "192.168.39.143"
},
{
"key": "pod.name",
"type": "string",
"value": "rook-ceph-osd-0-79985c4dbd-nldpk"
},
{
"key": "pod.namespace",
"type": "string",
"value": "rook-ceph"
},
{
"key": "telemetry.sdk.language",
"type": "string",
"value": "cpp"
},
{
"key": "telemetry.sdk.name",
"type": "string",
"value": "opentelemetry"
},
{
"key": "telemetry.sdk.version",
"type": "string",
"value": "1.4.0"
}
]
}
},
"warnings": null
}
#!/bin/bash [3/2959]
# script to simulate different load to different buckets/users
# generate 3 buckets
cat << EOF | kubectl apply -f -
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
name: ceph-bucket-heavy
spec:
generateBucketName: ceph-bkt-heavy
storageClassName: rook-ceph-delete-bucket
---
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
name: ceph-bucket-medium
spec:
generateBucketName: ceph-bkt-medium
storageClassName: rook-ceph-delete-bucket
---
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
name: ceph-bucket-light
spec:
generateBucketName: ceph-bkt-light
storageClassName: rook-ceph-delete-bucket
EOF
i=0
AWS_URL=$(minikube service --url rook-ceph-rgw-my-store-external -n rook-ceph)
echo uploading objects to: $AWS_URL
while :
do
# object size between 1MB and 10MB
obj_size=$((1 + $RANDOM % 10))
head -c "$obj_size"M /dev/urandom > tmp.txt
obc_id=$((1 + $RANDOM % 100))
if [ $obc_id -lt 10 ]; then
# 10% of objects go to "light" bucket/user
obc_name="ceph-bucket-light"
elif [ $obc_id -lt 40 ]; then
# 30% of objects go to "medium" bucket/user
obc_name="ceph-bucket-medium"
else
# 60% of objects got to "heavy" bucket/user
obc_name="ceph-bucket-heavy"
fi
bucket_name=$(kubectl get objectbucketclaim $obc_name -o jsonpath='{.spec.bucketName}')
access_key=$(kubectl -n default get secret $obc_name -o jsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 --decode)
secret_key=$(kubectl -n default get secret $obc_name -o jsonpath='{.data.AWS_SECRET_ACCESS_KEY}' | base64 --decode)
((i++))
AWS_ACCESS_KEY_ID=$access_key AWS_SECRET_ACCESS_KEY=$secret_key aws --endpoint-url $AWS_URL s3 cp tmp.txt s3://$bucket_name/tmp$i.txt
done
@rootfs
Copy link

rootfs commented Apr 5, 2023

I created a bigger VM

minikube start --cpus 10 --memory 32GB --disk-size=400g --extra-disks=2 --driver=kvm2 --force --container-runtime cri-o

Docker runtime looks buggy

@rootfs
Copy link

rootfs commented Apr 5, 2023

To see the prometheus metrics from local computer:

Create a ssh tunnel:

ssh -L9091:localhost:9090 root@52.116.206.82

On the kube host, port forward to local port 9090

# kubectl port-forward service/prometheus-k8s -n monitoring 9090:9090

The on local computer, browser points to localhost:9091

@rootfs
Copy link

rootfs commented Apr 6, 2023

minikube requires lots of storage in rootfs. So I moved it to a data partition that is on /dev/vdd

@rootfs
Copy link

rootfs commented Apr 6, 2023

export MINIKUBE_HOME=/data/minikube
mount /dev/vdd /data

@rootfs
Copy link

rootfs commented Apr 7, 2023

minikube addons disable storage-provisioner

@rootfs
Copy link

rootfs commented Apr 11, 2023

Get last pod energy in rook-ceph namespace:

kubectl exec -ti -n monitoring prometheus-k8s-0 -- sh -c 'wget -O- -q "localhost:9090/api/v1/query?query=kepler_container_joules_total{container_namespace=~\"rook-ceph\",mode=~\"dynamic\"}"[3s]'  |jq -r '.data.result[] | [.metric.pod_name, .metric.container_name,.metric.container_namespace, .metric.mode, .values[0][1] ]'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment