Skip to content

Instantly share code, notes, and snippets.

@etheleon
Last active December 25, 2021 18:50
Show Gist options
  • Save etheleon/80414516c7fbc7147a5718b9897b1518 to your computer and use it in GitHub Desktop.
Save etheleon/80414516c7fbc7147a5718b9897b1518 to your computer and use it in GitHub Desktop.
installing kubeflow 1.2

Introduction

Installing kubeflow on localmachine is not a simple task. Documentation on the official website might be outdated. At the time of writing, the solutions suggested include miniKF and microk8s. The later sets up GPU passthrough effortlessly.

The following gist highlights how to install kubeflow on a single linux (ubuntu) workstation using microk8s

Install microk8s

$ sudo snap install microk8s --classic --channel=1.19/stable

Enable microk8s features

$ microk8s enable dns dashboard storage gpu

Update kube-apiserver flags

Addend the following to /var/snap/microk8s/current/args/kube-apiserver:

--service-account-signing-key-file=${SNAP_DATA}/certs/serviceaccount.key
--service-account-issuer=kubernetes.default.svc

istio configuration is outside of kubeflow.

Reason

Allow the use of trustworthy JWTs. See kubeflow's readme Not doing so will lead to istio pods hanging with the follow error see git issue

Additional Background

Newer version of kubeflow (>= 1.0) has abandoned Ambassador, K8S API gateway, for istio's ingress gateway source

Ambassador

Ingress -->        Envoy        --> Ambassador --> Other services
              (JWT valdation)

Istio ingress gateway

Ingress  --> istio-ingressgateway --> Other services

Restart microk8s

$ microk8s stop
$ microk8s start

Create kubeconfig

sudo microk8s.kubectl config view --raw > $HOME/.kube/config
export KUBECONFIG=$HOME/.kube/config

Manually install kubeflow

At the point of writing, microk8s enable kubeflow is not working for me.

Download kfctl

$ wget https://github.com/kubeflow/kfctl/releases/download/v1.2.0/kfctl_v1.2.0-0-gbc038f9_linux.tar.gz

Prepare your environment

  1. add path to kfctl binary to your $PATH
  2. export the following env variables
export BASE_DIR=/opt/
export KF_NAME=<your_kubeflow_deployment_name>
# Set the path to the base directory where you want to store one or more
# Kubeflow deployments. For example, /opt/.
# Then set the Kubeflow application directory for this deployment.
export KF_DIR=${BASE_DIR}/${KF_NAME}

# Set the configuration file to use when deploying Kubeflow.
# The following configuration installs Istio by default. Comment out
# the Istio components in the config file to skip Istio installation.
# See https://github.com/kubeflow/kubeflow/pull/3663
export CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v1.2-branch/kfdef/kfctl_k8s_istio.v1.2.0.yaml"

change to https://raw.githubusercontent.com/kubeflow/manifests/v1.2-branch/kfdef/kfctl_istio_dex.v1.2.0.yaml if you want to have admin access (kudos to @kosehy)

Create empty directory

$ mkdir $HOME/kf_installation_temp && cd $HOME/kf_installation_temp
$ kfctl apply -V -f $CONFIG_URI

Access kubeflow dashboard

$ kubectl port-forward svc/istio-ingressgateway 8081:80 -n istio-system

Pulling large images

If you're pulling large images into microk8s, you might want to extend the pull-progress duration. Add the following to /var/snap/microk8s/current/args/kubelet

--image-pull-progress-deadline="10m"

Restarting microk8s might not be enough, you'll have to reboot the entire machine.

Kubeflow pipelines

Kubelet runtime

Pods from runs created by argo's workflow controller (underlying engine powering kfp) cannot be created unless the container runtime switched from remote (microk8s default) to docker (see github thread)

Edit the flags in /var/snap/microk8s/current/args/kubelet:

# --container-runtime=remote
# --container-runtime-endpoint=${SNAP_COMMON}/run/containerd.sock
--container-runtime=docker

Creating runs for kubeflow pipelines

After writing your pipeline, there are two ways to create a run based on your pipeline depending on whether if the pipeline is going to be reused. ie. for production or for experimentation

  • Production: (1) Submit pipeline and (2) create a (recurring or one off) run based on this pipeline.
  • Directly submitting a run via the kfp python SDK from a notebook (mostly for experimentation)

Direct runs creation from notebook via kfp SDK without submitting a pipeline first.

For creating runs directly from notebooks to kubeflow pipelines, one needs to authenticate as an authorized user to submit jobs.

Bind notebook workloads with the ml-pipeline service role

The following binds notebook-server workloads launched in the wesley namespace and using the default-editor service account (this would be the namespace where the jupyter notebooks are deployed) with the ServiceRole ml-pipeline-service, this belongs to kubeflow.

apiVersion: rbac.istio.io/v1alpha1
kind: ServiceRoleBinding
metadata:
  name: bind-ml-pipeline-nb-wesley-namespace
  namespace: kubeflow
spec:
  roleRef:
    kind: ServiceRole
    name: ml-pipeline-services
  subjects:
  - properties:
      source.principal: cluster.local/ns/wesley/sa/default-editor

To create this ServiceRole Binding, run kubectl apply -f <filename.yaml>

Attach email address to header of HTTP requests from this workload

See github thread, this is still a workaround until the contributors address this properly.

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: add-header
  namespace: wesley
spec:
  configPatches:
  - applyTo: VIRTUAL_HOST
    match:
      context: SIDECAR_OUTBOUND
      routeConfiguration:
        vhost:
          name: ml-pipeline.kubeflow.svc.cluster.local:8888
          route:
            name: default
    patch:
      operation: MERGE
      value:
        request_headers_to_add:
        - append: true
          header:
            key: kubeflow-userid
            value: anonymous@kubeflow.org
  workloadSelector:
    labels:
      notebook-name: wesley
  • kubeflow-userid should be the email address of the owner for the notebook server's namespace.

     $ kubectl get ns wesley -o yaml
  • notebook-name, check the labels for the notebook pod:

     $ kubectl get pods --show-labels
@alessandroferrari
Copy link

Thanks for the writeup: when adding <<--image-pull-progress-deadline="10m">> to kubelet args the deploy will not work on my end (I have followed the tutorial from a fresh install), only a subset of the pods get declare in the declaration step.

@alessandroferrari
Copy link

Also, the port forward instruction hangs on the second print:
$ kubectl port-forward svc/istio-ingressgateway 8081:80 -n istio-system Forwarding from 127.0.0.1:8081 -> 80 Forwarding from [::1]:8081 -> 80
In order to expose the ingress-gateway in istio, I had to follow:
https://istio.io/latest/docs/tasks/traffic-management/ingress/ingress-control/

@etheleon
Copy link
Author

etheleon commented Jan 3, 2021 via email

@etheleon
Copy link
Author

etheleon commented Jan 3, 2021 via email

@alessandroferrari
Copy link

Btw, can you run without issues the mnist example for tf jobs after this procedure? I have filed an issue about it kubeflow/kubeflow#5492 , and I suspect the problem is the fact that i need to use the NodePort instead of the port forwarding, and maybe the issue is that the NodePort procedure does not work properly with the egress.

@alessandroferrari
Copy link

U just posted a website What exactly did u do?
On Sun, 3 Jan 2021 at 6:05 PM, alessandroferrari @.> wrote: @.* commented on this gist. ------------------------------ Also, the port forward instruction hangs on the second print: $ kubectl port-forward svc/istio-ingressgateway 8081:80 -n istio-system Forwarding from 127.0.0.1:8081 -> 80 Forwarding from [::1]:8081 -> 80 In order to expose the ingress-gateway in istio, I had to follow: https://istio.io/latest/docs/tasks/traffic-management/ingress/ingress-control/ — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://gist.github.com/80414516c7fbc7147a5718b9897b1518#gistcomment-3580415, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAV4M6UBZ5MOLRQW6OVCHPTSYA6OTANCNFSM4VRXYDVA .
-- Regards Wesley

Here the exact procedure:
`
kubectl get svc istio-ingressgateway -n istio-system

INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="http2")].nodePort}')
SECURE_INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="https")].nodePort}')
TCP_INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.spec.ports[?(@.name=="tcp")].nodePort}')
INGRESS_HOST=$(kubectl get po -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].status.hostIP}')

kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: httpbin-gateway
spec:
selector:
istio: ingressgateway # use Istio default gateway implementation
servers:

  • port:
    number: 80
    name: http
    protocol: HTTP
    hosts:
    • "httpbin.example.com"
      EOF

kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: httpbin
spec:
hosts:

  • "httpbin.example.com"
    gateways:
  • httpbin-gateway
    http:
  • match:
    • uri:
      prefix: /status
    • uri:
      prefix: /delay
      route:
    • destination:
      port:
      number: 8000
      host: httpbin
      EOF

curl -s -I -HHost:httpbin.example.com "http://$INGRESS_HOST:$INGRESS_PORT/status/200"

echo "http://$INGRESS_HOST:$INGRESS_PORT"
`

@etheleon
Copy link
Author

etheleon commented Jan 3, 2021

Ah, awesome (you solved it for external LB)

I didn't mean this to be exposed through a load balance only local workstation (so only localhost) 👏 thanks!

@kosehy
Copy link

kosehy commented Jan 4, 2021

https://gist.github.com/etheleon/80414516c7fbc7147a5718b9897b1518#update-kube-apiserver-flags
To edit the kube-apiserver file, I use printf to add two codes to the end of the kube-apiserver file.

# nano /var/snap/microk8s/current/args/kube-apiserver
printf "\n# Allow the use of trustworthy JWTs
--service-account-signing-key-file=\${SNAP_DATA}/certs/serviceaccount.key
--service-account-issuer=kubernetes.default.svc
#~Allow the use of trustworthy JWTs\n" >> /var/snap/microk8s/current/args/kube-apiserver

@kosehy
Copy link

kosehy commented Jan 4, 2021

And if you are using microk8s the first time, you need to try with sudo or add the user to 'microk8s' group.

gpu@gpu:~$ microk8s enable dns dashboard storage gpu
Insufficient permissions to access MicroK8s.
You can either try again with sudo or add the user gpu to the 'microk8s' group:

    sudo usermod -a -G microk8s gpu
    sudo chown -f -R gpu ~/.kube

The new group will be available on the user's next login.

@ynott
Copy link

ynott commented Jan 5, 2021

Thank you for the very good instruction. 👍

export CONFIG_URI="https://github.com/kubeflow/manifests/blob/v1.2-branch/kfdef/kfctl_k8s_istio.v1.2.0.yaml"

This looks better as follows

export CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v1.2-branch/kfdef/kfctl_k8s_istio.v1.2.0.yaml"

@alessandroferrari
Copy link

When installing this way, can you actually make use of tf-jobs without additional instructions? kubeflow/kubeflow#5492

@tritran-cotai
Copy link

tritran-cotai commented Jan 7, 2021

Thanks for your guideline @etheleon. I am able to install kubeflow but how can we have admin account to access Kubeflow Dashboard.
I cannot find that information in this guideline.
I used CONFIG_URI="https://github.com/kubeflow/manifests/blob/v1.2-branch/kfdef/kfctl_istio_dex.v1.2.0.yaml"
[UPDATED] I got it
https://www.kubeflow.org/docs/started/k8s/kfctl-istio-dex/#add-static-users-for-basic-auth

@kosehy
Copy link

kosehy commented Jan 13, 2021

During applying the servicerolebinding.yaml,
I have the error message below:

gpu@gpu1:~/Downloads$ microk8s kubectl apply -f servicerolebinding.yaml
Error from server: error when creating "servicerolebinding.yaml": admission webhook "pilot.validation.istio.io" denied the request: configuration is invalid: empty subjects are not allowed. Found an empty subject at index 0

After add the double space at the beginning of the source.principal part in servicerolebinding.yaml I can apply yaml file!

apiVersion: rbac.istio.io/v1alpha1
kind: ServiceRoleBinding
metadata:
  name: bind-ml-pipeline-nb-wesley-namespace
  namespace: kubeflow
spec:
  roleRef:
    kind: ServiceRole
    name: ml-pipeline-services
  subjects:
  - properties:
      source.principal: cluster.local/ns/wesley/sa/default-editor

@etheleon
Copy link
Author

Thank you for the very good instruction. 👍

export CONFIG_URI="https://github.com/kubeflow/manifests/blob/v1.2-branch/kfdef/kfctl_k8s_istio.v1.2.0.yaml"

This looks better as follows

export CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v1.2-branch/kfdef/kfctl_k8s_istio.v1.2.0.yaml"

Updated! You're welcome

@etheleon
Copy link
Author

During applying the servicerolebinding.yaml,
I have the error message below:

gpu@gpu1:~/Downloads$ microk8s kubectl apply -f servicerolebinding.yaml
Error from server: error when creating "servicerolebinding.yaml": admission webhook "pilot.validation.istio.io" denied the request: configuration is invalid: empty subjects are not allowed. Found an empty subject at index 0

After add the double space at the beginning of the source.principal part in servicerolebinding.yaml I can apply yaml file!

apiVersion: rbac.istio.io/v1alpha1
kind: ServiceRoleBinding
metadata:
  name: bind-ml-pipeline-nb-wesley-namespace
  namespace: kubeflow
spec:
  roleRef:
    kind: ServiceRole
    name: ml-pipeline-services
  subjects:
  - properties:
      source.principal: cluster.local/ns/wesley/sa/default-editor

Updated!! Thank you 🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment