Skip to content

Instantly share code, notes, and snippets.

@spiffxp
Last active June 26, 2019 22:38
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save spiffxp/5cdec0e1c04ca7fcc9d48c26e2d4aa47 to your computer and use it in GitHub Desktop.
Save spiffxp/5cdec0e1c04ca7fcc9d48c26e2d4aa47 to your computer and use it in GitHub Desktop.

Intent is to create my prow instance to service my own github org as a testbed / staging area for changes I would want to make to prow.k8s.io

Made GitHub things

Followed Getting Started Guide

I went to the Next Steps section, and did things a bit out of order.

I created a bare repo at https://github.com/bashfire/prow-config

Added a Makefile

# GKE cluster variables
PROJECT ?= spiffxp-gke-dev
ZONE ?= us-central1-f
CLUSTER ?= prow

# prow docker image variables
REPO ?= gcr.io/k8s-prow
TAG ?= v20190620-f4e6553b2

CWD = $(shell pwd)

get-cluster-credentials:
	gcloud container clusters get-credentials "$(CLUSTER)" --project="$(PROJECT)" --zone="$(ZONE)"

.PHONY: get-cluster-credentials

update-config: get-cluster-credentials check-config
	kubectl create configmap config --from-file=config.yaml=config.yaml --dry-run -o yaml | kubectl replace configmap config -f -

update-plugins: get-cluster-credentials check-config
	kubectl create configmap plugins --from-file=plugins.yaml=plugins.yaml --dry-run -o yaml | kubectl replace configmap plugins -f -

.PHONY: update-config update-plugins

check-config:
	docker run -v $(CWD):/prow $(REPO)/checkconfig:$(TAG) --config-path /prow/config.yaml --plugin-config /prow/plugins.yaml

.PHONY: check-config

Verified I could update the cluster's plugins and config

# plugins.yaml
plugins:
  bashfire:
  - cat
  - size
# config.yaml
prowjob_namespace: default
pod_namespace: test-pods
# from bashfire/prow-config
make check-config
make update-plugins
make update-config

Followed the steps in Configure Cloud Storage

gcloud iam service-accounts create spiffxp-gke-dev-prow # step 1
identifier="$(  gcloud iam service-accounts list --filter 'name:spiffxp-gke-dev-prow' --format 'value(email)' )"
gsutil mb gs://bashfire-prow # step 2
gsutil iam ch allUsers:objectViewer gs://bashfire-prow # step 3
gsutil iam ch "serviceAccount:${identifier}:objectCreator" gs://bashfire-prow # step 4
gcloud iam service-accounts keys create --iam-account "${identifier}" service-account.json # step 5
kubectl create secret generic gcs-credentials --from-file=service-account.json # step 6

Next tried the Add More Jobs step

Discovered I had to add more to config.yaml and plugins.yaml than is documented. Also by telling pods to run in a different namespace, I needed to put the gcs-credentials secret in that namespace as well.

# config.yaml
plank:
  default_decoration_config:
    utility_images: # TODO: "no default decoration image pull specs provided for plank" could be a clearer error message
      clonerefs: "gcr.io/k8s-prow/clonerefs:v20190619-25afbb545"
      initupload: "gcr.io/k8s-prow/initupload:v20190619-25afbb545"
      entrypoint: "gcr.io/k8s-prow/entrypoint:v20190619-25afbb545"
      sidecar: "gcr.io/k8s-prow/sidecar:v20190619-25afbb545"
    gcs_configuration:
      bucket: bashfire-prow
      path_strategy: explicit
    gcs_credentials_secret: gcs-credentials
# plugins.yaml
plugins:
  bashfire:
  - cat
  - size
  - trigger

I then discovered the gcs-credentials secret wasn't being found, because it was in the default namespace, and the Pods created by ProwJobs were now being created in the test-pods namespace

kubectl delete secret gcs-credentials
kubectl create secret generic gcs-credentials --namespace=test-pods --from-file=service-account.json

When I did this, the echo-test periodic job that had been working stopped working

kubectl get pods --namespace=test-pods --field-selector=status.phase!=Succeeded
# pick a pod
kubectl describe -n test-pods pod/3465dd1c-96d1-11e9-b19b-aa00477a1c7c
# look for the Init Containers section, see initupload has State: Terminated
kubectl logs -n test-pods 3465dd1c-96d1-11e9-b19b-aa00477a1c7c -c initupload
# {"component":"initupload","error":"failed to upload to GCS: failed to upload to GCS: encountered errors during upload: [[googleapi: Error 403: spiffxp-gke-dev-prow@spiffxp-gke-dev.iam.gserviceaccount.com does not have storage.objects.delete access to bashfire-prow/logs/echo-test/latest-build.txt., forbidden]]","file":"prow/cmd/initupload/main.go:45","func":"main.main","level":"fatal","msg":"Failed to initialize job","time":"2019-06-24T22:41:54Z"}

It turns out initupload's Run calls gcsupload's Run, which will ultimately try overwriting build-latest.txt. It seems weird that we'd want to do that before a build has actually finished. But let's assume the legacy behavior is correct before we fall too deep down the rabbit hole.

So, TODO: objectCreator isn't sufficient permissions for periodics to take advantage of podutils

gsutil iam ch "serviceAccount:${identifier}:objectAdmin" gs://bashfire-prow

At this point, I had a more or less functional prow cluster. Next, I wanted to allow prow to update itself.

# added to plugins.yaml
config_updater:
  config_file: config.yaml
  plugin_file: plugins.yaml
plugins:
  #...
  bashfire/prow-config:
  - config-updater
make update-plugins

Next I wanted to setup the prow.bashfire.dev domain name. I own bashfire.dev through namecheap.com, so I followed the instructions at https://cloud.google.com/dns/docs/quickstart and https://cloud.google.com/dns/docs/update-name-servers to setup a bashfire.dev zone in Google Cloud DNS.

I could get to prow.bashfire.dev, but it was redirecting to https which was not happy because I hadn't setup ingress yet.

I followed instructions at https://docs.cert-manager.io/en/latest/getting-started/install/kubernetes.html to install cert-manager and verify that it worked

kubectl create namespace cert-manager
kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true
# using kubectl 1.12 so need the --validate=false flag
kubectl apply -f --validate=false https://github.com/jetstack/cert-manager/releases/download/v0.8.1/cert-manager.yaml

I followed some combination of https://docs.cert-manager.io/en/latest/tasks/issuers/setup-acme/index.html and looking at test-infra's setup to get https://prow.bashfire.dev to work. There has been some flail involved because of deltas from all the ideal/example setups:

  • I'm using a newer version of cert-manager than test-infra and trying to avoid deprecated flags
  • I'm using the gce-ingress setup that comes by default with GKE rather than trying to setup nginx-ingress
  • I had this running so long that a self-signed cert was picked up by gce-ingress first; once I had a proper cert it took a while for it to be picked up (https://cloud.google.com/kubernetes-engine/docs/concepts/ingress#setting_up_https_tls_between_client_and_load_balancer claims 10 minutes)
  • I was pointed at the staging ACME server for a while, so I was getting a cert, it just wasn't signed by a valid CA
  • I verified what the actual cert was in-cluster via k get secret tls-ingress-secret -o=json | jq -r '.data["tls.crt"]' | base64 --decode | openssl x509 -text -noout
gcloud compute addresses create prow-bashfire-dev-ip --global
kubectl delete ingress ing
kubectl apply -f tls-ingress.yaml
kubectl apply -f cluster-issuer.yaml
# tls-ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  namespace: default
  name: tls-ingress
  annotations:
    kubernetes.io/ingress.class: gce
    kubernetes.io/ingress.global-static-ip-name: prow-bashfire-dev-ip
    certmanager.k8s.io/cluster-issuer: letsencrypt-prod
spec:
  tls:
  - secretName: tls-ingress-secret
    hosts:
    - prow.bashfire.dev
  rules:
  - host: prow.bashfire.dev
    http:
      paths:
      - path: /*
        backend:
          serviceName: deck
          servicePort: 80
      - path: /hook
        backend:
          serviceName: hook
          servicePort: 8888
# cluster-issuer.yaml
apiVersion: certmanager.k8s.io/v1alpha1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    email: spiffxp@google.com
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-prod-private-key
    solvers:
    - http01:
        ingress:
          name: tls-ingress
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment