Skip to content

Instantly share code, notes, and snippets.

@moretea
Last active July 14, 2017 10:46
Show Gist options
  • Save moretea/97a9676de5d851e129d9d9ce1f47a9ce to your computer and use it in GitHub Desktop.
Save moretea/97a9676de5d851e129d9d9ce1f47a9ce to your computer and use it in GitHub Desktop.
portworx + kops
Executing with arguments: -k etcd://100.96.6.3:2380 -c mycluster -a -f -x kubernetes
Fri Jul 14 10:40:51 UTC 2017 : Running on Linux ip-172-20-58-196 4.4.65-k8s #1 SMP Tue May 2 15:48:24 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
size for /dev/shm is 67100672, less than required 293601280
PXD version: 33cafba6c39c862340a8c30b2677849f67bd2d6a
Key Value Store: etcd://100.96.6.3:2380
Using cluster: mycluster
Using scheduler: kubernetes
/docker-entry-point.sh: line 763: /sys/fs/cgroup/cpu/cpu.rt_runtime_us: Permission denied
Failed to enable rt scheduler
Checking sysfs mount...
2017-07-14 10:40:51,902 CRIT Supervisor running as root (no user in config file)
2017-07-14 10:40:51,904 INFO supervisord started with pid 7733
2017-07-14 10:40:52,906 INFO spawned: 'relayd' with pid 7827
2017-07-14 10:40:52,907 INFO spawned: 'lttng' with pid 7828
2017-07-14 10:40:52,909 INFO spawned: 'exec' with pid 7829
2017-07-14 10:40:52,910 INFO spawned: 'pxdaemon' with pid 7830
2017-07-14 10:40:52,912 INFO spawned: 'px-ns' with pid 7831
2017-07-14 10:40:52,917 INFO spawned: 'px_event_listener' with pid 7832
Fri Jul 14 10:40:52 UTC 2017 size 147776 is within limits of maxsize 436207616
PXPROCS: lttng not started yet...sleeping
time="2017-07-14T10:40:52Z" level=info msg="px-ns Starting.."
NS client starting fuse module
Starting NS server
2017-07-14 10:40:53,960 INFO success: relayd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-07-14 10:40:53,961 INFO success: lttng entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-07-14 10:40:53,961 INFO success: exec entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-07-14 10:40:53,961 INFO success: pxdaemon entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-07-14 10:40:53,961 INFO success: px-ns entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-07-14 10:40:53,961 INFO success: px_event_listener entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
PXPROCS: lttng not started yet...sleeping
Spawning a session daemon
Session pxd created.
Traces will be written in net://localhost
Live timer set to 2000000 usec
Session pxd set to shm_path: /var/lib/osd/lttng/pxd-20170714-104058.
UST channel pxd_channel enabled for session pxd
All UST events are enabled in channel pxd_channel
Tracing started for session pxd
PXPROCS: Started px-storage with pid 7893
bash: connect: Connection refused
bash: /dev/tcp/localhost/9009: Connection refused
PXPROCS: px-storage not started yet...sleeping
C++ grpc server listening on 0.0.0.0:9009
PXPROCS: Started px with pid 7903
PXPROCS: Started watchdog with pid 7904
2017-07-14_10:41:01: PX-Watchdog: Starting watcher
2017-07-14_10:41:01: PX-Watchdog: Waiting for px process to start
root 7903 7830 0 10:41 ? 00:00:00 /usr/local/bin/px -daemon
2017-07-14_10:41:02: PX-Watchdog: (pid 7903): Begin monitoring
time="2017-07-14T10:41:02Z" level=info msg="Registering [kernel] as a volume driver"
time="2017-07-14T10:41:02Z" level=info msg="Starting PX Version: 1.2.8-e70082e - Build Version e70082e281be8b71872b09a3304926438466fc5b"
time="2017-07-14T10:41:02Z" level=error msg="Failed to unmarshal cloudcfg yaml: yaml: line 27: did not find expected <document start>"
time="2017-07-14T10:41:02Z" level=error msg="Incorrect cloud init data provided for Portworxor no cloud init found. Not using cloud init. [Error: yaml: line 27: did not find expected <document start>]"
time="2017-07-14T10:41:02Z" level=info msg="Node is not yet initialized"
time="2017-07-14T10:41:12Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 0\n"
time="2017-07-14T10:41:22Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 1\n"
time="2017-07-14T10:41:33Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 2\n"
time="2017-07-14T10:41:43Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 3\n"
time="2017-07-14T10:41:54Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 4\n"
time="2017-07-14T10:42:04Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 5\n"
2017-07-14_10:42:05: PX-Watchdog: (pid 7903): PX REST server died or did not started. return code 7. Timeout 3600
time="2017-07-14T10:42:15Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 6\n"
time="2017-07-14T10:42:25Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 7\n"
2017-07-14_10:42:36: PX-Watchdog: Waiting for px process to start
root 7903 7830 99 10:41 ? 00:01:40 /usr/local/bin/px -daemon
2017-07-14_10:42:36: PX-Watchdog: (pid 7903): Begin monitoring
time="2017-07-14T10:42:36Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 8\n"
time="2017-07-14T10:42:46Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 9\n"
time="2017-07-14T10:42:57Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 10\n"
time="2017-07-14T10:43:07Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 11\n"
time="2017-07-14T10:43:18Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 12\n"
time="2017-07-14T10:43:28Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 13\n"
2017-07-14_10:43:39: PX-Watchdog: (pid 7903): PX REST server died or did not started. return code 7. Timeout 3600
time="2017-07-14T10:43:39Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 14\n"
time="2017-07-14T10:43:49Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 15\n"
time="2017-07-14T10:44:00Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 16\n"
2017-07-14_10:44:10: PX-Watchdog: Waiting for px process to start
root 7903 7830 99 10:41 ? 00:03:24 /usr/local/bin/px -daemon
2017-07-14_10:44:10: PX-Watchdog: (pid 7903): Begin monitoring
time="2017-07-14T10:44:10Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 17\n"
time="2017-07-14T10:44:21Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 18\n"
time="2017-07-14T10:44:31Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 19\n"
time="2017-07-14T10:44:42Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 20\n"
time="2017-07-14T10:44:52Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 21\n"
time="2017-07-14T10:45:03Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 22\n"
2017-07-14_10:45:13: PX-Watchdog: (pid 7903): PX REST server died or did not started. return code 7. Timeout 3600
time="2017-07-14T10:45:13Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 23\n"
time="2017-07-14T10:45:24Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 24\n"
time="2017-07-14T10:45:34Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 25\n"
2017-07-14_10:45:44: PX-Watchdog: Waiting for px process to start
root 7903 7830 99 10:41 ? 00:05:10 /usr/local/bin/px -daemon
2017-07-14_10:45:44: PX-Watchdog: (pid 7903): Begin monitoring
time="2017-07-14T10:45:45Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 26\n"
time="2017-07-14T10:45:55Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 27\n"
time="2017-07-14T10:46:06Z" level=error msg="kvdb deadline exceeded error: context deadline exceeded, retry count: 28\n"
apiVersion: v1
kind: List
items:
- apiVersion: v1
kind: Namespace
metadata:
name: etcd-operator
- apiVersion: v1
kind: ServiceAccount
metadata:
name: etcd-operator
namespace: etcd-operator
- apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: etcd-operator
rules:
- apiGroups:
- etcd.coreos.com
resources:
- clusters
verbs:
- "*"
- apiGroups:
- extensions
resources:
- thirdpartyresources
verbs:
- "*"
- apiGroups:
- storage.k8s.io
resources:
- storageclasses
verbs:
- "*"
- apiGroups:
- ""
resources:
- pods
- services
- endpoints
- persistentvolumeclaims
- events
verbs:
- "*"
- apiGroups:
- apps
resources:
- deployments
verbs:
- "*"
- apiGroups:
- ""
resources:
- secrets
- configmaps
verbs:
- get
- apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: etcd-operator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: etcd-operator
subjects:
- kind: ServiceAccount
name: etcd-operator
namespace: etcd-operator
- apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: etcd-operator
namespace: etcd-operator
spec:
replicas: 1
template:
metadata:
labels:
name: etcd-operator
spec:
serviceAccountName: etcd-operator
containers:
- name: etcd-operator
image: quay.io/coreos/etcd-operator:v0.3.3
env:
- name: MY_POD_NAMESPACE
value: etcd-operator
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
  1. Create DNS
  2. Create bucket for terraform state
    aws s3api create-bucket --bucket container-solutions-maarten-portworx-kops
    
  3. Edit & source settings.sh
  4. Create cluster
export NODE_SIZE=m4.large
export MASTER_SIZE=m4.large
export ZONES="eu-west-1a,eu-west-1b,eu-west-1c"
kops create cluster "$NAME" --node-count 9 --zones $ZONES --node-size $NODE_SIZE --master-size $MASTER_SIZE --master-zones $ZONES --authorization RBAC
  1. Validate that the cluster is up&running
  2. Create the etcd operator kubectl apply -f etcd-operator.yaml
  3. Create the etcd cluster kubectl apply -f portworx-etcd.yaml
  4. Get IP address of one of the etcd nodes: a. kubectl run --restart Never --rm -ti dns-lookup-etc-cluster --image busybox a. nslookup portworx.etcd-operator.svc.cluster.local | grep portworx-0001 | awk -F' ' '{print $3}' a. exit
  5. edit portworx.yml (CTRL+F for etcd://)
  6. kubectl apply -f portworx.yml
  7. Wait some time, get error messages: kubectl get pods -n kube-system | grep portworx | awk -F' ' '{print $1}' | xargs -I'{}' kubectl logs {} -n kube-system
apiVersion: "etcd.coreos.com/v1beta1"
kind: "Cluster"
metadata:
name: "portworx"
namespace: "etcd-operator"
spec:
size: 3
version: "3.1.8"
apiVersion: v1
kind: ServiceAccount
metadata:
name: px-account
namespace: kube-system
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1alpha1
metadata:
name: node-get-put-list-role
rules:
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "update", "list"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1alpha1
metadata:
name: node-role-binding
subjects:
- apiVersion: v1
kind: ServiceAccount
name: px-account
namespace: kube-system
roleRef:
kind: ClusterRole
name: node-get-put-list-role
apiGroup: rbac.authorization.k8s.io
---
kind: Service
apiVersion: v1
metadata:
name: portworx-service
namespace: kube-system
spec:
selector:
name: portworx
ports:
- protocol: TCP
port: 9001
targetPort: 9001
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: portworx
namespace: kube-system
spec:
minReadySeconds: 0
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
name: portworx
spec:
hostNetwork: true
hostPID: true
containers:
- name: portworx
image: portworx/px-enterprise:1.2.8
terminationMessagePath: "/tmp/px-termination-log"
imagePullPolicy: Always
args:
["-k etcd://100.96.6.3:2380",
"-c mycluster",
"",
"",
"-a -f",
"",
"",
"",
"",
"",
"",
"-x", "kubernetes"]
livenessProbe:
initialDelaySeconds: 840 # allow image pull in slow networks
httpGet:
host: 127.0.0.1
path: /status
port: 9001
readinessProbe:
periodSeconds: 10
httpGet:
host: 127.0.0.1
path: /status
port: 9001
securityContext:
privileged: true
volumeMounts:
- name: dockersock
mountPath: /var/run/docker.sock
- name: libosd
mountPath: /var/lib/osd:shared
- name: dev
mountPath: /dev
- name: etcpwx
mountPath: /etc/pwx/
- name: optpwx
mountPath: /export_bin:shared
- name: cores
mountPath: /var/cores
- name: kubelet
mountPath: /var/lib/kubelet:shared
- name: src
mountPath: /usr/src
- name: dockerplugins
mountPath: /run/docker/plugins
initContainers:
- name: px-init
image: portworx/px-init
terminationMessagePath: "/tmp/px-init-termination-log"
securityContext:
privileged: true
volumeMounts:
- name: hostproc
mountPath: /media/host/proc
restartPolicy: Always
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
serviceAccountName: px-account
volumes:
- name: libosd
hostPath:
path: /var/lib/osd
- name: dev
hostPath:
path: /dev
- name: etcpwx
hostPath:
path: /etc/pwx
- name: optpwx
hostPath:
path: /opt/pwx/bin
- name: cores
hostPath:
path: /var/cores
- name: kubelet
hostPath:
path: /var/lib/kubelet
- name: src
hostPath:
path: /usr/src
- name: dockerplugins
hostPath:
path: /run/docker/plugins
- name: dockersock
hostPath:
path: /var/run/docker.sock
- name: hostproc
hostPath:
path: /proc
export NAME=portworx-blog.cs.maarten-hoogendoorn.nl
export KOPS_STATE_STORE=s3://container-solutions-maarten-portworx-kops
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment