Skip to content

Instantly share code, notes, and snippets.

@polvi
Created May 3, 2017 23:53
Show Gist options
  • Save polvi/34ef498a967de563dc4252a7bfb7d582 to your computer and use it in GitHub Desktop.
Save polvi/34ef498a967de563dc4252a7bfb7d582 to your computer and use it in GitHub Desktop.
HDFS of Kubernetes

Easiest HDFS cluster in the world with kubernetes.

Inspiration from kimoonkim/kubernetes-HDFS

kubectl create -f namenode.yaml
kubectl create -f datanode.yaml

Setup a port-forward to so you can see it is alive:

kubectl port-forward hdfs-namenode-0 50070:50070

Then in your browser hit to check out the datanodes:

http://localhost:50070/dfshealth.html#tab-datanode

You should see a datanode list with one node in it

Back in your console, scale it up!!

kubectl scale statefulset hdfs-datanode --replicas=3

Refresh your browser. Bada boom!

http://localhost:50070/dfshealth.html#tab-datanode

You should see a datanode list with three nodes in it

Now start hadoop'n

kubectl exec -ti hdfs-namenode-0 /bin/bash
root@hdfs-namenode-0:/# hadoop fs -mkdir /tmp
root@hdfs-namenode-0:/# hadoop fs -put /bin/systemd /tmp/ # just upload the systemd binary into hdfs to see if it working (could be any file)
root@hdfs-namenode-0:/# hadoop fs -ls /tmp
Found 1 items
-rw-r--r--   3 root supergroup    1313160 2017-05-03 21:15 /tmp/systemd
# A headless service to create DNS records.
apiVersion: v1
kind: Service
metadata:
name: hdfs-datanode
labels:
app: hdfs-datanode
spec:
ports:
- port: 50010
name: fs
clusterIP: None
selector:
app: hdfs-datanode
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: hdfs-datanode
spec:
serviceName: "hdfs-datanode"
replicas: 1
template:
metadata:
labels:
app: hdfs-datanode
spec:
containers:
- name: datanode
image: uhopper/hadoop-datanode:2.7.2
env:
- name: CORE_CONF_fs_defaultFS
value: hdfs://hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:8020
ports:
- containerPort: 50010
name: fs
restartPolicy: Always
# A headless service to create DNS records.
apiVersion: v1
kind: Service
metadata:
name: hdfs-namenode
labels:
app: hdfs-namenode
spec:
ports:
- port: 8020
name: fs
clusterIP: None
selector:
app: hdfs-namenode
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: hdfs-namenode
spec:
serviceName: "hdfs-namenode"
replicas: 1
template:
metadata:
labels:
app: hdfs-namenode
spec:
terminationGracePeriodSeconds: 0
containers:
- name: hdfs-namenode
image: uhopper/hadoop-namenode:2.7.2
env:
- name: CLUSTER_NAME
value: hdfs-k8s
ports:
- containerPort: 8020
name: fs
restartPolicy: Always
@fspaniol
Copy link

Hi,

Once I access the portal, I keep on getting this message:

NameNode is still loading. Redirecting to the Startup Progress page.

Do you know how to solve it?

Best,
Fernando

@Pl4tiNuM
Copy link

You forgot to expose the port 50070 on the namenode!

Thanks a lot for your support :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment