Creating subclusters and node groups within YARN queues using node labels.
- Create directories in HDFS for node labels
hadoop fs -mkdir -p /yarn/node-labels
hadoop fs -chown -R yarn:yarn /yarn
hadoop fs -chmod -R 700 /yarn
hadoop fs -mkdir -p /user/yarn
hadoop fs -chown -R yarn:yarn /user/yarn
hadoop fs -chmod -R 700 /user/yarn
- Enable node labels and set label hdfs location in YARN config yarn.node-labels.fs-store.root-dir =
hdfs://mycluster:8020/yarn/node-labels
- Create labels (these should probably be machine names or machine groups, I called them exclusive/shared….)
yarn rmadmin -addToClusterNodeLabels "exclusive(exclusive=true),shared(exclusive=false)"
- assign labels to nodes
yarn rmadmin -replaceLabelsOnNode "node3.hadoop.local=exclusive node4.hadoop.local=shared"
- Configure capacity scheduler xml (tweak these settings to tune cluster, especially per user limits.
yarn.scheduler.capacity.maximum-am-resource-percent=0.2
yarn.scheduler.capacity.maximum-applications=10000
yarn.scheduler.capacity.node-locality-delay=40
yarn.scheduler.capacity.queue-mappings-override.enable=false
yarn.scheduler.capacity.root.accessible-node-labels=shared,exclusive
yarn.scheduler.capacity.root.accessible-node-labels.exclusive.capacity=100
yarn.scheduler.capacity.root.accessible-node-labels.exclusive.maximum-capacity=100
yarn.scheduler.capacity.root.accessible-node-labels.shared.capacity=100
yarn.scheduler.capacity.root.accessible-node-labels.shared.maximum-capacity=100
yarn.scheduler.capacity.root.acl_administer_queue=*
yarn.scheduler.capacity.root.capacity=100
yarn.scheduler.capacity.root.default.accessible-node-labels=*
yarn.scheduler.capacity.root.default.accessible-node-labels.exclusive.capacity=0
yarn.scheduler.capacity.root.default.accessible-node-labels.exclusive.maximum-capacity=100
yarn.scheduler.capacity.root.default.accessible-node-labels.shared.capacity=50
yarn.scheduler.capacity.root.default.accessible-node-labels.shared.maximum-capacity=100
yarn.scheduler.capacity.root.default.acl_submit_applications=*
yarn.scheduler.capacity.root.default.capacity=50
yarn.scheduler.capacity.root.default.default-node-label-expression=shared
yarn.scheduler.capacity.root.default.maximum-capacity=100
yarn.scheduler.capacity.root.default.state=RUNNING
yarn.scheduler.capacity.root.default.user-limit-factor=2
yarn.scheduler.capacity.root.hive1.accessible-node-labels=*
yarn.scheduler.capacity.root.hive1.accessible-node-labels.exclusive.capacity=100
yarn.scheduler.capacity.root.hive1.accessible-node-labels.exclusive.maximum-capacity=100
yarn.scheduler.capacity.root.hive1.accessible-node-labels.shared.capacity=25
yarn.scheduler.capacity.root.hive1.accessible-node-labels.shared.maximum-capacity=100
yarn.scheduler.capacity.root.hive1.acl_administer_queue=*
yarn.scheduler.capacity.root.hive1.acl_submit_applications=*
yarn.scheduler.capacity.root.hive1.capacity=25
yarn.scheduler.capacity.root.hive1.default-node-label-expression=exclusive
yarn.scheduler.capacity.root.hive1.maximum-capacity=100
yarn.scheduler.capacity.root.hive1.minimum-user-limit-percent=100
yarn.scheduler.capacity.root.hive1.ordering-policy=fair
yarn.scheduler.capacity.root.hive1.ordering-policy.fair.enable-size-based-weight=false
yarn.scheduler.capacity.root.hive1.state=RUNNING
yarn.scheduler.capacity.root.hive1.user-limit-factor=4
yarn.scheduler.capacity.root.hive2.accessible-node-labels=*
yarn.scheduler.capacity.root.hive2.accessible-node-labels.exclusive.capacity=0
yarn.scheduler.capacity.root.hive2.accessible-node-labels.exclusive.maximum-capacity=100
yarn.scheduler.capacity.root.hive2.accessible-node-labels.shared.capacity=25
yarn.scheduler.capacity.root.hive2.accessible-node-labels.shared.maximum-capacity=100
yarn.scheduler.capacity.root.hive2.acl_administer_queue=*
yarn.scheduler.capacity.root.hive2.acl_submit_applications=*
yarn.scheduler.capacity.root.hive2.capacity=25
yarn.scheduler.capacity.root.hive2.default-node-label-expression=shared
yarn.scheduler.capacity.root.hive2.maximum-capacity=25
yarn.scheduler.capacity.root.hive2.minimum-user-limit-percent=100
yarn.scheduler.capacity.root.hive2.ordering-policy=fair
yarn.scheduler.capacity.root.hive2.ordering-policy.fair.enable-size-based-weight=false
yarn.scheduler.capacity.root.hive2.state=RUNNING
yarn.scheduler.capacity.root.hive2.user-limit-factor=4
yarn.scheduler.capacity.root.queues=default,hive1,hive2
- Refresh YARN Queues / Restart YARN / Restart HIVE
- Set Hive default queues to Hive1,Hive2. Set default tez sessions to 2. Restart Hive.
- Test nodes with jobs....
http://spark.apache.org/docs/latest/running-on-yarn.html
Spark Properties
spark.yarn.am.nodeLabelExpression
spark.yarn.executor.nodeLabelExpression
spark.yarn.tags
https://community.hortonworks.com/articles/11434/yarn-node-labels-1.html http://www.slideshare.net/Hadoop_Summit/node-labels-in-yarn-49792443