Skip to content

Instantly share code, notes, and snippets.

@asears
Last active January 24, 2024 07:16
Show Gist options
  • Save asears/cf7587561d955a6c3f2ddfed77ee18c3 to your computer and use it in GitHub Desktop.
Save asears/cf7587561d955a6c3f2ddfed77ee18c3 to your computer and use it in GitHub Desktop.

Creating subclusters and node groups within YARN queues using node labels.

  1. Create directories in HDFS for node labels
hadoop fs -mkdir -p /yarn/node-labels
hadoop fs -chown -R yarn:yarn /yarn
hadoop fs -chmod -R 700 /yarn
hadoop fs -mkdir -p /user/yarn
hadoop fs -chown -R yarn:yarn /user/yarn
hadoop fs -chmod -R 700 /user/yarn
  1. Enable node labels and set label hdfs location in YARN config yarn.node-labels.fs-store.root-dir =
hdfs://mycluster:8020/yarn/node-labels
  1. Create labels (these should probably be machine names or machine groups, I called them exclusive/shared….)
yarn rmadmin -addToClusterNodeLabels "exclusive(exclusive=true),shared(exclusive=false)"

  1. assign labels to nodes
yarn rmadmin -replaceLabelsOnNode "node3.hadoop.local=exclusive node4.hadoop.local=shared"
  1. Configure capacity scheduler xml (tweak these settings to tune cluster, especially per user limits.
yarn.scheduler.capacity.maximum-am-resource-percent=0.2
yarn.scheduler.capacity.maximum-applications=10000
yarn.scheduler.capacity.node-locality-delay=40
yarn.scheduler.capacity.queue-mappings-override.enable=false
yarn.scheduler.capacity.root.accessible-node-labels=shared,exclusive
yarn.scheduler.capacity.root.accessible-node-labels.exclusive.capacity=100
yarn.scheduler.capacity.root.accessible-node-labels.exclusive.maximum-capacity=100
yarn.scheduler.capacity.root.accessible-node-labels.shared.capacity=100
yarn.scheduler.capacity.root.accessible-node-labels.shared.maximum-capacity=100
yarn.scheduler.capacity.root.acl_administer_queue=*
yarn.scheduler.capacity.root.capacity=100
yarn.scheduler.capacity.root.default.accessible-node-labels=*
yarn.scheduler.capacity.root.default.accessible-node-labels.exclusive.capacity=0
yarn.scheduler.capacity.root.default.accessible-node-labels.exclusive.maximum-capacity=100
yarn.scheduler.capacity.root.default.accessible-node-labels.shared.capacity=50
yarn.scheduler.capacity.root.default.accessible-node-labels.shared.maximum-capacity=100
yarn.scheduler.capacity.root.default.acl_submit_applications=*
yarn.scheduler.capacity.root.default.capacity=50
yarn.scheduler.capacity.root.default.default-node-label-expression=shared
yarn.scheduler.capacity.root.default.maximum-capacity=100
yarn.scheduler.capacity.root.default.state=RUNNING
yarn.scheduler.capacity.root.default.user-limit-factor=2
yarn.scheduler.capacity.root.hive1.accessible-node-labels=*
yarn.scheduler.capacity.root.hive1.accessible-node-labels.exclusive.capacity=100
yarn.scheduler.capacity.root.hive1.accessible-node-labels.exclusive.maximum-capacity=100
yarn.scheduler.capacity.root.hive1.accessible-node-labels.shared.capacity=25
yarn.scheduler.capacity.root.hive1.accessible-node-labels.shared.maximum-capacity=100
yarn.scheduler.capacity.root.hive1.acl_administer_queue=*
yarn.scheduler.capacity.root.hive1.acl_submit_applications=*
yarn.scheduler.capacity.root.hive1.capacity=25
yarn.scheduler.capacity.root.hive1.default-node-label-expression=exclusive
yarn.scheduler.capacity.root.hive1.maximum-capacity=100
yarn.scheduler.capacity.root.hive1.minimum-user-limit-percent=100
yarn.scheduler.capacity.root.hive1.ordering-policy=fair
yarn.scheduler.capacity.root.hive1.ordering-policy.fair.enable-size-based-weight=false
yarn.scheduler.capacity.root.hive1.state=RUNNING
yarn.scheduler.capacity.root.hive1.user-limit-factor=4
yarn.scheduler.capacity.root.hive2.accessible-node-labels=*
yarn.scheduler.capacity.root.hive2.accessible-node-labels.exclusive.capacity=0
yarn.scheduler.capacity.root.hive2.accessible-node-labels.exclusive.maximum-capacity=100
yarn.scheduler.capacity.root.hive2.accessible-node-labels.shared.capacity=25
yarn.scheduler.capacity.root.hive2.accessible-node-labels.shared.maximum-capacity=100
yarn.scheduler.capacity.root.hive2.acl_administer_queue=*
yarn.scheduler.capacity.root.hive2.acl_submit_applications=*
yarn.scheduler.capacity.root.hive2.capacity=25
yarn.scheduler.capacity.root.hive2.default-node-label-expression=shared
yarn.scheduler.capacity.root.hive2.maximum-capacity=25
yarn.scheduler.capacity.root.hive2.minimum-user-limit-percent=100
yarn.scheduler.capacity.root.hive2.ordering-policy=fair
yarn.scheduler.capacity.root.hive2.ordering-policy.fair.enable-size-based-weight=false
yarn.scheduler.capacity.root.hive2.state=RUNNING
yarn.scheduler.capacity.root.hive2.user-limit-factor=4
yarn.scheduler.capacity.root.queues=default,hive1,hive2
  1. Refresh YARN Queues / Restart YARN / Restart HIVE
  2. Set Hive default queues to Hive1,Hive2. Set default tez sessions to 2. Restart Hive.
  3. Test nodes with jobs....

http://spark.apache.org/docs/latest/running-on-yarn.html

Spark Properties

spark.yarn.am.nodeLabelExpression
spark.yarn.executor.nodeLabelExpression
spark.yarn.tags

https://community.hortonworks.com/articles/11434/yarn-node-labels-1.html http://www.slideshare.net/Hadoop_Summit/node-labels-in-yarn-49792443

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment