周三. 6 月 18th, 2025

大数据开发bug日记学习笔记

yarn容量调度器-队列设置

作者admin

6 月 11, 2024

结构及实例
实例配置：
yarn.scheduler.capacity.maximum-am-resource-percent=0.2
yarn.scheduler.capacity.maximum-applications=10000
yarn.scheduler.capacity.node-locality-delay=40
yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator
yarn.scheduler.capacity.root.accessible-node-labels=*
yarn.scheduler.capacity.root.acl_administer_queue=*
yarn.scheduler.capacity.root.acl_submit_applications=*
yarn.scheduler.capacity.root.capacity=100
yarn.scheduler.capacity.root.minimum-user-limit-percent=1
yarn.scheduler.capacity.root.production.acl_administer_jobs=*
yarn.scheduler.capacity.root.production.acl_submit_applications=*
yarn.scheduler.capacity.root.production.capacity=80
yarn.scheduler.capacity.root.production.maximum-capacity=100
yarn.scheduler.capacity.root.production.state=RUNNING
yarn.scheduler.capacity.root.production.user-limit-factor=5
yarn.scheduler.capacity.root.queues=production,workflow
yarn.scheduler.capacity.root.workflow.acl_administer_jobs=*
yarn.scheduler.capacity.root.workflow.acl_submit_applications=*
yarn.scheduler.capacity.root.workflow.capacity=20
yarn.scheduler.capacity.root.workflow.hive.acl_administer_jobs=*
yarn.scheduler.capacity.root.workflow.hive.acl_submit_applications=admin
yarn.scheduler.capacity.root.workflow.hive.capacity=50
yarn.scheduler.capacity.root.workflow.hive.maximum-capacity=100
yarn.scheduler.capacity.root.workflow.hive.state=RUNNING
yarn.scheduler.capacity.root.workflow.hive.user-limit-factor=5
yarn.scheduler.capacity.root.workflow.maximum-capacity=100
yarn.scheduler.capacity.root.workflow.queues=hive,spark
yarn.scheduler.capacity.root.workflow.spark.acl_administer_jobs=*
yarn.scheduler.capacity.root.workflow.spark.acl_submit_applications=*
yarn.scheduler.capacity.root.workflow.spark.capacity=50
yarn.scheduler.capacity.root.workflow.spark.maximum-capacity=100
yarn.scheduler.capacity.root.workflow.spark.state=RUNNING
yarn.scheduler.capacity.root.workflow.spark.user-limit-factor=5
yarn.scheduler.capacity.root.workflow.state=RUNNING
yarn.scheduler.capacity.root.workflow.user-limit-factor=5
yarn.scheduler.capacity.schedule-asynchronously.enable=true
yarn.scheduler.capacity.schedule-asynchronously.maximum-threads=1
yarn.scheduler.capacity.schedule-asynchronously.scheduling-interval-ms=10
使用注意：
- yarn任务会默认分配给production队列。
- 开发时可以指定任务运行队列
  - set mapred.job.queue.name=hive;
集群组件队列参数设置
- hive默认队列设置
- spark默认队列设置
yarn 容量调度器参数说明和示例
yarn.scheduler.capacity.default.minimum-user-limit-percent=1 每个任务占用的最少资源
yarn.scheduler.capacity.maximum-am-resource-percent=0.2 设置有多少资源可以用来运行app master，即控制当前激活状态的应用。默认是10%。
yarn.scheduler.capacity.maximum-applications=10000 设置系统中可以同时运行和等待的应用数量。默认是10000.
yarn.scheduler.capacity.node-locality-delay=40 调度器尝试进行调度的次数。一般都是跟集群的节点数量有关。默认40（一个机架上的节点数）
yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator 资源计算方法，默认是
yarn.scheduler.capacity.root.accessible-node-labels=*
yarn.scheduler.capacity.root.acl_administer_queue=* 哪些用户或用户组可以管理队列
yarn.scheduler.capacity.root.acl_submit_applications=* 访问控制列表ACL控制谁可以向该队列提交任务。如果一个用户可以向该队列提交，那么也可以提交任务到它的子队列
yarn.scheduler.capacity.root.capacity=100 一个百分比的值，表示占用整个集群的百分之多少比例的资源，这个queue-path下所有的capacity之和是100
yarn.scheduler.capacity.root.queues=default,workflow
yarn.scheduler.capacity.root.default.acl_administer_jobs=* 哪些用户或用户组可以管理队列
yarn.scheduler.capacity.root.default.acl_submit_applications=* 哪些用户或用户组可以提交人物
yarn.scheduler.capacity.root.default.capacity=20 一个百分比的值，表示占用整个集群的百分之多少比例的资源，这个queue-path下所有的capacity之和是100
yarn.scheduler.capacity.root.default.maximum-capacity=100 弹性设置，最大时占用多少比例资源
yarn.scheduler.capacity.root.default.state=RUNNING 队列状态，可以是RUNNING或STOPPED
yarn.scheduler.capacity.root.default.user-limit-factor=1 每个用户的低保百分比，比如设置为1，则表示无论有多少用户在跑任务，每个用户占用资源最低不会少于1%的资源
yarn.scheduler.capacity.root.workflow.acl_administer_jobs=* 哪些用户或用户组可以管理队列
yarn.scheduler.capacity.root.workflow.acl_submit_applications=* 哪些用户或用户组可以提交人物
yarn.scheduler.capacity.root.workflow.capacity=80 一个百分比的值，表示占用整个集群的百分之多少比例的资源，这个queue-path下所有的capacity之和是100
yarn.scheduler.capacity.root.workflow.maximum-capacity=100 弹性设置，最大时占用多少比例资源
yarn.scheduler.capacity.root.workflow.state=RUNNING 队列状态，可以是RUNNING或STOPPED
yarn.scheduler.capacity.root.workflow.user-limit-factor=1 每个用户的低保百分比，比如设置为1，则表示无论有多少用户在跑任务，每个用户占用资源最低不会少于1%的资源
yarn.scheduler.capacity.root.workflow.queues=hive,impala,spark,flink
yarn.scheduler.capacity.schedule-asynchronously.enable=true
yarn.scheduler.capacity.schedule-asynchronously.maximum-threads=1
yarn.scheduler.capacity.schedule-asynchronously.scheduling-interval-ms=10
root
——workflow 0.8
|—hive 0.4
|—impala 0.3
- |—spark 0.2
- |—flink 0.1
——default 0.2
dev环境 yarn 队列信息
http://139.159.150.130:31013/cluster/scheduler?openQueues=default

作者 admin

张宴银，大数据开发工程师

相关文章

业务知识学习笔记

物流知识讲堂 – 物流业务知识

5 月 13, 2025 admin

学习笔记数仓建模

中级数据建模师

2 月 8, 2025 张, 宴银

学习笔记数仓建模

初级模型设计

1 月 24, 2025 张, 宴银

发表回复取消回复

You missed

业务知识学习笔记

物流知识讲堂 – 物流业务知识

2025 年 5 月 13 日 admin

零基础学习数据治理 – 黎山

2025 年 2 月 27 日张, 宴银

AI大模型工具

2025 年 2 月 25 日张, 宴银

AI 机器学习

机器学习 – Numpy

2025 年 2 月 11 日张, 宴银