source: openkb.info
Oozie Launcher job is a map-only job which will start the jobs which does the real work: eg, Hive, MR, Pig, etc.
(Oozie Launcher MR job [AM/Mapper Container(Hive CLI)])
. -> (MR job-1 spawned by Hive query(stage0) [AM/Mapper/Reducer Containter])
. -> (MR job-2 spawned by Hive query(stage1) [AM/Mapper/Reducer Containter])
It is controlled by below 4 parameters set in workflow.xml for each Oozie job.
oozie.launcher.mapreduce.map.memory.mb
oozie.launcher.mapreduce.map.java.opts
oozie.launcher.yarn.app.mapreduce.am.resource.mb
oozie.launcher.mapreduce.map.java.opts
The algorithm is in Oozie source code:
core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
// memory.mb
int launcherMapMemoryMB = launcherConf.getInt(HADOOP_MAP_MEMORY_MB, 1536);
int amMemoryMB = launcherConf.getInt(YARN_AM_RESOURCE_MB, 1536);
// YARN_MEMORY_MB_MIN to provide buffer.
// suppose launcher map aggressively use high memory, need some
// headroom for AM
int memoryMB = Math.max(launcherMapMemoryMB, amMemoryMB) + YARN_MEMORY_MB_MIN;
// limit to 4096 in case of 32 bit
if (launcherMapMemoryMB < 4096 && amMemoryMB < 4096 && memoryMB > 4096) {
memoryMB = 4096;
}
launcherConf.setInt(YARN_AM_RESOURCE_MB, memoryMB);
// We already made mapred.child.java.opts and
// mapreduce.map.java.opts equal, so just start with one of them
String launcherMapOpts = launcherConf.get(HADOOP_MAP_JAVA_OPTS, "");
String amChildOpts = launcherConf.get(YARN_AM_COMMAND_OPTS);
StringBuilder optsStr = new StringBuilder();
int heapSizeForMap = extractHeapSizeMB(launcherMapOpts);
int heapSizeForAm = extractHeapSizeMB(amChildOpts);
int heapSize = Math.max(heapSizeForMap, heapSizeForAm) + YARN_MEMORY_MB_MIN;
// limit to 3584 in case of 32 bit
if (heapSizeForMap < 4096 && heapSizeForAm < 4096 && heapSize > 3584) {
heapSize = 3584;
}
if (amChildOpts != null) {
optsStr.append(amChildOpts);
}
optsStr.append(" ").append(launcherMapOpts.trim());
if (heapSize > 0) {
// append calculated total heap size to the end
optsStr.append(" ").append("-Xmx").append(heapSize).append("m");
}
launcherConf.set(YARN_AM_COMMAND_OPTS, optsStr.toString().trim());
In above code, YARN_MEMORY_MB_MIN=512.
For memory.mb:
max(oozie.launcher.mapreduce.map.memory.mb,oozie.launcher.yarn.app.mapreduce.am.resource.mb)+512
For JAVA OPT:
max(oozie.launcher.mapreduce.map.java.optsb,oozie.launcher.mapreduce.map.java.opts)+512
Examples:
- Set below in workflow.xml:
<property>
<name>oozie.launcher.mapreduce.map.memory.mb</name>
<value>1024</value>
</property>
<property>
<name>oozie.launcher.mapreduce.map.java.opts</name>
<value>-Xmx777m</value>
</property>
<property>
<name>oozie.launcher.yarn.app.mapreduce.am.resource.mb</name>
<value>2048</value>
</property>
<property>
<name>oozie.launcher.mapreduce.map.java.opts</name>
<value>-Xmx1111m</value>
</property>
The actual container size for Oozie Launcher job is: (3072mb,-Xmx1623m).
The memory.mb=3072 because max(1024,2048)+512=2560 ==> 3072 because of yarn.scheduler.minimum-allocation-mb=1024.
2. Set below in workflow.xml:
<property>
<name>oozie.launcher.mapreduce.map.memory.mb</name>
<value>3072</value>
</property>
<property>
<name>oozie.launcher.mapreduce.map.java.opts</name>
<value>-Xmx777m</value>
</property>
<property>
<name>oozie.launcher.yarn.app.mapreduce.am.resource.mb</name>
<value>2048</value>
</property>
<property>
<name>oozie.launcher.mapreduce.map.java.opts</name>
<value>-Xmx1111m</value>
</property>
The actual container size for Oozie Launcher job is: (4098mb,-Xmx1623m).
Do not blindly trust the configuration page because there could be multiple sources to control the same thing.
Take above example #2 for example:
To check actual memory.mb, start with RM log:
Assigned container container_e04_1468279966583_0020_01_000001 of capacity <memory:4096, vCores:1, disks:0.0>
To check the actual java opts, do "ps -ef" on the NM when the Oozie Launcher job is running:
v7: mapr 18959 18948 99 19:36 ? 00:00:04 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.91-1.b14.el6.x86_64/jre/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/opt/mapr/hadoop/hadoop-2.7.0/logs/userlogs/application_1468279966583_0020/container_e04_1468279966583_0020_01_000001 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Xmx1024m -Xmx200m -Xmx1111m -Xmx1623m -Djava.io.tmpdir=./tmp org.apache.hadoop.mapreduce.v2.app.MRAppMaster
- When Oozie Job runs "OutOfMemory", figure out is it Oozie Launcher Job, or the MR job spawned by Hadoop components.
- Knows how to verify the memory.mb and JAVA opts for Oozie Launcher job during runtime.
Actually it works for me after remove the words oozie.launcher, I mean , for example, i use 'mapreduce.map.java.opts' instead of oozie.launcher.mapreduce.map.java.opts.