Skip to content

Instantly share code, notes, and snippets.

@killerwhile
Last active February 1, 2016 20:33
Show Gist options
  • Save killerwhile/95c49fb0af0e938de222 to your computer and use it in GitHub Desktop.
Save killerwhile/95c49fb0af0e938de222 to your computer and use it in GitHub Desktop.
Listing Hadoop jars part of the classpath
# hadoop classpath prior to 2.6 returns wildcarded path (starting hadoop 2.6, hadoop classpath --glob is doing the trick by itself)
# It's sometimes convenient to have a nicely printed list of jars that that part of the classpath when running hadoop jar.
for i in $(hadoop classpath | sed -e "s/:/ /g");
do
echo $i | egrep "\.jar$" >/dev/null && python -c "import os,sys; print os.path.realpath(sys.argv[1])" $i
done | sort | uniq
# If you want to generate the list of ArtifactIds to exclude from your packaging, use this version
for i in $(hadoop classpath | sed -e "s/:/ /g");
do
echo $i | egrep "\.jar$" >/dev/null && basename $(python -c "import os,sys; print os.path.realpath(sys.argv[1])" $i) | sed 's/-[0-9].*//' | grep -v "\.jar";
done | sort | uniq | paste -sd,
@killerwhile
Copy link
Author

For CDH4 (CHD4.7.1), the maven-dependency-plugin can have this configuration option:
<excludeArtifactIds>activation,ant-contrib,aopalliance,asm,avro,avro-compiler,cloudera-jets3t,commons-beanutils,commons-beanutils-core,commons-cli,commons-codec,commons-collections,commons-compress,commons-configuration,commons-daemon,commons-digester,commons-el,commons-httpclient,commons-io,commons-lang,commons-logging,commons-math,commons-net,datafu,guava,guice,guice-servlet,hadoop-annotations,hadoop-ant,hadoop-auth,hadoop-common,hadoop-core,hadoop-examples,hadoop-fairscheduler,hadoop-hdfs,hadoop-lzo-cdh4,hadoop-test,hadoop-tools,hadoop-yarn-api,hadoop-yarn-applications-distributedshell,hadoop-yarn-applications-unmanaged-am-launcher,hadoop-yarn-client,hadoop-yarn-common,hadoop-yarn-server-common,hadoop-yarn-server-nodemanager,hadoop-yarn-server-resourcemanager,hadoop-yarn-server-tests,hadoop-yarn-server-web-proxy,hadoop-yarn-site,hsqldb,hue-plugins,jackson-core-asl,jackson-jaxrs,jackson-mapper-asl,jackson-xc,jasper-compiler,jasper-runtime,javax.inject,jaxb-api,jaxb-impl,jersey-core,jersey-guice,jersey-json,jersey-server,jets3t,jettison,jetty,jetty-util,jline,jsch,jsp-api,jsr305,junit,kfs,log4j,mockito-all,netty,paranamer,parquet-avro,parquet-cascading,parquet-column,parquet-common,parquet-encoding,parquet-format,parquet-generator,parquet-hadoop,parquet-hive,parquet-pig,parquet-pig-bundle,parquet-scrooge,parquet-test-hadoop2,parquet-thrift,pig,protobuf-java,servlet-api,slf4j-api,slf4j-log4j12,snappy-java,stax-api,xmlenc,xz,zookeeper
</excludeArtifactIds>

@killerwhile
Copy link
Author

For HDP 2.3.4, the packages (and version) included in the classpath is:

package=activation,version=1.1
package=aopalliance,version=1.0
package=asm,version=3.2
package=avro,version=1.7.4
package=aws-java-sdk,version=1.7.4
package=azure-storage,version=2.2.0
package=commons-beanutils-core,version=1.8.0
package=commons-beanutils,version=1.7.0
package=commons-cli,version=1.2
package=commons-codec,version=1.4
package=commons-collections4,version=4.1
package=commons-collections,version=3.2.2
package=commons-compress,version=1.4.1
package=commons-configuration,version=1.6
package=commons-daemon,version=1.0.13
package=commons-digester,version=1.8
package=commons-httpclient,version=3.1
package=commons-io,version=2.4
package=commons-lang3,version=3.3.2
package=commons-lang,version=2.6
package=commons-logging,version=1.1.3
package=commons-math3,version=3.1.1
package=commons-net,version=3.1
package=curator-client,version=2.7.1
package=curator-framework,version=2.7.1
package=curator-recipes,version=2.7.1
package=fst,version=2.24
package=gson,version=2.2.4
package=guava,version=11.0.2
package=guice-servlet,version=3.0
package=guice,version=3.0
package=hadoop-annotations,version=2.7.1.2.3.4.0-3485
package=hadoop-ant,version=2.7.1.2.3.4.0-3485
package=hadoop-archives,version=2.7.1.2.3.4.0-3485
package=hadoop-auth,version=2.7.1.2.3.4.0-3485
package=hadoop-aws,version=2.7.1.2.3.4.0-3485
package=hadoop-azure,version=2.7.1.2.3.4.0-3485
package=hadoop-common,version=2.7.1.2.3.4.0-3485
package=hadoop-datajoin,version=2.7.1.2.3.4.0-3485
package=hadoop-distcp,version=2.7.1.2.3.4.0-3485
package=hadoop-extras,version=2.7.1.2.3.4.0-3485
package=hadoop-gridmix,version=2.7.1.2.3.4.0-3485
package=hadoop-hdfs-nfs,version=2.7.1.2.3.4.0-3485
package=hadoop-hdfs,version=2.7.1.2.3.4.0-3485
package=hadoop-lzo,version=0.6.0.2.3.4.0-3485
package=hadoop-mapreduce-client-app,version=2.7.1.2.3.4.0-3485
package=hadoop-mapreduce-client-common,version=2.7.1.2.3.4.0-3485
package=hadoop-mapreduce-client-core,version=2.7.1.2.3.4.0-3485
package=hadoop-mapreduce-client-hs-plugins,version=2.7.1.2.3.4.0-3485
package=hadoop-mapreduce-client-hs,version=2.7.1.2.3.4.0-3485
package=hadoop-mapreduce-client-jobclient,version=2.7.1.2.3.4.0-3485
package=hadoop-mapreduce-client-shuffle,version=2.7.1.2.3.4.0-3485
package=hadoop-mapreduce-examples,version=2.7.1.2.3.4.0-3485
package=hadoop-nfs,version=2.7.1.2.3.4.0-3485
package=hadoop-openstack,version=2.7.1.2.3.4.0-3485
package=hadoop-rumen,version=2.7.1.2.3.4.0-3485
package=hadoop-sls,version=2.7.1.2.3.4.0-3485
package=hadoop-streaming,version=2.7.1.2.3.4.0-3485
package=hadoop-yarn-api,version=2.7.1.2.3.4.0-3485
package=hadoop-yarn-applications-distributedshell,version=2.7.1.2.3.4.0-3485
package=hadoop-yarn-applications-unmanaged-am-launcher,version=2.7.1.2.3.4.0-3485
package=hadoop-yarn-client,version=2.7.1.2.3.4.0-3485
package=hadoop-yarn-common,version=2.7.1.2.3.4.0-3485
package=hadoop-yarn-registry,version=2.7.1.2.3.4.0-3485
package=hadoop-yarn-server-applicationhistoryservice,version=2.7.1.2.3.4.0-3485
package=hadoop-yarn-server-common,version=2.7.1.2.3.4.0-3485
package=hadoop-yarn-server-nodemanager,version=2.7.1.2.3.4.0-3485
package=hadoop-yarn-server-resourcemanager,version=2.7.1.2.3.4.0-3485
package=hadoop-yarn-server-sharedcachemanager,version=2.7.1.2.3.4.0-3485
package=hadoop-yarn-server-tests,version=2.7.1.2.3.4.0-3485
package=hadoop-yarn-server-timeline-plugins,version=2.7.1.2.3.4.0-3485
package=hadoop-yarn-server-web-proxy,version=2.7.1.2.3.4.0-3485
package=hamcrest-core,version=1.3
package=httpclient,version=4.2.5
package=httpcore,version=4.2.5
package=jackson-annotations,version=2.2.3
package=jackson-core-asl,version=1.9.13
package=jackson-core,version=2.2.3
package=jackson-databind,version=2.2.3
package=jackson-jaxrs,version=1.9.13
package=jackson-mapper-asl,version=1.9.13
package=jackson-xc,version=1.9.13
package=java-xmlbuilder,version=0.4
package=jaxb-api,version=2.2.2
package=jaxb-impl,version=2.2.3-1
package=jersey-client,version=1.9
package=jersey-core,version=1.9
package=jersey-guice,version=1.9
package=jersey-json,version=1.9
package=jersey-server,version=1.9
package=jets3t,version=0.9.0
package=jettison,version=1.1
package=jettison,version=1.3.4
package=joda-time,version=2.9.1
package=jsch,version=0.1.42
package=jsp-api,version=2.1
package=jsr305,version=2.0.3
package=jsr305,version=3.0.0
package=junit,version=4.11
package=leveldbjni-all,version=1.8
package=log4j,version=1.2.17
package=metrics-core,version=3.0.1
package=microsoft-windowsazure-storage-sdk,version=0.6.0
package=mockito-all,version=1.8.5
package=mysql-connector-java,version=5.1.37
package=objenesis,version=2.1
package=okhttp,version=2.4.0
package=okio,version=1.4.0
package=paranamer,version=2.3
package=protobuf-java,version=2.5.0
package=ranger-hdfs-plugin-shim,version=0.5.0.2.3.4.0-3485
package=ranger-plugin-classloader,version=0.5.0.2.3.4.0-3485
package=ranger-yarn-plugin-shim,version=0.5.0.2.3.4.0-3485
package=servlet-api,version=2.5
package=slf4j-api,version=1.7.10
package=slf4j-api,version=1.7.5
package=slf4j-log4j12,version=1.7.10
package=snappy-java,version=1.0.4.1
package=stax-api,version=1.0-2
package=tez-api,version=0.7.0.2.3.4.0-3485
package=tez-common,version=0.7.0.2.3.4.0-3485
package=tez-dag,version=0.7.0.2.3.4.0-3485
package=tez-examples,version=0.7.0.2.3.4.0-3485
package=tez-history-parser,version=0.7.0.2.3.4.0-3485
package=tez-mapreduce,version=0.7.0.2.3.4.0-3485
package=tez-runtime-internals,version=0.7.0.2.3.4.0-3485
package=tez-runtime-library,version=0.7.0.2.3.4.0-3485
package=tez-tests,version=0.7.0.2.3.4.0-3485
package=tez-yarn-timeline-cache-plugin,version=0.7.0.2.3.4.0-3485
package=tez-yarn-timeline-history,version=0.7.0.2.3.4.0-3485
package=tez-yarn-timeline-history-with-acls,version=0.7.0.2.3.4.0-3485
package=tez-yarn-timeline-history-with-fs,version=0.7.0.2.3.4.0-3485
package=xml-apis,version=1.3.04
package=xmlenc,version=0.52
package=xz,version=1.0
package=zookeeper,version=3.4.6.2.3.4.0-3485

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment