Skip to content

Instantly share code, notes, and snippets.

@zmarcantel
Last active January 4, 2016 13:09
Show Gist options
  • Save zmarcantel/8626135 to your computer and use it in GitHub Desktop.
Save zmarcantel/8626135 to your computer and use it in GitHub Desktop.
Trying to run a simple hadoop job, but hadoop is throwing a NoClassDef on "org/w3c/dom/Document"
I'm trying to run the basic examples from the "Mahout In Action" book (https://github.com/tdunning/MiA).
I do this using nearly the same maven setup but tooled for cassandra use rather than a file data model.
But, when I try to run the *-job.jar, it spits a NoClassDef from the datastax/hadoop end.
I'm using 1.0.5-dse of the driver as that's the only one that supports the current DSE version of Cassandra(1.2.1) if that helps at all though the issue seems to be deeper.
vagrant@brain-0:/srv/pod$ dse mahout hadoop jar target/pod-1.0-SNAPSHOT-job.jar
Running: /usr/share/dse/mahout/bin/mahout hadoop jar target/pod-1.0-SNAPSHOT-job.jar
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using HADOOP_HOME=/usr/share/dse/hadoop
HADOOP_CONF_DIR=/etc/dse/hadoop
RunJar jarFile [mainClass] args...
vagrant@brain-0:/srv/pod$ dse mahout hadoop jar target/pod-1.0-SNAPSHOT-job.jar com.canwe.InterestRecommender
Running: /usr/share/dse/mahout/bin/mahout hadoop jar target/pod-1.0-SNAPSHOT-job.jar com.canwe.InterestRecommender
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using HADOOP_HOME=/usr/share/dse/hadoop
HADOOP_CONF_DIR=/etc/dse/hadoop
Exception in thread "main" Exception in thread "DSE SystemClassLoader Background Thread 3" java.lang.NoClassDefFoundError: org/w3c/dom/Document
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1225)
at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1141)
at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1076)
at org.apache.hadoop.conf.Configuration.get(Configuration.java:429)
at org.apache.hadoop.util.RunJar.main(RunJar.java:109)
Caused by: java.lang.ClassNotFoundException: Class org.w3c.dom.Document not found in modules [ModuleClassLoader:Hadoop, ModuleClassLoader:Dse, SystemClassLoader]
at com.datastax.bdp.loader.SystemClassLoader.loadClass(SystemClassLoader.java:120)
at com.datastax.bdp.loader.ModuleClassLoader.loadClass(ModuleClassLoader.java:38)
at com.datastax.bdp.loader.ModuleClassLoader.loadClass(ModuleClassLoader.java:32)
... 5 more
java.lang.LinkageError: loader constraint violation: loader (instance of com/datastax/bdp/loader/ModuleClassLoader) previously initiated loading for a different type with name "org/w3c/dom/Document"
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:792)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at com.datastax.bdp.loader.ModuleClassLoader.loadClassDirectly(ModuleClassLoader.java:46)
at com.datastax.bdp.loader.SystemClassLoader.tryLoadClass(SystemClassLoader.java:195)
at com.datastax.bdp.loader.SystemClassLoader.tryLoadClass(SystemClassLoader.java:175)
at com.datastax.bdp.loader.SystemClassLoader.access$100(SystemClassLoader.java:29)
at com.datastax.bdp.loader.SystemClassLoader$ClassLoadingTask.run(SystemClassLoader.java:237)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.canwe</groupId>
<artifactId>pod</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<name>pod</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<mahout.version>0.8</mahout.version>
<mahout.groupid>org.apache.mahout</mahout.groupid>
</properties>
<build>
<pluginManagement>
<plugins>
<!--This plugin's configuration is used to store Eclipse m2e settings only. It has
no influence on the Maven build itself. -->
<plugin>
<groupId>org.eclipse.m2e</groupId>
<artifactId>lifecycle-mapping</artifactId>
<version>1.0.0</version>
<configuration>
<lifecycleMappingMetadata>
<pluginExecutions>
<pluginExecution>
<pluginExecutionFilter>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-antrun-plugin</artifactId>
<versionRange>[1.5,)</versionRange>
<goals>
<goal>run</goal>
</goals>
</pluginExecutionFilter>
<action>
<ignore />
</action>
</pluginExecution>
</pluginExecutions>
</lifecycleMappingMetadata>
</configuration>
</plugin>
</plugins>
</pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.3.2</version>
<configuration>
<encoding>UTF-8</encoding>
<source>1.6</source>
<target>1.6</target>
<optimize>true</optimize>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-antrun-plugin</artifactId>
<version>1.6</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-resources-plugin</artifactId>
<version>2.4.3</version>
<configuration>
<encoding>UTF-8</encoding>
</configuration>
</plugin>
<!-- create hadoop job jar -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<executions>
<execution>
<id>job</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
<configuration>
<descriptors>
<descriptor>src/main/assembly/job.xml</descriptor>
</descriptors>
</configuration>
</execution>
<execution>
<id>my-jar-with-dependencies</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.9</version>
<configuration>
<useFile>false</useFile>
</configuration>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>${mahout.groupid}</groupId>
<artifactId>mahout-core</artifactId>
<version>${mahout.version}</version>
</dependency>
<dependency>
<groupId>${mahout.groupid}</groupId>
<artifactId>mahout-core</artifactId>
<type>test-jar</type>
<scope>test</scope>
<version>${mahout.version}</version>
</dependency>
<dependency>
<groupId>${mahout.groupid}</groupId>
<artifactId>mahout-math</artifactId>
<version>${mahout.version}</version>
</dependency>
<dependency>
<groupId>${mahout.groupid}</groupId>
<artifactId>mahout-math</artifactId>
<type>test-jar</type>
<scope>test</scope>
<version>${mahout.version}</version>
</dependency>
<dependency>
<groupId>${mahout.groupid}</groupId>
<artifactId>mahout-examples</artifactId>
<version>${mahout.version}</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>r03</version>
</dependency>
<dependency>
<groupId>org.apache.thrift</groupId>
<artifactId>libthrift</artifactId>
<version>0.6.1</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.5.11</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>zookeeper</artifactId>
<version>3.3.1</version>
</dependency>
<dependency>
<groupId>org.twitter4j</groupId>
<artifactId>twitter4j-stream</artifactId>
<version>2.2.3</version>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.0.1</version>
<type>jar</type>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>1.0.5-dse</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>15.0-rc1</version>
</dependency>
<dependency>
<groupId>xml-apis</groupId>
<artifactId>xml-apis</artifactId>
<version>1.4.01</version>
</dependency>
</dependencies>
</project>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment