Skip to content

Instantly share code, notes, and snippets.

@djo
Last active August 29, 2015 14:14
Show Gist options
  • Save djo/0855109da9ba4664b54c to your computer and use it in GitHub Desktop.
Save djo/0855109da9ba4664b54c to your computer and use it in GitHub Desktop.
Resolve conflicts in Apache Spark with Parquet format

Conflicts in the sbt dependencies when adding parquet-format:

java.lang.RuntimeException: deduplicate: different file contents found in the following:
path-to-cache/org.slf4j/slf4j-api/jars/slf4j-api-1.7.9.jar:META-INF/maven/org.slf4j/slf4j-api/pom.properties
path-to-cache/com.twitter/parquet-format/jars/parquet-format-2.2.0-rc1.jar:META-INF/maven/org.slf4j/slf4j-api/pom.properties

To resolve it exclude slf4j-api/pom.properties from parquet-format package:

mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) =>
  {
    case PathList("META-INF", "maven","org.slf4j","slf4j-api", ps) if ps.startsWith("pom") => MergeStrategy.discard
    case x => old(x)
  }
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment