Skip to content

Instantly share code, notes, and snippets.

@astewart-twist
Last active December 8, 2015 21:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save astewart-twist/ceb36e492f96fe2a2500 to your computer and use it in GitHub Desktop.
Save astewart-twist/ceb36e492f96fe2a2500 to your computer and use it in GitHub Desktop.
M02032:95:000000000-AL16U:1:1101:28074:11970 0 group0001.sample001 1 0 30M1D9M * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX NM:i:1 MD:Z:30^C9 AS:i:32 XS:i:32 XA:Z:group0292.sample0001,+1,30M1D9M,1;group0003.sample0001,+1,30M1D9M,1;group0194.sample0001,+1,30M1D9M,1;
M02032:95:000000000-AL16U:1:1101:2028:14335 0 group0001.sample001 1 0 40M * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX NM:i:0 MD:Z:40 AS:i:40 XS:i:40 XA:Z:group0194.sample0001,+1,40M,0;group0003.sample0001,+1,40M,0;group0292.sample0001,+1,40M,0;
M02032:95:000000000-AL16U:1:1101:11717:14813 0 group0001.sample001 1 0 40M * 0 0 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX NM:i:0 MD:Z:40 AS:i:40 XS:i:40 XA:Z:group0292.sample0001,+1,40M,0;group0003.sample0001,+1,40M,0;group0194.sample0001,+1,40M,0;
$ adam-submit transform ./my.bam ./my.adam
Using ADAM_MAIN=org.bdgenomics.adam.cli.ADAMMain
Using SPARK_SUBMIT=/opt/spark-1.5.2-bin-hadoop2.6/bin/spark-submit
2015-12-07 09:40:29 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-12-07 09:40:30 WARN MetricsSystem:71 - Using default name DAGScheduler for source because spark.app.id is not set.
2015-12-07 09:40:33 WARN ThreadLocalRandom:136 - Failed to generate a seed from SecureRandom within 3 seconds. Not enough entrophy?
Command body threw exception:
java.io.IOException: 'file:my.bam': no reads in first split: bad BAM file or tiny split size?
Exception in thread "main" java.io.IOException: 'file:my.bam': no reads in first split: bad BAM file or tiny split size?
at org.seqdoop.hadoop_bam.BAMInputFormat.addProbabilisticSplits(BAMInputFormat.java:197)
at org.seqdoop.hadoop_bam.BAMInputFormat.getSplits(BAMInputFormat.java:99)
at org.seqdoop.hadoop_bam.AnySAMInputFormat.getSplits(AnySAMInputFormat.java:240)
at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:115)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1914)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1055)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:998)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:998)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:998)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply$mcV$sp(PairRDDFunctions.scala:938)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(PairRDDFunctions.scala:930)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(PairRDDFunctions.scala:930)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:930)
at org.apache.spark.rdd.InstrumentedPairRDDFunctions.saveAsNewAPIHadoopFile(InstrumentedPairRDDFunctions.scala:487)
at org.bdgenomics.adam.rdd.ADAMRDDFunctions$$anonfun$adamParquetSave$1.apply$mcV$sp(ADAMRDDFunctions.scala:75)
at org.bdgenomics.adam.rdd.ADAMRDDFunctions$$anonfun$adamParquetSave$1.apply(ADAMRDDFunctions.scala:60)
at org.bdgenomics.adam.rdd.ADAMRDDFunctions$$anonfun$adamParquetSave$1.apply(ADAMRDDFunctions.scala:60)
at org.apache.spark.rdd.Timer.time(Timer.scala:57)
at org.bdgenomics.adam.rdd.ADAMRDDFunctions.adamParquetSave(ADAMRDDFunctions.scala:60)
at org.bdgenomics.adam.rdd.ADAMRDDFunctions.adamParquetSave(ADAMRDDFunctions.scala:46)
at org.bdgenomics.adam.rdd.read.AlignmentRecordRDDFunctions.adamSave(AlignmentRecordRDDFunctions.scala:96)
at org.bdgenomics.adam.cli.Transform.run(Transform.scala:269)
at org.bdgenomics.utils.cli.BDGSparkCommand$class.run(BDGCommand.scala:54)
at org.bdgenomics.adam.cli.Transform.run(Transform.scala:110)
at org.bdgenomics.adam.cli.ADAMMain.apply(ADAMMain.scala:121)
at org.bdgenomics.adam.cli.ADAMMain$.main(ADAMMain.scala:77)
at org.bdgenomics.adam.cli.ADAMMain.main(ADAMMain.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment