Skip to content

Instantly share code, notes, and snippets.

@deajan
Created September 4, 2015 11:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save deajan/5132423dc99139df45e2 to your computer and use it in GitHub Desktop.
Save deajan/5132423dc99139df45e2 to your computer and use it in GitHub Desktop.
regain-server-crawler
13:43:25: Retrying preparation of: file:///storage/papiers/OZY/VEHICULE/VEHICULE.2015-08-23T14-56-16Z_ocr.pdf
13:43:25: Marking old entry for a later deletion: file:///storage/papiers/OZY/VEHICULE/VEHICULE.2015-08-23T14-56-16Z_ocr.pdf from 20150823
13:43:26: Preparing file:///storage/papiers/OZY/VEHICULE/VEHICULE.2015-08-23T14-56-16Z_ocr.pdf with preparator net.sf.regain.crawler.preparator.PdfBoxPreparator failed
net.sf.regain.RegainException: Preparing file:///storage/papiers/OZY/VEHICULE/VEHICULE.2015-08-23T14-56-16Z_ocr.pdf with preparator net.sf.regain.crawler.preparator.PdfBoxPreparator failed
at net.sf.regain.crawler.document.DocumentFactory.createDocument(DocumentFactory.java:336)
at net.sf.regain.crawler.document.DocumentFactory.createDocument(DocumentFactory.java:251)
at net.sf.regain.crawler.IndexWriterManager.createNewIndexEntry(IndexWriterManager.java:749)
at net.sf.regain.crawler.IndexWriterManager.addToIndex(IndexWriterManager.java:732)
at net.sf.regain.crawler.Crawler.run(Crawler.java:575)
at net.sf.regain.crawler.Main.main(Main.java:128)
Caused by: java.lang.IllegalArgumentException: Comparison method violates its general contract!
at java.util.TimSort.mergeHi(TimSort.java:895)
at java.util.TimSort.mergeAt(TimSort.java:512)
at java.util.TimSort.mergeCollapse(TimSort.java:435)
at java.util.TimSort.sort(TimSort.java:241)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1454)
at java.util.Collections.sort(Collections.java:175)
at org.apache.pdfbox.util.PDFTextStripper.writePage(PDFTextStripper.java:558)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:449)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:372)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:328)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:247)
at net.sf.regain.crawler.preparator.PdfBoxPreparator.prepare(PdfBoxPreparator.java:107)
at net.sf.regain.crawler.document.DocumentFactory.createDocument(DocumentFactory.java:319)
... 5 more
13:43:26: Preparation with EmptyPreparator done: file:///storage/papiers/OZY/VEHICULE/VEHICULE.2015-08-23T14-56-16Z_ocr.pdf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment