I just tried my example from the tinycdxserver README and realised that curl is messing up the line-endings due to some conversion it does by default. I haven't checked yet exactly what curl is doing but tinycdxserver is interpreting it as if all the lines in the file have been concatenated together (you can see that by running tinycdxserver in verbose mode with the -v option).
Using curl's --data-binary option instead of --data fixes that and I've updated the README correspondingly.
That could be what's tripping you up. Here's a more complete example that I just tested. You should get an "Added N records" response back if it worked properly, where N is the line count of the cdx.
records.cdx below has a blank ("-") first column because tinycdxserver ignores it and does its own canonicalisation so our usual indexing process doesn't even bother filling it in. You can use standard CDX files as well, I've included a second file records2.cdx with SURT-style URLs that was generated using IA tools just to demonstrate that.
Compile tinycdxserver:
$ git clone git@github.com:nla/tinycdxserver.git
$ cd tinycdxserver
$ mvn package
Start tinycdxserver:
$ mkdir /tmp/data
$ java -jar target/tinycdxserver-0.1-SNAPSHOT.jar -d /tmp/data
Grab an example CDX:
$ curl -LO https://gist.github.com/ato/b2ad8e65b35afe690921/raw/4e663c44c74c585ac0d5226780465d2281177958/records.cdx
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 1203 100 1203 0 0 1297 0 --:--:-- --:--:-- --:--:-- 1297
Load it:
$ curl -XPOST --data-binary @records.cdx http://localhost:8080/myindex
Added 6 records
Get a record back:
$ curl -s http://localhost:8080/myindex?url=http://minister.infrastructure.gov.au/
au,gov,infrastructure,minister)/ 20150914222035 http://www.minister.infrastructure.gov.au/ text/html 301 ZH3ZBTFT5T6VC4BHO3MC6MLFECBEKDYN 389
Query using wayback's xml protocol:
$ curl -s http://localhost:8080/myindex?q=type:urlquery+url:http://minister.infrastructure.gov.au/ | xml_pp
<?xml version="1.0" encoding="UTF-8"?>
<wayback>
<request>
<startdate>19960101000000</startdate>
<enddate>20151015072406</enddate>
<type>urlquery</type>
<firstreturned>0</firstreturned>
<url>au,gov,infrastructure,minister)/</url>
<resultsrequested>10000</resultsrequested>
<resultstype>resultstypecapture</resultstype>
</request>
<results>
<result>
<compressedoffset>152443</compressedoffset>
<mimetype>text/html</mimetype>
<file>WEB-20150914222031256-00000-43190~heritrix.nla.gov.au~8443.warc.gz</file>
<redirecturl>http://minister.infrastructure.gov.au/</redirecturl>
<urlkey>au,gov,infrastructure,minister)/</urlkey>
<digest>ZH3ZBTFT5T6VC4BHO3MC6MLFECBEKDYN</digest>
<httpresponsecode>301</httpresponsecode>
<robotflags>-</robotflags>
<url>http://www.minister.infrastructure.gov.au/</url>
<capturedate>20150914222035</capturedate>
</result>
</results>
</wayback>
Okay, that helps a lot, thanks. I installed RockDB with compression enabled, and used the Maven dependency that does not have a platform binary in it, and it seems to work. I only have 77 records in it, so the compressed size was actually larger than the uncompressed, but it's a lot less readable!
FWIW, here's the docker build:
https://github.com/anjackson/wauldock/tree/master/tinycdxserver
and the minor changes I made to the tinycdxserver itself are in this repo:
https://github.com/anjackson/tinycdxserver