Skip to content

Instantly share code, notes, and snippets.

Created May 15, 2018
What would you like to do?
Deleting batches of records with SPARQL

I learned this when trying to clear our records in AWS Neptune. I was hitting the query timeout when trying to drop an entire graph. If you don't want to/can't raise the timeout, you can drop smaller parts of the graph in each transaction.

curl -sX POST http://<cluster-prefix> --data-urlencode 'update=
  GRAPH <> { ?s ?p ?o }
  GRAPH <> {
      SELECT ?s ?p ?o
      WHERE {
        ?s ?p ?o .
      LIMIT 10

This will delete 10 records, specifically the first 10 that are returned for a SELECT * WHERE { ?s ?p ?o } query. You can adjust the limit value to find a batch size that keeps you under the timeout.

Yeah, this is a dirty hack but there was a bit of pain to learn this so I want to store the knowledge.

Also, be sure to use --data-urlencode not --data-binary otherwise you might find the server ignores your input but doesn't give any indication of error.

Copy link

agcunha commented Oct 7, 2019

Thank you. You helped me a lot! My rdf dataset has 20000000 triples and adjusted the limit to 1000000

Copy link

GeniJaho commented Oct 17, 2020

Thanks man, made my day better.

Copy link

nzewail commented Jan 5, 2022

Super helpful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment