The goal is simple: infer class membership (using rdfs:subClassOf and rdf:type predicates). Don't do it with a property path or something. You must let the reasoner do it.
I've tried to do this with a few reasoners. All unsuccessful.
- Apache Jena wasn't able to do it with 12GB of RAM.
- Stardog wasn't able to do it with 12GB of RAM.
- REQUIEM wasn't able to do it with 12GB of RAM.
In this zip file you'll find tbox.ttl
and abox.ttl
.
This is the query that should return 79 results:
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX ex: <http://example.com/>
SELECT *
WHERE
{ ex:condition0 a ?type
}
Without reasoning it yields 1 result:
type |
---|
http://www\.wikidata\.org/entity/Q32552 |
But with RDFS reasoning enabled there should be 79 results.
e.g.
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ex: <http://example.com/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX wd: <http://www.wikidata.org/entity/>
SELECT *
WHERE
{ ex:condition0 rdf:type/(rdfs:subClassOf)* ?type }
Yields:
I have a slighly more powerful machine than @VladimirAlexiev, so I tried it out with GraphDB as well. Nothing special in regards to the setup, just a bit better hardware. I was also doing other memory-intensive operations at the time, so hardly a good benchmark, but perhaps somewhat illustrative of what could be going on in a production server.
Note that GraphDB is forward chaining, so it does materialization at import time.
The data loaded in 36 minutes. The total statements for that DB are 71,938,374, with a substantial expansion ratio of 23.18. Once that is done, the query executed in a few miliseconds.
The benefit is the query speed, so if you don't have just one node, but a few million of them, this would give a better performance than a backward-chaining DB.
@pchampin, if inferrust can do all of that forward-chaining inference in under a minute, it's really impressive.