Skip to content

Instantly share code, notes, and snippets.

@rescribet
Last active November 11, 2018 22:00
Show Gist options
  • Save rescribet/7d2b06546369b1e47449976bf29e30a1 to your computer and use it in GitHub Desktop.
Save rescribet/7d2b06546369b1e47449976bf29e30a1 to your computer and use it in GitHub Desktop.
RDFLib PR parsing benchmark

Source files:

node --max-old-space-size=4096 --expose-gc ./bench.js base infobox_property_definitions_en.ttl

testing base with file infobox_property_definitions_en.ttl
Garbage collection enabled
imported 118596 statements using 103.37MB (78.06MB after clean) in 8.36708s
imported 118596 statements using 95.16MB (77.37MB after clean) in 8.31138s
imported 118596 statements using 107.71MB (77.31MB after clean) in 8.28040s
imported 118596 statements using 106.86MB (77.32MB after clean) in 8.42875s
imported 118596 statements using 106.80MB (77.30MB after clean) in 8.34049s
imported 118596 statements using 107.84MB (77.34MB after clean) in 8.35689s
imported 118596 statements using 107.86MB (77.32MB after clean) in 8.34772s
imported 118596 statements using 106.62MB (77.33MB after clean) in 8.44471s
imported 118596 statements using 105.84MB (77.32MB after clean) in 8.34903s
imported 118596 statements using 105.23MB (77.30MB after clean) in 8.32810s
RDFLib base x 0.12 ops/sec ±0.85% (5 runs sampled)
Version base avaraged on 105.33MB (27.93MB garbage) in 8.35546s

node --max-old-space-size=4096 --expose-gc ./bench.js mem infobox_property_definitions_en.ttl

testing mem with file infobox_property_definitions_en.ttl
Garbage collection enabled
imported 118596 statements using 94.80MB (61.67MB after clean) in 8.32204s
imported 118596 statements using 54.56MB (33.99MB after clean) in 7.82057s
imported 118596 statements using 59.17MB (33.93MB after clean) in 7.83794s
imported 118596 statements using 58.57MB (33.96MB after clean) in 7.81863s
imported 118596 statements using 61.84MB (33.90MB after clean) in 7.82801s
imported 118596 statements using 60.66MB (33.92MB after clean) in 7.82081s
imported 118596 statements using 57.95MB (33.87MB after clean) in 7.79161s
imported 118596 statements using 61.45MB (33.95MB after clean) in 7.90005s
imported 118596 statements using 58.36MB (33.93MB after clean) in 7.83745s
imported 118596 statements using 58.36MB (33.94MB after clean) in 7.82779s
RDFLib mem x 0.13 ops/sec ±0.53% (5 runs sampled)
Version mem avaraged on 62.57MB (25.87MB garbage) in 7.88049s

118k:

  • mem: ±40% red (±52% red after GC)
  • time: ±7% red

node --max-old-space-size=4096 --expose-gc ./bench.js base geo_coordinates_mappingbased_en.ttl

testing base with file geo_coordinates_mappingbased_en.ttl
Garbage collection enabled
imported 2450527 statements using 1514.54MB (1503.33MB after clean) in 188.25559s
imported 2450527 statements using 1489.74MB (1472.57MB after clean) in 190.34794s
imported 2450527 statements using 1479.50MB (1472.61MB after clean) in 189.11244s
imported 2450527 statements using 1491.87MB (1472.54MB after clean) in 189.18544s
imported 2450527 statements using 1478.00MB (1472.54MB after clean) in 189.14758s
imported 2450527 statements using 1477.46MB (1472.53MB after clean) in 189.34550s
imported 2450527 statements using 1478.64MB (1472.56MB after clean) in 190.11499s
imported 2450527 statements using 1481.26MB (1472.61MB after clean) in 190.01659s
imported 2450527 statements using 1491.72MB (1472.53MB after clean) in 189.43322s
imported 2450527 statements using 1480.64MB (1472.61MB after clean) in 189.34023s
RDFLib base x 0.01 ops/sec ±0.28% (5 runs sampled)
Version base avaraged on 1486.34MB (10.69MB garbage) in 189.42995s

node --max-old-space-size=4096 --expose-gc ./bench.js mem geo_coordinates_mappingbased_en.ttl

testing mem with file geo_coordinates_mappingbased_en.ttl
Garbage collection enabled
imported 2450527 statements using 1311.32MB (932.85MB after clean) in 174.02617s
imported 2450527 statements using 824.62MB (570.64MB after clean) in 176.40745s
imported 2450527 statements using 811.86MB (570.72MB after clean) in 176.56365s
imported 2450527 statements using 817.83MB (570.71MB after clean) in 175.60193s
imported 2450527 statements using 820.28MB (570.73MB after clean) in 175.98204s
imported 2450527 statements using 817.60MB (570.63MB after clean) in 175.07669s
imported 2450527 statements using 819.16MB (570.73MB after clean) in 175.51566s
imported 2450527 statements using 824.25MB (570.70MB after clean) in 175.86463s
imported 2450527 statements using 820.32MB (570.72MB after clean) in 175.38096s
imported 2450527 statements using 824.90MB (570.66MB after clean) in 174.56705s
RDFLib mem x 0.01 ops/sec ±0.51% (5 runs sampled)
Version mem avaraged on 869.21MB (262.31MB garbage) in 175.49862s

2.4M:

  • mem: ±41% red (±58% red after GC)
  • time: ±8% red
// RDFLib.js performance benchmark script
const Benchmark = require('benchmark')
const fs = require('fs')
const version = process.argv[2]
const file = process.argv[3]
console.log(`testing ${version} with file ${file}`)
const rdflib = require(`../../${version === 'base' ? 'linkeddata' : 'fletcher91'}/rdflib.js/lib/index`)
const NS_PER_SEC = 1e9;
const suite = new Benchmark.Suite
const mem = []
const time = []
if (typeof global.gc !== 'function') {
console.log('Garbage collection not exposed')
} else {
console.log('Garbage collection enabled')
global.gc()
}
function run() {
const doc = fs.readFileSync(file, 'utf8')
const memStart = process.memoryUsage().heapUsed
const timeStart = process.hrtime()
const kb = new rdflib.Store()
const p = rdflib.N3Parser(kb, kb, 'https://example.com/', 'https://example.com', null, null, '', null)
p.loadBuf(doc)
const diff = process.hrtime(timeStart)
const timeTaken = ((diff[0] * NS_PER_SEC + diff[1]) / NS_PER_SEC).toFixed(5)
const memEnd = process.memoryUsage().heapUsed
const megsUsed = ((memEnd - memStart) / 1024 / 1024).toFixed(2)
console.log(`imported ${kb.statements.length} statements using ${megsUsed}MB in ${timeTaken}s`)
time.push(Number.parseFloat(timeTaken))
mem.push(Number.parseFloat(megsUsed))
}
suite
.add(`RDFLib ${version}`, () => {
global.gc()
run()
})
// add listeners
.on('cycle', function(event) {
console.log(String(event.target))
})
.on('complete', function() {
const avgTime = (time.reduce((a,b) => a + b, 0) / time.length).toFixed(5)
const avgMem = (mem.reduce((a,b) => a + b, 0) / mem.length).toFixed(2)
console.log(`Version ${version} avaraged on ${avgMem}MB in ${avgTime}s`)
})
.run({ 'async': false })
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment