Skip to content

Instantly share code, notes, and snippets.

@hyrsky
Created July 1, 2023 01:02
Show Gist options
  • Save hyrsky/c50a4e5a7f0438a42193f71b3b3f44da to your computer and use it in GitHub Desktop.
Save hyrsky/c50a4e5a7f0438a42193f71b3b3f44da to your computer and use it in GitHub Desktop.
PHP XML reader benchmarking

Benchmarking PHP XML readers

Benchmarking memory usage for the purpose of reading large XML files.

Usage

make test

Results

runtime (s) memory usage (MiB) memory usage, real (MiB)
test-xmlreader.php 3.3585588932037 0.62660217285156 2
test-simplexml.php 2.3316519260406 0.61759948730469 2
test-simplexml-string.php 2.3231160640717 24.296783447266 25.90625
.PHONY: test setup
testdata.xml:
curl -sSL -o testdata.xml http://aiweb.cs.washington.edu/research/projects/xmltk/xmldata/data/nasa/nasa.xml
result.log: testdata.xml
php runner.php test-xmlreader.php 10 > results.log
php runner.php test-simplexml.php 10 >> results.log
php runner.php test-simplexml-string.php 10 >> results.log
test: result.log
<?php
include($argv[1]);
echo "Testing: " . $argv[1] . "\n";
$start_time = microtime(TRUE);
for ($i = 0; $i < $argv[2]; $i++) {
test();
}
$end_time = microtime(TRUE);
echo "-------------\n";
echo "runtime: " . $end_time - $start_time . "s\n";
echo "memory usage: " . (memory_get_peak_usage(false) / 1024 / 1024) . " MiB\n";
echo "memory usage (real): " . (memory_get_peak_usage(true) / 1024 / 1024) . " MiB\n";
<?php
/**
* Control test - this should have high memory usage.
*/
function test(): void
{
$reader = new SimpleXMLElement(file_get_contents('testdata.xml'), dataIsURL: false);
$fd = fopen('/dev/null', 'w');
foreach ($reader->dataset as $dataset) {
fwrite($fd, $dataset->asXML());
}
fclose($fd);
}
<?php
function test(): void
{
$reader = new SimpleXMLElement('file://' . dirname(__FILE__) . '/testdata.xml', dataIsURL: true);
$fd = fopen('/dev/null', 'w');
foreach ($reader->dataset as $dataset) {
fwrite($fd, $dataset->asXML());
}
fclose($fd);
}
<?php
function test(): void {
$reader = XMLReader::open('testdata.xml');
$fd = fopen('/dev/null', 'w');
try {
// Read the XML document node by node. When <dataset> element is
// reached, read only sibling nodes until the end of the document.
while ($reader->read()) {
// Check if the current node is an element
if ($reader->nodeType != XMLReader::ELEMENT) {
continue;
}
switch ($reader->name) {
case 'dataset':
// Read all sibling projects.
do {
fwrite($fd, $reader->readOuterXml());
} while ($reader->next('dataset'));
break;
default:
break;
}
}
} finally {
$reader->close();
fclose($fd);
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment