Johan van der Knijff, 3 July 2019
This document describes some proposed changes to the jpylyzer output format for the upcoming jpylyzer 2.0 release (which is foreseen for November 2019). The main occasion for these changes is the addition of raw codestream validation functionality. Since this functionality will lead to a small (but nevertheless breaking) change to jpylyzer's output format, this is a good moment for fixing a few other inconsistencies.
Related Github issues are:
- Add option to validate raw codestreams (through API, possibly also CLI) #113
- Suggested output improvements (jpylyzer 2.x) #55
The modifications as described below have already been implemented in the testcodestream development branch of jpylyzer.
Output for 1 single file:
<?xml version='1.0' encoding='UTF-8'?>
<jpylyzer xmlns="http://openpreservation.org/ns/jpylyzer/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://openpreservation.org/ns/jpylyzer/ http://jpylyzer.openpreservation.org/jpylyzer-v-1-1.xsd">
<toolInfo>
<toolName>jpylyzer</toolName>
<toolVersion>1.18.0</toolVersion>
</toolInfo>
<fileInfo>
<fileName>aware.jp2</fileName>
<filePath>/home/johan/jpylyzer-test-files/aware.jp2</filePath>
<fileSizeInBytes>662735</fileSizeInBytes>
<fileLastModified>Wed Dec 2 08:28:52 2015</fileLastModified>
</fileInfo>
<statusInfo>
<success>True</success>
</statusInfo>
<isValidJP2>True</isValidJP2>
<tests/>
<properties>
::
</properties>
</jpylyzer>
Output for 2 files with --wrapper
option enabled (this wraps multiple jpylyzer
elements inside a results
element, which is the root element in this case):
<?xml version='1.0' encoding='UTF-8'?>
<results xmlns="http://openpreservation.org/ns/jpylyzer/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://openpreservation.org/ns/jpylyzer/ http://jpylyzer.openpreservation.org/jpylyzer-v-1-1.xsd">
<jpylyzer>
<toolInfo>
<toolName>jpylyzer</toolName>
<toolVersion>1.18.0</toolVersion>
</toolInfo>
<fileInfo>
<fileName>aware.jp2</fileName>
<filePath>/home/johan/jpylyzer-test-files/aware.jp2</filePath>
<fileSizeInBytes>662735</fileSizeInBytes>
<fileLastModified>Wed Dec 2 08:28:52 2015</fileLastModified>
</fileInfo>
<statusInfo>
<success>True</success>
</statusInfo>
<isValidJP2>True</isValidJP2>
<tests/>
<properties>
::
</properties>
</jpylyzer>
<jpylyzer>
<toolInfo>
<toolName>jpylyzer</toolName>
<toolVersion>1.18.0</toolVersion>
</toolInfo>
<fileInfo>
<fileName>rubbish.jp2</fileName>
<filePath>/home/johan/jpylyzer-test-files/rubbish.jp2</filePath>
<fileSizeInBytes>662735</fileSizeInBytes>
<fileLastModified>Wed Dec 5 09:28:52 2015</fileLastModified>
</fileInfo>
<statusInfo>
<success>True</success>
</statusInfo>
<isValidJP2>True</isValidJP2>
<tests/>
<properties>
::
</properties>
</jpylyzer>
</results>
- The name of the
isValidJP2
element would not really be appropriate for raw codestream validation (because in this case jpylyzer only validates against the codestream specification, not against the JP2 specification! - The use of the
results
element if--wrapper
or--recurse
are activated is confusing, because it results in slightly different variations of the output format (it's also a bit ugly). - If
--wrapper
is not used in case of multiple files, jpylyzer's output is not even well-formed XML! - The information inside
toolInfo
is repeated for each file.
- The root element is always
jpylyzer
- The jpylyzer element contains 1
toolInfo
element and 1 or morefile
elements - Each
file
element contains the output for one individual file/image. Inside it are the usual sub-elements (fileInfo
,statusInfo
, etc.) - The
isValidJP2
element is replaced by the newisValid
element. Aformat
attribute defines the validation format (allowed values:jp2
for JP2 validation, andj2c
for raw codestream validation). The validation format is defined by the new--format
command-line option (if this option is not set, jpylyzer validates against JP2 by default).
The Figure below gives an overview of the revised format:
Output for 1 single JP2 using JP2 validation (complete output available here):
<?xml version='1.0' encoding='UTF-8'?>
<jpylyzer xmlns="http://openpreservation.org/ns/jpylyzer/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://openpreservation.org/ns/jpylyzer/ http://jpylyzer.openpreservation.org/jpylyzer-v-2-0.xsd">
<toolInfo>
<toolName>jpylyzer</toolName>
<toolVersion>2.0.0a1</toolVersion>
</toolInfo>
<file>
<fileInfo>
<fileName>aware.jp2</fileName>
<filePath>/home/johan/jpylyzer-test-files/aware.jp2</filePath>
<fileSizeInBytes>662735</fileSizeInBytes>
<fileLastModified>Wed Dec 2 08:28:52 2015</fileLastModified>
</fileInfo>
<statusInfo>
<success>True</success>
</statusInfo>
<isValid format="jp2">True</isValid>
<tests/>
<properties>
::
</properties>
</file>
</jpylyzer>
Output for 1 single codestream using codestream validation (complete output available here):
<?xml version='1.0' encoding='UTF-8'?>
<jpylyzer xmlns="http://openpreservation.org/ns/jpylyzer/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://openpreservation.org/ns/jpylyzer/ http://jpylyzer.openpreservation.org/jpylyzer-v-2-0.xsd">
<toolInfo>
<toolName>jpylyzer</toolName>
<toolVersion>2.0.0a1</toolVersion>
</toolInfo>
<file>
<fileInfo>
<fileName>is_codestream.jp2</fileName>
<filePath>/home/johan/jpylyzer-test-files/is_codestream.j2c</filePath>
<fileSizeInBytes>628385</fileSizeInBytes>
<fileLastModified>Wed Dec 2 08:28:52 2015</fileLastModified>
</fileInfo>
<statusInfo>
<success>True</success>
</statusInfo>
<isValid format="j2c">True</isValid>
<tests/>
<properties>
::
</properties>
</file>
</jpylyzer>
Output for multiple files (complete output available here):
<?xml version='1.0' encoding='UTF-8'?>
<jpylyzer xmlns="http://openpreservation.org/ns/jpylyzer/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://openpreservation.org/ns/jpylyzer/ http://jpylyzer.openpreservation.org/jpylyzer-v-2-0.xsd">
<toolInfo>
<toolName>jpylyzer</toolName>
<toolVersion>2.0.0a1</toolVersion>
</toolInfo>
<file>
<fileInfo>
<fileName>openJPEG15.jp2</fileName>
<filePath>/home/johan/test/openJPEG15.jp2</filePath>
<fileSizeInBytes>670372</fileSizeInBytes>
<fileLastModified>Wed Dec 2 08:28:52 2015</fileLastModified>
</fileInfo>
<statusInfo>
<success>True</success>
</statusInfo>
<isValid format="jp2">True</isValid>
<tests/>
<properties>
::
</properties>
</file>
<file>
<fileInfo>
<fileName>palettedImage.jp2</fileName>
<filePath>/home/johan/test/palettedImage.jp2</filePath>
<fileSizeInBytes>317550</fileSizeInBytes>
<fileLastModified>Wed Dec 2 08:28:52 2015</fileLastModified>
</fileInfo>
<statusInfo>
<success>True</success>
</statusInfo>
<isValid format="jp2">True</isValid>
<tests/>
<properties>
::
</properties>
</file>
</jpylyzer>
Since these changes will break existing workflows, jpylyzer 2 will have a new --legacyout
option. When it is activated, output is reported in jpylyzer 1.x format. Codestream validation cannot be used if --legacyout
is activated.
As jpylyzer 2 wraps the output of all analysed files into well-formed XML, the --wrapper
option will be ignored by default. The option will remain available for use with the --legacyout
option. However, it will be marked as deprecated in the documentation and helper text.
Hi Johan, some comments about your proposal.
1/ you should put a version in your namespace if it's not backward compatible
2/ as proposed in my PR, the fileLastModified should be in xs:dateTime format (otherwise it depends on the current locale...)
3/ it should be nice to have a propertiesExtension tag (in the same manner as PREMIS, and of type extensionComplexType) in order to enable the output of other information, like the MIX addition I proposed in my second PR.
Thanks for making this tool so complete and useful.