Skip to content

Instantly share code, notes, and snippets.

Created December 7, 2020 01:52
Show Gist options
  • Save jaygray0919/42728b48c528992879f65c5bd033ba20 to your computer and use it in GitHub Desktop.
Save jaygray0919/42728b48c528992879f65c5bd033ba20 to your computer and use it in GitHub Desktop.

Schemarama CORE

NodeJs project for validating Structured Data with ShEx and SHACL. Main tasks:

How to use it?


Four data formats can be parsed with schemarama: JSON-LD, Microdata, RDFa and NQuads. There are 4 corresponding functions for parsing with the same interface: parseJsonLd, parseMicrodata, parseRdfa, parseNQuads. Each of them should be called with data as a string and a baseUrl:

const store = await parseJsonLd(data, baseUrl);


ShEx validation

For validating structured data with ShEx, you will need ShEx shapes in the ShExJ format, to construct the ShexValidator object. You also can specify annotations in the format of the map {label in the target report} -> {predicate in the annotation}, but this is optional and only useful if shapes have annotations. To validate the data, you need to have it as string or parsed n3.Store. ShEx validation also requires a start shape as an IRI. So, to perform the validation, you will need to call shexValidator.validate(data, startShape). You can optionally provide baseUrl to the validation.

const ShexValidator = require('./schemarama/shexValidator').Validator;
const annotations = {
    description: '',
    severity: ''
// shapes - JSON object with shapes in ShExJ format
const validator = new ShexValidator(shapes, {annotations: annotations});
const startShape = ''
// data - string or n3.js Store, that should be validated
validator.validate(data, startShape, {baseUrl: ''}) 
    .then(report => report.failures.forEach(failure => console.log(failure)));

SHACL validation

ShaclValidator is very similar to the ShexValidator. Shapes should be provided as a string of the turtle format. Shacl uses target notion to determine target shapes for each validated piece of structured data, so it doesnt't need the explicitly specified start shape (as ShEx does).

const ShaclValidator = require('./schemarama/shaclValidator').Validator;
const annotations = {
    description: '',
    severity: ''
// shapes - JSON object with shapes in ShEx format
const validator = new ShaclValidator(shapes, {annotations: annotations});
// data - string or n3.js Store, that should be validated
validator.validate(data, {baseUrl: ''}) 
    .then(report => report.failures.forEach(failure => console.log(failure)));

Validation report Failure (StructuredDataFailure) for both ShEx and SHACL has the same structure:

    property: string, - property, which caused the failure
    message: string, - short message, which provides info about the failure
    severity: 'error'|'warning'|'info', - severity. Default 3 types used, but can be extended to more if needed
    shape: string, - shape, containing the failing constraint
    node: string, - node in the structured data, to which the property belongs
    // properties from annotations

Validation report will be returned from validate as an object with three fields:

    baseUrl: string,
    store: Store, // n3.Store with parsed triples 
    failures: StructuredDataFailure[] // array with failures


To use this module in the browser, you need to create a bundle with webpack.
All commands below should be executed in the /core directory of the project:

  1. If you don't have node and npm installed, please download an installer from here or run sudo apt update, sudo apt install nodejs and sudo apt install npm for Ubuntu.
  2. run npm install
  3. run npx webpack --config webpack.config.js
  4. Four bundles will appear in the dist folder. Please choose one that suits your needs best:
    • An uncompressed bundle for the full package - schemarama.bundle.js;
    • A minimized bundle for the full package - schemarama.bundle.min.js;
    • An uncompressed bundle for the parsing-only package - schemarama-parsing.bundle.js;
    • A minimized bundle for the parsing-only package - schemarama-parsing.bundle.min.js;

Cli mode

To use this project as a cli, first you need to do an npm install in the /core folder


node cli --parse
Required arguments:
--input <file path> - path to the input file.
Optional arguments:
--output <file path> - path to the output file. If not specified, output will be printed to the console.
--format <format> - one of the output formats. Available formats: nquads|ntriples|turtle|trig. 'nquads' is used by default.

ShEx validation:

node cli --validate
Required arguments:
--shex <file path> - path to ShEx shapes. Could be either local or URL
--input <file path> - path to the input file.
--target <shapeURI> - target shape, e.g.
Optional arguments:
--output <file path> - path to the output file. If not specified, output will be printed to the console.
--base <URI> - base data URI (@id). If not specified, random URI will be used.
--annotations <file path> - path to the annotation correspondence file, in JSON format, where keys are keys in the failure report object and values are annotation predicates. If not specified, annotations will be ignored.

SHACL validation:

node cli --validate Required arguments:
--shacl <file path> - path to SHACL shapes. Could be either local or URL
--input <file path> - path to the input file.
Optional arguments:
--output <file path> - path to the output file. If not specified, output will be printed to the console.
--base <URI> - base data URI (@id). If not specified, random URI will be used.
--annotations <file path> - path to the annotation correspondence file, in JSON format, where keys are keys in the failure report object and values are annotation predicates. If not specified, annotations will be ignored.
--subclasses <file path> - additional triples that should be added to data for every validation. Originally used for adding subclasses structure to the data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment