Skip to content

Instantly share code, notes, and snippets.

@datadavev
Created March 30, 2021 13:07
Show Gist options
  • Save datadavev/3ba3b12390c859b2f780ad7b78ebd739 to your computer and use it in GitHub Desktop.
Save datadavev/3ba3b12390c859b2f780ad7b78ebd739 to your computer and use it in GitHub Desktop.
Demonstrate forcing https for schema.org context
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "spiritual-canadian",
"metadata": {},
"source": [
"# Forcing https domain for schema.org\n",
"\n",
"Demonstrates forcing `https://schema.org/` namespace using [pyld](https://github.com/digitalbazaar/pyld).\n",
"\n",
"Given a choice of a `schema.org` context document that uses the `https` namespace variant, intercept requests for `schema.org` context loading and replace the requested URL with the URL pointing to the https variant."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "aquatic-thailand",
"metadata": {},
"outputs": [],
"source": [
"import re\n",
"import json\n",
"import pyld.jsonld\n",
"\n",
"SO_FORCE_HTTPS = True\n",
"SO_MATCH = re.compile(r\"http(s)?\\://schema.org(/)?\")\n",
"SO_CONTEXT_LOCATION = \"https://raw.githubusercontent.com/schemaorg/schemaorg/836cae785cfcb09fe69d0a611be9b8c73b67a0d4/data/releases/12.0/schemaorgcontext.jsonld\"\n",
"\n",
"def schemaOrgDocumentLoader(url, options={}):\n",
" if SO_FORCE_HTTPS and SO_MATCH.match(url) is not None:\n",
" url = SO_CONTEXT_LOCATION\n",
" loader = pyld.jsonld.requests_document_loader()\n",
" return loader(url, options=options)\n",
"\n",
"pyld.jsonld.set_document_loader(schemaOrgDocumentLoader)"
]
},
{
"cell_type": "markdown",
"id": "computational-month",
"metadata": {},
"source": [
"Test case that expands a simple JSON-LD document triggering retrieval of the remote context document from the URL `http://schema.org/`. The final context document retrieved after redirects and Link traversal is `https://schema.org/docs/jsonldcontext.jsonld`. "
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "everyday-aquatic",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[\n",
" {\n",
" \"@id\": \"https://example.net/test_1\",\n",
" \"@type\": [\n",
" \"http://schema.org/Thing\"\n",
" ],\n",
" \"http://schema.org/description\": [\n",
" {\n",
" \"@value\": \"Simple test document for schema.org\"\n",
" }\n",
" ]\n",
" }\n",
"]\n"
]
}
],
"source": [
"test1 = '''{\n",
" \"@context\": \"http://schema.org/\",\n",
" \"@id\":\"test_1\",\n",
" \"@type\": \"Thing\",\n",
" \"description\":\"Simple test document for schema.org\"\n",
"}\n",
"'''\n",
"\n",
"# Disable schema.org https forcing\n",
"SO_FORCE_HTTPS = False\n",
"\n",
"jld = json.loads(test1)\n",
"expanded = pyld.jsonld.expand(jld, options={\"base\":\"https://example.net/\"})\n",
"print(json.dumps(expanded, indent=2))"
]
},
{
"cell_type": "markdown",
"id": "periodic-medium",
"metadata": {},
"source": [
"Expand the same test document except forcing the use of a `schema.org` [context document that uses the `https` variant of the namespace](https://raw.githubusercontent.com/schemaorg/schemaorg/836cae785cfcb09fe69d0a611be9b8c73b67a0d4/data/releases/12.0/schemaorgcontext.jsonld). "
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "public-tablet",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[\n",
" {\n",
" \"@id\": \"https://example.net/test_1\",\n",
" \"@type\": [\n",
" \"https://schema.org/Thing\"\n",
" ],\n",
" \"https://schema.org/description\": [\n",
" {\n",
" \"@value\": \"Simple test document for schema.org\"\n",
" }\n",
" ]\n",
" }\n",
"]\n"
]
}
],
"source": [
"# Enable schema.org https forcing\n",
"SO_FORCE_HTTPS = True\n",
"\n",
"jld = json.loads(test1)\n",
"expanded = pyld.jsonld.expand(jld, options={\"base\":\"https://example.net/\"})\n",
"print(json.dumps(expanded, indent=2))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "harmful-going",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment