Skip to content

Instantly share code, notes, and snippets.

Created March 18, 2017 16:55
Show Gist options
  • Save anonymous/0a3a8ec292a4a480a0c01b89ef3a297e to your computer and use it in GitHub Desktop.
Save anonymous/0a3a8ec292a4a480a0c01b89ef3a297e to your computer and use it in GitHub Desktop.
notebooks_demos/notebooks/csw_unidata.ipynb
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"metadata": {
"collapsed": false
},
"cell_type": "markdown",
"source": "# How to search the IOOS CSW catalog with Python tools\n\n\nThis notebook demonstrates a how to query a [Catalog Service for the Web (CSW)](https://en.wikipedia.org/wiki/Catalog_Service_for_the_Web), like the IOOS Catalog, and to parse its results into endpoints that can be used to access the data."
},
{
"metadata": {
"collapsed": true,
"trusted": true,
"ExecuteTime": {
"start_time": "2017-03-18T12:54:45.379944",
"end_time": "2017-03-18T12:54:45.392957"
}
},
"cell_type": "code",
"source": "import os\nimport sys\n\nioos_tools = os.path.join(os.path.pardir)\nsys.path.append(ioos_tools)",
"execution_count": 1,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Let's start by creating the search filters.\nThe filter used here constraints the search on a certain geographical region (bounding box), a time span (last week), and some [CF](http://cfconventions.org/Data/cf-standard-names/37/build/cf-standard-name-table.html) variable standard names that represent sea surface temperature."
},
{
"metadata": {
"collapsed": false,
"trusted": true,
"ExecuteTime": {
"start_time": "2017-03-18T12:54:45.413978",
"end_time": "2017-03-18T12:54:45.487051"
}
},
"cell_type": "code",
"source": "from datetime import datetime, timedelta\nimport dateutil.parser\n\nservice_type = 'WMS'\n\nmin_lon, min_lat = -90.0, 30.0 \nmax_lon, max_lat = -80.0, 40.0 \n\nbbox = [min_lon, min_lat, max_lon, max_lat]\ncrs = 'urn:ogc:def:crs:OGC:1.3:CRS84'\n\n# Temporal range: Last week.\nnow = datetime.utcnow()\nstart, stop = now - timedelta(days=(7)), now\n\nstart = dateutil.parser.parse('2017-03-01T00:00:00Z')\nstop = dateutil.parser.parse('2017-04-01T00:00:00Z')\n\n\n# Ocean Model Names\nmodel_names = ['NAM', 'GFS']",
"execution_count": 2,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "With these 3 elements it is possible to assemble a [OGC Filter Encoding (FE)](http://www.opengeospatial.org/standards/filter) using the `owslib.fes`\\* module.\n\n\\* OWSLib is a Python package for client programming with Open Geospatial Consortium (OGC) web service (hence OWS) interface standards, and their related content models."
},
{
"metadata": {
"collapsed": false,
"trusted": true,
"ExecuteTime": {
"start_time": "2017-03-18T12:54:45.493057",
"end_time": "2017-03-18T12:54:50.660219"
}
},
"cell_type": "code",
"source": "from owslib import fes\nfrom ioos_tools.ioos import fes_date_filter\n\nkw = dict(wildCard='*', escapeChar='\\\\',\n singleChar='?', propertyname='apiso:AnyText')\n\nor_filt = fes.Or([fes.PropertyIsLike(literal=('*%s*' % val), **kw)\n for val in model_names])\n\nkw = dict(wildCard='*', escapeChar='\\\\',\n singleChar='?', propertyname='apiso:ServiceType')\n\nserviceType = fes.PropertyIsLike(literal=('*%s*' % service_type), **kw)\n\n\nbegin, end = fes_date_filter(start, stop)\nbbox_crs = fes.BBox(bbox, crs=crs)\n\nfilter_list = [\n fes.And(\n [\n bbox_crs, # bounding box\n begin, end, # start and end date\n or_filt, # or conditions (CF variable names)\n serviceType # search only for datasets that have WMS services\n ]\n )\n]",
"execution_count": 3,
"outputs": []
},
{
"metadata": {
"collapsed": false,
"trusted": true,
"ExecuteTime": {
"start_time": "2017-03-18T12:54:50.668227",
"end_time": "2017-03-18T12:54:51.159718"
}
},
"cell_type": "code",
"source": "from owslib.csw import CatalogueServiceWeb\n\n\nendpoint = 'https://data.ioos.us/csw'\n\ncsw = CatalogueServiceWeb(endpoint, timeout=60)",
"execution_count": 4,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "The `csw` object created from `CatalogueServiceWeb` did not fetched anything yet.\nIt is the method `getrecords2` that uses the filter for the search. However, even though there is a `maxrecords` option, the search is always limited by the server side and there is the need to iterate over multiple calls of `getrecords2` to actually retrieve all records.\nThe `get_csw_records` does exactly that."
},
{
"metadata": {
"collapsed": true,
"trusted": true,
"ExecuteTime": {
"start_time": "2017-03-18T12:54:51.166725",
"end_time": "2017-03-18T12:54:51.207766"
}
},
"cell_type": "code",
"source": "def get_csw_records(csw, filter_list, pagesize=10, maxrecords=1000):\n \"\"\"Iterate `maxrecords`/`pagesize` times until the requested value in\n `maxrecords` is reached.\n \"\"\"\n from owslib.fes import SortBy, SortProperty\n # Iterate over sorted results.\n sortby = SortBy([SortProperty('dc:title', 'ASC')])\n csw_records = {}\n startposition = 0\n nextrecord = getattr(csw, 'results', 1)\n while nextrecord != 0:\n csw.getrecords2(constraints=filter_list, startposition=startposition,\n maxrecords=pagesize, sortby=sortby)\n csw_records.update(csw.records)\n if csw.results['nextrecord'] == 0:\n break\n startposition += pagesize + 1 # Last one is included.\n if startposition >= maxrecords:\n break\n csw.records.update(csw_records)",
"execution_count": 5,
"outputs": []
},
{
"metadata": {
"collapsed": false,
"scrolled": true,
"trusted": true,
"ExecuteTime": {
"start_time": "2017-03-18T12:54:51.216775",
"end_time": "2017-03-18T12:54:53.838394"
}
},
"cell_type": "code",
"source": "get_csw_records(csw, filter_list, pagesize=10, maxrecords=1000)\n\nrecords = '\\n'.join(csw.records.keys())\nprint('Found {} records.\\n'.format(len(csw.records.keys())))\nfor key, value in list(csw.records.items()):\n print('[{}]\\n{}\\n'.format(value.title, key))",
"execution_count": 6,
"outputs": [
{
"output_type": "stream",
"text": "Found 17 records.\n\n[NAM CONUS 40km/Best NAM CONUS 40km Time Series]\nedu.ucar.unidata:grib/NCEP/NAM/CONUS_40km/conduit/Best\n\n[NAM CONUS 80km/Best NAM CONUS 80km Time Series]\nedu.ucar.unidata:grib/NCEP/NAM/CONUS_80km/Best\n\n[NAM Fireweather Nested/Best NAM Fireweather Nested Time Series/LambertConformal_622X510 (Center 38.53N 78.03W)]\nedu.ucar.unidata:grib/NCEP/NAM/Firewxnest/Best/LambertConformal_622X510-38p53N-78p03W\n\n[NAM Polar 90km/Best NAM Polar 90km Time Series]\nedu.ucar.unidata:grib/NCEP/NAM/Polar_90km/Best\n\n[NOAA/NCEP Global Forecast System (GFS) Atmospheric Model]\nncep_global\n\n[NOAA/NCEP Global Forecast System (GFS) Atmospheric Model: Pacific]\nncep_pac\n\n[WaveWatch III (WW3) Global Wave Model]\nww3_global\n\n[NAM CONUS 12km from NOAAPORT/Best NAM CONUS 12km from NOAAPORT Time Series]\nedu.ucar.unidata:grib/NCEP/NAM/CONUS_12km/Best\n\n[NAM CONUS 12km from CONDUIT/Best NAM CONUS 12km from CONDUIT Time Series]\nedu.ucar.unidata:grib/NCEP/NAM/CONUS_12km/conduit/Best\n\n[NAM Alaska 45km from CONDUIT/Best NAM Alaska 45km from CONDUIT Time Series]\nedu.ucar.unidata:grib/NCEP/NAM/Alaska_45km/conduit/Best\n\n[GFS CONUS 20km/Best GFS CONUS 20km Time Series]\nedu.ucar.unidata:grib/NCEP/GFS/CONUS_20km/Best\n\n[NAM Alaska 11km/Best NAM Alaska 11km Time Series]\nedu.ucar.unidata:grib/NCEP/NAM/Alaska_11km/Best\n\n[GFS CONUS 80km/Best GFS CONUS 80km Time Series]\nedu.ucar.unidata:grib/NCEP/GFS/CONUS_80km/Best\n\n[NAM CONUS 20km/Best NAM CONUS 20km Time Series]\nedu.ucar.unidata:grib/NCEP/NAM/CONUS_20km/noaaport/Best\n\n[NAM Alaska 22km/Best NAM Alaska 22km Time Series]\nedu.ucar.unidata:grib/NCEP/NAM/Alaska_22km/Best\n\n[NAM Alaska 45km from NOAAPORT/Best NAM Alaska 45km from NOAAPORT Time Series]\nedu.ucar.unidata:grib/NCEP/NAM/Alaska_45km/noaaport/Best\n\n[GFS CONUS 95km/Best GFS CONUS 95km Time Series]\nedu.ucar.unidata:grib/NCEP/GFS/CONUS_95km/Best\n\n",
"name": "stdout"
}
]
},
{
"metadata": {
"collapsed": false,
"trusted": true,
"ExecuteTime": {
"start_time": "2017-03-18T12:54:53.845401",
"end_time": "2017-03-18T12:54:53.894450"
}
},
"cell_type": "code",
"source": "csw.request",
"execution_count": 7,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "b'<csw:GetRecords xmlns:csw=\"http://www.opengis.net/cat/csw/2.0.2\" xmlns:gml=\"http://www.opengis.net/gml\" xmlns:ogc=\"http://www.opengis.net/ogc\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" outputSchema=\"http://www.opengis.net/cat/csw/2.0.2\" outputFormat=\"application/xml\" version=\"2.0.2\" service=\"CSW\" resultType=\"results\" startPosition=\"11\" maxRecords=\"10\" xsi:schemaLocation=\"http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd\"><csw:Query typeNames=\"csw:Record\"><csw:ElementSetName>summary</csw:ElementSetName><csw:Constraint version=\"1.1.0\"><ogc:Filter><ogc:And><ogc:BBOX><ogc:PropertyName>ows:BoundingBox</ogc:PropertyName><gml:Envelope srsName=\"urn:ogc:def:crs:OGC:1.3:CRS84\"><gml:lowerCorner>-90.0 30.0</gml:lowerCorner><gml:upperCorner>-80.0 40.0</gml:upperCorner></gml:Envelope></ogc:BBOX><ogc:PropertyIsLessThanOrEqualTo><ogc:PropertyName>apiso:TempExtent_begin</ogc:PropertyName><ogc:Literal>2017-04-01 00:00</ogc:Literal></ogc:PropertyIsLessThanOrEqualTo><ogc:PropertyIsGreaterThanOrEqualTo><ogc:PropertyName>apiso:TempExtent_end</ogc:PropertyName><ogc:Literal>2017-03-01 00:00</ogc:Literal></ogc:PropertyIsGreaterThanOrEqualTo><ogc:Or><ogc:PropertyIsLike wildCard=\"*\" singleChar=\"?\" escapeChar=\"\\\\\"><ogc:PropertyName>apiso:AnyText</ogc:PropertyName><ogc:Literal>*NAM*</ogc:Literal></ogc:PropertyIsLike><ogc:PropertyIsLike wildCard=\"*\" singleChar=\"?\" escapeChar=\"\\\\\"><ogc:PropertyName>apiso:AnyText</ogc:PropertyName><ogc:Literal>*GFS*</ogc:Literal></ogc:PropertyIsLike></ogc:Or><ogc:PropertyIsLike wildCard=\"*\" singleChar=\"?\" escapeChar=\"\\\\\"><ogc:PropertyName>apiso:ServiceType</ogc:PropertyName><ogc:Literal>*WMS*</ogc:Literal></ogc:PropertyIsLike></ogc:And></ogc:Filter></csw:Constraint><ogc:SortBy><ogc:SortProperty><ogc:PropertyName>dc:title</ogc:PropertyName><ogc:SortOrder>ASC</ogc:SortOrder></ogc:SortProperty></ogc:SortBy></csw:Query></csw:GetRecords>'"
},
"metadata": {},
"execution_count": 7
}
]
},
{
"metadata": {
"collapsed": false,
"trusted": true,
"ExecuteTime": {
"start_time": "2017-03-18T12:54:53.902458",
"end_time": "2017-03-18T12:54:53.916472"
}
},
"cell_type": "code",
"source": "#write to JSON for use in TerriaJS\ncsw_request = '\"{}\": {}\"'.format('getRecordsTemplate',str(csw.request,'utf-8'))\n\nimport io\nimport json\nwith io.open('query.json', 'a', encoding='utf-8') as f:\n f.write(json.dumps(csw_request, ensure_ascii=False))\n f.write('\\n')",
"execution_count": 8,
"outputs": []
}
],
"metadata": {
"_draft": {
"nbviewer_url": "https://gist.github.com/8ca00c0386618a87062f9b8b490b3161"
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3",
"language": "python"
},
"gist": {
"id": "8ca00c0386618a87062f9b8b490b3161",
"data": {
"description": "notebooks_demos/notebooks/csw_unidata.ipynb",
"public": true
}
},
"language_info": {
"version": "3.5.3",
"file_extension": ".py",
"name": "python",
"pygments_lexer": "ipython3",
"nbconvert_exporter": "python",
"codemirror_mode": {
"version": 3,
"name": "ipython"
},
"mimetype": "text/x-python"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment