Skip to content

Instantly share code, notes, and snippets.

@psychemedia
Last active June 26, 2022 19:30
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save psychemedia/b1fadd04af01ccd6a3a1 to your computer and use it in GitHub Desktop.
Save psychemedia/b1fadd04af01ccd6a3a1 to your computer and use it in GitHub Desktop.
demo - nomis API python wrapper
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"activity": false
},
"source": [
"# Demo Python Wrapper for nomis API\n",
"\n",
"*An example Python wrapper for the nonis API.*\n",
"\n",
"Something I started exploring some time ago was the ability to [generate textual reports around monthly JSA figures](http://nbviewer.ipython.org/gist/psychemedia/86d436aa2a3a6914f618/nomis_textualisation_test.ipynb). At the time, I found it really tricky to navigate the [nomis API](https://www.nomisweb.co.uk/api/v01/help?uid=0xd57c2bd58aa382ddb5cae1383cbe476f36609e57) in an efficient way, as for example when trying to generate URLs like the following to get the total JSA claimant count for the Isle of Wight:"
]
},
{
"cell_type": "code",
"execution_count": 457,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>SEX_NAME</th>\n",
" <th>GEOGRAPHY_NAME</th>\n",
" <th>MEASURES_NAME</th>\n",
" <th>DATE_CODE</th>\n",
" <th>DATE_NAME</th>\n",
" <th>OBS_VALUE</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> Male</td>\n",
" <td> Isle of Wight</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 1983-06</td>\n",
" <td> June 1983</td>\n",
" <td> 3504</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> Female</td>\n",
" <td> Isle of Wight</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 1983-06</td>\n",
" <td> June 1983</td>\n",
" <td> 1336</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> Total</td>\n",
" <td> Isle of Wight</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 1983-06</td>\n",
" <td> June 1983</td>\n",
" <td> 4840</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> Male</td>\n",
" <td> Isle of Wight</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 1983-07</td>\n",
" <td> July 1983</td>\n",
" <td> 3458</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> Female</td>\n",
" <td> Isle of Wight</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 1983-07</td>\n",
" <td> July 1983</td>\n",
" <td> 1302</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td> Total</td>\n",
" <td> Isle of Wight</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 1983-07</td>\n",
" <td> July 1983</td>\n",
" <td> 4760</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td> Male</td>\n",
" <td> Isle of Wight</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 1983-08</td>\n",
" <td> August 1983</td>\n",
" <td> 3409</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td> Female</td>\n",
" <td> Isle of Wight</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 1983-08</td>\n",
" <td> August 1983</td>\n",
" <td> 1264</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td> Total</td>\n",
" <td> Isle of Wight</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 1983-08</td>\n",
" <td> August 1983</td>\n",
" <td> 4673</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" SEX_NAME GEOGRAPHY_NAME MEASURES_NAME DATE_CODE DATE_NAME \\\n",
"0 Male Isle of Wight Persons claiming JSA 1983-06 June 1983 \n",
"1 Female Isle of Wight Persons claiming JSA 1983-06 June 1983 \n",
"2 Total Isle of Wight Persons claiming JSA 1983-06 June 1983 \n",
"3 Male Isle of Wight Persons claiming JSA 1983-07 July 1983 \n",
"4 Female Isle of Wight Persons claiming JSA 1983-07 July 1983 \n",
"5 Total Isle of Wight Persons claiming JSA 1983-07 July 1983 \n",
"6 Male Isle of Wight Persons claiming JSA 1983-08 August 1983 \n",
"7 Female Isle of Wight Persons claiming JSA 1983-08 August 1983 \n",
"8 Total Isle of Wight Persons claiming JSA 1983-08 August 1983 \n",
"\n",
" OBS_VALUE \n",
"0 3504 \n",
"1 1336 \n",
"2 4840 \n",
"3 3458 \n",
"4 1302 \n",
"5 4760 \n",
"6 3409 \n",
"7 1264 \n",
"8 4673 "
]
},
"execution_count": 457,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"\n",
"baseURL='http://www.nomisweb.co.uk/api/v01/dataset/NM_1_1.data.csv?'\n",
"url=baseURL+'geography=2038431803&sex=5,6,7&item=1&measures=20100'\n",
"#Projection\n",
"url+='&select=sex_name,geography_name,measures_name,date_code,date_name,obs_value'\n",
"tmp=pd.read_csv(url)\n",
"tmp[:9]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So I've started working on the follow to try to make it easier to work with this API in code...\n",
"\n",
"The idea is that we should be able to have a conversation with the API to generate URLs that will bring bring back desired datasets or slices/filtered versions of particular datasets, and then retrive those datasets into a *pandas* dataframe so we can start to work with it."
]
},
{
"cell_type": "code",
"execution_count": 458,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [],
"source": [
"import pandas as pd\n",
"import urllib\n",
"import re\n",
"\n",
"class NOMIS_CONFIG:\n",
" #TO DO implement cache to cache list of datasets and dimensions associated with datasets (except time/date?)\n",
" \n",
" def __init__(self):\n",
" NOMIS_STUB='https://www.nomisweb.co.uk/api/v01/dataset/'\n",
" \n",
" self.url=NOMIS_STUB\n",
" self.codes=None\n",
" self.metadata={}\n",
"\n",
" def _url_encode(self,params=None):\n",
" if params is not None and params!='' and params != {}:\n",
" #params='?{}'.format( '&'.join( ['{}={}'.format(p,params[p]) for p in params] ) )\n",
" params='?{}'.format(urllib.urlencode(params))\n",
" else:\n",
" params=''\n",
" return params\n",
"\n",
"\n",
" def _describe_dataset(self,df):\n",
" for row in df.iterrows():\n",
" dfr=row[1]\n",
" print('{idx} - {name}: {description}\\n'.format(idx=dfr['idx'],\n",
" name=dfr['name'],\n",
" description=dfr['description']) )\n",
" \n",
" def _describe_metadata(self,idx,df,keys,pretty=True):\n",
" if not pretty:\n",
" for key in keys:\n",
" print( '---- {} ----'.format(key) )\n",
" for row in df[key].iterrows():\n",
" dfr=row[1]\n",
" print('{dimension} - {description}: {value}'.format(dimension=dfr['dimension'],\n",
" description=dfr['description'],\n",
" value=dfr['value']) )\n",
" else:\n",
" print('The following dimensions are available for {idx} ({name}):\\n'.format(\n",
" idx=idx, \n",
" name=self.dataset_lookup_property(idx,'name')))\n",
" for key in keys:\n",
" items =['{} ({})'.format(row[1]['description'],row[1]['value']) for row in df[key].iterrows()]\n",
" print( ' - {key}: {items}'.format(key=key,items=', '.join(items)) )\n",
" \n",
" def help_url(self,idx='NM_7_1'):\n",
" metadata=self.nomis_code_metadata(idx)\n",
" keys=metadata.keys()\n",
" keys.remove('core')\n",
" print('Dataset {idx} ({name}) supports the following dimensions: {dims}.'.format(\n",
" idx=idx,\n",
" dims=', '.join(keys),\n",
" name=self.dataset_lookup_property(idx,'name')))\n",
"\n",
" def dataset_lookup_property(self,idx=None,prop=None):\n",
" if idx is None or prop is None: return ''\n",
" df=self.dataset_lookup(idx)\n",
"\n",
" if prop in df.columns: return str(df[prop][0])\n",
" else: return ''\n",
" \n",
" def dataset_lookup(self,idx=None,dimensions=False,describe=False):\n",
" ##dimensions used in sense of do we display them or not\n",
" if self.codes is None:\n",
" self.codes=self.nomis_codes_datasets(dimensions=True)\n",
" \n",
" if idx is not None:\n",
" #Test if idx is a list or single string\n",
" if isinstance(idx, str): idx=[idx]\n",
" df=self.codes[self.codes['idx'].isin(idx)]\n",
" else:\n",
" df=self.codes[:]\n",
" \n",
" cols=df.columns.tolist() \n",
" if not dimensions:\n",
" for col in ['dimension','concept']:\n",
" cols.remove(col)\n",
" df=df[cols].drop_duplicates().reset_index(drop=True)\n",
" if describe: self._describe_dataset(df)\n",
" else: return df\n",
" \n",
" def _get_geo_from_postcode(self, postcode, areacode=None):\n",
" #Set a default\n",
" if areacode is None:\n",
" areacode='district'\n",
" \n",
" codemap={ 'district':486 }\n",
"\n",
" if areacode in codemap:\n",
" areacode=codemap[areacode]\n",
" \n",
" return 'POSTCODE|{postcode};{code}'.format(postcode=postcode,code=areacode)\n",
"\n",
" \n",
" def _dimension_mapper(self,idx,dim,dims):\n",
" ''' dims is a string of comma separated values for a particular dimension ''' \n",
" if dim is not None:\n",
" sc=self._nomis_codes_dimension_grab(dim,idx,params=None)\n",
" dimmap=dict(zip(sc['description'].astype(str),sc['value']))\n",
" keys=dimmap.keys()\n",
" keys.sort(key=len, reverse=True)\n",
" for s in keys:\n",
" pattern = re.compile(s, re.IGNORECASE)\n",
" dims=pattern.sub(str(dimmap[s]), str(dims))\n",
" return dims\n",
" \n",
" def _sex_map(self,idx,sex):\n",
" return self._dimension_mapper(idx,'sex',sex)\n",
" \n",
" def _get_geo_code_helper(self,helper):\n",
" value=None\n",
" desc=None\n",
"\n",
" #I am baking values in, but maybe they should be searched for and retrieved that way?\n",
" if helper=='UK_WPC_2010':\n",
" #UK Westminster Parliamentary Constituency\n",
" value='2092957697TYPE460'\n",
" elif helper=='LA_district':\n",
" value='2092957697TYPE464'\n",
"\n",
" return value,desc\n",
"\n",
" def get_geo_code(self,value=None,desc=None, search=None, helper=None, chase=False):\n",
" #The semantics of this are quite tricky\n",
" #value is a code for a geography, the thing searched within\n",
" #desc identifies a description within a geography - on a match it takes you to this lower geography\n",
" #search is term to search (free text search) with the descriptions of areas returned\n",
" #helper is in place for shortcuts\n",
"\n",
" #Given a local authority code, eg 1946157281, a report can be previewed at:\n",
" ##https://www.nomisweb.co.uk/reports/lmp/la/1946157281/report.aspx\n",
" #default\n",
" if helper is not None:\n",
" value,desc=self._get_geo_code_helper(helper)\n",
" if chase:\n",
" chaser= self.nomis_codes_geog(geography=value)\n",
" if search is not None:\n",
" chasecands=chaser[ chaser['description'].str.contains(search) ][['description','value']].values\n",
" else:\n",
" chasecands=chaser[['description','value']].values\n",
" locs=[]\n",
" for chasecand in chasecands:\n",
" locs.append(chasecand[1])\n",
" if len(locs): value=','.join(map(str,locs))\n",
"\n",
" geog=self.nomis_codes_geog(geography=value)\n",
" if desc is not None:\n",
" candidates=geog[['description','value']].values\n",
" for candidate in candidates:\n",
" if candidate[0]==desc:\n",
" geog=self.nomis_codes_geog(geography=candidate[1])\n",
"\n",
" if search is not None:\n",
" retval=geog[ geog['description'].str.contains(search) ][['description','value']].values\n",
" else:\n",
" retval=geog[['description','value']].values\n",
"\n",
" return pd.DataFrame(retval,columns=['description','geog'])\n",
"\n",
" def _get_datasets(self,search=None):\n",
" url='http://www.nomisweb.co.uk/api/v01/dataset/def.sdmx.json'\n",
" if search is not None:\n",
" url='{url}{params}'.format(url=url,params=self._url_encode({'search':search}))\n",
" data=pd.read_json(url)\n",
" return data\n",
"\n",
" def nomis_code_metadata(self,idx='NM_1_1',describe=None):\n",
" if idx in self.metadata:\n",
" metadata=self.metadata[idx]\n",
" else:\n",
" core=self.dataset_lookup(idx,dimensions=True)\n",
" metadata={'core':core}\n",
" for dim in core['concept'].str.lower():\n",
" metadata[dim]=self._nomis_codes_dimension_grab(dim,idx,params=None)\n",
" self.metadata[idx]=metadata \n",
" if describe=='all':\n",
" keys= metadata.keys()\n",
" keys.remove('core')\n",
" self._describe_metadata(idx,metadata,keys)\n",
" elif isinstance(describe, str) and describe in metadata.keys():\n",
" self._describe_metadata(idx,metadata,[describe])\n",
" elif isinstance(describe, list):\n",
" self._describe_metadata(idx,metadata,describe)\n",
" else:\n",
" return metadata\n",
" \n",
" \n",
" def nomis_codes_datasets(self,search=None,dimensions=False):\n",
" #TO DO - by default, use local dataset list and search in specified cols;\n",
" # add additional parameter to force a search on API\n",
" \n",
" df=self._get_datasets(search)\n",
"\n",
" keyfamilies=df.loc['keyfamilies']['structure']\n",
" if keyfamilies is None: return pd.DataFrame()\n",
" \n",
" datasets=[]\n",
" for keyfamily in keyfamilies['keyfamily']:\n",
" kf={'agency':keyfamily['agencyid'],\n",
" 'idx':keyfamily['id'],\n",
" 'name':keyfamily['name']['value'],\n",
" 'description': keyfamily['description']['value'] if 'description' in keyfamily else ''\n",
" #'dimensions':[dimensions['codelist'] for dimensions in keyfamily['components']['dimension']]\n",
" }\n",
"\n",
" if dimensions:\n",
" for _dimensions in keyfamily['components']['dimension']:\n",
" kf['dimension']= _dimensions['codelist']\n",
" kf['concept']= _dimensions['conceptref']\n",
" datasets.append(kf.copy())\n",
" else:\n",
" datasets.append(kf.copy())\n",
" \n",
" return pd.DataFrame(datasets)\n",
"\n",
" def _nomis_codes_parser(self,url):\n",
" jdata=pd.read_json(url)\n",
" cl=jdata.loc['codelists']['structure']\n",
" if cl is None: return pd.DataFrame()\n",
" \n",
" codes_data=[]\n",
" for codelist in cl['codelist']:\n",
" code_data={'agencyid':codelist['agencyid'],\n",
" 'dataset':jdata.loc['header']['structure']['id'],\n",
" 'dimension':codelist['id'],\n",
" 'name':codelist['name']['value']\n",
" }\n",
" for code in codelist['code']:\n",
" code_data['description']=code['description']['value']\n",
" code_data['value']=code['value']\n",
" codes_data.append(code_data.copy())\n",
" return pd.DataFrame(codes_data)\n",
"\n",
" #Generic mininal constructor\n",
" def _nomis_codes_url_constructor(self,dim,idx,params=None):\n",
" #This doesn't cope with geography properly that can insert an element into the path?\n",
" return '{nomis}{idx}/{dim}.def.sdmx.json{params}'.format(nomis=self.url,\n",
" idx=idx,\n",
" dim=dim.lower(),\n",
" params=self._url_encode(params))\n",
" def _nomis_codes_dimension_grab(self,dim,idx,params=None):\n",
" url=self._nomis_codes_url_constructor(dim,idx,params=None)\n",
" return self._nomis_codes_parser(url)\n",
" \n",
" #Set up shorthand functions to call particular dimensions\n",
" #Select appropriate datsets as default to demo the call\n",
" def nomis_codes_measures(self,idx='NM_1_1'):\n",
" url=self._nomis_codes_url_constructor('measures',idx)\n",
" return self._nomis_codes_parser(url)\n",
" \n",
" def nomis_codes_time(self,idx='NM_1_1'):\n",
" url=self._nomis_codes_url_constructor('time',idx)\n",
" return self._nomis_codes_parser(url)\n",
"\n",
" def nomis_codes_industry(self,idx='NM_21_1'):\n",
" url=self._nomis_codes_url_constructor('industry',idx)\n",
" return self._nomis_codes_parser(url)\n",
" \n",
" def nomis_codes_freq(self,idx='NM_1_1'):\n",
" url=url=self._nomis_codes_url_constructor('freq',idx)\n",
" return self._nomis_codes_parser(url)\n",
"\n",
" def nomis_codes_age_dur(self,idx='NM_7_1'):\n",
" url=url=self._nomis_codes_url_constructor('age_dur',idx)\n",
" return self._nomis_codes_parser(url)\n",
"\n",
" def nomis_codes_ethnicity(self,idx='NM_118_1'):\n",
" url=url=self._nomis_codes_url_constructor('ethnicity',idx)\n",
" return self._nomis_codes_parser(url)\n",
" \n",
" def nomis_codes_occupation(self,idx='NM_7_1'):\n",
" url=url=self._nomis_codes_url_constructor('occupation',idx)\n",
" return self._nomis_codes_parser(url)\n",
" \n",
" def nomis_codes_age(self,idx='NM_18_1'):\n",
" url=url=self._nomis_codes_url_constructor('age',idx)\n",
" return self._nomis_codes_parser(url)\n",
" \n",
" def nomis_codes_duration(self,idx='NM_18_1'):\n",
" url=url=self._nomis_codes_url_constructor('duration',idx)\n",
" return self._nomis_codes_parser(url)\n",
" \n",
"\n",
" def nomis_codes_sex(self,idx='NM_1_1',geography=None):\n",
" params={}\n",
" if geography is not None:\n",
" params['geography']=geography\n",
"\n",
" url='{nomis}{idx}/sex.def.sdmx.json{params}'.format(nomis=self.url,\n",
" idx=idx,\n",
" params=self._url_encode(params))\n",
"\n",
" return self._nomis_codes_parser(url)\n",
" \n",
" def nomis_codes_geog(self,idx='NM_1_1',geography=None,search=None):\n",
" params={}\n",
" if geography is not None:\n",
" geog='/{geog}'.format(geog=geography)\n",
" else:\n",
" geog=''\n",
"\n",
" if search is not None:\n",
" params['search']=search\n",
" \n",
" url='{nomis}{idx}/geography{geog}.def.sdmx.json{params}'.format(nomis=self.url,\n",
" idx=idx,geog=geog,\n",
" params=self._url_encode(params))\n",
" \n",
" return self._nomis_codes_parser(url)\n",
" \n",
" def nomis_codes_items(self,idx='NM_1_1',geography=None,sex=None):\n",
" sex=self._sex_map(idx,sex)\n",
" params={}\n",
"\n",
" if geography is not None:\n",
" params['geography']=geography\n",
" if sex is not None:\n",
" params['sex']=sex\n",
"\n",
" url='{nomis}{idx}/item.def.sdmx.json{params}'.format(nomis=self.url,\n",
" idx=idx,\n",
" params=self._url_encode(params))\n",
"\n",
" return self._nomis_codes_parser(url)\n",
"\n",
" #TO DO have a dataset_explain(idx) function that will print a description of a dataset,\n",
" #summarise what dimensions are available, and the value they can take,\n",
" #and provide a stub function usage example (with eligible parameters) to call it\n",
"\n",
" def _nomis_data_url(self,idx='NM_1_1',postcode=None, areacode=None, **kwargs):\n",
"\n",
" #TO DO\n",
" #Add an explain=True parameter that will print a natural language summary of what the command is calling\n",
" \n",
" \n",
" ###---Time/date info from nomis API docs---\n",
" #Useful time options:\n",
" ##\"latest\" - the latest available data for this dataset\n",
" ##\"previous\" - the date prior to \"latest\"\n",
" ##\"prevyear\" - the date one year prior to \"latest\"\n",
" ##\"first\" - the oldest available data for this dataset\n",
" ##Using the \"time\" concept you are limited to entering two dates, \n",
" ##a start and end. All dates between these are returned.\n",
" \n",
" #date is more flexible for ranges\n",
" ##With the \"date\" parameter you can specify relative dates, \n",
" ##so for example if you wanted the latest date, three months and six months prior to that\n",
" ##you could specify \"date=latest,latestMINUS3,latestMINUS6\". \n",
" ##You can use ranges with the \"date\" parameter, \n",
" ##e.g. if you wanted data for 12 months ago, together with all dates in the last six month\n",
" ##up to latest you could specify \"date=prevyear,latestMINUS5-latest\".\n",
" \n",
" ##To illustrate the difference between using \"date\" and \"time\";\n",
" ##if you specified \"time=first,latest\" in your URI you would get all dates from first to latest inclusive,\n",
" ##whereas with \"date=first,latest\" your output would contain only the first and latest dates.\n",
" \n",
" metadata=self.nomis_code_metadata(idx)\n",
" \n",
" #HELPERS\n",
" \n",
" #Find geography from postcode\n",
" if 'geography' not in kwargs and postcode is not None:\n",
" kwargs['geography']=self._get_geo_from_postcode(postcode, areacode)\n",
"\n",
" #Map natural language dimension values to corresponding codes\n",
" for dim in set( metadata.keys() ).intersection( kwargs.keys() ):\n",
" kwargs[dim]=self._dimension_mapper(idx,dim,kwargs[dim])\n",
" \n",
" #Set a default time period to be latest\n",
" if 'date' not in kwargs and 'time' not in kwargs:\n",
" kwargs['time']='latest'\n",
"\n",
" \n",
" #Set up a default projection for the returned columns\n",
" cols=['geography_code','geography_name','measures_name','measures','date_code','date_name','obs_value']\n",
"\n",
" for k in ['sex','age','item']:\n",
" if k in kwargs: cols.insert(len(cols)-1,'{}_name'.format(k))\n",
" \n",
" if 'select' not in kwargs:\n",
" kwargs['select']=','.join(cols)\n",
" \n",
" url='{nomis}{idx}.data.csv{params}'.format(nomis=self.url,\n",
" idx=idx,\n",
" params=self._url_encode(kwargs))\n",
" return url\n",
" \n",
" def _nomis_data(self,idx='NM_1_1',postcode=None, areacode=None, **kwargs):\n",
" url=self._nomis_data_url(idx,postcode, areacode, **kwargs)\n",
"\n",
" df=pd.read_csv(url)\n",
" df['_Code']=idx\n",
" return df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Usage\n",
"\n",
"To start with, we create a `NOMIS_CONFIG()` object. This doesn't really contain anything to start with except for a bunch of methods...\n",
"\n",
"The first time we properly call on it, however, there is likely to be a delay as various bits get seeded into it..."
]
},
{
"cell_type": "code",
"execution_count": 459,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [],
"source": [
"nomis=NOMIS_CONFIG()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### dataset\\_lookup( *idx | [idx, ... ], describe = False* )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"One of the first things we might want to do is to look up some basic information about a particular dataset, such as its name and a brief description of it. The first time we call this, the object grabs a list of *all* the datasets, so it may take some time."
]
},
{
"cell_type": "code",
"execution_count": 460,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>agency</th>\n",
" <th>description</th>\n",
" <th>idx</th>\n",
" <th>name</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> NOMIS</td>\n",
" <td> JSA claimant count records the number of peopl...</td>\n",
" <td> NM_1_1</td>\n",
" <td> claimant count with rates and proportions</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" agency description idx \\\n",
"0 NOMIS JSA claimant count records the number of peopl... NM_1_1 \n",
"\n",
" name \n",
"0 claimant count with rates and proportions "
]
},
"execution_count": 460,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nm_1_1_info=nomis.dataset_lookup('NM_1_1')\n",
"nm_1_1_info"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Many of the function calls take a dataset identifier as the first parameter (*idx*). Rather than break functions that donlt receive a dataset identifier, I have tried to put in a dummy value that returns an example result. It should be clear from the returned data which dataset it relates to (the dataset identifier value should be clearly visible in the response).\n",
"\n",
"One exception is in the `nomis.dataset_lookup()` function - if we don't query a particular dataset here, we see the whole listing."
]
},
{
"cell_type": "code",
"execution_count": 461,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>agency</th>\n",
" <th>description</th>\n",
" <th>idx</th>\n",
" <th>name</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> NOMIS</td>\n",
" <td> JSA claimant count records the number of peopl...</td>\n",
" <td> NM_1_1</td>\n",
" <td> claimant count with rates and proportions</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> NOMIS</td>\n",
" <td> A quartery count of claimants who were claimin...</td>\n",
" <td> NM_2_1</td>\n",
" <td> claimant count - age and duration</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> NOMIS</td>\n",
" <td> A monthly count of job seekers allowance (JSA)...</td>\n",
" <td> NM_4_1</td>\n",
" <td> claimant count - age and duration</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> NOMIS</td>\n",
" <td> A midyear estimate of the workforce (the denom...</td>\n",
" <td> NM_5_1</td>\n",
" <td> claimant count denominators - historical workf...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> NOMIS</td>\n",
" <td> A quarterly count of job seekers allowance cl...</td>\n",
" <td> NM_6_1</td>\n",
" <td> claimant count - occupation</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" agency description idx \\\n",
"0 NOMIS JSA claimant count records the number of peopl... NM_1_1 \n",
"1 NOMIS A quartery count of claimants who were claimin... NM_2_1 \n",
"2 NOMIS A monthly count of job seekers allowance (JSA)... NM_4_1 \n",
"3 NOMIS A midyear estimate of the workforce (the denom... NM_5_1 \n",
"4 NOMIS A quarterly count of job seekers allowance cl... NM_6_1 \n",
"\n",
" name \n",
"0 claimant count with rates and proportions \n",
"1 claimant count - age and duration \n",
"2 claimant count - age and duration \n",
"3 claimant count denominators - historical workf... \n",
"4 claimant count - occupation "
]
},
"execution_count": 461,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nomis.dataset_lookup().head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also view the a table summarising a single dataset, passed as a string, or several datasets, passed as a list, as in this example:"
]
},
{
"cell_type": "code",
"execution_count": 462,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>agency</th>\n",
" <th>description</th>\n",
" <th>idx</th>\n",
" <th>name</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> NOMIS</td>\n",
" <td> JSA claimant count records the number of peopl...</td>\n",
" <td> NM_1_1</td>\n",
" <td> claimant count with rates and proportions</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> NOMIS</td>\n",
" <td> A quarterly count of job seekers allowance cl...</td>\n",
" <td> NM_7_1</td>\n",
" <td> claimant count - occupation, age and duration</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> NOMIS</td>\n",
" <td> A monthly count of job seekers allowance (JSA)...</td>\n",
" <td> NM_18_1</td>\n",
" <td> claimant count - age duration with proportions</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> NOMIS</td>\n",
" <td> The midyear (30 June) estimates of population ...</td>\n",
" <td> NM_31_1</td>\n",
" <td> mid-year population estimates</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" agency description idx \\\n",
"0 NOMIS JSA claimant count records the number of peopl... NM_1_1 \n",
"1 NOMIS A quarterly count of job seekers allowance cl... NM_7_1 \n",
"2 NOMIS A monthly count of job seekers allowance (JSA)... NM_18_1 \n",
"3 NOMIS The midyear (30 June) estimates of population ... NM_31_1 \n",
"\n",
" name \n",
"0 claimant count with rates and proportions \n",
"1 claimant count - occupation, age and duration \n",
"2 claimant count - age duration with proportions \n",
"3 mid-year population estimates "
]
},
"execution_count": 462,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nomis.dataset_lookup(['NM_1_1','NM_7_1','NM_18_1','NM_31_1'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Alternatively, we can choose to print out the whole description by setting `describe=True`."
]
},
{
"cell_type": "code",
"execution_count": 463,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"NM_1_1 - claimant count with rates and proportions: JSA claimant count records the number of people claiming Jobseekers Allowance (JSA) and National Insurance credits at Jobcentre Plus local offices. This is not an official measure of unemployment, but is the only indicative statistic available for areas smaller than Local Authorities.\n",
"\n",
"NM_7_1 - claimant count - occupation, age and duration: A quarterly count of job seekers allowance claimants analysed by their sought and usual occupation, their age and the duration of their claim.\n",
"\n",
"NM_18_1 - claimant count - age duration with proportions: A monthly count of job seekers allowance (JSA) claimants broken down by age and duration of claim together with age based proportions. Totals exclude non-computerised clerical claims (approx. 1%). Available for Local Authorities.\n",
"\n",
"NM_31_1 - mid-year population estimates: The midyear (30 June) estimates of population are based on results from the latest Census of Population with allowance for under-enumeration. Available at Local Authority level and above.\n",
"\n"
]
}
],
"source": [
"nomis.dataset_lookup(['NM_1_1','NM_7_1','NM_18_1','NM_31_1'],describe=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### help\\_url( idx, *dimensions = False*)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you need help with what parameters to add to a URL:"
]
},
{
"cell_type": "code",
"execution_count": 464,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Dataset NM_7_1 (claimant count - occupation, age and duration) supports the following dimensions: measures, sex, item, age_dur, freq, geography, occupation.\n"
]
}
],
"source": [
"nomis.help_url(idx='NM_7_1')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### dataset_lookup( *idx, dimensions = False* )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There is actually a more comprehensive view of a datasets available that contains some metadata columns (*concept* and *dimension*) that describe what filter dimensions are available over the dataset."
]
},
{
"cell_type": "code",
"execution_count": 465,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>agency</th>\n",
" <th>concept</th>\n",
" <th>description</th>\n",
" <th>dimension</th>\n",
" <th>idx</th>\n",
" <th>name</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> NOMIS</td>\n",
" <td> GEOGRAPHY</td>\n",
" <td> JSA claimant count records the number of peopl...</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> NM_1_1</td>\n",
" <td> claimant count with rates and proportions</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> NOMIS</td>\n",
" <td> SEX</td>\n",
" <td> JSA claimant count records the number of peopl...</td>\n",
" <td> CL_1_1_SEX</td>\n",
" <td> NM_1_1</td>\n",
" <td> claimant count with rates and proportions</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> NOMIS</td>\n",
" <td> ITEM</td>\n",
" <td> JSA claimant count records the number of peopl...</td>\n",
" <td> CL_1_1_ITEM</td>\n",
" <td> NM_1_1</td>\n",
" <td> claimant count with rates and proportions</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> NOMIS</td>\n",
" <td> MEASURES</td>\n",
" <td> JSA claimant count records the number of peopl...</td>\n",
" <td> CL_1_1_MEASURES</td>\n",
" <td> NM_1_1</td>\n",
" <td> claimant count with rates and proportions</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> NOMIS</td>\n",
" <td> FREQ</td>\n",
" <td> JSA claimant count records the number of peopl...</td>\n",
" <td> CL_1_1_FREQ</td>\n",
" <td> NM_1_1</td>\n",
" <td> claimant count with rates and proportions</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" agency concept description \\\n",
"0 NOMIS GEOGRAPHY JSA claimant count records the number of peopl... \n",
"1 NOMIS SEX JSA claimant count records the number of peopl... \n",
"2 NOMIS ITEM JSA claimant count records the number of peopl... \n",
"3 NOMIS MEASURES JSA claimant count records the number of peopl... \n",
"4 NOMIS FREQ JSA claimant count records the number of peopl... \n",
"\n",
" dimension idx name \n",
"0 CL_1_1_GEOGRAPHY NM_1_1 claimant count with rates and proportions \n",
"1 CL_1_1_SEX NM_1_1 claimant count with rates and proportions \n",
"2 CL_1_1_ITEM NM_1_1 claimant count with rates and proportions \n",
"3 CL_1_1_MEASURES NM_1_1 claimant count with rates and proportions \n",
"4 CL_1_1_FREQ NM_1_1 claimant count with rates and proportions "
]
},
"execution_count": 465,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nomis.dataset_lookup('NM_1_1',dimensions=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### nomis_code_metadata( *idx, describe = 'all' | dimension | [dimension, ...]* )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can pull down a complete description of the levels available within each concept for a single selected dataset."
]
},
{
"cell_type": "code",
"execution_count": 466,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"{'core': agency concept description \\\n",
" 0 NOMIS GEOGRAPHY JSA claimant count records the number of peopl... \n",
" 1 NOMIS SEX JSA claimant count records the number of peopl... \n",
" 2 NOMIS ITEM JSA claimant count records the number of peopl... \n",
" 3 NOMIS MEASURES JSA claimant count records the number of peopl... \n",
" 4 NOMIS FREQ JSA claimant count records the number of peopl... \n",
" \n",
" dimension idx name \n",
" 0 CL_1_1_GEOGRAPHY NM_1_1 claimant count with rates and proportions \n",
" 1 CL_1_1_SEX NM_1_1 claimant count with rates and proportions \n",
" 2 CL_1_1_ITEM NM_1_1 claimant count with rates and proportions \n",
" 3 CL_1_1_MEASURES NM_1_1 claimant count with rates and proportions \n",
" 4 CL_1_1_FREQ NM_1_1 claimant count with rates and proportions ,\n",
" u'freq': agencyid dataset description dimension name \\\n",
" 0 NOMIS NM_1_1 Monthly CL_1_1_FREQ Frequency code list \n",
" 1 NOMIS NM_1_1 Quarterly CL_1_1_FREQ Frequency code list \n",
" 2 NOMIS NM_1_1 Half-yearly, semester CL_1_1_FREQ Frequency code list \n",
" 3 NOMIS NM_1_1 Annually CL_1_1_FREQ Frequency code list \n",
" \n",
" value \n",
" 0 M \n",
" 1 Q \n",
" 2 S \n",
" 3 A ,\n",
" u'geography': agencyid dataset description dimension name value\n",
" 0 NOMIS NM_1_1 United Kingdom CL_1_1_GEOGRAPHY geography 2092957697\n",
" 1 NOMIS NM_1_1 Great Britain CL_1_1_GEOGRAPHY geography 2092957698\n",
" 2 NOMIS NM_1_1 England CL_1_1_GEOGRAPHY geography 2092957699\n",
" 3 NOMIS NM_1_1 Wales CL_1_1_GEOGRAPHY geography 2092957700\n",
" 4 NOMIS NM_1_1 Scotland CL_1_1_GEOGRAPHY geography 2092957701\n",
" 5 NOMIS NM_1_1 Northern Ireland CL_1_1_GEOGRAPHY geography 2092957702\n",
" 6 NOMIS NM_1_1 England and Wales CL_1_1_GEOGRAPHY geography 2092957703,\n",
" u'item': agencyid dataset description dimension name value\n",
" 0 NOMIS NM_1_1 Total claimants CL_1_1_ITEM item 1\n",
" 1 NOMIS NM_1_1 Students on vacation CL_1_1_ITEM item 2\n",
" 2 NOMIS NM_1_1 Temporarily stopped CL_1_1_ITEM item 3\n",
" 3 NOMIS NM_1_1 Claimants under 18 years CL_1_1_ITEM item 4\n",
" 4 NOMIS NM_1_1 Married females CL_1_1_ITEM item 9,\n",
" u'measures': agencyid dataset description dimension name value\n",
" 0 NOMIS NM_1_1 claimants CL_1_1_MEASURES measures 20100\n",
" 1 NOMIS NM_1_1 workforce CL_1_1_MEASURES measures 20201\n",
" 2 NOMIS NM_1_1 active CL_1_1_MEASURES measures 20202\n",
" 3 NOMIS NM_1_1 residence CL_1_1_MEASURES measures 20203,\n",
" u'sex': agencyid dataset description dimension name value\n",
" 0 NOMIS NM_1_1 Male CL_1_1_SEX sex 5\n",
" 1 NOMIS NM_1_1 Female CL_1_1_SEX sex 6\n",
" 2 NOMIS NM_1_1 Total CL_1_1_SEX sex 7}"
]
},
"execution_count": 466,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"p=nomis.nomis_code_metadata() #a default id is provided for demo purposes; set using eg id='NM_7_1'\n",
"p"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It's easy enough to view the levels for a particular dimension in its own table."
]
},
{
"cell_type": "code",
"execution_count": 467,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>agencyid</th>\n",
" <th>dataset</th>\n",
" <th>description</th>\n",
" <th>dimension</th>\n",
" <th>name</th>\n",
" <th>value</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> United Kingdom</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957697</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> Great Britain</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957698</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> England</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957699</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> Wales</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957700</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> Scotland</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957701</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> Northern Ireland</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957702</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> England and Wales</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957703</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" agencyid dataset description dimension name value\n",
"0 NOMIS NM_1_1 United Kingdom CL_1_1_GEOGRAPHY geography 2092957697\n",
"1 NOMIS NM_1_1 Great Britain CL_1_1_GEOGRAPHY geography 2092957698\n",
"2 NOMIS NM_1_1 England CL_1_1_GEOGRAPHY geography 2092957699\n",
"3 NOMIS NM_1_1 Wales CL_1_1_GEOGRAPHY geography 2092957700\n",
"4 NOMIS NM_1_1 Scotland CL_1_1_GEOGRAPHY geography 2092957701\n",
"5 NOMIS NM_1_1 Northern Ireland CL_1_1_GEOGRAPHY geography 2092957702\n",
"6 NOMIS NM_1_1 England and Wales CL_1_1_GEOGRAPHY geography 2092957703"
]
},
"execution_count": 467,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"p['geography']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also **describe** *one*, *several*, or *all* of the metadata elements associated with a dataset."
]
},
{
"cell_type": "code",
"execution_count": 468,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The following dimensions are available for NM_1_1 (claimant count with rates and proportions):\n",
"\n",
" - measures: claimants (20100), workforce (20201), active (20202), residence (20203)\n",
" - sex: Male (5), Female (6), Total (7)\n",
" - item: Total claimants (1), Students on vacation (2), Temporarily stopped (3), Claimants under 18 years (4), Married females (9)\n",
" - freq: Monthly (M), Quarterly (Q), Half-yearly, semester (S), Annually (A)\n",
" - geography: United Kingdom (2092957697), Great Britain (2092957698), England (2092957699), Wales (2092957700), Scotland (2092957701), Northern Ireland (2092957702), England and Wales (2092957703)\n"
]
}
],
"source": [
"nomis.nomis_code_metadata(describe='all')"
]
},
{
"cell_type": "code",
"execution_count": 469,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The following dimensions are available for NM_1_1 (claimant count with rates and proportions):\n",
"\n",
" - geography: United Kingdom (2092957697), Great Britain (2092957698), England (2092957699), Wales (2092957700), Scotland (2092957701), Northern Ireland (2092957702), England and Wales (2092957703)\n"
]
}
],
"source": [
"nomis.nomis_code_metadata(describe='geography')"
]
},
{
"cell_type": "code",
"execution_count": 470,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The following dimensions are available for NM_1_1 (claimant count with rates and proportions):\n",
"\n",
" - geography: United Kingdom (2092957697), Great Britain (2092957698), England (2092957699), Wales (2092957700), Scotland (2092957701), Northern Ireland (2092957702), England and Wales (2092957703)\n",
" - freq: Monthly (M), Quarterly (Q), Half-yearly, semester (S), Annually (A)\n"
]
}
],
"source": [
"nomis.nomis_code_metadata(describe=['geography','freq'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also pull down example tables (or actual tables if we pass in a dataset id) for particular dimensions.\n",
"\n",
"For example:\n",
"\n",
"* `nomis.nomis_codes_age_dur()`\n",
"* `nomis.nomis_codes_occupation()`\n",
"* `nomis.nomis_codes_ethnicity()`\n",
"* `nomis.nomis_codes_geog()`\n",
"* `nomis.nomis_codes_items()`\n",
"* `nomis.nomis_codes_measures()`\n",
"* `nomis.nomis_codes_age()`\n",
"* `nomis.nomis_codes_sex()`\n",
"* `nomis.nomis_codes_duration()`\n",
"* `nomis.nomis_codes_measures()`\n",
"* `nomis.nomis_codes_time()`\n",
"* `nomis.nomis_codes_freq()`"
]
},
{
"cell_type": "code",
"execution_count": 471,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>agencyid</th>\n",
" <th>dataset</th>\n",
" <th>description</th>\n",
" <th>dimension</th>\n",
" <th>name</th>\n",
" <th>value</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> United Kingdom</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957697</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> Great Britain</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957698</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> England</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957699</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> Wales</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957700</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> Scotland</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957701</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> Northern Ireland</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957702</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> England and Wales</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957703</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" agencyid dataset description dimension name value\n",
"0 NOMIS NM_1_1 United Kingdom CL_1_1_GEOGRAPHY geography 2092957697\n",
"1 NOMIS NM_1_1 Great Britain CL_1_1_GEOGRAPHY geography 2092957698\n",
"2 NOMIS NM_1_1 England CL_1_1_GEOGRAPHY geography 2092957699\n",
"3 NOMIS NM_1_1 Wales CL_1_1_GEOGRAPHY geography 2092957700\n",
"4 NOMIS NM_1_1 Scotland CL_1_1_GEOGRAPHY geography 2092957701\n",
"5 NOMIS NM_1_1 Northern Ireland CL_1_1_GEOGRAPHY geography 2092957702\n",
"6 NOMIS NM_1_1 England and Wales CL_1_1_GEOGRAPHY geography 2092957703"
]
},
"execution_count": 471,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nomis.nomis_codes_geog()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `geography` element also allows us to identify the geographies contained within a particular geography."
]
},
{
"cell_type": "code",
"execution_count": 472,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>agencyid</th>\n",
" <th>dataset</th>\n",
" <th>description</th>\n",
" <th>dimension</th>\n",
" <th>name</th>\n",
" <th>value</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> Wales</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957700</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> 1991 frozen wards within Wales</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957700TYPE1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> parliamentary constituencies 1983 revision wit...</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957700TYPE8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> tecs / lecs as of 1989 within Wales</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957700TYPE18</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> NOMIS</td>\n",
" <td> NM_1_1</td>\n",
" <td> 1981 frozen wards within Wales</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> geography</td>\n",
" <td> 2092957700TYPE33</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" agencyid dataset description \\\n",
"0 NOMIS NM_1_1 Wales \n",
"1 NOMIS NM_1_1 1991 frozen wards within Wales \n",
"2 NOMIS NM_1_1 parliamentary constituencies 1983 revision wit... \n",
"3 NOMIS NM_1_1 tecs / lecs as of 1989 within Wales \n",
"4 NOMIS NM_1_1 1981 frozen wards within Wales \n",
"\n",
" dimension name value \n",
"0 CL_1_1_GEOGRAPHY geography 2092957700 \n",
"1 CL_1_1_GEOGRAPHY geography 2092957700TYPE1 \n",
"2 CL_1_1_GEOGRAPHY geography 2092957700TYPE8 \n",
"3 CL_1_1_GEOGRAPHY geography 2092957700TYPE18 \n",
"4 CL_1_1_GEOGRAPHY geography 2092957700TYPE33 "
]
},
"execution_count": 472,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nomis.nomis_codes_geog(geography='2092957700').head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can force a search of the *nomis* API in a non-cacheing way by calling `nomis_codes_datasets()`. Calling without an argument returns a table identifying all the datasets:"
]
},
{
"cell_type": "code",
"execution_count": 473,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>agency</th>\n",
" <th>description</th>\n",
" <th>idx</th>\n",
" <th>name</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> NOMIS</td>\n",
" <td> JSA claimant count records the number of peopl...</td>\n",
" <td> NM_1_1</td>\n",
" <td> claimant count with rates and proportions</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> NOMIS</td>\n",
" <td> A quartery count of claimants who were claimin...</td>\n",
" <td> NM_2_1</td>\n",
" <td> claimant count - age and duration</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> NOMIS</td>\n",
" <td> A monthly count of job seekers allowance (JSA)...</td>\n",
" <td> NM_4_1</td>\n",
" <td> claimant count - age and duration</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> NOMIS</td>\n",
" <td> A midyear estimate of the workforce (the denom...</td>\n",
" <td> NM_5_1</td>\n",
" <td> claimant count denominators - historical workf...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> NOMIS</td>\n",
" <td> A quarterly count of job seekers allowance cl...</td>\n",
" <td> NM_6_1</td>\n",
" <td> claimant count - occupation</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" agency description idx \\\n",
"0 NOMIS JSA claimant count records the number of peopl... NM_1_1 \n",
"1 NOMIS A quartery count of claimants who were claimin... NM_2_1 \n",
"2 NOMIS A monthly count of job seekers allowance (JSA)... NM_4_1 \n",
"3 NOMIS A midyear estimate of the workforce (the denom... NM_5_1 \n",
"4 NOMIS A quarterly count of job seekers allowance cl... NM_6_1 \n",
"\n",
" name \n",
"0 claimant count with rates and proportions \n",
"1 claimant count - age and duration \n",
"2 claimant count - age and duration \n",
"3 claimant count denominators - historical workf... \n",
"4 claimant count - occupation "
]
},
"execution_count": 473,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"datasets=nomis.nomis_codes_datasets()\n",
"datasets.head(5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also search the *nomis* API directly for keywords or keyphrases contained with the `name` and `description` columns."
]
},
{
"cell_type": "code",
"execution_count": 474,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>agency</th>\n",
" <th>concept</th>\n",
" <th>description</th>\n",
" <th>dimension</th>\n",
" <th>idx</th>\n",
" <th>name</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> NOMIS</td>\n",
" <td> GEOGRAPHY</td>\n",
" <td> JSA claimant count records the number of peopl...</td>\n",
" <td> CL_1_1_GEOGRAPHY</td>\n",
" <td> NM_1_1</td>\n",
" <td> claimant count with rates and proportions</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> NOMIS</td>\n",
" <td> SEX</td>\n",
" <td> JSA claimant count records the number of peopl...</td>\n",
" <td> CL_1_1_SEX</td>\n",
" <td> NM_1_1</td>\n",
" <td> claimant count with rates and proportions</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> NOMIS</td>\n",
" <td> ITEM</td>\n",
" <td> JSA claimant count records the number of peopl...</td>\n",
" <td> CL_1_1_ITEM</td>\n",
" <td> NM_1_1</td>\n",
" <td> claimant count with rates and proportions</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> NOMIS</td>\n",
" <td> MEASURES</td>\n",
" <td> JSA claimant count records the number of peopl...</td>\n",
" <td> CL_1_1_MEASURES</td>\n",
" <td> NM_1_1</td>\n",
" <td> claimant count with rates and proportions</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> NOMIS</td>\n",
" <td> FREQ</td>\n",
" <td> JSA claimant count records the number of peopl...</td>\n",
" <td> CL_1_1_FREQ</td>\n",
" <td> NM_1_1</td>\n",
" <td> claimant count with rates and proportions</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" agency concept description \\\n",
"0 NOMIS GEOGRAPHY JSA claimant count records the number of peopl... \n",
"1 NOMIS SEX JSA claimant count records the number of peopl... \n",
"2 NOMIS ITEM JSA claimant count records the number of peopl... \n",
"3 NOMIS MEASURES JSA claimant count records the number of peopl... \n",
"4 NOMIS FREQ JSA claimant count records the number of peopl... \n",
"\n",
" dimension idx name \n",
"0 CL_1_1_GEOGRAPHY NM_1_1 claimant count with rates and proportions \n",
"1 CL_1_1_SEX NM_1_1 claimant count with rates and proportions \n",
"2 CL_1_1_ITEM NM_1_1 claimant count with rates and proportions \n",
"3 CL_1_1_MEASURES NM_1_1 claimant count with rates and proportions \n",
"4 CL_1_1_FREQ NM_1_1 claimant count with rates and proportions "
]
},
"execution_count": 474,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"datasets_dim=nomis.nomis_codes_datasets(search='claimant count with rates and proportions',dimensions=True)\n",
"datasets_dim.head(5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Wild-card operators are also possible in the search:"
]
},
{
"cell_type": "code",
"execution_count": 475,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>agency</th>\n",
" <th>description</th>\n",
" <th>idx</th>\n",
" <th>name</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> NOMIS</td>\n",
" <td> The seasonally adjusted series takes into acco...</td>\n",
" <td> NM_11_1</td>\n",
" <td> claimant count - seasonally adjusted</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> NOMIS</td>\n",
" <td> an analysis of seasonally adjusted jobcentre i...</td>\n",
" <td> NM_19_1</td>\n",
" <td> vacancies - seasonally adjusted series</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> NOMIS</td>\n",
" <td> this data set provides quarterly estimates of ...</td>\n",
" <td> NM_26_1</td>\n",
" <td> employee job estimates - seasonally adjusted</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> NOMIS</td>\n",
" <td> </td>\n",
" <td> NM_39_1</td>\n",
" <td> claimant flows - seasonally adjusted</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> NOMIS</td>\n",
" <td> The labour force survey (LFS) is a quarterly s...</td>\n",
" <td> NM_87_1</td>\n",
" <td> labour force survey - quarterly: four quarter ...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td> NOMIS</td>\n",
" <td> This dataset provides quarterly estimates of w...</td>\n",
" <td> NM_130_1</td>\n",
" <td> workforce jobs by industry (SIC 2007) - season...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td> NOMIS</td>\n",
" <td> This dataset provides quarterly estimates of w...</td>\n",
" <td> NM_131_1</td>\n",
" <td> workforce jobs by industry (SIC 2007) and sex ...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" agency description idx \\\n",
"0 NOMIS The seasonally adjusted series takes into acco... NM_11_1 \n",
"1 NOMIS an analysis of seasonally adjusted jobcentre i... NM_19_1 \n",
"2 NOMIS this data set provides quarterly estimates of ... NM_26_1 \n",
"3 NOMIS NM_39_1 \n",
"4 NOMIS The labour force survey (LFS) is a quarterly s... NM_87_1 \n",
"5 NOMIS This dataset provides quarterly estimates of w... NM_130_1 \n",
"6 NOMIS This dataset provides quarterly estimates of w... NM_131_1 \n",
"\n",
" name \n",
"0 claimant count - seasonally adjusted \n",
"1 vacancies - seasonally adjusted series \n",
"2 employee job estimates - seasonally adjusted \n",
"3 claimant flows - seasonally adjusted \n",
"4 labour force survey - quarterly: four quarter ... \n",
"5 workforce jobs by industry (SIC 2007) - season... \n",
"6 workforce jobs by industry (SIC 2007) and sex ... "
]
},
"execution_count": 475,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nomis.nomis_codes_datasets(search='*seasonally adjusted*')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that as some point `nomis_codes_datasets()` is likely to be pushed further inside the class and `dataset_lookup` will be come the preferred way of inspecting this information."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Grabbing Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As well as grabbing information and metadata about datasets from the *nomis* API, we can also retrieve items from those datasets."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### \\_nomis\\_data( _idx, postcode, areacode, \\**kwargs_ )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can actually grab a dataset using `nomis._nomis_data(idx, ...)`. Depending on the dataset selected, different dimension arguments are possible (as identified using `nomis.help_url(idx)`, for example)."
]
},
{
"cell_type": "code",
"execution_count": 476,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>GEOGRAPHY_CODE</th>\n",
" <th>GEOGRAPHY_NAME</th>\n",
" <th>MEASURES_NAME</th>\n",
" <th>MEASURES</th>\n",
" <th>DATE_CODE</th>\n",
" <th>DATE_NAME</th>\n",
" <th>SEX_NAME</th>\n",
" <th>ITEM_NAME</th>\n",
" <th>OBS_VALUE</th>\n",
" <th>_Code</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> E06000046</td>\n",
" <td> Isle of Wight</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 20100</td>\n",
" <td> 2015-01</td>\n",
" <td> January 2015</td>\n",
" <td> Male</td>\n",
" <td> Total claimants</td>\n",
" <td> 1386</td>\n",
" <td> NM_1_1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> E06000046</td>\n",
" <td> Isle of Wight</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 20100</td>\n",
" <td> 2015-01</td>\n",
" <td> January 2015</td>\n",
" <td> Female</td>\n",
" <td> Total claimants</td>\n",
" <td> 686</td>\n",
" <td> NM_1_1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> E06000046</td>\n",
" <td> Isle of Wight</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 20100</td>\n",
" <td> 2015-01</td>\n",
" <td> January 2015</td>\n",
" <td> Total</td>\n",
" <td> Total claimants</td>\n",
" <td> 2072</td>\n",
" <td> NM_1_1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" GEOGRAPHY_CODE GEOGRAPHY_NAME MEASURES_NAME MEASURES DATE_CODE \\\n",
"0 E06000046 Isle of Wight Persons claiming JSA 20100 2015-01 \n",
"1 E06000046 Isle of Wight Persons claiming JSA 20100 2015-01 \n",
"2 E06000046 Isle of Wight Persons claiming JSA 20100 2015-01 \n",
"\n",
" DATE_NAME SEX_NAME ITEM_NAME OBS_VALUE _Code \n",
"0 January 2015 Male Total claimants 1386 NM_1_1 \n",
"1 January 2015 Female Total claimants 686 NM_1_1 \n",
"2 January 2015 Total Total claimants 2072 NM_1_1 "
]
},
"execution_count": 476,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"testdata=nomis._nomis_data(geography='2038431803',sex='5,6,7',item=1,measures=20100)\n",
"testdata.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There's a postcode helper available for identifying a geography from a postcode. By default, we use district as the geography to make lookups into. (See the code for other alternatives.)"
]
},
{
"cell_type": "code",
"execution_count": 477,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>GEOGRAPHY_CODE</th>\n",
" <th>GEOGRAPHY_NAME</th>\n",
" <th>MEASURES_NAME</th>\n",
" <th>MEASURES</th>\n",
" <th>DATE_CODE</th>\n",
" <th>DATE_NAME</th>\n",
" <th>SEX_NAME</th>\n",
" <th>ITEM_NAME</th>\n",
" <th>OBS_VALUE</th>\n",
" <th>_Code</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> E06000042</td>\n",
" <td> Milton Keynes</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 20100</td>\n",
" <td> 2015-01</td>\n",
" <td> January 2015</td>\n",
" <td> Male</td>\n",
" <td> Total claimants</td>\n",
" <td> 1773</td>\n",
" <td> NM_1_1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> E06000042</td>\n",
" <td> Milton Keynes</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 20100</td>\n",
" <td> 2015-01</td>\n",
" <td> January 2015</td>\n",
" <td> Female</td>\n",
" <td> Total claimants</td>\n",
" <td> 1044</td>\n",
" <td> NM_1_1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> E06000042</td>\n",
" <td> Milton Keynes</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 20100</td>\n",
" <td> 2015-01</td>\n",
" <td> January 2015</td>\n",
" <td> Total</td>\n",
" <td> Total claimants</td>\n",
" <td> 2817</td>\n",
" <td> NM_1_1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" GEOGRAPHY_CODE GEOGRAPHY_NAME MEASURES_NAME MEASURES DATE_CODE \\\n",
"0 E06000042 Milton Keynes Persons claiming JSA 20100 2015-01 \n",
"1 E06000042 Milton Keynes Persons claiming JSA 20100 2015-01 \n",
"2 E06000042 Milton Keynes Persons claiming JSA 20100 2015-01 \n",
"\n",
" DATE_NAME SEX_NAME ITEM_NAME OBS_VALUE _Code \n",
"0 January 2015 Male Total claimants 1773 NM_1_1 \n",
"1 January 2015 Female Total claimants 1044 NM_1_1 \n",
"2 January 2015 Total Total claimants 2817 NM_1_1 "
]
},
"execution_count": 477,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nomis._nomis_data(postcode='mk7 6AA',sex='5,6,7',item=1,measures=20100).head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As well as passing dimensions and their associated values into the _`**kwargs`_, we can also pass in a `select` parameter that identifies which columns to return from the *nomis* API.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### \\_nomis\\_data\\_url( *idx, \\**kwargs* )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can inspect the URL that gets generated from a particular set of parameters."
]
},
{
"cell_type": "code",
"execution_count": 478,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"'https://www.nomisweb.co.uk/api/v01/dataset/NM_31_1.data.csv?measures=20100&time=latest&select=geography_code%2Cgeography_name%2Cmeasures_name%2Cmeasures%2Cdate_code%2Cdate_name%2Cobs_value&geography=1946157281'"
]
},
"execution_count": 478,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nomis._nomis_data_url(idx='NM_31_1',geography='1946157281',measures=20100)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Automatic Conversion of Dimension Parameter Values to Dimension Parameter Codes"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A helper function is provided to convert dimension values to dimension codes."
]
},
{
"cell_type": "code",
"execution_count": 479,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>GEOGRAPHY_CODE</th>\n",
" <th>GEOGRAPHY_NAME</th>\n",
" <th>MEASURES_NAME</th>\n",
" <th>MEASURES</th>\n",
" <th>DATE_CODE</th>\n",
" <th>DATE_NAME</th>\n",
" <th>SEX_NAME</th>\n",
" <th>ITEM_NAME</th>\n",
" <th>OBS_VALUE</th>\n",
" <th>_Code</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> E06000046</td>\n",
" <td> Isle of Wight</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 20100</td>\n",
" <td> 2015-01</td>\n",
" <td> January 2015</td>\n",
" <td> Male</td>\n",
" <td> Total claimants</td>\n",
" <td> 1386</td>\n",
" <td> NM_1_1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> E06000046</td>\n",
" <td> Isle of Wight</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 20100</td>\n",
" <td> 2015-01</td>\n",
" <td> January 2015</td>\n",
" <td> Female</td>\n",
" <td> Total claimants</td>\n",
" <td> 686</td>\n",
" <td> NM_1_1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" GEOGRAPHY_CODE GEOGRAPHY_NAME MEASURES_NAME MEASURES DATE_CODE \\\n",
"0 E06000046 Isle of Wight Persons claiming JSA 20100 2015-01 \n",
"1 E06000046 Isle of Wight Persons claiming JSA 20100 2015-01 \n",
"\n",
" DATE_NAME SEX_NAME ITEM_NAME OBS_VALUE _Code \n",
"0 January 2015 Male Total claimants 1386 NM_1_1 \n",
"1 January 2015 Female Total claimants 686 NM_1_1 "
]
},
"execution_count": 479,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"testdata=nomis._nomis_data(geography='2038431803',sex='5,Female',item=1,measures=20100)\n",
"testdata.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Conversions are based on look-ups into the metadata for the dataset."
]
},
{
"cell_type": "code",
"execution_count": 480,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>GEOGRAPHY_CODE</th>\n",
" <th>GEOGRAPHY_NAME</th>\n",
" <th>MEASURES_NAME</th>\n",
" <th>MEASURES</th>\n",
" <th>DATE_CODE</th>\n",
" <th>DATE_NAME</th>\n",
" <th>SEX_NAME</th>\n",
" <th>ITEM_NAME</th>\n",
" <th>OBS_VALUE</th>\n",
" <th>_Code</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> E06000046</td>\n",
" <td> Isle of Wight</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 20100</td>\n",
" <td> 2015-01</td>\n",
" <td> January 2015</td>\n",
" <td> Male</td>\n",
" <td> Total claimants</td>\n",
" <td> 1386</td>\n",
" <td> NM_1_1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> E06000046</td>\n",
" <td> Isle of Wight</td>\n",
" <td> Persons claiming JSA</td>\n",
" <td> 20100</td>\n",
" <td> 2015-01</td>\n",
" <td> January 2015</td>\n",
" <td> Female</td>\n",
" <td> Total claimants</td>\n",
" <td> 686</td>\n",
" <td> NM_1_1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" GEOGRAPHY_CODE GEOGRAPHY_NAME MEASURES_NAME MEASURES DATE_CODE \\\n",
"0 E06000046 Isle of Wight Persons claiming JSA 20100 2015-01 \n",
"1 E06000046 Isle of Wight Persons claiming JSA 20100 2015-01 \n",
"\n",
" DATE_NAME SEX_NAME ITEM_NAME OBS_VALUE _Code \n",
"0 January 2015 Male Total claimants 1386 NM_1_1 \n",
"1 January 2015 Female Total claimants 686 NM_1_1 "
]
},
"execution_count": 480,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"testdata=nomis._nomis_data(geography='2038431803',sex='5,Female',item='Total Claimants',measures=20100)\n",
"testdata.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can identify the dimension values or codes by using the `nomis.nomis_code_metadata(idx, describe='all'|DIMENSION|DIMENSIONLIST)` lookup."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Help With Finding Geography Codes\n",
"\n",
"The following tools are provided in addition to the automatic dicovery of a geography code from a postcode lookup."
]
},
{
"cell_type": "code",
"execution_count": 481,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>description</th>\n",
" <th>geog</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> United Kingdom</td>\n",
" <td> 2092957697</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> Great Britain</td>\n",
" <td> 2092957698</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> England</td>\n",
" <td> 2092957699</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> Wales</td>\n",
" <td> 2092957700</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> Scotland</td>\n",
" <td> 2092957701</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td> Northern Ireland</td>\n",
" <td> 2092957702</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td> England and Wales</td>\n",
" <td> 2092957703</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" description geog\n",
"0 United Kingdom 2092957697\n",
"1 Great Britain 2092957698\n",
"2 England 2092957699\n",
"3 Wales 2092957700\n",
"4 Scotland 2092957701\n",
"5 Northern Ireland 2092957702\n",
"6 England and Wales 2092957703"
]
},
"execution_count": 481,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nomis.get_geo_code()"
]
},
{
"cell_type": "code",
"execution_count": 482,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>description</th>\n",
" <th>geog</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> England</td>\n",
" <td> 2092957699</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> Scotland</td>\n",
" <td> 2092957701</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> Northern Ireland</td>\n",
" <td> 2092957702</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> England and Wales</td>\n",
" <td> 2092957703</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" description geog\n",
"0 England 2092957699\n",
"1 Scotland 2092957701\n",
"2 Northern Ireland 2092957702\n",
"3 England and Wales 2092957703"
]
},
"execution_count": 482,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nomis.get_geo_code(search='land')"
]
},
{
"cell_type": "code",
"execution_count": 483,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>description</th>\n",
" <th>geog</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> Isle of Wight</td>\n",
" <td> 1946157281</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> 2011 census frozen wards within Isle of Wight</td>\n",
" <td> 1946157281TYPE236</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> 2011 super output areas - middle layer within ...</td>\n",
" <td> 1946157281TYPE297</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> 2011 super output areas - lower layer within I...</td>\n",
" <td> 1946157281TYPE298</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> super output areas - lower layer within Isle o...</td>\n",
" <td> 1946157281TYPE304</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td> super output areas - middle layer within Isle ...</td>\n",
" <td> 1946157281TYPE305</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td> 2003 CAS wards within Isle of Wight</td>\n",
" <td> 1946157281TYPE312</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7</th>\n",
" <td> 2009 statistical wards within Isle of Wight</td>\n",
" <td> 1946157281TYPE337</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8</th>\n",
" <td> 2013 electoral ward within Isle of Wight</td>\n",
" <td> 1946157281TYPE401</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9</th>\n",
" <td> pre-2009 local authorities: district / unitary...</td>\n",
" <td> 1946157281TYPE486</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" description geog\n",
"0 Isle of Wight 1946157281\n",
"1 2011 census frozen wards within Isle of Wight 1946157281TYPE236\n",
"2 2011 super output areas - middle layer within ... 1946157281TYPE297\n",
"3 2011 super output areas - lower layer within I... 1946157281TYPE298\n",
"4 super output areas - lower layer within Isle o... 1946157281TYPE304\n",
"5 super output areas - middle layer within Isle ... 1946157281TYPE305\n",
"6 2003 CAS wards within Isle of Wight 1946157281TYPE312\n",
"7 2009 statistical wards within Isle of Wight 1946157281TYPE337\n",
"8 2013 electoral ward within Isle of Wight 1946157281TYPE401\n",
"9 pre-2009 local authorities: district / unitary... 1946157281TYPE486"
]
},
"execution_count": 483,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nomis.get_geo_code(value='1946157281')"
]
},
{
"cell_type": "code",
"execution_count": 484,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>description</th>\n",
" <th>geog</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> Isle of Wight</td>\n",
" <td> 2038431803</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> 1991 frozen wards within Isle of Wight</td>\n",
" <td> 2038431803TYPE1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> 1981 frozen wards within Isle of Wight</td>\n",
" <td> 2038431803TYPE33</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> super output areas - lower layer within Isle o...</td>\n",
" <td> 2038431803TYPE304</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> super output areas - middle layer within Isle ...</td>\n",
" <td> 2038431803TYPE305</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td> 2003 CAS wards within Isle of Wight</td>\n",
" <td> 2038431803TYPE312</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6</th>\n",
" <td> local authorities: district / unitary within I...</td>\n",
" <td> 2038431803TYPE464</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" description geog\n",
"0 Isle of Wight 2038431803\n",
"1 1991 frozen wards within Isle of Wight 2038431803TYPE1\n",
"2 1981 frozen wards within Isle of Wight 2038431803TYPE33\n",
"3 super output areas - lower layer within Isle o... 2038431803TYPE304\n",
"4 super output areas - middle layer within Isle ... 2038431803TYPE305\n",
"5 2003 CAS wards within Isle of Wight 2038431803TYPE312\n",
"6 local authorities: district / unitary within I... 2038431803TYPE464"
]
},
"execution_count": 484,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nomis.get_geo_code(value='2038431803')"
]
},
{
"cell_type": "code",
"execution_count": 485,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>description</th>\n",
" <th>geog</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> parliamentary constituencies 1983 revision wit...</td>\n",
" <td> 2092957700TYPE8</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> parliamentary constituencies 1983 revision wit...</td>\n",
" <td> 2092957700TYPE45</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> parliamentary constituencies 2010 within Wales</td>\n",
" <td> 2092957700TYPE460</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> national assembly for wales constituencies wit...</td>\n",
" <td> 2092957700TYPE466</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> parliamentary constituencies 2005 revision wit...</td>\n",
" <td> 2092957700TYPE468</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td> parliamentary constituencies 1995 revision wit...</td>\n",
" <td> 2092957700TYPE484</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" description geog\n",
"0 parliamentary constituencies 1983 revision wit... 2092957700TYPE8\n",
"1 parliamentary constituencies 1983 revision wit... 2092957700TYPE45\n",
"2 parliamentary constituencies 2010 within Wales 2092957700TYPE460\n",
"3 national assembly for wales constituencies wit... 2092957700TYPE466\n",
"4 parliamentary constituencies 2005 revision wit... 2092957700TYPE468\n",
"5 parliamentary constituencies 1995 revision wit... 2092957700TYPE484"
]
},
"execution_count": 485,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nomis.get_geo_code(desc='Wales',search='const')"
]
},
{
"cell_type": "code",
"execution_count": 486,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>description</th>\n",
" <th>geog</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> East Northamptonshire</td>\n",
" <td> 1946157157</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td> Northampton</td>\n",
" <td> 1946157159</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td> South Northamptonshire</td>\n",
" <td> 1946157160</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td> Wolverhampton</td>\n",
" <td> 1946157192</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td> Southampton</td>\n",
" <td> 1946157287</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" description geog\n",
"0 East Northamptonshire 1946157157\n",
"1 Northampton 1946157159\n",
"2 South Northamptonshire 1946157160\n",
"3 Wolverhampton 1946157192\n",
"4 Southampton 1946157287"
]
},
"execution_count": 486,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nomis.get_geo_code(helper='LA_district',search='hampton')"
]
},
{
"cell_type": "code",
"execution_count": 487,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>description</th>\n",
" <th>geog</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td> Isle of Wight</td>\n",
" <td> 1946157281</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" description geog\n",
"0 Isle of Wight 1946157281"
]
},
"execution_count": 487,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nomis.get_geo_code(helper='LA_district',search='Isle of Wight')"
]
},
{
"cell_type": "code",
"execution_count": 488,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>description</th>\n",
" <th>geog</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0 </th>\n",
" <td> East Northamptonshire</td>\n",
" <td> 1946157157</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1 </th>\n",
" <td> 2011 census frozen wards within East Northampt...</td>\n",
" <td> 1946157157TYPE236</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2 </th>\n",
" <td> 2011 super output areas - middle layer within ...</td>\n",
" <td> 1946157157TYPE297</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3 </th>\n",
" <td> 2011 super output areas - lower layer within E...</td>\n",
" <td> 1946157157TYPE298</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4 </th>\n",
" <td> super output areas - lower layer within East N...</td>\n",
" <td> 1946157157TYPE304</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5 </th>\n",
" <td> super output areas - middle layer within East ...</td>\n",
" <td> 1946157157TYPE305</td>\n",
" </tr>\n",
" <tr>\n",
" <th>6 </th>\n",
" <td> 2003 CAS wards within East Northamptonshire</td>\n",
" <td> 1946157157TYPE312</td>\n",
" </tr>\n",
" <tr>\n",
" <th>7 </th>\n",
" <td> 2009 statistical wards within East Northampton...</td>\n",
" <td> 1946157157TYPE337</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8 </th>\n",
" <td> 2013 electoral ward within East Northamptonshire</td>\n",
" <td> 1946157157TYPE401</td>\n",
" </tr>\n",
" <tr>\n",
" <th>9 </th>\n",
" <td> pre-2009 local authorities: district / unitary...</td>\n",
" <td> 1946157157TYPE486</td>\n",
" </tr>\n",
" <tr>\n",
" <th>10</th>\n",
" <td> Northampton</td>\n",
" <td> 1946157159</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11</th>\n",
" <td> 2011 census frozen wards within Northampton</td>\n",
" <td> 1946157159TYPE236</td>\n",
" </tr>\n",
" <tr>\n",
" <th>12</th>\n",
" <td> 2011 super output areas - middle layer within ...</td>\n",
" <td> 1946157159TYPE297</td>\n",
" </tr>\n",
" <tr>\n",
" <th>13</th>\n",
" <td> 2011 super output areas - lower layer within N...</td>\n",
" <td> 1946157159TYPE298</td>\n",
" </tr>\n",
" <tr>\n",
" <th>14</th>\n",
" <td> super output areas - lower layer within Northa...</td>\n",
" <td> 1946157159TYPE304</td>\n",
" </tr>\n",
" <tr>\n",
" <th>15</th>\n",
" <td> super output areas - middle layer within North...</td>\n",
" <td> 1946157159TYPE305</td>\n",
" </tr>\n",
" <tr>\n",
" <th>16</th>\n",
" <td> 2003 CAS wards within Northampton</td>\n",
" <td> 1946157159TYPE312</td>\n",
" </tr>\n",
" <tr>\n",
" <th>17</th>\n",
" <td> 2009 statistical wards within Northampton</td>\n",
" <td> 1946157159TYPE337</td>\n",
" </tr>\n",
" <tr>\n",
" <th>18</th>\n",
" <td> 2013 electoral ward within Northampton</td>\n",
" <td> 1946157159TYPE401</td>\n",
" </tr>\n",
" <tr>\n",
" <th>19</th>\n",
" <td> pre-2009 local authorities: district / unitary...</td>\n",
" <td> 1946157159TYPE486</td>\n",
" </tr>\n",
" <tr>\n",
" <th>20</th>\n",
" <td> South Northamptonshire</td>\n",
" <td> 1946157160</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td> 2011 census frozen wards within South Northamp...</td>\n",
" <td> 1946157160TYPE236</td>\n",
" </tr>\n",
" <tr>\n",
" <th>22</th>\n",
" <td> 2011 super output areas - middle layer within ...</td>\n",
" <td> 1946157160TYPE297</td>\n",
" </tr>\n",
" <tr>\n",
" <th>23</th>\n",
" <td> 2011 super output areas - lower layer within S...</td>\n",
" <td> 1946157160TYPE298</td>\n",
" </tr>\n",
" <tr>\n",
" <th>24</th>\n",
" <td> super output areas - lower layer within South ...</td>\n",
" <td> 1946157160TYPE304</td>\n",
" </tr>\n",
" <tr>\n",
" <th>25</th>\n",
" <td> super output areas - middle layer within South...</td>\n",
" <td> 1946157160TYPE305</td>\n",
" </tr>\n",
" <tr>\n",
" <th>26</th>\n",
" <td> 2003 CAS wards within South Northamptonshire</td>\n",
" <td> 1946157160TYPE312</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27</th>\n",
" <td> 2009 statistical wards within South Northampto...</td>\n",
" <td> 1946157160TYPE337</td>\n",
" </tr>\n",
" <tr>\n",
" <th>28</th>\n",
" <td> 2013 electoral ward within South Northamptonshire</td>\n",
" <td> 1946157160TYPE401</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29</th>\n",
" <td> pre-2009 local authorities: district / unitary...</td>\n",
" <td> 1946157160TYPE486</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30</th>\n",
" <td> Wolverhampton</td>\n",
" <td> 1946157192</td>\n",
" </tr>\n",
" <tr>\n",
" <th>31</th>\n",
" <td> 2011 census frozen wards within Wolverhampton</td>\n",
" <td> 1946157192TYPE236</td>\n",
" </tr>\n",
" <tr>\n",
" <th>32</th>\n",
" <td> 2011 super output areas - middle layer within ...</td>\n",
" <td> 1946157192TYPE297</td>\n",
" </tr>\n",
" <tr>\n",
" <th>33</th>\n",
" <td> 2011 super output areas - lower layer within W...</td>\n",
" <td> 1946157192TYPE298</td>\n",
" </tr>\n",
" <tr>\n",
" <th>34</th>\n",
" <td> super output areas - lower layer within Wolver...</td>\n",
" <td> 1946157192TYPE304</td>\n",
" </tr>\n",
" <tr>\n",
" <th>35</th>\n",
" <td> super output areas - middle layer within Wolve...</td>\n",
" <td> 1946157192TYPE305</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36</th>\n",
" <td> 2003 CAS wards within Wolverhampton</td>\n",
" <td> 1946157192TYPE312</td>\n",
" </tr>\n",
" <tr>\n",
" <th>37</th>\n",
" <td> 2009 statistical wards within Wolverhampton</td>\n",
" <td> 1946157192TYPE337</td>\n",
" </tr>\n",
" <tr>\n",
" <th>38</th>\n",
" <td> 2013 electoral ward within Wolverhampton</td>\n",
" <td> 1946157192TYPE401</td>\n",
" </tr>\n",
" <tr>\n",
" <th>39</th>\n",
" <td> pre-2009 local authorities: district / unitary...</td>\n",
" <td> 1946157192TYPE486</td>\n",
" </tr>\n",
" <tr>\n",
" <th>40</th>\n",
" <td> Southampton</td>\n",
" <td> 1946157287</td>\n",
" </tr>\n",
" <tr>\n",
" <th>41</th>\n",
" <td> 2011 census frozen wards within Southampton</td>\n",
" <td> 1946157287TYPE236</td>\n",
" </tr>\n",
" <tr>\n",
" <th>42</th>\n",
" <td> 2011 super output areas - middle layer within ...</td>\n",
" <td> 1946157287TYPE297</td>\n",
" </tr>\n",
" <tr>\n",
" <th>43</th>\n",
" <td> 2011 super output areas - lower layer within S...</td>\n",
" <td> 1946157287TYPE298</td>\n",
" </tr>\n",
" <tr>\n",
" <th>44</th>\n",
" <td> super output areas - lower layer within Southa...</td>\n",
" <td> 1946157287TYPE304</td>\n",
" </tr>\n",
" <tr>\n",
" <th>45</th>\n",
" <td> super output areas - middle layer within South...</td>\n",
" <td> 1946157287TYPE305</td>\n",
" </tr>\n",
" <tr>\n",
" <th>46</th>\n",
" <td> 2003 CAS wards within Southampton</td>\n",
" <td> 1946157287TYPE312</td>\n",
" </tr>\n",
" <tr>\n",
" <th>47</th>\n",
" <td> 2009 statistical wards within Southampton</td>\n",
" <td> 1946157287TYPE337</td>\n",
" </tr>\n",
" <tr>\n",
" <th>48</th>\n",
" <td> 2013 electoral ward within Southampton</td>\n",
" <td> 1946157287TYPE401</td>\n",
" </tr>\n",
" <tr>\n",
" <th>49</th>\n",
" <td> pre-2009 local authorities: district / unitary...</td>\n",
" <td> 1946157287TYPE486</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" description geog\n",
"0 East Northamptonshire 1946157157\n",
"1 2011 census frozen wards within East Northampt... 1946157157TYPE236\n",
"2 2011 super output areas - middle layer within ... 1946157157TYPE297\n",
"3 2011 super output areas - lower layer within E... 1946157157TYPE298\n",
"4 super output areas - lower layer within East N... 1946157157TYPE304\n",
"5 super output areas - middle layer within East ... 1946157157TYPE305\n",
"6 2003 CAS wards within East Northamptonshire 1946157157TYPE312\n",
"7 2009 statistical wards within East Northampton... 1946157157TYPE337\n",
"8 2013 electoral ward within East Northamptonshire 1946157157TYPE401\n",
"9 pre-2009 local authorities: district / unitary... 1946157157TYPE486\n",
"10 Northampton 1946157159\n",
"11 2011 census frozen wards within Northampton 1946157159TYPE236\n",
"12 2011 super output areas - middle layer within ... 1946157159TYPE297\n",
"13 2011 super output areas - lower layer within N... 1946157159TYPE298\n",
"14 super output areas - lower layer within Northa... 1946157159TYPE304\n",
"15 super output areas - middle layer within North... 1946157159TYPE305\n",
"16 2003 CAS wards within Northampton 1946157159TYPE312\n",
"17 2009 statistical wards within Northampton 1946157159TYPE337\n",
"18 2013 electoral ward within Northampton 1946157159TYPE401\n",
"19 pre-2009 local authorities: district / unitary... 1946157159TYPE486\n",
"20 South Northamptonshire 1946157160\n",
"21 2011 census frozen wards within South Northamp... 1946157160TYPE236\n",
"22 2011 super output areas - middle layer within ... 1946157160TYPE297\n",
"23 2011 super output areas - lower layer within S... 1946157160TYPE298\n",
"24 super output areas - lower layer within South ... 1946157160TYPE304\n",
"25 super output areas - middle layer within South... 1946157160TYPE305\n",
"26 2003 CAS wards within South Northamptonshire 1946157160TYPE312\n",
"27 2009 statistical wards within South Northampto... 1946157160TYPE337\n",
"28 2013 electoral ward within South Northamptonshire 1946157160TYPE401\n",
"29 pre-2009 local authorities: district / unitary... 1946157160TYPE486\n",
"30 Wolverhampton 1946157192\n",
"31 2011 census frozen wards within Wolverhampton 1946157192TYPE236\n",
"32 2011 super output areas - middle layer within ... 1946157192TYPE297\n",
"33 2011 super output areas - lower layer within W... 1946157192TYPE298\n",
"34 super output areas - lower layer within Wolver... 1946157192TYPE304\n",
"35 super output areas - middle layer within Wolve... 1946157192TYPE305\n",
"36 2003 CAS wards within Wolverhampton 1946157192TYPE312\n",
"37 2009 statistical wards within Wolverhampton 1946157192TYPE337\n",
"38 2013 electoral ward within Wolverhampton 1946157192TYPE401\n",
"39 pre-2009 local authorities: district / unitary... 1946157192TYPE486\n",
"40 Southampton 1946157287\n",
"41 2011 census frozen wards within Southampton 1946157287TYPE236\n",
"42 2011 super output areas - middle layer within ... 1946157287TYPE297\n",
"43 2011 super output areas - lower layer within S... 1946157287TYPE298\n",
"44 super output areas - lower layer within Southa... 1946157287TYPE304\n",
"45 super output areas - middle layer within South... 1946157287TYPE305\n",
"46 2003 CAS wards within Southampton 1946157287TYPE312\n",
"47 2009 statistical wards within Southampton 1946157287TYPE337\n",
"48 2013 electoral ward within Southampton 1946157287TYPE401\n",
"49 pre-2009 local authorities: district / unitary... 1946157287TYPE486"
]
},
"execution_count": 488,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"nomis.get_geo_code(helper='LA_district',search='hampton',chase=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Further Scrappy Notes..."
]
},
{
"cell_type": "code",
"execution_count": 489,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [
{
"ename": "SyntaxError",
"evalue": "invalid syntax (<ipython-input-489-89ce433f9290>, line 3)",
"output_type": "error",
"traceback": [
"\u001b[0;36m File \u001b[0;32m\"<ipython-input-489-89ce433f9290>\"\u001b[0;36m, line \u001b[0;32m3\u001b[0m\n\u001b[0;31m Explore a local authority profile\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n"
]
}
],
"source": [
"## Exploring the Isle of Wight JSA Figures\n",
"\n",
"Explore a local authority profile\n",
"https://www.nomisweb.co.uk/reports/lmp/la/1946157281/report.aspx"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"activity": false,
"collapsed": false
},
"outputs": [],
"source": [
"nomis.get_geo_code(value='1946157281')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"activity": false,
"collapsed": true
},
"outputs": [],
"source": [
"Steps\n",
"\n",
"Look up local authority profile: nomis.get_geo_code(helper='LA_district',search='Isle of Wight')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"activity": false,
"collapsed": true
},
"outputs": [],
"source": [
"Need to identify useful datasets - so for example, JSA by age, duration with proportions NM_18_1\n",
"\n",
"http://www.nomisweb.co.uk/api/v01/dataset/NM_18_1.data.csv?\n",
" geography=1946157281,2013265928,2092957698&date=latest&age=0&duration=0&sex=7&measures=20100,20206\n",
" &select=date_name,geography_name,geography_code,sex_name,age_name,duration_name,measures_name,obs_value,obs_status_name\n",
" \n",
" \n",
" ??NM_7_1"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.9"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
import pandas as pd
import urllib
import re
class NOMIS_CONFIG:
#TO DO implement cache to cache list of datasets and dimensions associated with datasets (except time/date?)
def __init__(self):
NOMIS_STUB='https://www.nomisweb.co.uk/api/v01/dataset/'
self.url=NOMIS_STUB
self.codes=None
self.metadata={}
def _url_encode(self,params=None):
if params is not None and params!='' and params != {}:
#params='?{}'.format( '&'.join( ['{}={}'.format(p,params[p]) for p in params] ) )
params='?{}'.format(urllib.urlencode(params))
else:
params=''
return params
def _describe_dataset(self,df):
for row in df.iterrows():
dfr=row[1]
print('{idx} - {name}: {description}\n'.format(idx=dfr['idx'],
name=dfr['name'],
description=dfr['description']) )
def _describe_metadata(self,idx,df,keys,pretty=True):
if not pretty:
for key in keys:
print( '---- {} ----'.format(key) )
for row in df[key].iterrows():
dfr=row[1]
print('{dimension} - {description}: {value}'.format(dimension=dfr['dimension'],
description=dfr['description'],
value=dfr['value']) )
else:
print('The following dimensions are available for {idx} ({name}):\n'.format(
idx=idx,
name=self.dataset_lookup_property(idx,'name')))
for key in keys:
items =['{} ({})'.format(row[1]['description'],row[1]['value']) for row in df[key].iterrows()]
print( ' - {key}: {items}'.format(key=key,items=', '.join(items)) )
def help_url(self,idx='NM_7_1'):
metadata=self.nomis_code_metadata(idx)
keys=metadata.keys()
keys.remove('core')
print('Dataset {idx} ({name}) supports the following dimensions: {dims}.'.format(
idx=idx,
dims=', '.join(keys),
name=self.dataset_lookup_property(idx,'name')))
def dataset_lookup_property(self,idx=None,prop=None):
if idx is None or prop is None: return ''
df=self.dataset_lookup(idx)
if prop in df.columns: return str(df[prop][0])
else: return ''
def dataset_lookup(self,idx=None,dimensions=False,describe=False):
##dimensions used in sense of do we display them or not
if self.codes is None:
self.codes=self.nomis_codes_datasets(dimensions=True)
if idx is not None:
#Test if idx is a list or single string
if isinstance(idx, str): idx=[idx]
df=self.codes[self.codes['idx'].isin(idx)]
else:
df=self.codes[:]
cols=df.columns.tolist()
if not dimensions:
for col in ['dimension','concept']:
cols.remove(col)
df=df[cols].drop_duplicates().reset_index(drop=True)
if describe: self._describe_dataset(df)
else: return df
def _get_geo_from_postcode(self, postcode, areacode=None):
#Set a default
if areacode is None:
areacode='district'
codemap={ 'district':486 }
if areacode in codemap:
areacode=codemap[areacode]
return 'POSTCODE|{postcode};{code}'.format(postcode=postcode,code=areacode)
def _dimension_mapper(self,idx,dim,dims):
''' dims is a string of comma separated values for a particular dimension '''
if dim is not None:
sc=self._nomis_codes_dimension_grab(dim,idx,params=None)
dimmap=dict(zip(sc['description'].astype(str),sc['value']))
keys=dimmap.keys()
keys.sort(key=len, reverse=True)
for s in keys:
pattern = re.compile(s, re.IGNORECASE)
dims=pattern.sub(str(dimmap[s]), str(dims))
return dims
def _sex_map(self,idx,sex):
return self._dimension_mapper(idx,'sex',sex)
def _get_geo_code_helper(self,helper):
value=None
desc=None
#I am baking values in, but maybe they should be searched for and retrieved that way?
if helper=='UK_WPC_2010':
#UK Westminster Parliamentary Constituency
value='2092957697TYPE460'
elif helper=='LA_district':
value='2092957697TYPE464'
return value,desc
def get_geo_code(self,value=None,desc=None, search=None, helper=None, chase=False):
#The semantics of this are quite tricky
#value is a code for a geography, the thing searched within
#desc identifies a description within a geography - on a match it takes you to this lower geography
#search is term to search (free text search) with the descriptions of areas returned
#helper is in place for shortcuts
#Given a local authority code, eg 1946157281, a report can be previewed at:
##https://www.nomisweb.co.uk/reports/lmp/la/1946157281/report.aspx
#default
if helper is not None:
value,desc=self._get_geo_code_helper(helper)
if chase:
chaser= self.nomis_codes_geog(geography=value)
if search is not None:
chasecands=chaser[ chaser['description'].str.contains(search) ][['description','value']].values
else:
chasecands=chaser[['description','value']].values
locs=[]
for chasecand in chasecands:
locs.append(chasecand[1])
if len(locs): value=','.join(map(str,locs))
geog=self.nomis_codes_geog(geography=value)
if desc is not None:
candidates=geog[['description','value']].values
for candidate in candidates:
if candidate[0]==desc:
geog=self.nomis_codes_geog(geography=candidate[1])
if search is not None:
retval=geog[ geog['description'].str.contains(search) ][['description','value']].values
else:
retval=geog[['description','value']].values
return pd.DataFrame(retval,columns=['description','geog'])
def _get_datasets(self,search=None):
url='http://www.nomisweb.co.uk/api/v01/dataset/def.sdmx.json'
if search is not None:
url='{url}{params}'.format(url=url,params=self._url_encode({'search':search}))
data=pd.read_json(url)
return data
def nomis_code_metadata(self,idx='NM_1_1',describe=None):
if idx in self.metadata:
metadata=self.metadata[idx]
else:
core=self.dataset_lookup(idx,dimensions=True)
metadata={'core':core}
for dim in core['concept'].str.lower():
metadata[dim]=self._nomis_codes_dimension_grab(dim,idx,params=None)
self.metadata[idx]=metadata
if describe=='all':
keys= metadata.keys()
keys.remove('core')
self._describe_metadata(idx,metadata,keys)
elif isinstance(describe, str) and describe in metadata.keys():
self._describe_metadata(idx,metadata,[describe])
elif isinstance(describe, list):
self._describe_metadata(idx,metadata,describe)
else:
return metadata
def nomis_codes_datasets(self,search=None,dimensions=False):
#TO DO - by default, use local dataset list and search in specified cols;
# add additional parameter to force a search on API
df=self._get_datasets(search)
keyfamilies=df.loc['keyfamilies']['structure']
if keyfamilies is None: return pd.DataFrame()
datasets=[]
for keyfamily in keyfamilies['keyfamily']:
kf={'agency':keyfamily['agencyid'],
'idx':keyfamily['id'],
'name':keyfamily['name']['value'],
'description': keyfamily['description']['value'] if 'description' in keyfamily else ''
#'dimensions':[dimensions['codelist'] for dimensions in keyfamily['components']['dimension']]
}
if dimensions:
for _dimensions in keyfamily['components']['dimension']:
kf['dimension']= _dimensions['codelist']
kf['concept']= _dimensions['conceptref']
datasets.append(kf.copy())
else:
datasets.append(kf.copy())
return pd.DataFrame(datasets)
def _nomis_codes_parser(self,url):
jdata=pd.read_json(url)
cl=jdata.loc['codelists']['structure']
if cl is None: return pd.DataFrame()
codes_data=[]
for codelist in cl['codelist']:
code_data={'agencyid':codelist['agencyid'],
'dataset':jdata.loc['header']['structure']['id'],
'dimension':codelist['id'],
'name':codelist['name']['value']
}
for code in codelist['code']:
code_data['description']=code['description']['value']
code_data['value']=code['value']
codes_data.append(code_data.copy())
return pd.DataFrame(codes_data)
#Generic mininal constructor
def _nomis_codes_url_constructor(self,dim,idx,params=None):
#This doesn't cope with geography properly that can insert an element into the path?
return '{nomis}{idx}/{dim}.def.sdmx.json{params}'.format(nomis=self.url,
idx=idx,
dim=dim.lower(),
params=self._url_encode(params))
def _nomis_codes_dimension_grab(self,dim,idx,params=None):
url=self._nomis_codes_url_constructor(dim,idx,params=None)
return self._nomis_codes_parser(url)
#Set up shorthand functions to call particular dimensions
#Select appropriate datsets as default to demo the call
def nomis_codes_measures(self,idx='NM_1_1'):
url=self._nomis_codes_url_constructor('measures',idx)
return self._nomis_codes_parser(url)
def nomis_codes_time(self,idx='NM_1_1'):
url=self._nomis_codes_url_constructor('time',idx)
return self._nomis_codes_parser(url)
def nomis_codes_industry(self,idx='NM_21_1'):
url=self._nomis_codes_url_constructor('industry',idx)
return self._nomis_codes_parser(url)
def nomis_codes_freq(self,idx='NM_1_1'):
url=url=self._nomis_codes_url_constructor('freq',idx)
return self._nomis_codes_parser(url)
def nomis_codes_age_dur(self,idx='NM_7_1'):
url=url=self._nomis_codes_url_constructor('age_dur',idx)
return self._nomis_codes_parser(url)
def nomis_codes_ethnicity(self,idx='NM_118_1'):
url=url=self._nomis_codes_url_constructor('ethnicity',idx)
return self._nomis_codes_parser(url)
def nomis_codes_occupation(self,idx='NM_7_1'):
url=url=self._nomis_codes_url_constructor('occupation',idx)
return self._nomis_codes_parser(url)
def nomis_codes_age(self,idx='NM_18_1'):
url=url=self._nomis_codes_url_constructor('age',idx)
return self._nomis_codes_parser(url)
def nomis_codes_duration(self,idx='NM_18_1'):
url=url=self._nomis_codes_url_constructor('duration',idx)
return self._nomis_codes_parser(url)
def nomis_codes_sex(self,idx='NM_1_1',geography=None):
params={}
if geography is not None:
params['geography']=geography
url='{nomis}{idx}/sex.def.sdmx.json{params}'.format(nomis=self.url,
idx=idx,
params=self._url_encode(params))
return self._nomis_codes_parser(url)
def nomis_codes_geog(self,idx='NM_1_1',geography=None,search=None):
params={}
if geography is not None:
geog='/{geog}'.format(geog=geography)
else:
geog=''
if search is not None:
params['search']=search
url='{nomis}{idx}/geography{geog}.def.sdmx.json{params}'.format(nomis=self.url,
idx=idx,geog=geog,
params=self._url_encode(params))
return self._nomis_codes_parser(url)
def nomis_codes_items(self,idx='NM_1_1',geography=None,sex=None):
sex=self._sex_map(idx,sex)
params={}
if geography is not None:
params['geography']=geography
if sex is not None:
params['sex']=sex
url='{nomis}{idx}/item.def.sdmx.json{params}'.format(nomis=self.url,
idx=idx,
params=self._url_encode(params))
return self._nomis_codes_parser(url)
#TO DO have a dataset_explain(idx) function that will print a description of a dataset,
#summarise what dimensions are available, and the value they can take,
#and provide a stub function usage example (with eligible parameters) to call it
def _nomis_data_url(self,idx='NM_1_1',postcode=None, areacode=None, **kwargs):
#TO DO
#Add an explain=True parameter that will print a natural language summary of what the command is calling
###---Time/date info from nomis API docs---
#Useful time options:
##"latest" - the latest available data for this dataset
##"previous" - the date prior to "latest"
##"prevyear" - the date one year prior to "latest"
##"first" - the oldest available data for this dataset
##Using the "time" concept you are limited to entering two dates,
##a start and end. All dates between these are returned.
#date is more flexible for ranges
##With the "date" parameter you can specify relative dates,
##so for example if you wanted the latest date, three months and six months prior to that
##you could specify "date=latest,latestMINUS3,latestMINUS6".
##You can use ranges with the "date" parameter,
##e.g. if you wanted data for 12 months ago, together with all dates in the last six month
##up to latest you could specify "date=prevyear,latestMINUS5-latest".
##To illustrate the difference between using "date" and "time";
##if you specified "time=first,latest" in your URI you would get all dates from first to latest inclusive,
##whereas with "date=first,latest" your output would contain only the first and latest dates.
metadata=self.nomis_code_metadata(idx)
#HELPERS
#Find geography from postcode
if 'geography' not in kwargs and postcode is not None:
kwargs['geography']=self._get_geo_from_postcode(postcode, areacode)
#Map natural language dimension values to corresponding codes
for dim in set( metadata.keys() ).intersection( kwargs.keys() ):
kwargs[dim]=self._dimension_mapper(idx,dim,kwargs[dim])
#Set a default time period to be latest
if 'date' not in kwargs and 'time' not in kwargs:
kwargs['time']='latest'
#Set up a default projection for the returned columns
cols=['geography_code','geography_name','measures_name','measures','date_code','date_name','obs_value']
for k in ['sex','age','item']:
if k in kwargs: cols.insert(len(cols)-1,'{}_name'.format(k))
if 'select' not in kwargs:
kwargs['select']=','.join(cols)
url='{nomis}{idx}.data.csv{params}'.format(nomis=self.url,
idx=idx,
params=self._url_encode(kwargs))
return url
def _nomis_data(self,idx='NM_1_1',postcode=None, areacode=None, **kwargs):
url=self._nomis_data_url(idx,postcode, areacode, **kwargs)
df=pd.read_csv(url)
df['_Code']=idx
return df
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment