Skip to content

Instantly share code, notes, and snippets.

@mbarton
Created February 17, 2021 09:31
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mbarton/b52d6be7ef5ad0551bfb460169331902 to your computer and use it in GitHub Desktop.
Save mbarton/b52d6be7ef5ad0551bfb460169331902 to your computer and use it in GitHub Desktop.
webscraper.io scraper for search results from the Hansard website
{
"_id": "hansard",
"startUrl": [
"https://hansard.parliament.uk/search/Contributions?startDate=1952-02-06&endDate=2020-11-27&searchTerm=%22affected%20by%20the%20bill%22&partial=False&page=[1-123]",
"https://hansard.parliament.uk/search/Contributions?startDate=1952-02-06&endDate=2020-11-27&searchTerm=%22consented%20to%20place%22&partial=False&page=[1-33]",
"https://hansard.parliament.uk/search/Contributions?startDate=1952-02-06&endDate=2020-11-27&searchTerm=%22have%20it%20in%20command%20from%22&partial=False&page=[1-35]",
"https://hansard.parliament.uk/search/Contributions?startDate=1952-02-06&endDate=2020-11-27&searchTerm=%22Queen%27s%20consent%20signified%22&partial=False&page=[1-16]",
"https://hansard.parliament.uk/search/Contributions?startDate=1952-02-06&endDate=2020-11-27&searchTerm=%22Prince%20of%20Wales%27s%20consent%20signified%22&partial=False&page=[1-4]",
"https://hansard.parliament.uk/search/Contributions?startDate=1952-02-06&endDate=2020-11-27&searchTerm=%22Queen%27s%20Consent%2C%20on%20behalf%20of%20the%20Crown%2C%20signified%22&partial=False&page=[1-23]",
"https://hansard.parliament.uk/search/Contributions?startDate=1952-02-06&endDate=2020-11-27&searchTerm=%22Prince%20of%20Wales%27s%20consent%2C%20on%20behalf%20of%20the%20Duchy%20of%20Cornwall%2C%20signified%22&partial=False&page=[1-3]",
"https://hansard.parliament.uk/search/Contributions?startDate=1952-02-06&endDate=2020-11-27&searchTerm=%22consent%20having%20been%20signified%22&partial=False&page=1"
],
"selectors": [
{
"id": "row",
"type": "SelectorElement",
"parentSelectors": [
"_root"
],
"selector": ".search-results .card",
"multiple": true,
"delay": 0
},
{
"id": "title",
"type": "SelectorText",
"parentSelectors": [
"row"
],
"selector": ".primary-info",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "date",
"type": "SelectorText",
"parentSelectors": [
"row"
],
"selector": ".tertiary-info",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "text",
"type": "SelectorText",
"parentSelectors": [
"row"
],
"selector": ".text",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "link",
"type": "SelectorElementAttribute",
"parentSelectors": [
"row"
],
"selector": "_parent_",
"multiple": false,
"extractAttribute": "href",
"delay": 0
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment