bbakerman/gist:52c54a2242adc4d2f03adc08ccadf1ba

## gistfile1.txt
Design a service which receives as input a list of URLs, scrapes those URLs for links to other pages and references images, then returns a mapping of page URLs to a list of image URLs.

Your service does not need to download and store the images.

Your service should follow links to other pages from the original submitted pages, and return the images on
those 2/3/nth level pages as if they were on the first level page.

The API contract is defined as:

POSTing to /jobs with a body of a JSON array of URLs to start scrapin0
g from (e.g. ["https://google.com", "https://www.statuspage.io"]) should return a job identifier of some kind

GETing /jobs/:job_id/status with the returned job identifier should return a JSON object of the
format of {"completed": x, "inprogress": y } where x is the number of original URLs which have been completely
crawled and y is the number of original URLs which are still being crawled.

GETing /jobs/:job_id/results with the returned job identifier should return a JSON object returning a
mapping of original URL to all reachable images from that original URL, in the format of:

{
  "https://google.com": [
    "https://google.com/images/logo_sm_2.gif",
    "https://google.com/images/warning.gif"
  ],
  "https://www.statuspage.io": [
    "https://dka575ofm4ao0.cloudfront.net/assets/base/favicon-b756db379a57687bdfa58f6bac32bec2.png",
    "https://dka575ofm4ao0.cloudfront.net/assets/base/apple-touch-icon-144x144-precomposed-293c39b0635ae7523612fe7488be9244.png"
  ]
}
NB: notice that the system does not have to track which images came from 2/3/nth level linked pages,
but it does need to track which of the originally submitted URLs led to the image URL.
	Design a service which receives as input a list of URLs, scrapes those URLs for links to other pages and references images, then returns a mapping of page URLs to a list of image URLs.

	Your service does not need to download and store the images.

	Your service should follow links to other pages from the original submitted pages, and return the images on
	those 2/3/nth level pages as if they were on the first level page.

	The API contract is defined as:

	POSTing to /jobs with a body of a JSON array of URLs to start scrapin0
	g from (e.g. ["https://google.com", "https://www.statuspage.io"]) should return a job identifier of some kind

	GETing /jobs/:job_id/status with the returned job identifier should return a JSON object of the
	format of {"completed": x, "inprogress": y } where x is the number of original URLs which have been completely
	crawled and y is the number of original URLs which are still being crawled.

	GETing /jobs/:job_id/results with the returned job identifier should return a JSON object returning a
	mapping of original URL to all reachable images from that original URL, in the format of:

	{
	"https://google.com": [
	"https://google.com/images/logo_sm_2.gif",
	"https://google.com/images/warning.gif"
	],
	"https://www.statuspage.io": [
	"https://dka575ofm4ao0.cloudfront.net/assets/base/favicon-b756db379a57687bdfa58f6bac32bec2.png",
	"https://dka575ofm4ao0.cloudfront.net/assets/base/apple-touch-icon-144x144-precomposed-293c39b0635ae7523612fe7488be9244.png"
	]
	}
	NB: notice that the system does not have to track which images came from 2/3/nth level linked pages,
	but it does need to track which of the originally submitted URLs led to the image URL.