Skip to content

Instantly share code, notes, and snippets.

View jvani's full-sized avatar

Jordan Vani jvani

View GitHub Profile
import parsel
import requests
import datetime as dt
if __name__ == "__main__":
url = 'https://webcams.nyctmc.org/multiview2.php'
resp = requests.get(url)
selector = parsel.Selector(resp.text)

Sayari Data Task

Context

Sayari collects public data from around the globe including: corporate registries, civil litigation registries, customs and import/export data, land and real property ownership, official gazettes, and more. This data powers our products and is leveraged for due diligence, risk management, and financial intelligence and compliance.

In order for the data to be useful, Sayari often runs entity resolution on the data we collect. This allows us to detect when a single company or person is mentioned in two different web pages. For this task you will collect some public data and perform some simple entity resolution on it.

Task

The Secretary of State of North Dakota provides a business search web app that allows users to search for businesses by name. Your task:

  1. Play around with the site and figure out how to query companies by name.