Skip to content

Instantly share code, notes, and snippets.

@alissapajer
Last active April 22, 2020 22:18
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save alissapajer/6a6340312269e459d06b8c6167d98c0a to your computer and use it in GitHub Desktop.
Save alissapajer/6a6340312269e459d06b8c6167d98c0a to your computer and use it in GitHub Desktop.
Precog: Uniform Access to Any API

Precog: Uniform Access to Any API

The number of web APIs has grown exponentially over the past decade. Today, nearly every product in every sector exposes a public API, and organizations commonly deploy private APIs to share data internally. The industry has settled for ad hoc approaches to accessing API data, commonly using a combination of Python and command line tools. And so we tacitly assume that the only way to access API data is using custom code. Precog changes that.

To motivate Precog's solution to the API problem, we need to understand where the true complexities in API access lie. We ask "Is there an API?", and then feel satisfied when the answer is "Yes". But the existence of an API does not mean that we can easily access the data in a meaningful way. This brings us to the three primary complexities of using web APIs: setting up the API requests, reshaping the data into an analytics-ready format, and making the reshaped data accessible.

Setting Up API Requests

Every API requires a different setup. Some APIs use basic authentication while others use OAuth 2.0. Many APIs are paginated, but pagination is not standardized: some APIs return the next page in the response header, while others return a pagination token or offset nested in the response data. Sometimes you have to make an initial request to gather some metadata, which is then used in subsequent requests. It's also common to make requests from multiple endpoints.

Without Precog, accessing an API means cobbling together all of the above components in a custom script. But writing custom scripts is not a sustainable business solution, as it consumes developer resources that could otherwise be used to directly impact the business. Precog solves this problem by providing a uniform user interface to access API endpoints.

[SCREENSHOT]

No matter the API, the interface looks the same. Precog authenticates using the authentication type of your choice. Precog makes multiple calls to the API to seamlessly handle pagination. The end result: Precog streams API data directly from the API to the data store of your choosing, allowing you to reshape it along the way. This brings us to the next complexity.

Making API Data Analytics-Ready

Our ultimate goal is to gain understanding of API data through analysis and visualization. But analysis and visualization tools speak the language of two-dimensional homogeneous data (like a spreadsheet), which is categorically opposed to the multi-dimensional heterogeneous structure of API data (think nested spreadsheets with missing entries). The industry currently solves this problem again with custom code. Data engineers are tasked with deciding the correct way to reshape the complex data into two dimensions while preserving the meaning of the row. Precog has solved this problem in the general case.

Nearly all APIs return JSON data, due to its flexibility and readability. But this comes at a hidden cost: the relationships described in a JSON document are deeply complex in unobvious ways. Furthermore, the structure of JSON responses varies wildly from API to API. Some APIs return an array of objects, where each object represents a semantic row. Others return one or more objects that encode metadata alongside the data. Others use array indices to encode semantically meaningful information. Also, the data are frequently heterogenous, meaning that each nested array or object looks different than its neighbors. Precog innately understands unstructured data. No matter how deeply nested. No matter how many holes. Precog understands API data.

[SCREENSHOT]

Precog's data browser presents a uniform view of all API data, enabling you to immediately see your data in a browsable format and to pick the components that are relevant to you. Precog then transforms your data into the two-dimensional format that data analytics tools understand, while preserving the semantic relationships encoded in the JSON structure. In this way, Precog empowers everyone to transform complex API data into tabular, analytics-ready data.

Loading API Data into a Destination

The third complexity of API access is loading the reshaped data into a destination. Each destination provides a different way to load data, often requiring an engineer's expertise. Precog connects to your database, pushing in the reshaped API data with the click of a button. With Precog, there is no need to maintain your database in parallel with your loading tool.

[SCREENSHOT]

Precog provides a single solution for streaming API data from the source, transforming it into analytics-ready tables, and loading it into your target destination. Precog gives everyone direct and instant access to any API.

Web API access remained an unsolved problem until Precog.

Download trial. Buy it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment