Skip to content

Instantly share code, notes, and snippets.

View elliot42's full-sized avatar

Elliot Block elliot42

  • Form Energy
  • Berkeley, CA
View GitHub Profile

I messed up explaining the value prop of Delta Lake, including the value prop for customers. Here is a better version:

  • Delta Lake allows you to build a dataset incrementally over time in S3, storing only the incremental changes (e.g. per-day), and then allowing you to look up the previous state of the dataset at a past point in time (“time travel”)
  • In our case, with a X00GB graph (e.g. GraphFrames of nodes + edges), we would not need to duplicate the entire dataset every day, we would only need to store the changes per day, which is a large space/cost savings over making duplicate datasets every day.
  • This functionality works for storing any dataset incrementally, which in our case would work for both graph representations, and feature value representations (from its perspective, it is simply storing a data table in parquet in s3, with metadata)
  • Multiple of our processes require historical state, e.g. graph traversals at a previous point in time, and looking up feature values at previous points in ti
all:
curl -L https://gist.github.com/elliot42/dd081d3587068f9684512ee6fb20c65f/raw/d47be4e31f754de10287745dc8469ea8aecce6a8/gistfile1.txt > package.json
curl -OL https://gist.github.com/danevans/759d8c8c4d340a9a034716a7c9a53fe0/raw/690ee359f71d0162d33cbadba1dd89b21921325c/webpack.config.js
curl -OL https://gist.github.com/danevans/759d8c8c4d340a9a034716a7c9a53fe0/raw/690ee359f71d0162d33cbadba1dd89b21921325c/index.html
curl -OL https://gist.github.com/danevans/759d8c8c4d340a9a034716a7c9a53fe0/raw/690ee359f71d0162d33cbadba1dd89b21921325c/index.js
yarn
{
"scripts": {
"build": "webpack --progress -p",
"watch": "webpack --progress --watch",
"server": "webpack-dev-server --open"
},
"name": "dir",
"version": "1.0.0",
"main": "index.js",
"license": "MIT",
{
"scripts": {
"build": "webpack --progress -p",
"watch": "webpack --progress --watch",
"server": "webpack-dev-server --open"
},
"name": "dir",
"version": "1.0.0",
"main": "index.js",
"license": "MIT",
{
"scripts": {
"build": "webpack --progress -p",
"watch": "webpack --progress --watch",
"server": "webpack-dev-server --open"
},
"name": "dir",
"version": "1.0.0",
"main": "index.js",
"license": "MIT",
{
"scripts": {
"build": "webpack --progress -p",
"watch": "webpack --progress --watch",
"server": "webpack-dev-server --open"
},
"name": "dir",
"version": "1.0.0",
"main": "index.js",
"license": "MIT",
{
"scripts": {
"build": "webpack --progress -p",
"watch": "webpack --progress --watch",
"server": "webpack-dev-server --open"
},
"name": "dir",
"version": "1.0.0",
"main": "index.js",
"license": "MIT",
{
"scripts": {
"build": "webpack --progress -p",
"watch": "webpack --progress --watch",
"server": "webpack-dev-server --open"
},
"name": "dir",
"version": "1.0.0",
"main": "index.js",
"license": "MIT",
{
"scripts": {
"build": "webpack --progress -p",
"watch": "webpack --progress --watch",
"server": "webpack-dev-server --open"
},
"name": "dir",
"version": "1.0.0",
"main": "index.js",
"license": "MIT",
{
"scripts": {
"build": "webpack --progress -p",
"watch": "webpack --progress --watch",
"server": "webpack-dev-server --open"
},
"name": "dir",
"version": "1.0.0",
"main": "index.js",
"license": "MIT",