Skip to content

Instantly share code, notes, and snippets.

@stuartlynn
Created March 8, 2017 16:36
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save stuartlynn/f049f90736a8948985db25f4a3093fd1 to your computer and use it in GitHub Desktop.
Save stuartlynn/f049f90736a8948985db25f4a3093fd1 to your computer and use it in GitHub Desktop.
Thoughts on running Analysis Functions on AWS Lambda

Analysis on AWS Lambda

What is AWS Lambda

Lambda is a serverless code execution enviromnent on AWS that can run abitrary python or node code. When a function is triggered it spins up runs and then exists.

Issues with the current setup for Analysis

Analysis currently is run at the Database level through plpythonu functions.

  • It makes it hard to distribute the code execution.
  • Libraries for analysis need to be installed on every CARTO instance and it requires product to install and maintain these libraries.
  • Adding new functions is hard because...
  • the deploy process requires us to touch every CARTO instance and
  • the fuction deffinitions are inflexable because we need to stick to postgresql's strict functon table returning limits.

How could running analysis on AWS Lambda help?

  • Any libraries(python modules or c libraries) that are require for an analysis are just bundled with the analyis and so can be taylored to the analyis
  • Processing happens as is needed and not on the same machine as the Database so perfomance of the database, tiler etc isn't effected
  • Deployment is much easier as deploying a new analysis method is just bundeling it up as a Lambda function and pushing to AWS
  • Testing is much easier.

How would it work?

  • Crankshaft (or camshaft) triggers a analysis function. Something like cdb_run_analysis(analysis_name,input_query, {params})
  • This triggers the AWS lambda function with appropriate arguments.
  • Research develops a Lambda function for a given analysis which takes as arguments an input SQL query, the users API key, their username, any parameters for the analysis and a target tablename for the results.
  • The lambda function calls CARTO using the SQL API or with a direct database connection to grab the data it needs.
  • It processes the data and pushes the results to the target tablename using either the SQL API, Impot API or database connection
  • CARTO registers the new table which idicates that the Analyis has compleated.

Possible issues with using Lambda

  • Hard time out of 5 mins.
  • Libraries + code payload's need to be < 50mb
  • Onprem install not possible
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment