Lambda is a serverless code execution enviromnent on AWS that can run abitrary python or node code. When a function is triggered it spins up runs and then exists.
Analysis currently is run at the Database level through plpythonu functions.
- It makes it hard to distribute the code execution.
- Libraries for analysis need to be installed on every CARTO instance and it requires product to install and maintain these libraries.
- Adding new functions is hard because...
- the deploy process requires us to touch every CARTO instance and
- the fuction deffinitions are inflexable because we need to stick to postgresql's strict functon table returning limits.
- Any libraries(python modules or c libraries) that are require for an analysis are just bundled with the analyis and so can be taylored to the analyis
- Processing happens as is needed and not on the same machine as the Database so perfomance of the database, tiler etc isn't effected
- Deployment is much easier as deploying a new analysis method is just bundeling it up as a Lambda function and pushing to AWS
- Testing is much easier.
- Crankshaft (or camshaft) triggers a analysis function. Something like cdb_run_analysis(analysis_name,input_query, {params})
- This triggers the AWS lambda function with appropriate arguments.
- Research develops a Lambda function for a given analysis which takes as arguments an input SQL query, the users API key, their username, any parameters for the analysis and a target tablename for the results.
- The lambda function calls CARTO using the SQL API or with a direct database connection to grab the data it needs.
- It processes the data and pushes the results to the target tablename using either the SQL API, Impot API or database connection
- CARTO registers the new table which idicates that the Analyis has compleated.
- Hard time out of 5 mins.
- Libraries + code payload's need to be < 50mb
- Onprem install not possible