Skip to content

Instantly share code, notes, and snippets.

@natbusa
Last active August 2, 2019 02:58
Show Gist options
  • Save natbusa/cfff41245ce4af3304ec41950592c923 to your computer and use it in GitHub Desktop.
Save natbusa/cfff41245ce4af3304ec41950592c923 to your computer and use it in GitHub Desktop.
Slogan:
Data Pipelines as a Service.
Subtitle
Delivering turn-key data pipelines to process data from Source to APIs.
Vision:
Data pipelines are complex. They are often tough to setup and maintain. But a data pipeline is a mean to an end not a goal in itself. Most company are stuck at endless engineering cycles with very limited output for what business is concerned. Our vision is to let you concentrate on the golden nuggets hidden in your data rather then spending time and effort in setting up complex toolchains. Select and instantiate our data pipeline templates and jump start your data-driven journey!
Offering:
We have a number of templates for most business domains.
Select the flow and the output which is most adapt to your business.
Advantages:
Very well tested and business proofed libraries.
Templates for ingestions, etl, ML, reporting and APIs.
Ease to configure, easy to understand.
Loads of visualizations, and interactive tools to understand your data.
Preconfigured Machine Learning and Predictive models.
SRE and Data Quality flows and dashboards included.
Canned ACL and security, easy to extend and customize.
For C-suite:
Get business results, this pipeline works as a big calculator.
No need to get lost in engineering techno babble.
For Project Manager:
Boost up your project by using well tested and highly configurable templates.
Deliver results fast in a reliable and reproducable matter
For ETL DevOps:
Simplify your ETL modeling with smart data model rewrites.
Automate ingestion, Fact Tables, Start Schema and Report generation.
For Data Scientists:
Jump start your experiments with a set of good baselines models,
for your classification, regression, active learning, and reinforcement learning models.
Easily setup model monitoring and model validation.
For DataOps:
Easily setup the cluster, with cherry picked components and libraries and configurations which are the result of hours of work done on existing production projects. Expand, select, and configure the existing data architecture blueprints as necessary. Speed up your provisioning and deployment cycles. Have more control on your data processes.
Data Pipelines in 10 steps
0) Provisioning:
1) ETL: Ingestion
2) ETL: BI
3) ETL: Reporting/Publish
4) ML: Feature Engineering
5) AI: Model Training
6) AI: Model Serving
7) AI: Model Monitoring
8) Apps/Dashboards
9) Security/Access
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment