Skip to content

Instantly share code, notes, and snippets.

d6tdev

View GitHub Profile
@d6tdev
d6tdev / top5-mistakes-vendors.md
Last active Mar 7, 2019
Avoid the common traps and keep your clients happy.
View top5-mistakes-vendors.md

5 Mistakes Data Vendors Commonly Make

Avoid these common traps and keep your clients happy.

  • Great Sales, Bad Delivery: The sales team closed the deal and it's time to deliver the actual data. Poorly organized files with no meta data and little documentation are the biggest cause for a frustrating onboarding process
  • No Quickstart Instructions: You presented a great story and case study but the client cannot easily verify our claims. They spin their wheels just to recreate what you already did
  • Reinvent the Wheel: To deliver data to clients, you need build APIs, ftp servers, S3 buckets etc complete with authentication and security. Vendors typically build their own infrastructure which is not only expensive to build and maintain but also means clients have to build custom pipes
  • Not Cloud Ready: Is ftp still the only thing you have to offer? Your clients are moving to the cloud and so should you
  • No usage analytics: You sell data so clients can make better decisions yet you have l
@d6tdev
d6tdev / effective-datasci-workflows.rst
Last active Mar 18, 2019
How to Build Highly Effective Data Science Workflows
View effective-datasci-workflows.rst

How to Build Highly Effective Data Science Workflows

Your current workflow probably chains several functions together like in the example below. While quick, it likely has many problems:

  • it doesn't scale well as you add complexity
  • you have to manually track which functions were run with which parameters
  • you have to manually track where data is saved
  • it's difficult for others to read
@d6tdev
d6tdev / reasons-why-bad-ml-code.rst
Last active Mar 14, 2019
4 Reasons Why Your Machine Learning Code is Probably Bad
View reasons-why-bad-ml-code.rst

4 Reasons Why Your Machine Learning Code is Probably Bad

Your current workflow probably chains several functions together like in the example below. While quick, it likely has many problems:

  • it doesn't scale well as you add complexity
  • you have to manually keep track of which functions were run with which parameter
  • you have to manually keep track of where data is saved
  • it's difficult for others to read
@d6tdev
d6tdev / d6tpipe-preview.py
Last active Sep 23, 2018
d6tpipe preview
View d6tpipe-preview.py
#****************************************
# d6tpipe preview - client
#****************************************
import d6tpipe.api
import d6tpipe.pipe
import pandas as pd
import dask.dataframe as dd
d6tapi = d6tpipe.api.APIClient(key='9PCCZP5q9eN9abvm',secret='MLpTftafuH3bRAfX')
You can’t perform that action at this time.