Skip to content

Instantly share code, notes, and snippets.

View d6tdev's full-sized avatar

d6tdev

View GitHub Profile
@d6tdev
d6tdev / d6tpipe-preview.py
Last active September 23, 2018 13:24
d6tpipe preview
#****************************************
# d6tpipe preview - client
#****************************************
import d6tpipe.api
import d6tpipe.pipe
import pandas as pd
import dask.dataframe as dd
d6tapi = d6tpipe.api.APIClient(key='9PCCZP5q9eN9abvm',secret='MLpTftafuH3bRAfX')
@d6tdev
d6tdev / reasons-why-bad-ml-code.rst
Last active July 2, 2021 18:28
4 Reasons Why Your Machine Learning Code is Probably Bad

4 Reasons Why Your Machine Learning Code is Probably Bad

Your current workflow probably chains several functions together like in the example below. While quick, it likely has many problems:

  • it doesn't scale well as you add complexity
  • you have to manually keep track of which functions were run with which parameter
  • you have to manually keep track of where data is saved
  • it's difficult for others to read
@d6tdev
d6tdev / effective-datasci-workflows.rst
Last active March 18, 2019 17:16
How to Build Highly Effective Data Science Workflows

How to Build Highly Effective Data Science Workflows

Your current workflow probably chains several functions together like in the example below. While quick, it likely has many problems:

  • it doesn't scale well as you add complexity
  • you have to manually track which functions were run with which parameters
  • you have to manually track where data is saved
  • it's difficult for others to read
@d6tdev
d6tdev / top5-mistakes-vendors.md
Last active March 7, 2019 22:31
Avoid the common traps and keep your clients happy.

5 Mistakes Data Vendors Commonly Make

Avoid these common traps and keep your clients happy.

  • Great Sales, Bad Delivery: The sales team closed the deal and it's time to deliver the actual data. Poorly organized files with no meta data and little documentation are the biggest cause for a frustrating onboarding process
  • No Quickstart Instructions: You presented a great story and case study but the client cannot easily verify our claims. They spin their wheels just to recreate what you already did
  • Reinvent the Wheel: To deliver data to clients, you need build APIs, ftp servers, S3 buckets etc complete with authentication and security. Vendors typically build their own infrastructure which is not only expensive to build and maintain but also means clients have to build custom pipes
  • Not Cloud Ready: Is ftp still the only thing you have to offer? Your clients are moving to the cloud and so should you
  • No usage analytics: You sell data so clients can make better decisions yet you have l