Last active
May 24, 2023 05:16
-
-
Save aronchick/d84fc18a8500f7da244bfc0ba24e885d to your computer and use it in GitHub Desktop.
Issues with Data Science
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Inappropriate HW/SW stack | |
Mismatched driver versions | |
Crash looping deployment | |
Data/model versioning [Nick Walsh] | |
Non-standard images/OS version | |
Pre-processing code doesn’t match production pre-processing | |
Production data doesn’t match training/test data | |
Output of the model doesn’t match application expectations | |
Hand-coded heuristics better than model [Adam Laiacano] | |
Model freshness (train on out-of-date data/input shape changed) | |
Test/production statistics/population shape skew | |
Overfitting on training/test data | |
Bias introduction (or not tested) | |
Over/under HW provisioning | |
Latency issues | |
Permissions/certs | |
Failure to obey health checks | |
Killed production model before roll out of new/in wrong order | |
Thundering herd for new model | |
Logging to the wrong location | |
Storage for model not allocated properly/accessible by deployment tooling | |
Route to artifacts not available for download | |
API signature changes not propagated/expected | |
Cross-data center latency | |
Expected benefit doesn’t materialize (e.g. multiple components in the app change simultaneously) | |
Get wrong/no traffic because A/B config didn’t roll out | |
No CI/CD; manual changes untracked [Jon Peck] | |
Get too much traffic too soon (expected to canary/exponential roll out) | |
Outliers not predicted [MikeBSilverman] | |
Change was a good change, but didn’t communicate with the rest of the team (so you must roll back) | |
No dates! (date to measure impact/improvement against a pre-agreed measure; date scheduled to assess data changes) [Mary Branscombe] | |
LACK OF DOCUMENTATION!! (the problem, the testing, the solution, lots more) [Terry Christiani] | |
Successful model causes pain elsewhere in the organization (e.g. detecting faults previously missed) [Mark Round] | |
Lack of visibility into real-time model behavior (detecting data drift, live data distribution vs train data, etc) [Nick Walsh] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Before You Move | |
---- | |
Bandwidth costs | |
Speed of insights | |
De/compression time | |
Ingestion time/cost | |
Removing PII | |
Sanitizing data (from Attacks) | |
Recording Metadata about Capture | |
Overloading Network | |
Changing Security Criteria | |
Defining a Long-term Schema in Advance (Ewan Leith) | |
Data Ordering | |
Distributed Caching Problems | |
Consistent Deletion / Duplication | |
Data Residency or Compliance Requirements (Andre) | |
Owning a Lake | |
---- | |
Frequency of Loads (Ewan Leith) | |
Deleting Data on Demand (Ewan Leith) | |
Export, Integrations including Modeling and Search (Rob M) | |
Ongoing Maintenance and Pruning (Rob M) | |
Incremental Weight of Queries | |
File/Compression Formats | |
Partition Sizes | |
Fulfilling a DSR Request (Helena Jackson) | |
Authentication and Granularity of Permissions (Tymac) | |
Facilitating Queries (Tim McNamara) | |
Managing Long Term Responsibility (Jacob O'Farrell) | |
Centralized Funding Model (Jacob O'Farrell) | |
Team Ownership of a Central Resource (Jacob O'Farrell) | |
Orchestrator and Deletion Workers (Torfinn Olsen) | |
Data Clawbacks (Randall Hunt) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment