#datajawn on twitter
“A story called potatoes”
- The end of human decision making(!)
- Outsourcing of jobs and technology
- Hype
- “We should stop training radiologists” - George Hinton
- Reality
- Radiology is still evolving. It’s not a solved problem.
- Self-driving cars
- Hype
- what do we do with the trash that accumulates when people leave the car?
- Reality
- Training a self-driving car model takes a lot of data. Takes a long time.
- So we use Captcha. Humans are in the loop.
- Robots
- Hype
- Robots will kill us
- Reality
- artificial general intelligence is a long way away
- People on the ground are worried about provisioning S3 buckets, not robot ethics, or linked data, or killer robots
- Problem
- you are a large brick and mortar book retailer trying to compete against Amazon
- You Have a website built on premises and it doesn’t do recommendations
- Solution
- You heard Google uses deep learning to recommend Youtube videos. So use AI!
- Hari Botter sidebar recommends books you may also enjoy on your website
- A scalable cloud-based application that interfaces with your existing webapp …
- New problems
- Moving data to the cloud
- Study cloud-native architectures
- AWS Glue
- Security and data privacy
- Data breaches and GDPR
- The less data you store, the easier it can be
- Keep windows of history then send to cold storage
- Differential privacy
- adding noise to a sample of user data, tweaking the age or adding more parameters. Your model will still read through the noise.
- Synthetic data creation: GANs
- Model interprebility – Are your recommendations good?
- The Barnes Foundation model algorithm led to a recommendation of paintings that were too similar
- Self driving cars tried to classify object several times before deciding to stop. Hit a person that it thought was a bike or a stop sign
- Use varied training data, get users to test and validate your data
- Use simpler models that are easier to interpret
- Moving data to the cloud
- Migrating to the cloud will take at least 15 months
- Citation needed
Sam Chenkin TechImpact
- Rich datasets
- Need to know how to apply limited resources
- larger companies can afford to have data science on staff
- Prove value of technology
- Understand impact
- Avert crises
- Documentation is only for compliance
- No one trusts technology
- Funders may require data collection but it’s not driven by nonprofit’s needs
Bonnie Kruft Glaxo Smith Klein
- Data Build
- Data Use
- Data Strategy
- Index
- Catalog where all the data is
- Build
- Infrastructure for storing, searching, and computing that data
- Genomics are predicted to match or overtake data from any other domain in 2025
- Easy to access data
- Is the data usable
- Can you make decisions
- Do you save time and money
Corey Chivers Penn Medicine @cjbayesian
Don’t tell you about whether you should use a model.
Hard to test true negatives, false postives etc.
Predict probability of outcome then try to maximize estimated goodness of outcomes
Evaluate the cost of potential outcomes
- Treat everyone
- Treat no one
- Some other strategy
Andrew Pawloski Element 84 small engineering firm
Billions of dollars in public funding
- Destroyed in Philadelphia
Your clothes have thousands of threads your analytics should too
Bruce Marable EmployeeCycle
- Recruiting and Sales follow the same lifecycle
This is how companies track how much customers love them
Track the pulse of how your customer/talent is feeling