- Understand and reduce manageable risks
- Prepare for problems and quickly recover from harm
- Adapt our practices based on the changing context
- Obfuscation & opacity
- Scales poorly
- Causes harm
- Dangerous feedback loops
- Clarity & accountability
- Designed for sustainability
- Mitigates harm & offers redress
- Adapts based on results & context
Security Questions to Ask of Your Big Data Project
Assess the Risk
- What are our objectives, priorities, and assumptions?
- What data & metadata do we have?
- How sensitive is the data? How dangerous is the data? Is it replaceable?
- How valuable is the data? To who?
- What legal standards apply?
- What rate of false positives are acceptable? False negatives?
- What can’t we measure? What are we leaving out?
- What tradeoffs are we making?
- Have we codified biases, injustices, or faulty assumptions into our model?
- Is there noise, malicious activity, or false information in our training data?
- Are we using obfuscation to hide sloppy data, processes, or proxies?
- Who will want access to the data? What will they try to get it?
- How does our system fail? Safe? Secure? Fair?
- Who or what makes the final decision in the model? Is there a safety lever?
- Are the incentives we are creating aligned with our objectives?
- Will abuse be possible for those who gain insider knowledge?
- How are we providing routine maintenance, updates, and cleanup?
- How are we separating, storing, and transmitting information?
- How do we determine permissions? Is it granular enough?
- Are we generating logs & alerts to detect failures and misuse?
- Are we routinely testing our models and systems for reliability and safety?
- Do we have an end-of-life process for our systems and data?
- Do we have a challenge, redress, and/or opt-out process?
Auditing Your Project
- Can we validate how a specific decision was made?
- Can we validate groupings, classifications, segmentation, etc.?
- Can we validate our assumptions? Expected outcomes? Predictions?
- What unintended consequences have we observed?
- Has this created any perverse incentives? Feedback loops? Echo chambers?
- How are people attempting to game the system? Are we catching them?
- Have we used any proxies to get around legal challenges?
- Is this funded at an appropriate level to keep it safe?
Reminder: Adapt based on the auditing results, and repeat the cycle for changes.
- Algorithmic Justice League
- "Your Data Is Being Manipulated"
- Threat Modeling: Designing for Security
- Haunted by Data
- Weapons of Math Destruction
- The Financial Modelers' Manifesto/Modelers' Hippocratic Oath
- I Am the Cavalry
- AI Security Resources
- Stop Data Mining Me: Opt-Out List
- How Adversarial Attacks Work
- The Problem with Building a "Fair" System
- Q: Why Do Keynote Speakers Keep Suggesting That Improving Security Is Possible? A: Because Keynote Speakers Make Bad Life Decisions and Are Poor Role Models
- An Ethics Checklist for Data Scientists