Skip to content

Instantly share code, notes, and snippets.

@stephlocke
Created September 20, 2018 16:08
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save stephlocke/5803b8f78693436f9093fc5b4da49da4 to your computer and use it in GitHub Desktop.
Save stephlocke/5803b8f78693436f9093fc5b4da49da4 to your computer and use it in GitHub Desktop.
Some links and resources around ethical data science

Ethical data science

Intro

Who I am

Purpose / Agenda

Problems caused

Facial recognition

Justice system

Access to finance

Ethical obligation

Personal

Corporate

Professional bodies

Planning ethical data science

Research and impact assessments

Defining fairness

Technical challenges

Data

Simpson's paradox

Testing for bias

Conclusion

Summary

Next steps

http://research.google.com/bigpicture/attacking-discrimination-in-ml/ https://arxiv.org/abs/1610.02413

https://anatomyof.ai/

responsibilities

human subject research The three principles of ethical human subjects research are:

Respect for Persons: People should be treated as autonomous individuals and persons with diminished autonomy (like children or prisoners) are entitled to protection. Beneficence: 1) Do not harm and 2) maximize possible benefits and minimize possible harms. Justice: Both the risks and benefits of research should be distributed equally.

https://makingnoiseandhearingthings.com/2018/08/31/what-you-can-cant-and-shouldnt-do-with-social-media-data/ https://www.hhs.gov/ohrp/regulations-and-policy/belmont-report/index.html

https://medium.com/s/story/data-violence-and-how-bad-engineering-choices-can-damage-society-39e44150e1d4

Diverse experts will result in diverse models: http://www.humansforai.com/

https://www.bloomberg.com/view/articles/2018-06-27/here-s-how-not-to-improve-public-schools

Ethics in tech workshop: https://www.scu.edu/ethics-in-technology-practice/

Planning: https://ainowinstitute.org/aiareport2018.pdf https://www.gov.uk/government/publications/data-ethics-framework/data-ethics-framework

data

  • contracts
  • legislation
  • reasonable persons
  • anonymisation
  • sharing

evernote - data scientists see your data google juicer - https://storage.googleapis.com/pub-tools-public-publication-data/pdf/f7ca97121ebbf35dafbcd1acbde12ff5a2b51134.pdf

mental modes of privacy https://storage.googleapis.com/pub-tools-public-publication-data/pdf/44643.pdf

https://makingnoiseandhearingthings.files.wordpress.com/2018/08/screenshot-from-2018-07-25-16-10-21.png?w=1024 Fiesler, C., & Proferes, N. (2018). “Participant” Perceptions of Twitter Research Ethics. Social Media+ Society, 4(1), 2056305118763366. Table 4.

https://ai.googleblog.com/2018/09/introducing-inclusive-images-competition.html

models

  • adverseries
  • reverse engineering
  • interpretability
  • privacy

PATE Private Aggregation of Teacher Ensembles https://storage.googleapis.com/pub-tools-public-publication-data/pdf/0e08bda44d22e076d15edc45afcb2e1a7a231a84.pdf

https://ai-global.org/ ? https://github.com/ropenscilabs/proxy-bias-vignette/blob/master/EthicalMachineLearning.ipynb

https://www.districtdatalabs.com/fairness-and-bias-in-algorithms/

blue team and red team concepts?

https://arxiv.org/abs/1805.02400

stereotypes and biases

https://www.entrepreneur.com/article/319228

https://github.com/pymetrics/audit-ai https://unfiltered.news/about.html https://www.ajlunited.org/

simpsons paradox

  • mix effects "the fact that aggregate numbers can be affected by changes in the relative size of the subpopulations as well as the relative values within those subpopulations" https://storage.googleapis.com/pub-tools-public-publication-data/pdf/42901.pdf @inproceedings{42901, title = {Visualizing Statistical Mix Effects and Simpson's Paradox}, author = {Zan Armstrong and Martin Wattenberg}, year = {2014}, booktitle = {Proceedings of IEEE InfoVis 2014} }

"fairness"

  • group unaware - same cutoff points / decision boundary. Will encounter problems if protected characteristics or proxies are included
  • group thresholds - different cutoff points to allow more of a class in. Open to positive discrimination complaints.
  • demographic parity - acceptance rate determined to end up with a portfolio makeup that resembles overall demographic proportions. Can result in more rejections in majority cases
  • equal opportunity - the same true positive rate holds across groups
  • equal accuracy - the same overall accuracy rate holds across groups

http://research.google.com/bigpicture/attacking-discrimination-in-ml/ https://pair-code.github.io/what-if-tool/uci.html

https://github.com/Microsoft/fairlearn

general next readings

https://docs.google.com/spreadsheets/d/1deoFXnuTZ4RNlToxXoLxZMZXrYNISRtTjpYJP-WLAAM/edit#gid=369486359 http://geni.us/autoinequality http://geni.us/mathdestruction http://dmgreene.net/wp-content/uploads/2018/09/Greene-Hoffman-Stark-Better-Nicer-Clearer-Fairer-HICSS-Final-Submission.pdf https://www.aitruth.org/reading-list https://arxiv.org/ftp/arxiv/papers/1802/1802.07228.pdf Coming soon: https://medium.com/ai4allorg/ai4all-open-learning-brings-free-and-accessible-ai-education-online-with-the-support-of-google-org-3a6360c135c9 MOOC: https://www.coursera.org/lecture/data-science-ethics/data-science-needs-ethics-Ozf9b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment