stephlocke/ethicaldatascience.md

## ethicaldatascience.md

      
    Raw
  

              ethicaldatascience.md
            
          
    Ethical data science

Intro

Who I am

Purpose / Agenda

Problems caused

Facial recognition

Justice system

Access to finance

Ethical obligation

Personal

Corporate

Professional bodies

Planning ethical data science

Research and impact assessments

Defining fairness

Technical challenges

Data

Simpson's paradox

Testing for bias

Conclusion

Summary

Next steps

http://research.google.com/bigpicture/attacking-discrimination-in-ml/
https://arxiv.org/abs/1610.02413
https://anatomyof.ai/
responsibilities

human subject research
The three principles of ethical human subjects research are:
Respect for Persons: People should be treated as autonomous individuals and persons with diminished autonomy (like children or prisoners) are entitled to protection.
Beneficence: 1) Do not harm and 2) maximize possible benefits and minimize possible harms.
Justice: Both the risks and benefits of research should be distributed equally.
https://makingnoiseandhearingthings.com/2018/08/31/what-you-can-cant-and-shouldnt-do-with-social-media-data/
https://www.hhs.gov/ohrp/regulations-and-policy/belmont-report/index.html
https://medium.com/s/story/data-violence-and-how-bad-engineering-choices-can-damage-society-39e44150e1d4
Diverse experts will result in diverse models: http://www.humansforai.com/
https://www.bloomberg.com/view/articles/2018-06-27/here-s-how-not-to-improve-public-schools
Ethics in tech workshop: https://www.scu.edu/ethics-in-technology-practice/
Planning:
https://ainowinstitute.org/aiareport2018.pdf
https://www.gov.uk/government/publications/data-ethics-framework/data-ethics-framework
data


contracts
legislation
reasonable persons
anonymisation
sharing

evernote - data scientists see your data
google juicer - https://storage.googleapis.com/pub-tools-public-publication-data/pdf/f7ca97121ebbf35dafbcd1acbde12ff5a2b51134.pdf
mental modes of privacy https://storage.googleapis.com/pub-tools-public-publication-data/pdf/44643.pdf
https://makingnoiseandhearingthings.files.wordpress.com/2018/08/screenshot-from-2018-07-25-16-10-21.png?w=1024
Fiesler, C., & Proferes, N. (2018). “Participant” Perceptions of Twitter Research Ethics. Social Media+ Society, 4(1), 2056305118763366. Table 4.
https://ai.googleblog.com/2018/09/introducing-inclusive-images-competition.html
models


adverseries
reverse engineering
interpretability
privacy

PATE Private Aggregation of Teacher Ensembles https://storage.googleapis.com/pub-tools-public-publication-data/pdf/0e08bda44d22e076d15edc45afcb2e1a7a231a84.pdf
https://ai-global.org/ ?
https://github.com/ropenscilabs/proxy-bias-vignette/blob/master/EthicalMachineLearning.ipynb
https://www.districtdatalabs.com/fairness-and-bias-in-algorithms/
blue team and red team concepts?

https://arxiv.org/abs/1805.02400
stereotypes and biases


watson and einstein vs siri and cortana
linguistic - speech recognition amongst black americans for instance
usage patterns - shared devices in South Asia
https://ai.google/research/pubs/pub47247
facial recognition https://blogs.microsoft.com/ai/gender-skin-tone-facial-recognition-improvement/
https://www.media.mit.edu/projects/gender-shades/overview/

https://www.entrepreneur.com/article/319228
https://github.com/pymetrics/audit-ai
https://unfiltered.news/about.html
https://www.ajlunited.org/
simpsons paradox


mix effects "the fact that aggregate numbers can be affected by changes in the relative size of the subpopulations as well as the relative values within those subpopulations"
https://storage.googleapis.com/pub-tools-public-publication-data/pdf/42901.pdf
@inproceedings{42901,
title	= {Visualizing Statistical Mix Effects and Simpson's Paradox},
author	= {Zan Armstrong and Martin Wattenberg},
year	= {2014},
booktitle	= {Proceedings of IEEE InfoVis 2014}
}

"fairness"


group unaware - same cutoff points / decision boundary. Will encounter problems if protected characteristics or proxies are included
group thresholds - different cutoff points to allow more of a class in. Open to positive discrimination complaints.
demographic parity - acceptance rate determined to end up with a portfolio makeup that resembles overall demographic proportions. Can result in more rejections in majority cases
equal opportunity - the same true positive rate holds across groups
equal accuracy - the same overall accuracy rate holds across groups

http://research.google.com/bigpicture/attacking-discrimination-in-ml/
https://pair-code.github.io/what-if-tool/uci.html
https://github.com/Microsoft/fairlearn
general next readings

https://docs.google.com/spreadsheets/d/1deoFXnuTZ4RNlToxXoLxZMZXrYNISRtTjpYJP-WLAAM/edit#gid=369486359
http://geni.us/autoinequality
http://geni.us/mathdestruction
http://dmgreene.net/wp-content/uploads/2018/09/Greene-Hoffman-Stark-Better-Nicer-Clearer-Fairer-HICSS-Final-Submission.pdf
https://www.aitruth.org/reading-list
https://arxiv.org/ftp/arxiv/papers/1802/1802.07228.pdf
Coming soon: https://medium.com/ai4allorg/ai4all-open-learning-brings-free-and-accessible-ai-education-online-with-the-support-of-google-org-3a6360c135c9
MOOC: https://www.coursera.org/lecture/data-science-ethics/data-science-needs-ethics-Ozf9b