MantejGill/distrust_measures.csv

## distrust_measures.csv

          
            Distrust Measure
            Metric
            Description

            
              Data Quality
              Completeness
              Measures the proportion of missing data in a dataset. A dataset with a low percentage of missing data is considered to be of higher quality.

            
              
              Validity
              Measures whether the data in a dataset is accurate and conforms to a set of predefined rules or constraints.

            
              
              Consistency
              Measures whether the data in a dataset is consistent with other data sources.

            
              
              Timeliness
              Measures how recent the data in a dataset is. A dataset with more recent data is considered to be of higher quality

            
              
              Uniqueness
              Measures whether the data in a dataset is unique or duplicated.

            
              
              Accuracy
              Measures the degree to which the data in a dataset is free from errors or inaccuracies.

            
              
              Precision and Recall
              Evaluates the performance of a model. Precision measures the proportion of true positive predictions out of all positive predictions, and recall measures the proportion of true positive predictions out of all actual positive cases.

            
              
              F1-Score
              A weighted harmonic mean of precision and recall, where the best value is 1.0 and the worst value is 0.0

            
              
              Gini Coefficient
              A measure of inequality in a dataset, where 0 represents total equality and 1 represents total inequality

            
              
              Entropy
              A measure of the disorder or randomness in a dataset

            
              Data Bias
              Disparate Impact
              This metric measures the difference in outcome between groups in a dataset. For example, if a dataset is used to make a decision and a certain group is disproportionately affected by that decision, that dataset may be considered biased.

            
              
              Group Fairness
              This metric measures whether a model's performance is similar across different groups in a dataset. For example, if a model's accuracy is significantly lower for one group than for others, it may be considered biased.

            
              
              Equal Opportunity
              This metric measures whether a model's false positive rate is similar across different groups in a dataset. For example, if a model's false positive rate is significantly higher for one group than for others, it may be considered biased.

            
              
              Individual Fairness
              This metric measures whether similar individuals are treated similarly by a model.

            
              
              Parity
              This metric measures whether a model has similar performance for different groups in a dataset.

            
              
              Theil index
              This is a measure of the economic inequality in a dataset

            
              
              Mutual Information
              This is a measure of the association between variables in a dataset

            
              
              Gini Coefficient
              This is a measure of the inequality in a dataset

            
              
              Covariate Shift
              This is a measure of the difference in distribution of input features between the training and test dataset

            
              Data Privacy
              k-anonymity
              This metric measures the degree to which individuals in a dataset are indistinguishable from at least k-1 other individuals in the same dataset, in terms of the quasi-identifiers.

            
              
              l-diversity
              This metric measures the degree to which the sensitive attribute values are diverse within each equivalence class formed by the k-anonymity.

            
              
              t-closeness
              This metric measures the degree to which the distribution of the sensitive attribute values in a dataset is similar to the distribution of the sensitive attribute values in the entire population.

            
              
              differential privacy
              This metric measures the degree to which a dataset preserves the privacy of individuals by adding noise to the data in such a way that individual-level information is not revealed.

            
              
              re-identification risk
              This metric measures the probability that an individual in a dataset can be re-identified by a malicious attacker using external information.

            
              
              Information Loss
              This is a measure of the amount of information lost in a dataset after applying privacy-preserving techniques such as anonymization or generalization.

            
              
              Entropy
              This is a measure of the randomness or uncertainty in a dataset

            
              
              Mutual Information
              This is a measure of the association between variables in a dataset

            
              
              Traceability
              This is a measure of the degree to which an individual can be traced back in a dataset

            
              Data Fairness
              Disparate impact
              This metric measures the ratio of the positive rate of a sensitive attribute between the privileged and unprivileged groups. A value of 1 indicates no disparity, while values less than 1 indicate that the unprivileged group is more likely to be negatively impacted.

            
              
              Demographic parity
              This metric measures the degree to which the distribution of the positive outcome of a sensitive attribute is the same across different groups.

            
              
              Equal opportunity
              This metric measures the degree to which the true positive rate of the sensitive attribute is the same across different groups.

            
              
              Theil index
              This metric measures the degree to which the distribution of the sensitive attribute is unequal across different groups.

            
              
              Gini coefficient
              This metric measures the degree of inequality in the distribution of the sensitive attribute across different groups.

            
              
              False Positive Rate Difference (FPRD)
              This metric measures the difference in false positive rates across different groups

            
              
              False Negative Rate Difference (FNRD)
              This metric measures the difference in false negative rates across different groups

            
              
              Statistical Parity Difference (SPD)
              This metric measures the difference in the rate of favorable outcomes across different groups

            
              
              Equalized Odds
              This metric measures the degree to which the true positive rate and the false positive rate are the same across different groups

            
              Data Robustness
              Adversarial robustness
              This metric measures the ability of a model to resist adversarial attacks, such as adding small perturbations to the input data in an attempt to mislead the model.

            
              
              Distribution shift robustness
              This metric measures the ability of a model to perform well when the distribution of the data it is tested on is different from the distribution of the data it was trained on.

            
              
              Out-of-distribution robustness
              This metric measures the ability of a model to detect and handle data that is out of the distribution of the data it was trained on.

            
              
              Test-time augmentation robustness
              This metric measures the ability of a model to perform well when additional data augmentations are applied at test time.

            
              
              Generalization Error
              This measures the gap between the training and test error, lower the gap more robust the model is.

            
              
              Confidence-calibrated Error
              This measures the difference between the predicted probability and the true positive rate, lower the difference more robust the model is.

            
              
              Model Robustness Score
              This is a score generated by some toolkits like IBM's AI Fairness 360 that gives a numerical score for robustness.

            
              Data Transparency
              Model interpretability
              This metric measures the ability of the model to provide clear explanations for its predictions, such as through the use of feature importance or attribution methods.

            
              
              Data lineage
              This metric measures the ability to trace the origins of the data and the transformations applied to it.

            
              
              Data Provenance
              This metric measures the ability to track the entire lifecycle of the data, including data sourcing, data preparation, data usage, and data archiving.

            
              
              Data Governance
              This metric measures the ability to control, manage and protect the data.

            
              
              Data Quality
              This metric measures the ability of the data to meet the needs of its intended use, such as completeness, accuracy, and consistency.

            
              
              Data documentation
              This metric measures the ability to provide clear and concise documentation about the data, such as data dictionaries, data definitions, and data quality reports.

            
              
              Model audibility
              This metric measures the ability to provide insights into the internal workings of the model and its decision-making process.

            
              Data Explainability
              Feature Importance
              This metric measures the importance of each feature in the dataset in relation to the target variable. It helps to understand which features are driving the predictions of a model.

            
              
              Model interpretability
              This metric measures the ability of the model to provide clear explanations for its predictions, such as through the use of feature importance or attribution methods.

            
              
              Attribution methods
              This metric measures the ability to understand the contribution of each feature to the model's predictions, such as through techniques like LIME, SHAP, and Integrated Gradients.

            
              
              Model audibility
              This metric measures the ability to provide insights into the internal workings of the model and its decision-making process, such as through techniques like decision trees, rule sets, and decision lists.

            
              
              Proximity Measures
              This metric measures the similarity of a sample to the samples of a particular class, such as through techniques like k-neighbors, decision boundary visualization.

            
              
              Global Sensitivity measures
              This metric measures the ability of a model to be robust against small changes in the input data, such as through techniques like adversarial examples, sensitivity analysis.
Distrust Measure	Metric	Description
Data Quality	Completeness	Measures the proportion of missing data in a dataset. A dataset with a low percentage of missing data is considered to be of higher quality.
	Validity	Measures whether the data in a dataset is accurate and conforms to a set of predefined rules or constraints.
	Consistency	Measures whether the data in a dataset is consistent with other data sources.
	Timeliness	Measures how recent the data in a dataset is. A dataset with more recent data is considered to be of higher quality
	Uniqueness	Measures whether the data in a dataset is unique or duplicated.
	Accuracy	Measures the degree to which the data in a dataset is free from errors or inaccuracies.
	Precision and Recall	Evaluates the performance of a model. Precision measures the proportion of true positive predictions out of all positive predictions, and recall measures the proportion of true positive predictions out of all actual positive cases.
	F1-Score	A weighted harmonic mean of precision and recall, where the best value is 1.0 and the worst value is 0.0
	Gini Coefficient	A measure of inequality in a dataset, where 0 represents total equality and 1 represents total inequality
	Entropy	A measure of the disorder or randomness in a dataset
Data Bias	Disparate Impact	This metric measures the difference in outcome between groups in a dataset. For example, if a dataset is used to make a decision and a certain group is disproportionately affected by that decision, that dataset may be considered biased.
	Group Fairness	This metric measures whether a model's performance is similar across different groups in a dataset. For example, if a model's accuracy is significantly lower for one group than for others, it may be considered biased.
	Equal Opportunity	This metric measures whether a model's false positive rate is similar across different groups in a dataset. For example, if a model's false positive rate is significantly higher for one group than for others, it may be considered biased.
	Individual Fairness	This metric measures whether similar individuals are treated similarly by a model.
	Parity	This metric measures whether a model has similar performance for different groups in a dataset.
	Theil index	This is a measure of the economic inequality in a dataset
	Mutual Information	This is a measure of the association between variables in a dataset
	Gini Coefficient	This is a measure of the inequality in a dataset
	Covariate Shift	This is a measure of the difference in distribution of input features between the training and test dataset
Data Privacy	k-anonymity	This metric measures the degree to which individuals in a dataset are indistinguishable from at least k-1 other individuals in the same dataset, in terms of the quasi-identifiers.
	l-diversity	This metric measures the degree to which the sensitive attribute values are diverse within each equivalence class formed by the k-anonymity.
	t-closeness	This metric measures the degree to which the distribution of the sensitive attribute values in a dataset is similar to the distribution of the sensitive attribute values in the entire population.
	differential privacy	This metric measures the degree to which a dataset preserves the privacy of individuals by adding noise to the data in such a way that individual-level information is not revealed.
	re-identification risk	This metric measures the probability that an individual in a dataset can be re-identified by a malicious attacker using external information.
	Information Loss	This is a measure of the amount of information lost in a dataset after applying privacy-preserving techniques such as anonymization or generalization.
	Entropy	This is a measure of the randomness or uncertainty in a dataset
	Mutual Information	This is a measure of the association between variables in a dataset
	Traceability	This is a measure of the degree to which an individual can be traced back in a dataset
Data Fairness	Disparate impact	This metric measures the ratio of the positive rate of a sensitive attribute between the privileged and unprivileged groups. A value of 1 indicates no disparity, while values less than 1 indicate that the unprivileged group is more likely to be negatively impacted.
	Demographic parity	This metric measures the degree to which the distribution of the positive outcome of a sensitive attribute is the same across different groups.
	Equal opportunity	This metric measures the degree to which the true positive rate of the sensitive attribute is the same across different groups.
	Theil index	This metric measures the degree to which the distribution of the sensitive attribute is unequal across different groups.
	Gini coefficient	This metric measures the degree of inequality in the distribution of the sensitive attribute across different groups.
	False Positive Rate Difference (FPRD)	This metric measures the difference in false positive rates across different groups
	False Negative Rate Difference (FNRD)	This metric measures the difference in false negative rates across different groups
	Statistical Parity Difference (SPD)	This metric measures the difference in the rate of favorable outcomes across different groups
	Equalized Odds	This metric measures the degree to which the true positive rate and the false positive rate are the same across different groups
Data Robustness	Adversarial robustness	This metric measures the ability of a model to resist adversarial attacks, such as adding small perturbations to the input data in an attempt to mislead the model.
	Distribution shift robustness	This metric measures the ability of a model to perform well when the distribution of the data it is tested on is different from the distribution of the data it was trained on.
	Out-of-distribution robustness	This metric measures the ability of a model to detect and handle data that is out of the distribution of the data it was trained on.
	Test-time augmentation robustness	This metric measures the ability of a model to perform well when additional data augmentations are applied at test time.
	Generalization Error	This measures the gap between the training and test error, lower the gap more robust the model is.
	Confidence-calibrated Error	This measures the difference between the predicted probability and the true positive rate, lower the difference more robust the model is.
	Model Robustness Score	This is a score generated by some toolkits like IBM's AI Fairness 360 that gives a numerical score for robustness.
Data Transparency	Model interpretability	This metric measures the ability of the model to provide clear explanations for its predictions, such as through the use of feature importance or attribution methods.
	Data lineage	This metric measures the ability to trace the origins of the data and the transformations applied to it.
	Data Provenance	This metric measures the ability to track the entire lifecycle of the data, including data sourcing, data preparation, data usage, and data archiving.
	Data Governance	This metric measures the ability to control, manage and protect the data.
	Data Quality	This metric measures the ability of the data to meet the needs of its intended use, such as completeness, accuracy, and consistency.
	Data documentation	This metric measures the ability to provide clear and concise documentation about the data, such as data dictionaries, data definitions, and data quality reports.
	Model audibility	This metric measures the ability to provide insights into the internal workings of the model and its decision-making process.
Data Explainability	Feature Importance	This metric measures the importance of each feature in the dataset in relation to the target variable. It helps to understand which features are driving the predictions of a model.
	Model interpretability	This metric measures the ability of the model to provide clear explanations for its predictions, such as through the use of feature importance or attribution methods.
	Attribution methods	This metric measures the ability to understand the contribution of each feature to the model's predictions, such as through techniques like LIME, SHAP, and Integrated Gradients.
	Model audibility	This metric measures the ability to provide insights into the internal workings of the model and its decision-making process, such as through techniques like decision trees, rule sets, and decision lists.
	Proximity Measures	This metric measures the similarity of a sample to the samples of a particular class, such as through techniques like k-neighbors, decision boundary visualization.
	Global Sensitivity measures	This metric measures the ability of a model to be robust against small changes in the input data, such as through techniques like adversarial examples, sensitivity analysis.