Mantej Singh Gill MantejGill

## distrust_measures.csv
Distrust Measure,Metric,Description
Data Quality,Completeness,Measures the proportion of missing data in a dataset. A dataset with a low percentage of missing data is considered to be of higher quality.
,Validity,Measures whether the data in a dataset is accurate and conforms to a set of predefined rules or constraints.
,Consistency,Measures whether the data in a dataset is consistent with other data sources.
,Timeliness,Measures how recent the data in a dataset is. A dataset with more recent data is considered to be of higher quality
,Uniqueness,Measures whether the data in a dataset is unique or duplicated.
,Accuracy,Measures the degree to which the data in a dataset is free from errors or inaccuracies.
,Precision and Recall,"Evaluates the performance of a model. Precision measures the proportion of true positive predictions out of all positive predictions, and recall measures the proportion of true positive predictions out of all actual positive cases."
,F1-Score,"A weighted harmonic mean of precision and

## decentralized_datasets.csv

          
            Tools
            URL

            
              Ocean Protocol
              https://oceanprotocol.com/

            
              Datum
              https://datum.org/

            
              Enigma
              https://enigma.com/our-data

            
              DataBrokerDAO
              https://www.databroker.global/

            
              DeBlock
              https://deblock.io/

## smpc.csv

          
            Tools
            URL

            
              PySyft
              https://github.com/OpenMined/PySyft

            
              Enigma
              https://www.enigma.com/

            
              Microsoft SEAL
              https://github.com/Microsoft/SEAL

            
              TF Encrypted
              https://www.tf-encrypted.org/

            
              CrypTen
              https://crypten.org/

            
              MPyC
              https://mpyc.org/

            
              FairScale
              https://fairscale.ai/

## data_management_and_governance_platforms.csv

          
            Tools
            URL

            
              Collibra
              https://www.collibra.com/us/en

            
              Informatica MDM
              https://www.informatica.com/in/products/master-data-management.html

            
              SAP Master Data Governance
              https://www.sap.com/products/technology-platform/master-data-governance.html

            
              Alation
              https://www.alation.com/

            
              Talend MDM
              https://www.talend.com/resources/what-is-master-data-management/

## dataset_anonymization.csv

          
            Tools
            URL

            
              DataWig
              https://datawig.readthedocs.io/en/latest/

            
              Faker
              https://faker.readthedocs.io/en/master/

            
              ARX
              https://arx.deidentifier.org/anonymization-tool/

            
              DataSunrise Data Masking
              https://www.datasunrise.com/data-masking/

            
              Informatica Data Masking 
              https://www.informatica.com/blogs/informatica-data-masking-solution-a-data-security-product-dynamic-data-masking-for-structured-data-masking.html

            
              Delphix Dynamic Data Platform
              https://www.delphix.com/platform/masking

            
              Solix Data Masking
              https://www.solix.com/data-management-solutions/data-masking/

## federated_learning.csv

          
            Tools
            URL

            
              TensorFlow Federated (TFF)
              https://www.tensorflow.org/federated

            
              PySyft
              https://github.com/OpenMined/PySyft

            
              FATE
              https://fate.fedai.org/

## differential_privacy.csv

          
            Tools
            URL

            
              DataFly
              https://datafly.online/

            
              DP-Lib
              https://www.microsoft.com/en-us/ai/ai-lab-differential-privacy

            
              TensorFlow Privacy
              https://github.com/tensorflow/privacy

            
              OpenDP
              https://opendp.org/

            
              Rdp
              https://cran.r-project.org/web/packages/RDP/index.html

            
              PyDP
              https://github.com/OpenMined/PyDP

            
              Pytorch’s Opacus
              https://opacus.ai/

            
              SecretFlow
              https://github.com/secretflow/secretflow

            
              IBM’s Differential Privacy Library
              https://github.com/IBM/differential-privacy-library

## data_bias.csv

          
            Tool 
            Description

            
              AI Fairness 360
              This is an open-source toolkit offered by IBM for the detection and elimination of bias in machine learning models

            
              What-If Tool
              This tool allows users to test different scenarios within their data to check how changes affect the end results of a machine learning model

            
              TCAV (Testing with Concept Activation Vectors)
              TCAV is a tool developed by Google to scan algorithmic models for common biases, such as race, gender, and location

            
              FairML
              FairML is a Python open-source toolbox that is used to audit machine learning predictive models to detect bias.

## dataprofilingtools.csv
Name,Description
Dataedo,"A data profiling tool with a data catalog feature, allowing users to browse minimum, maximum, average, and median values, as well as see top values and other statistics."
Atlan,"A data management platform that provides data profiling capabilities, including data types, length, recurring patterns, and data quality assessment."
Boltic,"A free data profiling tool that offers features like data validation, data transformation, and data cleaning."
Aggregate Profiler,"An open-source data profiling tool that provides data profiling, filtering, and governance, similarity checks, data enrichment, and real-time alerting for data issues or changes."
IBM InfoSphere Information Analyzer,"A data analysis tool that helps organizations discover relationships, patterns, and trends in their data."
Informatica Data Explorer,"A data exploration tool that allows users to visualize, analyze, and clean data."
Melissa,A data quality tool that helps organizations identify and correct data quality issues.
Qua

## benchmark_datasets.csv
Dataset,Type,Description,Link
BenchMD,Medical Modalities,"The BenchMD benchmark consists of 19 real-world medical datasets across 7 medical modalities, including X-ray, CT, MRI, ultrasound, fundus, OCT, and pathology",https://www.rajpurkarlab.hms.harvard.edu/benchmd
ImageNet,Image Classification,"The ImageNet dataset is a large-scale image classification dataset with over 1.2 million images in 1,000 categories",https://www.image-net.org/
COCO,Object Detection,"The COCO dataset is a large-scale object detection, segmentation, and captioning dataset with over 330,000 images and 2.5 million object instances labeled across 80 object categories",https://cocodataset.org/#home
GLUE,Natural Language Processing,"The GLUE benchmark is a collection of nine natural language understanding tasks, including sentiment analysis, question answering, and textual entailment",https://gluebenchmark.com/
Tencent-MVSE,Video Similarity,"The Tencent-MVSE dataset is a large-scale benchmark dataset for multi-modal video similarity evalu
	Distrust Measure,Metric,Description
	Data Quality,Completeness,Measures the proportion of missing data in a dataset. A dataset with a low percentage of missing data is considered to be of higher quality.
	,Validity,Measures whether the data in a dataset is accurate and conforms to a set of predefined rules or constraints.
	,Consistency,Measures whether the data in a dataset is consistent with other data sources.
	,Timeliness,Measures how recent the data in a dataset is. A dataset with more recent data is considered to be of higher quality
	,Uniqueness,Measures whether the data in a dataset is unique or duplicated.
	,Accuracy,Measures the degree to which the data in a dataset is free from errors or inaccuracies.
	,Precision and Recall,"Evaluates the performance of a model. Precision measures the proportion of true positive predictions out of all positive predictions, and recall measures the proportion of true positive predictions out of all actual positive cases."
	,F1-Score,"A weighted harmonic mean of precision and
	Tools	URL
	Ocean Protocol	https://oceanprotocol.com/
	Datum	https://datum.org/
	Enigma	https://enigma.com/our-data
	DataBrokerDAO	https://www.databroker.global/
	DeBlock	https://deblock.io/
	Tools	URL
	PySyft	https://github.com/OpenMined/PySyft
	Enigma	https://www.enigma.com/
	Microsoft SEAL	https://github.com/Microsoft/SEAL
	TF Encrypted	https://www.tf-encrypted.org/
	CrypTen	https://crypten.org/
	MPyC	https://mpyc.org/
	FairScale	https://fairscale.ai/
	Tools	URL
	Collibra	https://www.collibra.com/us/en
	Informatica MDM	https://www.informatica.com/in/products/master-data-management.html
	SAP Master Data Governance	https://www.sap.com/products/technology-platform/master-data-governance.html
	Alation	https://www.alation.com/
	Talend MDM	https://www.talend.com/resources/what-is-master-data-management/
	Tools	URL
	DataWig	https://datawig.readthedocs.io/en/latest/
	Faker	https://faker.readthedocs.io/en/master/
	ARX	https://arx.deidentifier.org/anonymization-tool/
	DataSunrise Data Masking	https://www.datasunrise.com/data-masking/
	Informatica Data Masking	https://www.informatica.com/blogs/informatica-data-masking-solution-a-data-security-product-dynamic-data-masking-for-structured-data-masking.html
	Delphix Dynamic Data Platform	https://www.delphix.com/platform/masking
	Solix Data Masking	https://www.solix.com/data-management-solutions/data-masking/
	Tools	URL
	TensorFlow Federated (TFF)	https://www.tensorflow.org/federated
	PySyft	https://github.com/OpenMined/PySyft
	FATE	https://fate.fedai.org/
	Tools	URL
	DataFly	https://datafly.online/
	DP-Lib	https://www.microsoft.com/en-us/ai/ai-lab-differential-privacy
	TensorFlow Privacy	https://github.com/tensorflow/privacy
	OpenDP	https://opendp.org/
	Rdp	https://cran.r-project.org/web/packages/RDP/index.html
	PyDP	https://github.com/OpenMined/PyDP
	Pytorch’s Opacus	https://opacus.ai/
	SecretFlow	https://github.com/secretflow/secretflow
	IBM’s Differential Privacy Library	https://github.com/IBM/differential-privacy-library
	Tool	Description
	AI Fairness 360	This is an open-source toolkit offered by IBM for the detection and elimination of bias in machine learning models
	What-If Tool	This tool allows users to test different scenarios within their data to check how changes affect the end results of a machine learning model
	TCAV (Testing with Concept Activation Vectors)	TCAV is a tool developed by Google to scan algorithmic models for common biases, such as race, gender, and location
	FairML	FairML is a Python open-source toolbox that is used to audit machine learning predictive models to detect bias.
	Name,Description
	Dataedo,"A data profiling tool with a data catalog feature, allowing users to browse minimum, maximum, average, and median values, as well as see top values and other statistics."
	Atlan,"A data management platform that provides data profiling capabilities, including data types, length, recurring patterns, and data quality assessment."
	Boltic,"A free data profiling tool that offers features like data validation, data transformation, and data cleaning."
	Aggregate Profiler,"An open-source data profiling tool that provides data profiling, filtering, and governance, similarity checks, data enrichment, and real-time alerting for data issues or changes."
	IBM InfoSphere Information Analyzer,"A data analysis tool that helps organizations discover relationships, patterns, and trends in their data."
	Informatica Data Explorer,"A data exploration tool that allows users to visualize, analyze, and clean data."
	Melissa,A data quality tool that helps organizations identify and correct data quality issues.
	Qua
	Dataset,Type,Description,Link
	BenchMD,Medical Modalities,"The BenchMD benchmark consists of 19 real-world medical datasets across 7 medical modalities, including X-ray, CT, MRI, ultrasound, fundus, OCT, and pathology",https://www.rajpurkarlab.hms.harvard.edu/benchmd
	ImageNet,Image Classification,"The ImageNet dataset is a large-scale image classification dataset with over 1.2 million images in 1,000 categories",https://www.image-net.org/
	COCO,Object Detection,"The COCO dataset is a large-scale object detection, segmentation, and captioning dataset with over 330,000 images and 2.5 million object instances labeled across 80 object categories",https://cocodataset.org/#home
	GLUE,Natural Language Processing,"The GLUE benchmark is a collection of nine natural language understanding tasks, including sentiment analysis, question answering, and textual entailment",https://gluebenchmark.com/
	Tencent-MVSE,Video Similarity,"The Tencent-MVSE dataset is a large-scale benchmark dataset for multi-modal video similarity evalu