Skip to content

Instantly share code, notes, and snippets.

@githubfoam
Created February 29, 2024 11:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save githubfoam/f1c715571c0874490d4a4024eeb5807d to your computer and use it in GitHub Desktop.
Save githubfoam/f1c715571c0874490d4a4024eeb5807d to your computer and use it in GitHub Desktop.
ai security cheat sheet
#===================================================================================================================================
#gemini
Artificial intelligence (AI) attacks leverage AI techniques and vulnerabilities within AI systems to manipulate, deceive, or disrupt their intended behavior. Here are some common types of AI attacks with examples:
1. Adversarial Attacks
Aim: Fool AI models into making incorrect predictions or classifications.
Method: Subtle perturbations are crafted and added to input data. These might be imperceptible to a human, like slightly changing pixels in an image, but can drastically change the model's output.
Example: An image recognition system designed for self-driving cars might be tricked into misidentifying a stop sign as a speed limit sign by carefully altered pixels.
2. Data Poisoning
Aim: Corrupt the training data used by an AI model, causing it to learn incorrectly and make biased decisions.
Method: Injecting malicious or intentionally skewed samples into the training dataset.
Example: A spam filter trained on a dataset where attackers have labeled spam emails as legitimate could lead to it failing to identify spam.
3. Model Stealing
Aim: Extract the intellectual property of an AI model, allowing an attacker to replicate it without having access to the original algorithm or training data.
Method: Sending numerous queries to a model and analyzing its responses to reconstruct its decision-making patterns.
Example: Attackers could query a proprietary AI stock trading model to learn its strategies and gain an unfair advantage.
4. AI-Powered Social Engineering
Aim: Craft highly convincing spear-phishing or social engineering attacks with the help of AI.
Method: AI-generated text, images, or deepfakes to impersonate individuals with greater accuracy and believability.
Example: Creating a fake CEO profile, using AI-generated voice messages to instruct employees to make urgent unauthorized wire transfers.
5. Evasion Attacks
Aim: Circumvent AI-based security systems designed to detect malicious activity.
Method: Modifying malicious activities slightly to appear benign to the AI security model.
Example: Malicious code could be obfuscated with irrelevant instructions to evade an AI-based malware detector.
Defending Against AI Attacks
Adversarial Training: Train models with known adversarial examples to improve robustness.
Data Sanitization & Verification: Thorough validation and cleaning of training data sets.
Model Monitoring: Monitor performance to detect anomalies potentially indicating attacks.
Explainable AI (XAI): Use techniques that make AI models more transparent, making attacks harder to engineer.
Continuous Security Research: The field of AI attack detection is constantly evolving, so stay informed about the latest techniques.
Important Considerations
The ease of launching attacks is increasing with the availability of accessible AI tools.
AI attacks can also target physical systems, such as manipulating sensors in autonomous vehicles.
Let me know if you would like a deeper exploration of any of these attack types or their potential defenses
#===================================================================================================================================
#ChatGPT
ChatGPT
Artificial Intelligence (AI) attacks involve exploiting vulnerabilities or manipulating AI systems to achieve malicious objectives. These attacks can target various components of AI, including machine learning models, data integrity, and decision-making processes. Here are some examples of AI attacks:
Adversarial Attacks:
Example: Adversarial attacks involve manipulating input data to mislead machine learning models. Attackers make small, carefully crafted modifications to input images, text, or other data to cause misclassification.
Impact: A self-driving car misidentifying a stop sign or an image recognition system classifying a cat as a dog.
Data Poisoning:
Example: Injecting malicious data into training datasets to bias AI models. For instance, an attacker may introduce incorrect or misleading samples during the model training phase.
Impact: Biased decisions in areas like credit scoring or hiring based on manipulated training data.
Model Inversion:
Example: Reverse engineering a machine learning model by probing it with specific queries and analyzing the model's responses to gain insights into sensitive information used during training.
Impact: Compromising privacy and confidentiality of information within the model.
Evasion Attacks:
Example: Exploiting weaknesses in AI-based security systems. In the context of intrusion detection systems, an attacker might craft malicious input to evade detection.
Impact: Bypassing security measures, allowing malicious activities to go undetected.
Reinforcement Learning Exploitation:
Example: Manipulating the reward system in a reinforcement learning environment. For example, tricking a reinforcement learning algorithm into optimizing for unintended objectives.
Impact: Unintended and potentially harmful behavior in systems like game-playing AI or autonomous robots.
Privacy Violations in Generative Models:
Example: Generating synthetic data using generative models that inadvertently captures sensitive information from the training data, compromising privacy.
Impact: Unauthorized exposure of personal or confidential information.
Deepfake Attacks:
Example: Using deep learning techniques to create realistic fake images, videos, or audio recordings of individuals, leading to identity theft or misinformation.
Impact: Spreading false information, damaging reputations, or manipulating public opinion.
Exfiltration of Sensitive Information:
Example: Exploiting AI-based systems to extract confidential data. For instance, using language models to extract sensitive information from documents.
Impact: Unauthorized access to and use of confidential or classified information.
AI attacks are a growing concern as AI technologies become more pervasive. Defending against these attacks requires a combination of secure design principles, ongoing monitoring, and adaptation of security measures to mitigate emerging threats.
#===================================================================================================================================
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment