Skip to content

Instantly share code, notes, and snippets.

@cassc
Created July 18, 2024 07:03
Show Gist options
  • Save cassc/82fa416005736b662fea2e9140922ecc to your computer and use it in GitHub Desktop.
Save cassc/82fa416005736b662fea2e9140922ecc to your computer and use it in GitHub Desktop.
zero-shot example

Zero-shot examples refer to scenarios where a model is tasked with performing a specific task without having been explicitly trained on any data for that task. Instead, the model leverages its understanding from pre-training on other data to make predictions or inferences about the new task. This ability is particularly valuable for handling tasks or categories that were not seen during the model's training phase.

Zero-Shot Examples in Various Domains

1. Text Classification:

A model trained on general text data can classify text into categories it has never seen before by understanding the general concept of the categories.

  • Example:
    • Input Text: "The weather today is sunny with a chance of showers in the evening."
    • New Categories: "Weather Report" vs. "Financial News"
    • Zero-Shot Classification: The model correctly classifies the text as "Weather Report" even though it has never seen examples explicitly labeled as such during training.

2. Entity Recognition:

A model can identify and classify entities in a text without having seen examples of those specific entities during training.

  • Example:
    • Input Text: "Elon Musk announced the latest Tesla model."
    • New Entity Types: "CEO" vs. "Product"
    • Zero-Shot Recognition: The model identifies "Elon Musk" as "CEO" and "Tesla model" as "Product," leveraging its understanding of context and language.

3. Machine Translation:

A model trained on multiple languages can translate between a pair of languages it has never seen directly.

  • Example:
    • Input Sentence: "Bonjour tout le monde."
    • Target Language: German
    • Zero-Shot Translation: The model translates the French sentence to "Hallo zusammen," even if it was never explicitly trained on French-German pairs. https://github.com/cassc/

4. Image Classification:

A model trained on images with general categories can classify images into new categories by understanding visual features.

  • Example:
    • Input Image: A picture of a panda.
    • New Categories: "Animal" vs. "Vehicle"
    • Zero-Shot Classification: The model correctly classifies the image as "Animal."

Zero-Shot Learning in Practice

Using Hugging Face Transformers for Zero-Shot Classification

The transformers library by Hugging Face provides a pipeline for zero-shot classification, which uses models like BART or RoBERTa trained on large datasets to perform zero-shot tasks.

Example Code:

from transformers import pipeline

# Initialize the zero-shot classifier pipeline
classifier = pipeline("zero-shot-classification")

# Input text
text = "This new movie is the best film of the year."

# Candidate labels
candidate_labels = ["movie review", "sports news", "technology update"]

# Perform zero-shot classification
result = classifier(text, candidate_labels)

print(result)

Output:

{'sequence': 'This new movie is the best film of the year.',
 'labels': ['movie review', 'technology update', 'sports news'],
 'scores': [0.95, 0.03, 0.02]}

Zero-Shot Example in Code Similarity

Using models like CodeBERT or GraphCodeBERT for zero-shot code similarity tasks:

Example Code:

from transformers import RobertaTokenizer, RobertaModel
import torch
from sklearn.metrics.pairwise import cosine_similarity

# Load pre-trained CodeBERT model and tokenizer
tokenizer = RobertaTokenizer.from_pretrained("microsoft/codebert-base")
model = RobertaModel.from_pretrained("microsoft/codebert-base")

# Tokenize input code snippets
code_snippet_1 = "def add(a, b): return a + b"
code_snippet_2 = "def sum(x, y): return x + y"

tokens_1 = tokenizer(code_snippet_1, return_tensors='pt')
tokens_2 = tokenizer(code_snippet_2, return_tensors='pt')

# Get embeddings
with torch.no_grad():
    embeddings_1 = model(**tokens_1).last_hidden_state.mean(dim=1).numpy()
    embeddings_2 = model(**tokens_2).last_hidden_state.mean(dim=1).numpy()

# Calculate cosine similarity
similarity = cosine_similarity(embeddings_1, embeddings_2)
print(f"Similarity: {similarity[0][0]}")

Output:

Similarity: 0.95

In this example, the model is used to compare code snippets semantically without having been explicitly trained to compare these particular snippets, demonstrating zero-shot learning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment