Skip to content

Instantly share code, notes, and snippets.

@vuiseng9
Last active March 5, 2024 20:07
Show Gist options
  • Save vuiseng9/e1d1f294ea0148cdd9f3eb9da4298302 to your computer and use it in GitHub Desktop.
Save vuiseng9/e1d1f294ea0148cdd9f3eb9da4298302 to your computer and use it in GitHub Desktop.
import warnings
from transformers import AutoTokenizer

class PromptCreator:
    def __init__(self, model_id):
        self.tokenizer = AutoTokenizer.from_pretrained(model_id)
        self.offset = len(self.tokenizer(self.tokenizer.special_tokens_map['bos_token'])['input_ids'])
        self.samples = [
                                {
                                "prompt": "Create a recipe for a fruit salad:",
                                "response": "Let's make a delicious fruit salad together! We'll call it Summer Sunshine Fruit Salad. First, gather all the yummy fruits we need - strawberries, pineapple, watermelon, blueberries, grapes, mango, and banana. Wash and slice them up nicely. In a big bowl, mix them all together gently. If you like it sweeter, you can add a little honey or maple syrup. Squeeze some fresh lime juice over the fruits for a tangy flavor. Pop the bowl into the fridge for a little while to chill. When it's nice and cold, take it out and garnish with some fresh mint leaves if you want to make it look extra special. Now, you're ready to enjoy your colorful and refreshing Summer Sunshine Fruit Salad!"
                                },
                                {
                                "prompt": "Tell a story about a small cat:",
                                "response": "Once upon a time, in a quaint little village nestled among rolling hills, there lived a small cat named Whiskers. Whiskers wasn't like the other cats in the village; he was tiny, with fur as soft as silk and eyes that sparkled like emeralds. Despite his size, Whiskers had the heart of a lion. He roamed the cobblestone streets fearlessly, exploring every nook and cranny with boundless curiosity. One day, while chasing a butterfly through the garden, Whiskers stumbled upon a hidden path that led deep into the forest. Excited by the prospect of adventure, he followed the trail until he reached a clearing where a family of rabbits was in distress. Without hesitation, Whiskers sprang into action, using his wit and agility to save the day. From that moment on, Whiskers became a hero in the village, admired by all for his bravery and kindness."
                                },
                                {
                                "prompt": "How Rainbows Appear?",
                                "response": "Have you ever seen a rainbow after it rains? Let me tell you why they appear! When the sun shines through raindrops in the sky, it's like a magic trick happening. Sunlight has lots of colors in it, but usually, we can't see them all at once. When the sunlight goes through the raindrops, it bends and spreads out into all those colors like a big colorful fan. Each raindrop acts like a tiny mirror, splitting the sunlight into the colors of the rainbow - red, orange, yellow, green, blue, indigo, and violet. That's why when the sun comes out after it rains, you can see a beautiful rainbow in the sky!"
                                },
                                {
                                "prompt": "Explaining the Color Blue to a Child:",
                                "response": "Okay, let's imagine the color blue together! Close your eyes and think about a bright sunny day. Look up at the sky, it's so big and wide, right? That's the color blue! It's like the ocean, where fishes swim and waves crash. You might have seen blue in your toys, like the toy car or the crayon you use to draw water. Blue can remind you of sad things, like when you're feeling a little down, but it's also a color of calm and peace, like the soft blanket you cuddle up with at bedtime. Blue is cool and makes you feel relaxed, just like a gentle breeze on a hot day."
                                },
                                {
                                "prompt": "Meet Some Famous Historical Figures:",
                                "response": "Let's talk about some very important people from a long time ago! Imagine you could meet Leonardo da Vinci, who was not only an amazing artist but also a scientist who loved to invent things. There's also Cleopatra, a powerful queen of ancient Egypt known for her intelligence and beauty. Mahatma Gandhi was a peaceful leader who helped India gain independence from British rule through nonviolent protest. Then there's Joan of Arc, a brave young woman from France who led her country's army to victory during the Hundred Years' War. And finally, we have Albert Einstein, a brilliant scientist whose ideas about space, time, and energy changed the way we understand the universe!"
                                }
                            ]
        self.sample_texts = [' '.join(d.values()) for d in self.samples]

        self.sample_texts.append(
        "A Hare was making fun of the Tortoise one day for being so slow. "
        + "Do you ever get anywhere? he asked with a mocking laugh. Yes, replied the Tortoise, "
        + "and I get there sooner than you think. I'll run you a race and prove it. "
        + "The Hare was much amused at the idea of running a race with the Tortoise, "
        + "but for the fun of the thing he agreed. So the Fox, who had consented to act as judge, "
        + "marked the distance and started the runners off. The Hare was soon far out of sight, "
        + "and to make the Tortoise feel very deeply how ridiculous it was for him to try a race with a Hare, "
        + "he lay down beside the course to take a nap until the Tortoise should catch up."
        + "The Tortoise meanwhile kept going slowly but steadily, and, after a time, "
        + "passed the place where the Hare was sleeping. But the Hare slept on very peacefully; "
        + "and when at last he did wake up, the Tortoise was near the goal. "
        + "The Hare now ran his swiftest, but he could not overtake the Tortoise in time."
    )
        
    def create_prompt_of_length(self, ctx_len):
        def per_prompt(length, sample_token_ids):
            if length > len(sample_token_ids):
                warnings.warn("Context length is longer than the prompt length", Warning)
                return None
            else:
                prompt_token_ids = sample_token_ids[:length]
                return self.tokenizer.decode(prompt_token_ids)

        tokenized = self.tokenizer(self.sample_texts)
        if len(tokenized['input_ids']) > 1:
            decoded = []
            for each in tokenized['input_ids']:
                decoded.append(per_prompt(ctx_len, each))
        else:
            decoded = per_prompt(ctx_len, tokenized['input_ids'])
        return decoded


llama_prompt_creator = PromptCreator(model_id="meta-llama/Llama-2-7b-hf")
llama_prompts = llama_prompt_creator.create_prompt_of_length(ctx_len=32)

mistral_prompt_creator = PromptCreator(model_id="mistralai/Mistral-7B-v0.1")
mistral_prompts = mistral_prompt_creator.create_prompt_of_length(ctx_len=32)

# check
# do note that the default behaviour of tokenizer pad to begining of the sentence.
for i, each in enumerate(llama_prompts):
    print(f"\n{each}\nlen of llama prompt{i}: {len(llama_prompt_creator.tokenizer(each)['input_ids'])}")

print("-"*100)

for i, each in enumerate(mistral_prompts):
    print(f"\n{each}\nlen of mistral prompt{i}: {len(mistral_prompt_creator.tokenizer(each)['input_ids'])}")

print("end.")
@vuiseng9
Copy link
Author

vuiseng9 commented Mar 5, 2024

Sample outputs: (remove the <s> before use)

<s> Create a recipe for a fruit salad: Let's make a delicious fruit salad together! We'll call it Summer Sunshine
len of llama prompt0: 34

<s> Tell a story about a small cat: Once upon a time, in a quaint little village nestled among rolling hills, there lived a small cat named
len of llama prompt1: 34

<s> How Rainbows Appear? Have you ever seen a rainbow after it rains? Let me tell you why they appear! When the sun sh
len of llama prompt2: 34

<s> Explaining the Color Blue to a Child: Okay, let's imagine the color blue together! Close your eyes and think about a bright sunny
len of llama prompt3: 34

<s> Meet Some Famous Historical Figures: Let's talk about some very important people from a long time ago! Imagine you could meet Leonardo
len of llama prompt4: 34

<s> A Hare was making fun of the Tortoise one day for being so slow. Do you ever get anywhere? he asked with a mocking
len of llama prompt5: 34
----------------------------------------------------------------------------------------------------

<s> Create a recipe for a fruit salad: Let's make a delicious fruit salad together! We'll call it Summer Sunshine Fruit Salad.
len of mistral prompt0: 34

<s> Tell a story about a small cat: Once upon a time, in a quaint little village nestled among rolling hills, there lived a small cat named
len of mistral prompt1: 34

<s> How Rainbows Appear? Have you ever seen a rainbow after it rains? Let me tell you why they appear! When the sun sh
len of mistral prompt2: 34

<s> Explaining the Color Blue to a Child: Okay, let's imagine the color blue together! Close your eyes and think about a bright sunny day
len of mistral prompt3: 34

<s> Meet Some Famous Historical Figures: Let's talk about some very important people from a long time ago! Imagine you could meet Leonardo da Vin
len of mistral prompt4: 34

<s> A Hare was making fun of the Tortoise one day for being so slow. Do you ever get anywhere? he asked with a mocking laugh
len of mistral prompt5: 34

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment