Skip to content

Instantly share code, notes, and snippets.

@pszemraj
Last active May 22, 2023 17:45
Show Gist options
  • Save pszemraj/64d29ff2abef299a99ce21b72711ee3e to your computer and use it in GitHub Desktop.
Save pszemraj/64d29ff2abef299a99ce21b72711ee3e to your computer and use it in GitHub Desktop.
for Gio's slugbot project

Image Classification Telegram Bot

This script runs a Telegram bot that classifies images using a pre-trained model. The bot handles /start and /help commands, as well as photo messages. When a photo message is received, the bot downloads the photo, classifies it, and sends a message with the prediction.

The original intended use case is to classify if an image contains a slug or not:

is it a slug

why slugs?

In an era where data is the new oil, the ability to accurately classify and understand this data is paramount. Our revolutionary Telegram bot, powered by state-of-the-art deep learning algorithms, is a leap forward in this direction. It's not just about classifying slugs, snakes, and snails - it's about harnessing the power of artificial intelligence to make sense of the world around us.

Imagine a world where anyone, anywhere, can simply snap a photo and instantly gain insights about the biodiversity in their backyard. A world where researchers can quickly identify and catalog species, accelerating our understanding of ecosystems. A world where educators can bring the power of AI into their classrooms, sparking curiosity and fostering a new generation of scientists.

By making this technology accessible to everyone, we're not just building a bot, we're democratizing knowledge. We're empowering individuals, communities, and organizations to learn, explore, and make informed decisions. We're fostering a culture of curiosity and lifelong learning.

This is the power of deep learning. This is the promise of our Telegram bot. By classifying slugs, we're not just identifying a small creature - we're taking a giant leap towards a future where everyone has the power of AI at their fingertips. We're building a future where technology serves humanity, helping us understand and care for our planet. This is more than a bot - it's a tool for change, a catalyst for progress, and a beacon of hope for a brighter, more informed future.

Requirements

  • Python 3.6 or later
  • python-telegram-bot library
  • transformers library
  • huggingface_hub library
  • tqdm library
  • fire library

Installation

  1. Install the required Python libraries with pip:
pip install python-telegram-bot transformers huggingface_hub tqdm fire
  1. Clone the repository or download the script.

  2. Replace the TOKEN placeholder in the script with your actual Telegram bot token.

Usage

You can run the script from the command line with the following command:

python bot.py

Once the bot is running, you can interact with it on Telegram. Send the /start command to get a welcome message, the /help command to get a help message, or send a photo to get a prediction.

The bot classifies images into categories based on a pre-trained model. The categories and the model can be customized by modifying the repo_id variable and the categories variable in the script.

Customization

You can customize the bot's behavior by modifying the script. For example, you can change the welcome and help messages in the start and help_command functions, respectively. You can also change the way the bot handles photo messages in the handle_photo function.

Logging

The script logs informational messages as well as warnings and errors. The logging level can be changed by modifying the level parameter in the logging.basicConfig function. The log messages are printed to the console, but they can be redirected to a file or another output stream by modifying the stream parameter in the logging.basicConfig function.

Extending the Image Classification Telegram Bot

The current implementation of the bot classifies images into predefined categories. Here are some ideas on how to extend and improve the bot:

1. Image Captioning

The bot could be extended to not only classify images, but also provide a caption for them. This could be done using the BLIP (Bidirectional Latent Image Processing) model from Hugging Face, which is capable of both conditional and unconditional image captioning.

Here is an example of how the BLIP model could be integrated into the bot:

from PIL import Image
from transformers import BlipProcessor, BlipForConditionalGeneration
import requests

processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-base")
model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-base")

def handle_photo(update: Update, context: CallbackContext) -> None:
    file = context.bot.getFile(update.message.photo[-1].file_id)
    file.download('image.jpg')
    raw_image = Image.open('image.jpg').convert('RGB')

    # unconditional image captioning
    inputs = processor(raw_image, return_tensors="pt")
    out = model.generate(**inputs)
    caption = processor.decode(out[0], skip_special_tokens=True)

    update.message.reply_text(f'Caption: {caption}')

With this extension, when a photo message is received, the bot downloads the photo, generates a caption for it using the BLIP model, and sends a message with the caption.

2. Interactive Captioning

The bot could be made interactive by allowing users to provide a starting phrase for the caption. This could be done by handling text messages in addition to photo messages, and using the text as the starting phrase for the caption.

3. Multiple Models

The bot could be extended to support multiple models for image classification and captioning. Users could select the model they want to use through a command or a button in the bot's interface.

4. Image Enhancements

The bot could be extended to perform image enhancements before classification or captioning. This could include resizing, cropping, rotating, adjusting brightness and contrast, and other image processing operations.

5. Multilingual Support

The bot could be extended to support multiple languages. This could be done by translating the captions and classification results into the user's language.

6. User Feedback

The bot could be extended to allow users to provide feedback on the classification and captioning results. This feedback could be used to improve the models and the bot's performance.

import logging
from pathlib import Path
from telegram import Update, ForceReply
from telegram.ext import (
Updater,
CommandHandler,
MessageHandler,
Filters,
CallbackContext,
)
from transformers import pipeline
from huggingface_hub import from_pretrained_fastai
from tqdm.auto import tqdm
import fire
# Enable logging
logging.basicConfig(
format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", level=logging.INFO
)
logger = logging.getLogger(__name__)
# repo_id = "YOUR_USERNAME/YOUR_LEARNER_NAME"
repo_id = "MasleK/snails_snakes_slugs"
learn = from_pretrained_fastai(repo_id)
categories = learn.dls.vocab
def predict(image):
label, index, probs = learn.predict(image)
return dict(zip(categories, map(float, probs)))
def start(update: Update, context: CallbackContext) -> None:
"""Send a message when the command /start is issued."""
user = update.effective_user
update.message.reply_markdown_v2(
rf"Hi {user.mention_markdown_v2()}\!",
reply_markup=ForceReply(selective=True),
)
def help_command(update: Update, context: CallbackContext) -> None:
"""Send a message when the command /help is issued."""
update.message.reply_text("Help!")
def handle_photo(update: Update, context: CallbackContext) -> None:
"""Handle photo messages, classify the image and send a message with the prediction."""
file = context.bot.getFile(update.message.photo[-1].file_id)
file.download("image.jpg")
prediction = predict("image.jpg")
update.message.reply_text(f"Prediction: {prediction}")
def main() -> None:
"""Start the bot."""
updater = Updater("TOKEN", use_context=True)
dispatcher = updater.dispatcher
dispatcher.add_handler(CommandHandler("start", start))
dispatcher.add_handler(CommandHandler("help", help_command))
dispatcher.add_handler(
MessageHandler(Filters.photo & ~Filters.command, handle_photo)
)
updater.start_polling()
updater.idle()
if __name__ == "__main__":
fire.Fire(main)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment