Skip to content

Instantly share code, notes, and snippets.

@kevinmcaleer
Last active July 26, 2023 22:03
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kevinmcaleer/8bf03bf74ac6cbf43314c41582d1e471 to your computer and use it in GitHub Desktop.
Save kevinmcaleer/8bf03bf74ac6cbf43314c41582d1e471 to your computer and use it in GitHub Desktop.
Image Recognition for Googley Eyes

Setting up the image recogition

OpenAI has a nice automatic image captioning service, you can get this up and running by

  1. Create a virtual environment:

    python3 -m venv venv
  2. Activate the environment:

    source venv/bin/activate
  3. Install the dependencies:

    pip install -r requirements.txt
  4. Running the demo program caption_this.py:

    python3 caption_this.py
  5. You can upload your own image and update the code that loads the image.


import os
from dotenv import load_dotenv
from langchain.document_loaders import ImageCaptionLoader
from langchain.indexes import VectorstoreIndexCreator
import logging
#remove the warning message in terminal
logging.getLogger("transformers.generation_utils").setLevel(logging.ERROR)
logging.getLogger("tokenizers").setLevel(logging.ERROR)
os.environ['TOKENIZERS_PARALLELISM'] = 'false'
load_dotenv()
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY') #replace with your openai api key. Generate a key on https://platform.openai.com/
def collect_image_urls():
# image_urls = ['archie_and_trixie.jpg']
image_urls = ['kev2.jpg']
return image_urls
list_image_urls = collect_image_urls()
loader = ImageCaptionLoader(path_images=list_image_urls)
list_docs = loader.load()
index = VectorstoreIndexCreator().from_loaders([loader])
result = index.query('describe what is in the image, be as descriptive as possible using poetic language')
# result = index.query('describe what is in the image, be nonchalant and snarky')
print(result)
aiohttp==3.8.5
aiosignal==1.3.1
anyio==3.7.1
async-timeout==4.0.2
attrs==23.1.0
backoff==2.2.1
blinker==1.6.2
certifi==2023.7.22
charset-normalizer==3.2.0
chroma-hnswlib==0.7.1
chromadb==0.4.3
click==8.1.6
coloredlogs==15.0.1
dataclasses-json==0.5.13
exceptiongroup==1.1.2
fastapi==0.99.1
filelock==3.12.2
Flask==2.3.2
flatbuffers==23.5.26
frozenlist==1.4.0
fsspec==2023.6.0
gunicorn==21.2.0
h11==0.14.0
httptools==0.6.0
huggingface-hub==0.16.4
humanfriendly==10.0
idna==3.4
importlib-resources==6.0.0
itsdangerous==2.1.2
Jinja2==3.1.2
langchain==0.0.244
langsmith==0.0.14
MarkupSafe==2.1.3
marshmallow==3.20.1
monotonic==1.6
mpmath==1.3.0
multidict==6.0.4
mypy-extensions==1.0.0
networkx==3.1
numexpr==2.8.4
numpy==1.25.1
onnxruntime==1.15.1
openai==0.27.8
openapi-schema-pydantic==1.2.4
overrides==7.3.1
packaging==23.1
pandas==2.0.3
Pillow==10.0.0
posthog==3.0.1
protobuf==4.23.4
pulsar-client==3.2.0
pydantic==1.10.12
PyPika==0.48.9
python-dateutil==2.8.2
python-dotenv==1.0.0
pytz==2023.3
PyYAML==6.0.1
regex==2023.6.3
replicate==0.9.0
requests==2.31.0
safetensors==0.3.1
six==1.16.0
sniffio==1.3.0
SQLAlchemy==2.0.19
starlette==0.27.0
sympy==1.12
tenacity==8.2.2
tiktoken==0.4.0
tokenizers==0.13.3
torch==2.0.1
tqdm==4.65.0
transformers==4.31.0
typing-inspect==0.9.0
typing_extensions==4.7.1
tzdata==2023.3
urllib3==2.0.4
uvicorn==0.23.1
uvloop==0.17.0
watchfiles==0.19.0
websockets==11.0.3
Werkzeug==2.3.6
yarl==1.9.2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment