Joshua Knox jrknox1977

## llm-util.py
import os
from dotenv import load_dotenv
from openai import OpenAI
from groq import Groq
import anthropic
import google.generativeai as genai

# Use this pip install command:
# python3 -m pip install openai groq anthropic google-generativeai python-dotenv

## gist:b411fe214c544dc0937957a773dcb270
# YouTube Video Analysis with Claude 3
# By KNOX @jr_knox1977
#
# Prompt "inpired" by several prompts from Daniel Meissler's Fabric Project:
# https://github.com/danielmiessler/fabric
#
# This script expects a .env file with the following content:
# CLAUDE_API_KEY=your_api
#
# Quick pip: python3 -m pip install anthropic youtube_transcript_api python-dotenv

## multi_ollama_containers.md

      
              1 file
            
          
              1 fork
            
          
              2 comments
            
          
              3 stars
            
          
                jrknox1977
                / multi_ollama_containers.md
            
            
              Last active
              February 21, 2024 21:07
            
              
                Running Multiple ollama containers on a single host. 
              
          
    Multiple Ollama Containers on a single host (with multiple GPUs)

I don't want model RELOAD


I have a large machine with 2 GPUs and a considerable amount of RAM.
I was trying to use ollama to server llava and mistral BUT it would reload the models every time I switched model requests.
So this is the solution that appears to be working: Multiple Containers, each serving a different model, on different ports.

Ollama model working dir:


I have many models already downloaded on my machine so I mount the host ollama working dir to the containers.
Linux (At least on my linux machine) - /usr/share/ollama/.ollama


## ollama_dspy.py
# install DSPy: pip install dspy
import dspy

# Ollam is now compatible with OpenAI APIs
#
# To get this to work you must include `model_type='chat'` in the `dspy.OpenAI` call.
# If you do not include this you will get an error.
#
# I have also found that `stop='\n\n'` is required to get the model to stop generating text after the ansewr is complete.
# At least with mistral.

## dspy_ollama_1-24.py
# install DSPy: pip install dspy
import dspy

# Ollam 1.24 is now compatible with OpenAI APIs
# But DSPy has hard coded some logic around the names of the OpenAI models
# This is a workaround for now.
# You have to create a custom Ollama Model and use the name 'gpt-3.5-turbo' or at least 'gpt-3.5'
#
# Here is the DSPy code that is causing the issue:
#

## dspy_tgi_mistral.py
# install DSPy: pip install dspy
import dspy

# This sets up the language model for DSPy in this case we are using mistral 7b through TGI (Text Generation Interface from HuggingFace)
mistral = dspy.HFClientTGI(model='mistralai/Mistral-7B-v0.1', port=8080, url='http://localhost')

# This sets the language model for DSPy.
dspy.settings.configure(lm=mistral)

# This is not required but it helps to understand what is happening

## dspy_chain_of_thought_example.py
# install DSPy: pip install dspy
import dspy


# This sets up the language model for DSPy in this case we are using GPT-3.5-turbo
turbo = dspy.OpenAI(model='gpt-3.5-turbo')

# This sets the language model for DSPy. This must be set or you get an error that is not helpful:
# --> temperature = lm.kwargs['temperature'] if temperature is None else temperature
# --> AttributeError: 'NoneType' object has no attribute 'kwargs'

## basic_qa.py
# install DSPy: pip install dspy
import dspy


# This sets up the language model for DSPy in this case we are using GPT-3.5-turbo
turbo = dspy.OpenAI(model='gpt-3.5-turbo')

# This sets the language model for DSPy. This must be set or you get an error that is not helpful:
# --> temperature = lm.kwargs['temperature'] if temperature is None else temperature
# --> AttributeError: 'NoneType' object has no attribute 'kwargs'
	import os
	from dotenv import load_dotenv
	from openai import OpenAI
	from groq import Groq
	import anthropic
	import google.generativeai as genai

	# Use this pip install command:
	# python3 -m pip install openai groq anthropic google-generativeai python-dotenv
	# YouTube Video Analysis with Claude 3
	# By KNOX @jr_knox1977
	#
	# Prompt "inpired" by several prompts from Daniel Meissler's Fabric Project:
	# https://github.com/danielmiessler/fabric
	#
	# This script expects a .env file with the following content:
	# CLAUDE_API_KEY=your_api
	#
	# Quick pip: python3 -m pip install anthropic youtube_transcript_api python-dotenv
	# install DSPy: pip install dspy
	import dspy

	# Ollam is now compatible with OpenAI APIs
	#
	# To get this to work you must include `model_type='chat'` in the `dspy.OpenAI` call.
	# If you do not include this you will get an error.
	#
	# I have also found that `stop='\n\n'` is required to get the model to stop generating text after the ansewr is complete.
	# At least with mistral.
	# install DSPy: pip install dspy
	import dspy

	# Ollam 1.24 is now compatible with OpenAI APIs
	# But DSPy has hard coded some logic around the names of the OpenAI models
	# This is a workaround for now.
	# You have to create a custom Ollama Model and use the name 'gpt-3.5-turbo' or at least 'gpt-3.5'
	#
	# Here is the DSPy code that is causing the issue:
	#
	# install DSPy: pip install dspy
	import dspy

	# This sets up the language model for DSPy in this case we are using mistral 7b through TGI (Text Generation Interface from HuggingFace)
	mistral = dspy.HFClientTGI(model='mistralai/Mistral-7B-v0.1', port=8080, url='http://localhost')

	# This sets the language model for DSPy.
	dspy.settings.configure(lm=mistral)

	# This is not required but it helps to understand what is happening