-
-
Save mberman84/ea207e7d9e5f8c5f6a3252883ef16df3 to your computer and use it in GitHub Desktop.
1. # create new .py file with code found below | |
2. # install ollama | |
3. # install model you want “ollama run mistral” | |
4. conda create -n autogen python=3.11 | |
5. conda activate autogen | |
6. which python | |
7. python -m pip install pyautogen | |
7. ollama run mistral | |
8. ollama run codellama | |
9. # open new terminal | |
10. conda activate autogen | |
11. python -m pip install litellm | |
12. litellm --model ollama/mistral | |
13. # open new terminal | |
14. conda activate autogen | |
15. litellm --model ollama/codellama | |
### Code used: | |
import autogen | |
config_list_mistral = [ | |
{ | |
'base_url': "http://0.0.0.0:8000", | |
'api_key': "NULL" | |
} | |
] | |
config_list_codellama = [ | |
{ | |
'base_url': "http://0.0.0.0:25257", | |
'api_key': "NULL" | |
} | |
] | |
llm_config_mistral={ | |
"config_list": config_list_mistral, | |
} | |
llm_config_codellama={ | |
"config_list": config_list_codellama, | |
} | |
coder = autogen.AssistantAgent( | |
name="Coder", | |
llm_config=llm_config_codellama | |
) | |
user_proxy = autogen.UserProxyAgent( | |
name="user_proxy", | |
human_input_mode="NEVER", | |
max_consecutive_auto_reply=10, | |
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"), | |
code_execution_config={"work_dir": "web"}, | |
llm_config=llm_config_mistral, | |
system_message="""Reply TERMINATE if the task has been solved at full satisfaction. | |
Otherwise, reply CONTINUE, or the reason why the task is not solved yet.""" | |
) | |
task=""" | |
Write a python script to output numbers 1 to 100 and then the user_proxy agent should run the script | |
""" | |
user_proxy.initiate_chat(coder, message=task) |
Thanks @leolivier - the issue si @GitterDoneScott had thebloke/cuda11.8.0-ubuntu22.04-oneclick
, which indicates he got it all working with that version. I am having issues locating a different version of cuda from TheBloke.
Having said all that, isn't this model supposed to be working without CUDA or GPU? Isn't this a CPU only model and why is there a dependency on cuda in the first place?
I didn't try @GitterDoneScott gist but mine which runs on my laptop definitely uses Cuda
@leolivier ok, yes it looks like I need to expose the GPU to the docker container in order for any of this to work. Thanks!
user_proxy.initiate_chat(
File "/root/miniconda3/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 556, in initiate_chat
self.send(self.generate_init_message(**context), recipient, silent=silent)
File "/root/miniconda3/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 354, in send
recipient.receive(message, self, request_reply, silent)
File "/root/miniconda3/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 489, in receive
self.send(reply, sender, silent=silent)
File "/root/miniconda3/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 352, in send
valid = self._append_oai_message(message, "assistant", recipient)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 307, in _append_oai_message
oai_message["function_call"] = dict(oai_message["function_call"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not iterable
I'm getting this error.
@leolivier ok, yes it looks like I need to expose the GPU to the docker container in order for any of this to work. Thanks!
Which OS are you using? I'm on Windows with WSL2 for Ollama and didn't have to install anything special except the latest NVIDIA drivers of my (low end) graphic card.
The detailed install documentation for Linux says you should Download and install CUDA and then run nvidia-smi
to check the install...
Nice tutorial!
Unfortunately I am getting this error:
Traceback (most recent call last):
File "/Users/victor/Desktop/Projects/WISHBAR/run.py", line 45, in
user_proxy.initiate_chat(coder, message=task)
File "/usr/local/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 672, in initiate_chat
self.send(self.generate_init_message(**context), recipient, silent=silent)
File "/usr/local/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 420, in send
recipient.receive(message, self, request_reply, silent)
File "/usr/local/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 578, in receive
reply = self.generate_reply(messages=self.chat_messages[sender], sender=sender)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 1241, in generate_reply
final, reply = reply_func(self, messages=messages, sender=sender, config=reply_func_tuple["config"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 761, in generate_oai_reply
response = client.create(
^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/autogen/oai/client.py", line 266, in create
response = self._completions_create(client, params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/autogen/oai/client.py", line 531, in _completions_create
response = completions.create(**params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/openai/_utils/_utils.py", line 298, in wrapper
raise TypeError(msg)
TypeError: Missing required arguments; Expected either ('messages' and 'model') or ('messages', 'model' and 'stream') arguments to be given
Does anyone know how to fix it?
Thanks in advance.
@N0kay You're supposed to put a model parameter in the config list, followed with the name.
Example:
config_list_dolphinmixtral = [
{
'base_url': "http://0.0.0.0:7577",
'api_key': "NULL",
'model': "ollama/dolphin-mixtral"
}
]
$ install ollma in my linux ubantu 22.04 machine
im getting
ollama list
NAME ID SIZE MODIFIED
codellama:latest 8fdf8f752f6e 3.8 GB 13 minutes ago
mistral:7b-instruct-v0.2-q4_K_S ba00d3a5239e 4.1 GB 2 hours ago
mistral:latest 61e88e884507 4.1 GB 45 minutes ago
mistral_akash:latest 73dc103bd530 4.1 GB 4 hours ago
(base) akash@Precision-3580:~/LLMstudio$
setup i followed
- ollama run mistral
- ollama run codellama
then using litellm
litellm --model ollama/mistral - http://0.0.0.0:8000
litellm --model ollama/codellama - http://0.0.0.0:46602
my code is :
## Testing out the main version
import autogen
config_list_1 = [
{
'base_url': "http://0.0.0.0:8000",
'api_key': "NULL",
'model': "ollama/mistral",
}
]
config_list_2 = [
{
'base_url': "http://0.0.0.0:46602",
'api_key': "NULL",
'model': "ollama/codellama",
}
]
llm_config_1={
"config_list": config_list_1,
}
llm_config_2={
"config_list": config_list_2,
}
coder = autogen.AssistantAgent(
name="Coder",
llm_config=llm_config_2
)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
max_consecutive_auto_reply=10,
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
code_execution_config={"work_dir": "web"},
llm_config=llm_config_1,
system_message="""Reply TERMINATE if the task has been solved at full satisfaction.
Otherwise, reply CONTINUE, or the reason why the task is not solved yet."""
)
# task="""
# Write a python script to output numbers 1 to 100 and then the user_proxy agent should run the script
# """
task="""
Tell me a joke!
"""
user_proxy.initiate_chat(coder, message=task)
im getting error
user_proxy (to Coder):
Tell me a joke!
--------------------------------------------------------------------------------
Traceback (most recent call last):
File "/home/akash/LLMstudio/ollama/AutogenLangchainPDFchat-main/app_basic_autogen.py", line 53, in <module>
user_proxy.initiate_chat(coder, message=task)
File "/home/akash/anaconda3/envs/autogen/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 698, in initiate_chat
self.send(self.generate_init_message(**context), recipient, silent=silent)
File "/home/akash/anaconda3/envs/autogen/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 441, in send
recipient.receive(message, self, request_reply, silent)
File "/home/akash/anaconda3/envs/autogen/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 599, in receive
reply = self.generate_reply(messages=self.chat_messages[sender], sender=sender)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/akash/anaconda3/envs/autogen/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 1298, in generate_reply
final, reply = reply_func(self, messages=messages, sender=sender, config=reply_func_tuple["config"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/akash/anaconda3/envs/autogen/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 813, in generate_oai_reply
response = client.create(
^^^^^^^^^^^^^^
File "/home/akash/anaconda3/envs/autogen/lib/python3.11/site-packages/autogen/oai/client.py", line 283, in create
response = self._completions_create(client, params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/akash/anaconda3/envs/autogen/lib/python3.11/site-packages/autogen/oai/client.py", line 548, in _completions_create
response = completions.create(**params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/akash/anaconda3/envs/autogen/lib/python3.11/site-packages/openai/_utils/_utils.py", line 271, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/akash/anaconda3/envs/autogen/lib/python3.11/site-packages/openai/resources/chat/completions.py", line 659, in create
return self._post(
^^^^^^^^^^^
File "/home/akash/anaconda3/envs/autogen/lib/python3.11/site-packages/openai/_base_client.py", line 1180, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/akash/anaconda3/envs/autogen/lib/python3.11/site-packages/openai/_base_client.py", line 869, in request
return self._request(
^^^^^^^^^^^^^^
File "/home/akash/anaconda3/envs/autogen/lib/python3.11/site-packages/openai/_base_client.py", line 960, in _request
raise self._make_status_error_from_response(err.response) from None
openai.AuthenticationError: Error code: 401 - {'error': {'message': 'Authentication Error, ', 'type': 'auth_error', 'param': 'None', 'code': 401}}
@akashAD98 You can notice that call is being made to openai (which should not happen because you're usingg local models).
import autogen
config_list_mistral = [
{
'base_url': "http://0.0.0.0:8000",
'api_key': "NULL",
'model': "ollama/mixtral"
}
]
config_list_codellama = [
{
'base_url': "http://0.0.0.0:3873",
'api_key': "NULL",
'model': "ollama/stable-code"
}
]
llm_config_mistral={
"config_list": config_list_mistral,
}
llm_config_codellama={
"config_list": config_list_codellama,
}
coder = autogen.AssistantAgent(
name="Coder",
llm_config=llm_config_codellama
)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
max_consecutive_auto_reply=10,
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
code_execution_config={"work_dir": "web"},
llm_config=llm_config_mistral,
system_message="""Reply TERMINATE if the task has been solved at full satisfaction.
Otherwise, reply CONTINUE, or the reason why the task is not solved yet."""
)
task="""
Write a python script to output numbers 1 to 100 and then the user_proxy agent should run the script
"""
user_proxy.initiate_chat(coder, message=task)
This works, @xdLawless2 was right, the only thing that needed to be changed was adding the model
name in each config_list
import autogen config_list_mistral = [ { 'base_url': "http://0.0.0.0:8000", 'api_key': "NULL", 'model': "ollama/mixtral" } ] config_list_codellama = [ { 'base_url': "http://0.0.0.0:3873", 'api_key': "NULL", 'model': "ollama/stable-code" } ] llm_config_mistral={ "config_list": config_list_mistral, } llm_config_codellama={ "config_list": config_list_codellama, } coder = autogen.AssistantAgent( name="Coder", llm_config=llm_config_codellama ) user_proxy = autogen.UserProxyAgent( name="user_proxy", human_input_mode="NEVER", max_consecutive_auto_reply=10, is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"), code_execution_config={"work_dir": "web"}, llm_config=llm_config_mistral, system_message="""Reply TERMINATE if the task has been solved at full satisfaction. Otherwise, reply CONTINUE, or the reason why the task is not solved yet.""" ) task=""" Write a python script to output numbers 1 to 100 and then the user_proxy agent should run the script """ user_proxy.initiate_chat(coder, message=task)This works, @xdLawless2 was right, the only thing that needed to be changed was adding the
model
name in eachconfig_list
Thanks for taking the time to share this. I was about to post something similar with the complete code.
Much easier to troubleshoot a python file than AutoGen Studio... Good to have a working example without the UI overhead.
Since version 0.1.24, Ollama is compatible with OpenAI API and you don't even need litellm anymore.
You know just need to install Ollama and run ollama serve
then, in another terminal, pull the models you want to use eg ollama pull codellama
and ollama pull mistral
, then install autogen as before :
$ conda create -n autogen python=3.11
$ conda activate autogen
$ pip install pyautogen
Finally, the python code has to be slightly changed:
- The base URL must be changed from http://0.0.0.0:8000 to http://localhost:11434/v1 (which replace the litellm URL by the OpenAI compatible Ollama one)
- The code_execution_config in the autogen.UserProxyAgent() call must be changed to
code_execution_config={"work_dir": "web", "use_docker": False},
as use_docker has been changed to True by default in recent versions (otherwise you must install autogen as a docker container) - Also, the name of the model in the config has changed for me: if you pull eg codellama:7b-code-q4_K_M (ie a specific tag) or mistral (no tag is implicitely tag latest), then model must be "codellama:7b-code-q4_K_M" and "mistral:latest" in the config lists eg:
config_list_codellama = [
{
'base_url': "http://localhost:11434/v1",
'api_key': "fakekey",
'model': "codellama:7b-code-q4_K_M",
}
]
The whole code for me is:
import autogen
# direct access to Ollama since 0.1.24, compatible with OpenAI /chat/completions
BASE_URL="http://localhost:11434/v1"
config_list_mistral = [
{
'base_url': BASE_URL,
'api_key': "fakekey",
'model': "mistral:latest",
}
]
config_list_codellama = [
{
'base_url': BASE_URL,
'api_key': "fakekey",
'model': "codellama:7b-code-q4_K_M",
}
]
llm_config_mistral={
"config_list": config_list_mistral,
}
llm_config_codellama={
"config_list": config_list_codellama,
}
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
max_consecutive_auto_reply=10,
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
code_execution_config={"work_dir": "web", "use_docker": False},
llm_config=llm_config_mistral,
system_message="""Reply TERMINATE if the task has been solved at full satisfaction.
Otherwise, reply CONTINUE, or the reason why the task is not solved yet."""
)
coder = autogen.AssistantAgent(
name="Coder",
llm_config=llm_config_codellama
)
task="""
Write a python script that lists the number from 1 to 100
"""
user_proxy.initiate_chat(coder, message=task)
@leolivier This works on my mac but not on my ubuntu 22.04 server. Don't know why there would be any difference. I kept getting the same error @akashAD98 was getting. It kept trying to use openai credentials. I have ollama serve running in one terminal and calling the the python script in another terminal. I can run the models and get a list of the models, but I can't get autogen to accept the ollama endpoint without the openai creds. thoughts?
@Tedfulk are you sure you're using the same version of Ollama on your Mac and your Ubuntu ? I'm using it on WSL Ubuntu so not real Ubuntu but I don't think it would change this behavior...
Here's the groupchat version of the script
import autogen
from autogen.agentchat.contrib.retrieve_assistant_agent import RetrieveAssistantAgent
from autogen.agentchat.contrib.retrieve_user_proxy_agent import RetrieveUserProxyAgent
from autogen import AssistantAgent, UserProxyAgent, config_list_from_json, GroupChat, GroupChatManager
import chromadb
import os
from autogen import GroupChat
import json
from autogen.retrieve_utils import TEXT_FORMATS
######################################################################
config_list_mistral = [
{
"base_url": "http://localhost:35666/v1",
"api_key": "sk-111111111111",
"model": "TheBloke/Llama-2-7B-32K-Instruct-GGUF"
}
]
config_list_codellama = [
{
"base_url": "http://localhost:8000/v1",
"api_key": "sk-111111111111",
"model": "TheBloke/Llama-2-7B-32K-Instruct-GGUF"
}
]
######################################################################
llm_config_mistral={
"config_list": config_list_mistral,
}
llm_config_codellama={
"config_list": config_list_codellama,
}
######################################################################
llm_config_mistral = llm_config_mistral
llm_config_codellama = llm_config_codellama
######################################################################
assistant = autogen.AssistantAgent(
name="Assistant",
llm_config=llm_config_mistral,
# code_execution=False # Disable code execution entirely
code_execution_config={"work_dir":"coding", "use_docker":False}
)
coder = autogen.AssistantAgent(
name="Coder",
llm_config=llm_config_codellama,
# code_execution=False # Disable code execution entirely
code_execution_config={"work_dir":"coding", "use_docker":False}
)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
#human_input_mode="TERMINATE",
max_consecutive_auto_reply=10,
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
code_execution_config={"work_dir": "coding", "use_docker":False},
llm_config=llm_config_mistral,
system_message="""Reply TERMINATE if the task has been solved at full satisfaction.
Otherwise, reply CONTINUE, or the reason why the task is not solved yet."""
)
task="""
Write a python script to output numbers 1 to 100 and then the user_proxy agent should run the script
"""
#task="""
#Write a script to output numbers 1 to X where X is a random number generated by the user_proxy agent
#"""
#user_proxy.initiate_chat(coder, message=task) # Simple chat with coder
groupchat = autogen.GroupChat(agents=[user_proxy, coder, assistant], messages=[], max_round=12)
manager = autogen.GroupChatManager(groupchat=groupchat, llm_config=llm_config_mistral)
user_proxy.initiate_chat(manager, message=task)
self.send(self.generate_init_message(**context), recipient, silent=silent)
File "/root/miniconda3/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 354, in send
recipient.receive(message, self, request_reply, silent)
File "/root/miniconda3/lib/python3.11/site-packages/autogen/agentchat/conversable_agent.py", line 489, in receive
self.send(reply, sender, silent=silent)
@N0kay You're supposed to put a model parameter in the config list, followed with the name.
Example:
config_list_dolphinmixtral = [ { 'base_url': "http://0.0.0.0:7577", 'api_key': "NULL", 'model': "ollama/dolphin-mixtral" } ]
Since the URL actually controls the model, this value can be "NULL" similar to the api_key value. Expected but not used.
`import autogen
config_list = [
{
'base_url': "http://0.0.0.0:4000",
'api_key' : "NULL"
}
]
llm_config = {
'config_list': config_list,
}
assistant = autogen.AssistantAgent(
name = "Assistant",
llm_config = llm_config
)
user_proxy = autogen.UserProxyAgent(
name="user_proxy",
human_input_mode="TERMINATE",
max_consecutive_auto_reply=10,
is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
code_execution_config={"work_dir": "web", "use_docker" : False},
llm_config=llm_config,
system_message="""Reply TERMINATE if the task has been solved at full satisfaction.
Otherwise, reply CONTINUE, or the reason why the task is not solved yet."""
)
task="""
Tell me a joke
"""
user_proxy.initiate_chat(assistant, message=task)`
I have this as a one assistant and proxy code and I get this error if anyone could help
TypeError: Missing required arguments; Expected either ('messages' and 'model') or ('messages', 'model' and 'stream') arguments to be given
import autogen # direct access to Ollama since 0.1.24, compatible with OpenAI /chat/completions BASE_URL="http://localhost:11434/v1" config_list_mistral = [ { 'base_url': BASE_URL, 'api_key': "fakekey", 'model': "mistral:latest", } ] config_list_codellama = [ { 'base_url': BASE_URL, 'api_key': "fakekey", 'model': "codellama:7b-code-q4_K_M", } ] llm_config_mistral={ "config_list": config_list_mistral, } llm_config_codellama={ "config_list": config_list_codellama, } user_proxy = autogen.UserProxyAgent( name="user_proxy", human_input_mode="NEVER", max_consecutive_auto_reply=10, is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"), code_execution_config={"work_dir": "web", "use_docker": False}, llm_config=llm_config_mistral, system_message="""Reply TERMINATE if the task has been solved at full satisfaction. Otherwise, reply CONTINUE, or the reason why the task is not solved yet.""" ) coder = autogen.AssistantAgent( name="Coder", llm_config=llm_config_codellama ) task=""" Write a python script that lists the number from 1 to 100 """ user_proxy.initiate_chat(coder, message=task)
I found this code online that seems to be correct and directly addresses my needs. Thank you so much to @leolivier for sharing it.
It says: CUDA driver version is insufficient for CUDA runtime version
Did you try upgrading your driver?