human (Aug 29, 2024, 08:07 PM)
I hit Ctrl+A in Firefox on https://platform.openai.com/docs/assistants/tools/file-search and copied and pasted the raw text into Claude - you can see the text I pasted here: https://gist.github.com/simonw/f9775727dcde2edc0f9f15bbda0b4d42
human (Aug 29, 2024, 08:09 PM)
Based on this documentation, write me a Python CLI app (using the Click CLi library) with the following features:
openai-file-chat add-files name-of-vector-store *.pdf *.txt
This creates a new vector store called name-of-vector-store
and adds all the files passed to the command to that store.
openai-file-chat name-of-vector-store1 name-of-vector-store2 ...
This starts an interactive chat with the user, where any time they hit enter the question is answered by a chat assistant using the specified vector stores.
assistant (Aug 29, 2024, 08:09 PM)
Certainly! I'll create a Python CLI app using the Click library that implements the features you've described. Here's the implementation:
import click
import os
from openai import OpenAI
from typing import List
client = OpenAI()
@click.group()
def cli():
"""OpenAI File Chat CLI"""
pass
@cli.command()
@click.argument('vector_store_name')
@click.argument('files', nargs=-1, type=click.Path(exists=True))
def add_files(vector_store_name: str, files: List[str]):
"""Create a new vector store and add files to it."""
try:
# Create a new vector store
vector_store = client.beta.vector_stores.create(name=vector_store_name)
click.echo(f"Created vector store: {vector_store_name}")
# Upload files and add them to the vector store
file_ids = []
for file_path in files:
with open(file_path, "rb") as file:
uploaded_file = client.files.create(file=file, purpose="assistants")
file_ids.append(uploaded_file.id)
# Add files to the vector store
file_batch = client.beta.vector_stores.file_batches.create_and_poll(
vector_store_id=vector_store.id,
file_ids=file_ids
)
click.echo(f"Added {len(file_ids)} files to the vector store.")
click.echo(f"File batch status: {file_batch.status}")
click.echo(f"File counts: {file_batch.file_counts}")
except Exception as e:
click.echo(f"An error occurred: {str(e)}", err=True)
@cli.command()
@click.argument('vector_store_names', nargs=-1)
def chat(vector_store_names: List[str]):
"""Start an interactive chat using specified vector stores."""
try:
# Create an assistant with file search enabled
assistant = client.beta.assistants.create(
name="File Chat Assistant",
instructions="You are a helpful assistant. Use the provided vector stores to answer user questions.",
model="gpt-4o",
tools=[{"type": "file_search"}],
tool_resources={
"file_search": {
"vector_store_ids": vector_store_names
}
}
)
# Create a thread
thread = client.beta.threads.create()
click.echo("Chat started. Type your questions and press Enter. Type 'exit' to quit.")
while True:
user_input = click.prompt("You")
if user_input.lower() == 'exit':
break
# Add the user's message to the thread
client.beta.threads.messages.create(
thread_id=thread.id,
role="user",
content=user_input
)
# Create a run
run = client.beta.threads.runs.create(
thread_id=thread.id,
assistant_id=assistant.id
)
# Wait for the run to complete
while run.status != "completed":
run = client.beta.threads.runs.retrieve(thread_id=thread.id, run_id=run.id)
# Retrieve and display the assistant's response
messages = client.beta.threads.messages.list(thread_id=thread.id)
for message in messages.data:
if message.role == "assistant":
click.echo(f"Assistant: {message.content[0].text.value}")
break
except Exception as e:
click.echo(f"An error occurred: {str(e)}", err=True)
if __name__ == '__main__':
cli()
This Python CLI app implements the features you requested using the Click library and the OpenAI API. Here's a breakdown of the main components:
- The
add_files
command creates a new vector store with the given name and adds the specified files to it. - The
chat
command starts an interactive chat session using the specified vector stores.
To use this CLI app, you would save it as a Python file (e.g., openai_file_chat.py
) and run it from the command line. Here are some example commands:
-
To add files to a new vector store:
python openai_file_chat.py add-files my-vector-store *.pdf *.txt
-
To start a chat session using one or more vector stores:
python openai_file_chat.py chat vector-store1 vector-store2
Note that you'll need to have the OpenAI Python library installed (pip install openai
) and set up your OpenAI API key as an environment variable (OPENAI_API_KEY
) for this script to work.
This implementation assumes that the vector store IDs are the same as the names you provide when creating them. If this is not the case in the actual API, you may need to modify the chat
function to retrieve the correct vector store IDs.
Also, error handling is basic in this implementation. You might want to add more robust error handling and user feedback for a production-ready application.
Would you like me to explain any part of the code in more detail or make any modifications?
human (Aug 29, 2024, 08:12 PM)
output just the Python code to add a list-stores command
assistant (Aug 29, 2024, 08:12 PM)
Certainly! Here's the Python code to add a list-stores
command to the existing CLI app:
import click
import os
from openai import OpenAI
from typing import List
client = OpenAI()
@click.group()
def cli():
"""OpenAI File Chat CLI"""
pass
# ... (previous code remains the same)
@cli.command()
def list_stores():
"""List all vector stores."""
try:
vector_stores = client.beta.vector_stores.list()
if vector_stores.data:
click.echo("Vector Stores:")
for store in vector_stores.data:
click.echo(f"- ID: {store.id}, Name: {store.name}")
else:
click.echo("No vector stores found.")
except Exception as e:
click.echo(f"An error occurred: {str(e)}", err=True)
if __name__ == '__main__':
cli()
This code adds a new list_stores
command to the CLI app. Here's what it does:
- It defines a new function
list_stores()
decorated with@cli.command()
. - Inside the function, it uses the OpenAI client to list all vector stores.
- If vector stores are found, it prints each store's ID and name.
- If no vector stores are found, it prints a message saying so.
- If an error occurs, it prints an error message.
To use this new command, you would run:
python openai_file_chat.py list-stores
This will list all the vector stores associated with your OpenAI account.
Remember to place this new function alongside the other command functions in your CLI app, before the if __name__ == '__main__':
line.
human (Aug 29, 2024, 08:16 PM)
Modify the chat command such that after every response it does this thing:
run_step = client.beta.threads.runs.steps.retrieve(
thread_id="thread_abc123",
run_id="run_abc123",
step_id="step_abc123",
include=["step_details.tool_calls[*].file_search.results[*].content"]
)
And prints out the details
assistant (Aug 29, 2024, 08:16 PM)
Certainly! I'll modify the chat
command to retrieve and print out the details of the run step after every response. Here's the updated version of the chat
function:
import click
import os
from openai import OpenAI
from typing import List
import json
client = OpenAI()
# ... (previous code remains the same)
@cli.command()
@click.argument('vector_store_names', nargs=-1)
def chat(vector_store_names: List[str]):
"""Start an interactive chat using specified vector stores."""
try:
# Create an assistant with file search enabled
assistant = client.beta.assistants.create(
name="File Chat Assistant",
instructions="You are a helpful assistant. Use the provided vector stores to answer user questions.",
model="gpt-4o",
tools=[{"type": "file_search"}],
tool_resources={
"file_search": {
"vector_store_ids": vector_store_names
}
}
)
# Create a thread
thread = client.beta.threads.create()
click.echo("Chat started. Type your questions and press Enter. Type 'exit' to quit.")
while True:
user_input = click.prompt("You")
if user_input.lower() == 'exit':
break
# Add the user's message to the thread
client.beta.threads.messages.create(
thread_id=thread.id,
role="user",
content=user_input
)
# Create a run
run = client.beta.threads.runs.create(
thread_id=thread.id,
assistant_id=assistant.id
)
# Wait for the run to complete
while run.status != "completed":
run = client.beta.threads.runs.retrieve(thread_id=thread.id, run_id=run.id)
# Retrieve and display the assistant's response
messages = client.beta.threads.messages.list(thread_id=thread.id)
for message in messages.data:
if message.role == "assistant":
click.echo(f"Assistant: {message.content[0].text.value}")
break
# Retrieve and display the run step details
run_steps = client.beta.threads.runs.steps.list(thread_id=thread.id, run_id=run.id)
for step in run_steps.data:
if step.type == "tool_calls":
for tool_call in step.step_details.tool_calls:
if tool_call.type == "file_search":
run_step = client.beta.threads.runs.steps.retrieve(
thread_id=thread.id,
run_id=run.id,
step_id=step.id,
include=["step_details.tool_calls[*].file_search.results[*].content"]
)
click.echo("\nFile Search Results:")
click.echo(json.dumps(run_step.step_details.tool_calls[0].file_search.results, indent=2))
except Exception as e:
click.echo(f"An error occurred: {str(e)}", err=True)
# ... (rest of the code remains the same)
Here's a breakdown of the changes made to the chat
function:
-
We've added
import json
at the top of the file to help with pretty-printing the results. -
After retrieving and displaying the assistant's response, we now iterate through the run steps:
run_steps = client.beta.threads.runs.steps.list(thread_id=thread.id, run_id=run.id)
-
For each step that involves tool calls (specifically file searches), we retrieve the detailed step information:
run_step = client.beta.threads.runs.steps.retrieve( thread_id=thread.id, run_id=run.id, step_id=step.id, include=["step_details.tool_calls[*].file_search.results[*].content"] )
-
We then print out the file search results:
click.echo("\nFile Search Results:") click.echo(json.dumps(run_step.step_details.tool_calls[0].file_search.results, indent=2))
This modification will now display the file search results after each assistant response, giving you insight into which parts of the vector stores were used to generate the response.
Note that this will increase the verbosity of the output significantly. If you find it too verbose, you might want to add an option to toggle this detailed output on and off.
Also, keep in mind that not all runs may involve file searches, so the code checks for the presence of tool calls and file searches before attempting to retrieve and display this information.
human (Aug 29, 2024, 08:19 PM)
click.echo(json.dumps(run_step.step_details.tool_calls[0].file_search.results, indent=2, default=repr))
Those things are Pydantic objects, how can I make them print as pretty as possible?
assistant (Aug 29, 2024, 08:19 PM)
You're right, and I apologize for the oversight. Pydantic objects aren't directly JSON serializable, so we need a different approach to print them prettily. Here's an improved version that should work better with Pydantic objects:
import click
import os
from openai import OpenAI
from typing import List
from pydantic import BaseModel
from pprint import pformat
client = OpenAI()
def pretty_print_pydantic(obj):
if isinstance(obj, BaseModel):
return pformat(obj.dict(), indent=2, width=120)
elif isinstance(obj, list):
return pformat([pretty_print_pydantic(item) for item in obj], indent=2, width=120)
elif isinstance(obj, dict):
return pformat({k: pretty_print_pydantic(v) for k, v in obj.items()}, indent=2, width=120)
else:
return repr(obj)
# ... (previous code remains the same)
@cli.command()
@click.argument('vector_store_names', nargs=-1)
def chat(vector_store_names: List[str]):
"""Start an interactive chat using specified vector stores."""
try:
# Create an assistant with file search enabled
assistant = client.beta.assistants.create(
name="File Chat Assistant",
instructions="You are a helpful assistant. Use the provided vector stores to answer user questions.",
model="gpt-4o",
tools=[{"type": "file_search"}],
tool_resources={
"file_search": {
"vector_store_ids": vector_store_names
}
}
)
# Create a thread
thread = client.beta.threads.create()
click.echo("Chat started. Type your questions and press Enter. Type 'exit' to quit.")
while True:
user_input = click.prompt("You")
if user_input.lower() == 'exit':
break
# Add the user's message to the thread
client.beta.threads.messages.create(
thread_id=thread.id,
role="user",
content=user_input
)
# Create a run
run = client.beta.threads.runs.create(
thread_id=thread.id,
assistant_id=assistant.id
)
# Wait for the run to complete
while run.status != "completed":
run = client.beta.threads.runs.retrieve(thread_id=thread.id, run_id=run.id)
# Retrieve and display the assistant's response
messages = client.beta.threads.messages.list(thread_id=thread.id)
for message in messages.data:
if message.role == "assistant":
click.echo(f"Assistant: {message.content[0].text.value}")
break
# Retrieve and display the run step details
run_steps = client.beta.threads.runs.steps.list(thread_id=thread.id, run_id=run.id)
for step in run_steps.data:
if step.type == "tool_calls":
for tool_call in step.step_details.tool_calls:
if tool_call.type == "file_search":
run_step = client.beta.threads.runs.steps.retrieve(
thread_id=thread.id,
run_id=run.id,
step_id=step.id,
include=["step_details.tool_calls[*].file_search.results[*].content"]
)
click.echo("\nFile Search Results:")
click.echo(pretty_print_pydantic(run_step.step_details.tool_calls[0].file_search.results))
except Exception as e:
click.echo(f"An error occurred: {str(e)}", err=True)
# ... (rest of the code remains the same)
Here are the key changes:
-
We've added imports for
BaseModel
from Pydantic andpformat
from thepprint
module:from pydantic import BaseModel from pprint import pformat
-
We've defined a new
pretty_print_pydantic
function that recursively handles Pydantic objects, lists, and dictionaries:def pretty_print_pydantic(obj): if isinstance(obj, BaseModel): return pformat(obj.dict(), indent=2, width=120) elif isinstance(obj, list): return pformat([pretty_print_pydantic(item) for item in obj], indent=2, width=120) elif isinstance(obj, dict): return pformat({k: pretty_print_pydantic(v) for k, v in obj.items()}, indent=2, width=120) else: return repr(obj)
-
In the
chat
function, we've replaced thejson.dumps
call with our newpretty_print_pydantic
function:click.echo(pretty_print_pydantic(run_step.step_details.tool_calls[0].file_search.results))
This approach should provide a much prettier and more readable output for the Pydantic objects returned by the OpenAI API. It will handle nested Pydantic models, lists, and dictionaries, and fall back to the repr
for any other types of objects.
The pformat
function from the pprint
module is used to create a formatted string representation of the data, which should be more readable than a simple str
or repr
output.
If you find that certain fields are still not displaying as you'd like, you might need to adjust the pretty_print_pydantic
function to handle those specific cases. But this should provide a good starting point for prettily printing the Pydantic objects from the OpenAI API.