Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save reachrkr/5af6af92f12c858700bc0599eda408d3 to your computer and use it in GitHub Desktop.
Save reachrkr/5af6af92f12c858700bc0599eda408d3 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "331f5d8c-3041-4d69-a957-93a79b335562",
"metadata": {
"id": "331f5d8c-3041-4d69-a957-93a79b335562"
},
"source": [
"# Query Transformation Techniques - llamaindex\n",
"\n",
"A user query can be transformed and decomposed in many ways before being executed as part of a RAG query engine, agent, or any other pipeline.\n",
"\n",
"In this guide we show you different ways to transform, decompose queries, and find the set of relevant tools. Each technique might be applicable for different use cases!\n",
"\n",
"Here are the different query transformations:\n",
"\n",
"1. **Query-Rewriting**: Keep the tools, but rewrite the query in a variety of different ways to execute against the same tools.\n",
"2. **Sub-Questions**: Decompose queries into multiple sub-questions over different tools (identified by their metadata).\n",
"3. **Multi-Step Query Decomposition** : A multi-step query engine that's able to decompose a complex query into sequential subquestions\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "8c100403",
"metadata": {
"id": "8c100403",
"outputId": "456fa381-a3a2-40ba-82ac-d84d82c51ec5",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Requirement already satisfied: llama-index-question-gen-openai in /usr/local/lib/python3.10/dist-packages (0.1.3)\n",
"Requirement already satisfied: llama-index-core<0.11.0,>=0.10.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-question-gen-openai) (0.10.13)\n",
"Requirement already satisfied: llama-index-llms-openai<0.2.0,>=0.1.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-question-gen-openai) (0.1.6)\n",
"Requirement already satisfied: llama-index-program-openai<0.2.0,>=0.1.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-question-gen-openai) (0.1.4)\n",
"Requirement already satisfied: PyYAML>=6.0.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (6.0.1)\n",
"Requirement already satisfied: SQLAlchemy[asyncio]>=1.4.49 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (2.0.27)\n",
"Requirement already satisfied: aiohttp<4.0.0,>=3.8.6 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (3.9.3)\n",
"Requirement already satisfied: dataclasses-json in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (0.6.4)\n",
"Requirement already satisfied: deprecated>=1.2.9.3 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.2.14)\n",
"Requirement already satisfied: dirtyjson<2.0.0,>=1.0.8 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.0.8)\n",
"Requirement already satisfied: fsspec>=2023.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (2023.6.0)\n",
"Requirement already satisfied: httpx in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (0.27.0)\n",
"Requirement already satisfied: llamaindex-py-client<0.2.0,>=0.1.13 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (0.1.13)\n",
"Requirement already satisfied: nest-asyncio<2.0.0,>=1.5.8 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.6.0)\n",
"Requirement already satisfied: networkx>=3.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (3.2.1)\n",
"Requirement already satisfied: nltk<4.0.0,>=3.8.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (3.8.1)\n",
"Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.25.2)\n",
"Requirement already satisfied: openai>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.12.0)\n",
"Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.5.3)\n",
"Requirement already satisfied: pillow>=9.0.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (9.4.0)\n",
"Requirement already satisfied: requests>=2.31.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (2.31.0)\n",
"Requirement already satisfied: tenacity<9.0.0,>=8.2.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (8.2.3)\n",
"Requirement already satisfied: tiktoken>=0.3.3 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (0.6.0)\n",
"Requirement already satisfied: tqdm<5.0.0,>=4.66.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (4.66.2)\n",
"Requirement already satisfied: typing-extensions>=4.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (4.9.0)\n",
"Requirement already satisfied: typing-inspect>=0.8.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (0.9.0)\n",
"Requirement already satisfied: llama-index-agent-openai<0.2.0,>=0.1.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-program-openai<0.2.0,>=0.1.1->llama-index-question-gen-openai) (0.1.5)\n",
"Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.3.1)\n",
"Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (23.2.0)\n",
"Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.4.1)\n",
"Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (6.0.5)\n",
"Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.9.4)\n",
"Requirement already satisfied: async-timeout<5.0,>=4.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (4.0.3)\n",
"Requirement already satisfied: wrapt<2,>=1.10 in /usr/local/lib/python3.10/dist-packages (from deprecated>=1.2.9.3->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.14.1)\n",
"Requirement already satisfied: pydantic>=1.10 in /usr/local/lib/python3.10/dist-packages (from llamaindex-py-client<0.2.0,>=0.1.13->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (2.6.1)\n",
"Requirement already satisfied: anyio in /usr/local/lib/python3.10/dist-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (3.7.1)\n",
"Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (2024.2.2)\n",
"Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.10/dist-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.0.4)\n",
"Requirement already satisfied: idna in /usr/local/lib/python3.10/dist-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (3.6)\n",
"Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.3.0)\n",
"Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.10/dist-packages (from httpcore==1.*->httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (0.14.0)\n",
"Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (8.1.7)\n",
"Requirement already satisfied: joblib in /usr/local/lib/python3.10/dist-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.3.2)\n",
"Requirement already satisfied: regex>=2021.8.3 in /usr/local/lib/python3.10/dist-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (2023.12.25)\n",
"Requirement already satisfied: distro<2,>=1.7.0 in /usr/lib/python3/dist-packages (from openai>=1.1.0->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.7.0)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.31.0->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (3.3.2)\n",
"Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.31.0->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (2.0.7)\n",
"Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from SQLAlchemy[asyncio]>=1.4.49->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (3.0.3)\n",
"Requirement already satisfied: mypy-extensions>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from typing-inspect>=0.8.0->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.0.0)\n",
"Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /usr/local/lib/python3.10/dist-packages (from dataclasses-json->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (3.21.0)\n",
"Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from pandas->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (2.8.2)\n",
"Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (2023.4)\n",
"Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio->httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.2.0)\n",
"Requirement already satisfied: packaging>=17.0 in /usr/local/lib/python3.10/dist-packages (from marshmallow<4.0.0,>=3.18.0->dataclasses-json->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (23.2)\n",
"Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic>=1.10->llamaindex-py-client<0.2.0,>=0.1.13->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (0.6.0)\n",
"Requirement already satisfied: pydantic-core==2.16.2 in /usr/local/lib/python3.10/dist-packages (from pydantic>=1.10->llamaindex-py-client<0.2.0,>=0.1.13->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (2.16.2)\n",
"Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.1->pandas->llama-index-core<0.11.0,>=0.10.1->llama-index-question-gen-openai) (1.16.0)\n",
"Requirement already satisfied: llama-index-llms-openai in /usr/local/lib/python3.10/dist-packages (0.1.6)\n",
"Requirement already satisfied: llama-index-core<0.11.0,>=0.10.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-llms-openai) (0.10.13)\n",
"Requirement already satisfied: PyYAML>=6.0.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (6.0.1)\n",
"Requirement already satisfied: SQLAlchemy[asyncio]>=1.4.49 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (2.0.27)\n",
"Requirement already satisfied: aiohttp<4.0.0,>=3.8.6 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (3.9.3)\n",
"Requirement already satisfied: dataclasses-json in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (0.6.4)\n",
"Requirement already satisfied: deprecated>=1.2.9.3 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.2.14)\n",
"Requirement already satisfied: dirtyjson<2.0.0,>=1.0.8 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.0.8)\n",
"Requirement already satisfied: fsspec>=2023.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (2023.6.0)\n",
"Requirement already satisfied: httpx in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (0.27.0)\n",
"Requirement already satisfied: llamaindex-py-client<0.2.0,>=0.1.13 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (0.1.13)\n",
"Requirement already satisfied: nest-asyncio<2.0.0,>=1.5.8 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.6.0)\n",
"Requirement already satisfied: networkx>=3.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (3.2.1)\n",
"Requirement already satisfied: nltk<4.0.0,>=3.8.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (3.8.1)\n",
"Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.25.2)\n",
"Requirement already satisfied: openai>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.12.0)\n",
"Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.5.3)\n",
"Requirement already satisfied: pillow>=9.0.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (9.4.0)\n",
"Requirement already satisfied: requests>=2.31.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (2.31.0)\n",
"Requirement already satisfied: tenacity<9.0.0,>=8.2.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (8.2.3)\n",
"Requirement already satisfied: tiktoken>=0.3.3 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (0.6.0)\n",
"Requirement already satisfied: tqdm<5.0.0,>=4.66.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (4.66.2)\n",
"Requirement already satisfied: typing-extensions>=4.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (4.9.0)\n",
"Requirement already satisfied: typing-inspect>=0.8.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (0.9.0)\n",
"Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.3.1)\n",
"Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (23.2.0)\n",
"Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.4.1)\n",
"Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (6.0.5)\n",
"Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.9.4)\n",
"Requirement already satisfied: async-timeout<5.0,>=4.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (4.0.3)\n",
"Requirement already satisfied: wrapt<2,>=1.10 in /usr/local/lib/python3.10/dist-packages (from deprecated>=1.2.9.3->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.14.1)\n",
"Requirement already satisfied: pydantic>=1.10 in /usr/local/lib/python3.10/dist-packages (from llamaindex-py-client<0.2.0,>=0.1.13->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (2.6.1)\n",
"Requirement already satisfied: anyio in /usr/local/lib/python3.10/dist-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (3.7.1)\n",
"Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (2024.2.2)\n",
"Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.10/dist-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.0.4)\n",
"Requirement already satisfied: idna in /usr/local/lib/python3.10/dist-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (3.6)\n",
"Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.3.0)\n",
"Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.10/dist-packages (from httpcore==1.*->httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (0.14.0)\n",
"Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (8.1.7)\n",
"Requirement already satisfied: joblib in /usr/local/lib/python3.10/dist-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.3.2)\n",
"Requirement already satisfied: regex>=2021.8.3 in /usr/local/lib/python3.10/dist-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (2023.12.25)\n",
"Requirement already satisfied: distro<2,>=1.7.0 in /usr/lib/python3/dist-packages (from openai>=1.1.0->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.7.0)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.31.0->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (3.3.2)\n",
"Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.31.0->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (2.0.7)\n",
"Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from SQLAlchemy[asyncio]>=1.4.49->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (3.0.3)\n",
"Requirement already satisfied: mypy-extensions>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from typing-inspect>=0.8.0->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.0.0)\n",
"Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /usr/local/lib/python3.10/dist-packages (from dataclasses-json->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (3.21.0)\n",
"Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from pandas->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (2.8.2)\n",
"Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (2023.4)\n",
"Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio->httpx->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.2.0)\n",
"Requirement already satisfied: packaging>=17.0 in /usr/local/lib/python3.10/dist-packages (from marshmallow<4.0.0,>=3.18.0->dataclasses-json->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (23.2)\n",
"Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic>=1.10->llamaindex-py-client<0.2.0,>=0.1.13->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (0.6.0)\n",
"Requirement already satisfied: pydantic-core==2.16.2 in /usr/local/lib/python3.10/dist-packages (from pydantic>=1.10->llamaindex-py-client<0.2.0,>=0.1.13->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (2.16.2)\n",
"Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.1->pandas->llama-index-core<0.11.0,>=0.10.1->llama-index-llms-openai) (1.16.0)\n",
"Requirement already satisfied: llama-index in /usr/local/lib/python3.10/dist-packages (0.10.13.post1)\n",
"Requirement already satisfied: llama-index-agent-openai<0.2.0,>=0.1.4 in /usr/local/lib/python3.10/dist-packages (from llama-index) (0.1.5)\n",
"Requirement already satisfied: llama-index-cli<0.2.0,>=0.1.2 in /usr/local/lib/python3.10/dist-packages (from llama-index) (0.1.5)\n",
"Requirement already satisfied: llama-index-core<0.11.0,>=0.10.13 in /usr/local/lib/python3.10/dist-packages (from llama-index) (0.10.13)\n",
"Requirement already satisfied: llama-index-embeddings-openai<0.2.0,>=0.1.5 in /usr/local/lib/python3.10/dist-packages (from llama-index) (0.1.6)\n",
"Requirement already satisfied: llama-index-indices-managed-llama-cloud<0.2.0,>=0.1.2 in /usr/local/lib/python3.10/dist-packages (from llama-index) (0.1.3)\n",
"Requirement already satisfied: llama-index-legacy<0.10.0,>=0.9.48 in /usr/local/lib/python3.10/dist-packages (from llama-index) (0.9.48)\n",
"Requirement already satisfied: llama-index-llms-openai<0.2.0,>=0.1.5 in /usr/local/lib/python3.10/dist-packages (from llama-index) (0.1.6)\n",
"Requirement already satisfied: llama-index-multi-modal-llms-openai<0.2.0,>=0.1.3 in /usr/local/lib/python3.10/dist-packages (from llama-index) (0.1.4)\n",
"Requirement already satisfied: llama-index-program-openai<0.2.0,>=0.1.3 in /usr/local/lib/python3.10/dist-packages (from llama-index) (0.1.4)\n",
"Requirement already satisfied: llama-index-question-gen-openai<0.2.0,>=0.1.2 in /usr/local/lib/python3.10/dist-packages (from llama-index) (0.1.3)\n",
"Requirement already satisfied: llama-index-readers-file<0.2.0,>=0.1.4 in /usr/local/lib/python3.10/dist-packages (from llama-index) (0.1.6)\n",
"Requirement already satisfied: llama-index-readers-llama-parse<0.2.0,>=0.1.2 in /usr/local/lib/python3.10/dist-packages (from llama-index) (0.1.3)\n",
"Requirement already satisfied: llama-index-vector-stores-chroma<0.2.0,>=0.1.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.1.4)\n",
"Requirement already satisfied: PyYAML>=6.0.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (6.0.1)\n",
"Requirement already satisfied: SQLAlchemy[asyncio]>=1.4.49 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (2.0.27)\n",
"Requirement already satisfied: aiohttp<4.0.0,>=3.8.6 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (3.9.3)\n",
"Requirement already satisfied: dataclasses-json in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (0.6.4)\n",
"Requirement already satisfied: deprecated>=1.2.9.3 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (1.2.14)\n",
"Requirement already satisfied: dirtyjson<2.0.0,>=1.0.8 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (1.0.8)\n",
"Requirement already satisfied: fsspec>=2023.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (2023.6.0)\n",
"Requirement already satisfied: httpx in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (0.27.0)\n",
"Requirement already satisfied: llamaindex-py-client<0.2.0,>=0.1.13 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (0.1.13)\n",
"Requirement already satisfied: nest-asyncio<2.0.0,>=1.5.8 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (1.6.0)\n",
"Requirement already satisfied: networkx>=3.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (3.2.1)\n",
"Requirement already satisfied: nltk<4.0.0,>=3.8.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (3.8.1)\n",
"Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (1.25.2)\n",
"Requirement already satisfied: openai>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (1.12.0)\n",
"Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (1.5.3)\n",
"Requirement already satisfied: pillow>=9.0.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (9.4.0)\n",
"Requirement already satisfied: requests>=2.31.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (2.31.0)\n",
"Requirement already satisfied: tenacity<9.0.0,>=8.2.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (8.2.3)\n",
"Requirement already satisfied: tiktoken>=0.3.3 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (0.6.0)\n",
"Requirement already satisfied: tqdm<5.0.0,>=4.66.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (4.66.2)\n",
"Requirement already satisfied: typing-extensions>=4.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (4.9.0)\n",
"Requirement already satisfied: typing-inspect>=0.8.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-core<0.11.0,>=0.10.13->llama-index) (0.9.0)\n",
"Requirement already satisfied: beautifulsoup4<5.0.0,>=4.12.3 in /usr/local/lib/python3.10/dist-packages (from llama-index-readers-file<0.2.0,>=0.1.4->llama-index) (4.12.3)\n",
"Requirement already satisfied: bs4<0.0.3,>=0.0.2 in /usr/local/lib/python3.10/dist-packages (from llama-index-readers-file<0.2.0,>=0.1.4->llama-index) (0.0.2)\n",
"Requirement already satisfied: pymupdf<2.0.0,>=1.23.21 in /usr/local/lib/python3.10/dist-packages (from llama-index-readers-file<0.2.0,>=0.1.4->llama-index) (1.23.25)\n",
"Requirement already satisfied: pypdf<5.0.0,>=4.0.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-readers-file<0.2.0,>=0.1.4->llama-index) (4.0.2)\n",
"Requirement already satisfied: llama-parse<0.4.0,>=0.3.3 in /usr/local/lib/python3.10/dist-packages (from llama-index-readers-llama-parse<0.2.0,>=0.1.2->llama-index) (0.3.4)\n",
"Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.13->llama-index) (1.3.1)\n",
"Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.13->llama-index) (23.2.0)\n",
"Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.13->llama-index) (1.4.1)\n",
"Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.13->llama-index) (6.0.5)\n",
"Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.13->llama-index) (1.9.4)\n",
"Requirement already satisfied: async-timeout<5.0,>=4.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.6->llama-index-core<0.11.0,>=0.10.13->llama-index) (4.0.3)\n",
"Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.10/dist-packages (from beautifulsoup4<5.0.0,>=4.12.3->llama-index-readers-file<0.2.0,>=0.1.4->llama-index) (2.5)\n",
"Requirement already satisfied: wrapt<2,>=1.10 in /usr/local/lib/python3.10/dist-packages (from deprecated>=1.2.9.3->llama-index-core<0.11.0,>=0.10.13->llama-index) (1.14.1)\n",
"Requirement already satisfied: chromadb<0.5.0,>=0.4.22 in /usr/local/lib/python3.10/dist-packages (from llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.4.23)\n",
"Requirement already satisfied: onnxruntime<2.0.0,>=1.17.0 in /usr/local/lib/python3.10/dist-packages (from llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (1.17.1)\n",
"Requirement already satisfied: tokenizers<0.16.0,>=0.15.1 in /usr/local/lib/python3.10/dist-packages (from llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.15.2)\n",
"Requirement already satisfied: pydantic>=1.10 in /usr/local/lib/python3.10/dist-packages (from llamaindex-py-client<0.2.0,>=0.1.13->llama-index-core<0.11.0,>=0.10.13->llama-index) (2.6.1)\n",
"Requirement already satisfied: anyio in /usr/local/lib/python3.10/dist-packages (from httpx->llama-index-core<0.11.0,>=0.10.13->llama-index) (3.7.1)\n",
"Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from httpx->llama-index-core<0.11.0,>=0.10.13->llama-index) (2024.2.2)\n",
"Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.10/dist-packages (from httpx->llama-index-core<0.11.0,>=0.10.13->llama-index) (1.0.4)\n",
"Requirement already satisfied: idna in /usr/local/lib/python3.10/dist-packages (from httpx->llama-index-core<0.11.0,>=0.10.13->llama-index) (3.6)\n",
"Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from httpx->llama-index-core<0.11.0,>=0.10.13->llama-index) (1.3.0)\n",
"Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.10/dist-packages (from httpcore==1.*->httpx->llama-index-core<0.11.0,>=0.10.13->llama-index) (0.14.0)\n",
"Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.13->llama-index) (8.1.7)\n",
"Requirement already satisfied: joblib in /usr/local/lib/python3.10/dist-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.13->llama-index) (1.3.2)\n",
"Requirement already satisfied: regex>=2021.8.3 in /usr/local/lib/python3.10/dist-packages (from nltk<4.0.0,>=3.8.1->llama-index-core<0.11.0,>=0.10.13->llama-index) (2023.12.25)\n",
"Requirement already satisfied: distro<2,>=1.7.0 in /usr/lib/python3/dist-packages (from openai>=1.1.0->llama-index-core<0.11.0,>=0.10.13->llama-index) (1.7.0)\n",
"Requirement already satisfied: PyMuPDFb==1.23.22 in /usr/local/lib/python3.10/dist-packages (from pymupdf<2.0.0,>=1.23.21->llama-index-readers-file<0.2.0,>=0.1.4->llama-index) (1.23.22)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.31.0->llama-index-core<0.11.0,>=0.10.13->llama-index) (3.3.2)\n",
"Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.31.0->llama-index-core<0.11.0,>=0.10.13->llama-index) (2.0.7)\n",
"Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from SQLAlchemy[asyncio]>=1.4.49->llama-index-core<0.11.0,>=0.10.13->llama-index) (3.0.3)\n",
"Requirement already satisfied: mypy-extensions>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from typing-inspect>=0.8.0->llama-index-core<0.11.0,>=0.10.13->llama-index) (1.0.0)\n",
"Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /usr/local/lib/python3.10/dist-packages (from dataclasses-json->llama-index-core<0.11.0,>=0.10.13->llama-index) (3.21.0)\n",
"Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from pandas->llama-index-core<0.11.0,>=0.10.13->llama-index) (2.8.2)\n",
"Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->llama-index-core<0.11.0,>=0.10.13->llama-index) (2023.4)\n",
"Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio->httpx->llama-index-core<0.11.0,>=0.10.13->llama-index) (1.2.0)\n",
"Requirement already satisfied: build>=1.0.3 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (1.0.3)\n",
"Requirement already satisfied: chroma-hnswlib==0.7.3 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.7.3)\n",
"Requirement already satisfied: fastapi>=0.95.2 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.110.0)\n",
"Requirement already satisfied: uvicorn[standard]>=0.18.3 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.27.1)\n",
"Requirement already satisfied: posthog>=2.4.0 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (3.4.2)\n",
"Requirement already satisfied: pulsar-client>=3.1.0 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (3.4.0)\n",
"Requirement already satisfied: opentelemetry-api>=1.2.0 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (1.23.0)\n",
"Requirement already satisfied: opentelemetry-exporter-otlp-proto-grpc>=1.2.0 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (1.23.0)\n",
"Requirement already satisfied: opentelemetry-instrumentation-fastapi>=0.41b0 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.44b0)\n",
"Requirement already satisfied: opentelemetry-sdk>=1.2.0 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (1.23.0)\n",
"Requirement already satisfied: pypika>=0.48.9 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.48.9)\n",
"Requirement already satisfied: overrides>=7.3.1 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (7.7.0)\n",
"Requirement already satisfied: importlib-resources in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (6.1.1)\n",
"Requirement already satisfied: grpcio>=1.58.0 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (1.60.1)\n",
"Requirement already satisfied: bcrypt>=4.0.1 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (4.1.2)\n",
"Requirement already satisfied: typer>=0.9.0 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.9.0)\n",
"Requirement already satisfied: kubernetes>=28.1.0 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (29.0.0)\n",
"Requirement already satisfied: mmh3>=4.0.1 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (4.1.0)\n",
"Requirement already satisfied: orjson>=3.9.12 in /usr/local/lib/python3.10/dist-packages (from chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (3.9.15)\n",
"Requirement already satisfied: packaging>=17.0 in /usr/local/lib/python3.10/dist-packages (from marshmallow<4.0.0,>=3.18.0->dataclasses-json->llama-index-core<0.11.0,>=0.10.13->llama-index) (23.2)\n",
"Requirement already satisfied: coloredlogs in /usr/local/lib/python3.10/dist-packages (from onnxruntime<2.0.0,>=1.17.0->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (15.0.1)\n",
"Requirement already satisfied: flatbuffers in /usr/local/lib/python3.10/dist-packages (from onnxruntime<2.0.0,>=1.17.0->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (23.5.26)\n",
"Requirement already satisfied: protobuf in /usr/local/lib/python3.10/dist-packages (from onnxruntime<2.0.0,>=1.17.0->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (3.20.3)\n",
"Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from onnxruntime<2.0.0,>=1.17.0->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (1.12)\n",
"Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic>=1.10->llamaindex-py-client<0.2.0,>=0.1.13->llama-index-core<0.11.0,>=0.10.13->llama-index) (0.6.0)\n",
"Requirement already satisfied: pydantic-core==2.16.2 in /usr/local/lib/python3.10/dist-packages (from pydantic>=1.10->llamaindex-py-client<0.2.0,>=0.1.13->llama-index-core<0.11.0,>=0.10.13->llama-index) (2.16.2)\n",
"Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.1->pandas->llama-index-core<0.11.0,>=0.10.13->llama-index) (1.16.0)\n",
"Requirement already satisfied: huggingface_hub<1.0,>=0.16.4 in /usr/local/lib/python3.10/dist-packages (from tokenizers<0.16.0,>=0.15.1->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.20.3)\n",
"Requirement already satisfied: pyproject_hooks in /usr/local/lib/python3.10/dist-packages (from build>=1.0.3->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (1.0.0)\n",
"Requirement already satisfied: tomli>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from build>=1.0.3->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (2.0.1)\n",
"Requirement already satisfied: starlette<0.37.0,>=0.36.3 in /usr/local/lib/python3.10/dist-packages (from fastapi>=0.95.2->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.36.3)\n",
"Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from huggingface_hub<1.0,>=0.16.4->tokenizers<0.16.0,>=0.15.1->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (3.13.1)\n",
"Requirement already satisfied: google-auth>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from kubernetes>=28.1.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (2.27.0)\n",
"Requirement already satisfied: websocket-client!=0.40.0,!=0.41.*,!=0.42.*,>=0.32.0 in /usr/local/lib/python3.10/dist-packages (from kubernetes>=28.1.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (1.7.0)\n",
"Requirement already satisfied: requests-oauthlib in /usr/local/lib/python3.10/dist-packages (from kubernetes>=28.1.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (1.3.1)\n",
"Requirement already satisfied: oauthlib>=3.2.2 in /usr/local/lib/python3.10/dist-packages (from kubernetes>=28.1.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (3.2.2)\n",
"Requirement already satisfied: importlib-metadata<7.0,>=6.0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-api>=1.2.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (6.11.0)\n",
"Requirement already satisfied: googleapis-common-protos~=1.52 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (1.62.0)\n",
"Requirement already satisfied: opentelemetry-exporter-otlp-proto-common==1.23.0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (1.23.0)\n",
"Requirement already satisfied: opentelemetry-proto==1.23.0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (1.23.0)\n",
"Requirement already satisfied: opentelemetry-instrumentation-asgi==0.44b0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.44b0)\n",
"Requirement already satisfied: opentelemetry-instrumentation==0.44b0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.44b0)\n",
"Requirement already satisfied: opentelemetry-semantic-conventions==0.44b0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.44b0)\n",
"Requirement already satisfied: opentelemetry-util-http==0.44b0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-instrumentation-fastapi>=0.41b0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.44b0)\n",
"Requirement already satisfied: setuptools>=16.0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-instrumentation==0.44b0->opentelemetry-instrumentation-fastapi>=0.41b0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (67.7.2)\n",
"Requirement already satisfied: asgiref~=3.0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-instrumentation-asgi==0.44b0->opentelemetry-instrumentation-fastapi>=0.41b0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (3.7.2)\n",
"Requirement already satisfied: monotonic>=1.5 in /usr/local/lib/python3.10/dist-packages (from posthog>=2.4.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (1.6)\n",
"Requirement already satisfied: backoff>=1.10.0 in /usr/local/lib/python3.10/dist-packages (from posthog>=2.4.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (2.2.1)\n",
"Requirement already satisfied: httptools>=0.5.0 in /usr/local/lib/python3.10/dist-packages (from uvicorn[standard]>=0.18.3->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.6.1)\n",
"Requirement already satisfied: python-dotenv>=0.13 in /usr/local/lib/python3.10/dist-packages (from uvicorn[standard]>=0.18.3->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (1.0.1)\n",
"Requirement already satisfied: uvloop!=0.15.0,!=0.15.1,>=0.14.0 in /usr/local/lib/python3.10/dist-packages (from uvicorn[standard]>=0.18.3->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.19.0)\n",
"Requirement already satisfied: watchfiles>=0.13 in /usr/local/lib/python3.10/dist-packages (from uvicorn[standard]>=0.18.3->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.21.0)\n",
"Requirement already satisfied: websockets>=10.4 in /usr/local/lib/python3.10/dist-packages (from uvicorn[standard]>=0.18.3->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (12.0)\n",
"Requirement already satisfied: humanfriendly>=9.1 in /usr/local/lib/python3.10/dist-packages (from coloredlogs->onnxruntime<2.0.0,>=1.17.0->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (10.0)\n",
"Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->onnxruntime<2.0.0,>=1.17.0->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (1.3.0)\n",
"Requirement already satisfied: cachetools<6.0,>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from google-auth>=1.0.1->kubernetes>=28.1.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (5.3.2)\n",
"Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.10/dist-packages (from google-auth>=1.0.1->kubernetes>=28.1.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.3.0)\n",
"Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.10/dist-packages (from google-auth>=1.0.1->kubernetes>=28.1.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (4.9)\n",
"Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.10/dist-packages (from importlib-metadata<7.0,>=6.0->opentelemetry-api>=1.2.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (3.17.0)\n",
"Requirement already satisfied: pyasn1<0.6.0,>=0.4.6 in /usr/local/lib/python3.10/dist-packages (from pyasn1-modules>=0.2.1->google-auth>=1.0.1->kubernetes>=28.1.0->chromadb<0.5.0,>=0.4.22->llama-index-vector-stores-chroma<0.2.0,>=0.1.1->llama-index-cli<0.2.0,>=0.1.2->llama-index) (0.5.1)\n"
]
}
],
"source": [
"%pip install llama-index-question-gen-openai\n",
"%pip install llama-index-llms-openai\n",
"%pip install llama-index"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "30451d5d-91a8-4148-bdf8-a4b4fd8ab0fc",
"metadata": {
"id": "30451d5d-91a8-4148-bdf8-a4b4fd8ab0fc"
},
"outputs": [],
"source": [
"from IPython.display import Markdown, display\n",
"\n",
"\n",
"# define prompt viewing function\n",
"def display_prompt_dict(prompts_dict):\n",
" for k, p in prompts_dict.items():\n",
" text_md = f\"**Prompt Key**: {k}<br>\" f\"**Text:** <br>\"\n",
" display(Markdown(text_md))\n",
" print(p.get_template())\n",
" display(Markdown(\"<br><br>\"))"
]
},
{
"cell_type": "code",
"source": [
"import os\n",
"from getpass import getpass\n",
"\n",
"# platform.openai.com\n",
"os.environ[\"OPENAI_API_KEY\"] = os.getenv(\"OPENAI_API_KEY\") or getpass(\n",
" \"Enter OpenAI API Key: \"\n",
")\n"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "gi7XB21hGeWo",
"outputId": "c70d5193-e0c4-4ed1-f2b5-5d6782afaed8"
},
"id": "gi7XB21hGeWo",
"execution_count": 3,
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Enter OpenAI API Key: ··········\n"
]
}
]
},
{
"cell_type": "markdown",
"id": "cc787575-e60c-4c06-8b39-b33230438e00",
"metadata": {
"id": "cc787575-e60c-4c06-8b39-b33230438e00"
},
"source": [
"## Query Rewriting\n",
"\n",
"In this section, we show you how to rewrite queries into multiple queries. You can then execute all these queries against a retriever.\n",
"\n",
"This is a key step in advanced retrieval techniques. By doing query rewriting, you can generate multiple queries for [ensemble retrieval] and [fusion], leading to higher-quality retrieved results.\n",
"\n",
"Unlike the sub-question generator, this is just a prompt call, and exists independently of tools."
]
},
{
"cell_type": "markdown",
"id": "b38c5be2-fd99-4987-8412-2eda597da412",
"metadata": {
"id": "b38c5be2-fd99-4987-8412-2eda597da412"
},
"source": [
"### Query Rewriting (Custom)\n",
"\n",
"Here we show you how to use a prompt to generate multiple queries, using our LLM and prompt abstractions."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "d969dfcb-0a2b-4178-87ac-1d0444981f49",
"metadata": {
"id": "d969dfcb-0a2b-4178-87ac-1d0444981f49"
},
"outputs": [],
"source": [
"from llama_index.core import PromptTemplate\n",
"from llama_index.llms.openai import OpenAI\n",
"\n",
"query_gen_str = \"\"\"\\\n",
"You are a helpful assistant that generates multiple search queries based on a \\\n",
"single input query. Generate {num_queries} search queries, one on each line, \\\n",
"related to the following input query:\n",
"Query: {query}\n",
"Queries:\n",
"\"\"\"\n",
"query_gen_prompt = PromptTemplate(query_gen_str)\n",
"\n",
"llm = OpenAI(model=\"gpt-3.5-turbo\")\n",
"\n",
"\n",
"def generate_queries(query: str, llm, num_queries: int = 4):\n",
" response = llm.predict(\n",
" query_gen_prompt, num_queries=num_queries, query=query\n",
" )\n",
" # assume LLM proper put each query on a newline\n",
" queries = response.split(\"\\n\")\n",
" queries_str = \"\\n\".join(queries)\n",
" print(f\"Generated queries:\\n{queries_str}\")\n",
" return queries"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "4a4f1ed7-cbd8-4177-9c29-a6ecb2e8a530",
"metadata": {
"id": "4a4f1ed7-cbd8-4177-9c29-a6ecb2e8a530",
"outputId": "17fd5ef3-857a-4a99-eea3-907dd761b79e",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Generated queries:\n",
"1. Latest news on Citrix incident\n",
"2. Citrix security breach details\n",
"3. Impact of Citrix data breach\n",
"4. Citrix response to recent events\n"
]
}
],
"source": [
"queries = generate_queries(\"What happened at Citrix\", llm)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "f416f6ba-efc8-4e30-84ea-e20466168758",
"metadata": {
"id": "f416f6ba-efc8-4e30-84ea-e20466168758"
},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "535b3f78-1036-4878-b0fe-44f5330fdb1e",
"metadata": {
"id": "535b3f78-1036-4878-b0fe-44f5330fdb1e"
},
"source": [
"### Query Rewriting (using HyDE Query Transform)\n",
"\n",
"In this section we show you how to do query transformations using our Hyde QueryTransform class."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "3ac0639a-65ae-43cf-a68a-c9705cf7cb12",
"metadata": {
"id": "3ac0639a-65ae-43cf-a68a-c9705cf7cb12"
},
"outputs": [],
"source": [
"import logging\n",
"import sys\n",
"\n",
"logging.basicConfig(stream=sys.stdout, level=logging.INFO)\n",
"logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))\n",
"\n",
"from llama_index.core import VectorStoreIndex, SimpleDirectoryReader\n",
"from llama_index.core.indices.query.query_transform import HyDEQueryTransform\n",
"from llama_index.core.query_engine import TransformQueryEngine\n",
"from IPython.display import Markdown, display"
]
},
{
"cell_type": "code",
"source": [
"!mkdir -p 'data/paul_graham/'\n",
"!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'"
],
"metadata": {
"id": "pAvD9xJB85ze",
"outputId": "dfd3a309-a7d7-466a-9496-3a587459275e",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"id": "pAvD9xJB85ze",
"execution_count": 7,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"--2024-02-27 11:01:31-- https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt\n",
"Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n",
"Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 75042 (73K) [text/plain]\n",
"Saving to: ‘data/paul_graham/paul_graham_essay.txt’\n",
"\n",
"\r data/paul 0%[ ] 0 --.-KB/s \rdata/paul_graham/pa 100%[===================>] 73.28K --.-KB/s in 0.01s \n",
"\n",
"2024-02-27 11:01:31 (5.64 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]\n",
"\n"
]
}
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "480d58fc-b5c6-41d0-b497-d147a2d1e471",
"metadata": {
"id": "480d58fc-b5c6-41d0-b497-d147a2d1e471"
},
"outputs": [],
"source": [
"# load documents\n",
"documents = SimpleDirectoryReader(\"./data/paul_graham/\").load_data()"
]
},
{
"cell_type": "code",
"source": [
"index = VectorStoreIndex.from_documents(documents)\n"
],
"metadata": {
"id": "C_3I5I4fBF6o"
},
"id": "C_3I5I4fBF6o",
"execution_count": 9,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"First, we query without transformation: The same query string is used for embedding lookup and also summarization."
],
"metadata": {
"id": "OG-l06bgBLrS"
},
"id": "OG-l06bgBLrS"
},
{
"cell_type": "code",
"source": [
"query_str = \"what did paul graham do after going to RISD\""
],
"metadata": {
"id": "Lf3UD56-BMRn"
},
"id": "Lf3UD56-BMRn",
"execution_count": 10,
"outputs": []
},
{
"cell_type": "code",
"source": [
"query_engine = index.as_query_engine()\n",
"response = query_engine.query(query_str)\n",
"display(Markdown(f\"<b>{response}</b>\"))"
],
"metadata": {
"id": "Bn9x5CamBT9a",
"outputId": "d22836b2-c02b-4ca1-d4e1-f66d6283faee",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 46
}
},
"id": "Bn9x5CamBT9a",
"execution_count": 11,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<IPython.core.display.Markdown object>"
],
"text/markdown": "<b>Paul Graham started painting still lives in his bedroom at night while he was a student at the Accademia di Belli Arti in Florence.</b>"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": [
"Now, we use HyDEQueryTransform to generate a hypothetical document and use it for embedding lookup."
],
"metadata": {
"id": "PD9udZsfBmwt"
},
"id": "PD9udZsfBmwt"
},
{
"cell_type": "code",
"execution_count": 12,
"id": "61994183-0b39-4d9a-ae74-b78f68b8dcfc",
"metadata": {
"id": "61994183-0b39-4d9a-ae74-b78f68b8dcfc",
"outputId": "9fe152f9-6c51-4e56-aec8-8cdea0fb671a",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 81
}
},
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<IPython.core.display.Markdown object>"
],
"text/markdown": "<b>After going to RISD, Paul Graham resumed his old patterns, experimented with a new kind of still life painting technique, looked for an apartment to buy, and eventually had the idea to build a web app for making web apps. He got excited about this idea and decided to move to Cambridge to start a new company focused on this concept.</b>"
},
"metadata": {}
}
],
"source": [
"hyde = HyDEQueryTransform(include_original=True)\n",
"hyde_query_engine = TransformQueryEngine(query_engine, hyde)\n",
"response = hyde_query_engine.query(query_str)\n",
"display(Markdown(f\"<b>{response}</b>\"))"
]
},
{
"cell_type": "code",
"source": [
"query_bundle = hyde(query_str)\n",
"hyde_doc = query_bundle.embedding_strs[0]"
],
"metadata": {
"id": "F1z7natgDbce"
},
"id": "F1z7natgDbce",
"execution_count": 13,
"outputs": []
},
{
"cell_type": "code",
"source": [
"hyde_doc"
],
"metadata": {
"id": "ktNmMGeqDblc",
"outputId": "54dae83f-9b4d-40a6-86b7-ab8aadf2742a",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 109
}
},
"id": "ktNmMGeqDblc",
"execution_count": 14,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"\"After attending the Rhode Island School of Design (RISD), Paul Graham went on to co-found Viaweb, one of the first online stores that allowed users to build their own e-commerce websites. Viaweb was later acquired by Yahoo for $49 million, marking Graham's first successful venture in the tech industry. Following this success, Graham went on to co-found Y Combinator, a startup accelerator that has helped launch numerous successful companies such as Dropbox, Airbnb, and Reddit. Graham is also known for his influential essays on startups and entrepreneurship, as well as his work as an angel investor in various tech companies. Overall, Paul Graham's career after RISD has been marked by a series of successful ventures and contributions to the tech industry.\""
],
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
}
},
"metadata": {},
"execution_count": 14
}
]
},
{
"cell_type": "markdown",
"id": "f4ee4457-e8dc-4527-b305-9b09fb89c1f7",
"metadata": {
"id": "f4ee4457-e8dc-4527-b305-9b09fb89c1f7"
},
"source": [
"## Sub-Questions\n",
"\n",
"Given a set of tools and a user query, decide both the 1) set of sub-questions to generate, and 2) the tools that each sub-question should run over.\n",
"\n",
"We run through an example using the `OpenAIQuestionGenerator`, which depends on function calling, and also the `LLMQuestionGenerator`, which depends on prompting."
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "423c9cc6-20eb-4ae3-bca0-c01e606f865a",
"metadata": {
"id": "423c9cc6-20eb-4ae3-bca0-c01e606f865a"
},
"outputs": [],
"source": [
"from llama_index.core.question_gen import LLMQuestionGenerator\n",
"from llama_index.question_gen.openai import OpenAIQuestionGenerator\n",
"from llama_index.llms.openai import OpenAI"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "59f47121-fc91-45cd-a071-c3c6208726b7",
"metadata": {
"id": "59f47121-fc91-45cd-a071-c3c6208726b7"
},
"outputs": [],
"source": [
"llm = OpenAI()\n",
"question_gen = OpenAIQuestionGenerator.from_defaults(llm=llm)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "effdbaca-4087-4199-bc85-ceea00d6edf4",
"metadata": {
"id": "effdbaca-4087-4199-bc85-ceea00d6edf4",
"outputId": "b6644761-e0bb-41d8-fb2a-eb58e645caef",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 565
}
},
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<IPython.core.display.Markdown object>"
],
"text/markdown": "**Prompt Key**: question_gen_prompt<br>**Text:** <br>"
},
"metadata": {}
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"You are a world class state of the art agent.\n",
"\n",
"You have access to multiple tools, each representing a different data source or API.\n",
"Each of the tools has a name and a description, formatted as a JSON dictionary.\n",
"The keys of the dictionary are the names of the tools and the values are the descriptions.\n",
"Your purpose is to help answer a complex user question by generating a list of sub questions that can be answered by the tools.\n",
"\n",
"These are the guidelines you consider when completing your task:\n",
"* Be as specific as possible\n",
"* The sub questions should be relevant to the user question\n",
"* The sub questions should be answerable by the tools provided\n",
"* You can generate multiple sub questions for each tool\n",
"* Tools must be specified by their name, not their description\n",
"* You don't need to use a tool if you don't think it's relevant\n",
"\n",
"Output the list of sub questions by calling the SubQuestionList function.\n",
"\n",
"## Tools\n",
"```json\n",
"{tools_str}\n",
"```\n",
"\n",
"## User Question\n",
"{query_str}\n",
"\n"
]
},
{
"output_type": "display_data",
"data": {
"text/plain": [
"<IPython.core.display.Markdown object>"
],
"text/markdown": "<br><br>"
},
"metadata": {}
}
],
"source": [
"display_prompt_dict(question_gen.get_prompts())"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "586cf84d-276d-40f3-bb5e-2f5f774b84a1",
"metadata": {
"id": "586cf84d-276d-40f3-bb5e-2f5f774b84a1"
},
"outputs": [],
"source": [
"from llama_index.core.tools import ToolMetadata\n",
"\n",
"tool_choices = [\n",
" ToolMetadata(\n",
" name=\"uber_2021_10k\",\n",
" description=(\n",
" \"Provides information about Uber financials for year 2021\"\n",
" ),\n",
" ),\n",
" ToolMetadata(\n",
" name=\"lyft_2021_10k\",\n",
" description=(\n",
" \"Provides information about Lyft financials for year 2021\"\n",
" ),\n",
" ),\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "3bf94cf9-a869-420f-a6c2-3f2a0b11365e",
"metadata": {
"id": "3bf94cf9-a869-420f-a6c2-3f2a0b11365e"
},
"outputs": [],
"source": [
"from llama_index.core import QueryBundle\n",
"\n",
"query_str = \"Compare and contrast Uber and Lyft\"\n",
"choices = question_gen.generate(tool_choices, QueryBundle(query_str=query_str))"
]
},
{
"cell_type": "markdown",
"id": "11696baf-4767-4b16-bf41-ea4fe9a3722e",
"metadata": {
"id": "11696baf-4767-4b16-bf41-ea4fe9a3722e"
},
"source": [
"The outputs are `SubQuestion` Pydantic objects."
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "3cc4baa8-db8e-4357-b3fe-d9975f8c7fa1",
"metadata": {
"id": "3cc4baa8-db8e-4357-b3fe-d9975f8c7fa1",
"outputId": "4f5895dc-e9fd-4afa-844c-7106b71ed24f",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"[SubQuestion(sub_question='What were the total revenues for Uber in 2021?', tool_name='uber_2021_10k'),\n",
" SubQuestion(sub_question='What were the total revenues for Lyft in 2021?', tool_name='lyft_2021_10k'),\n",
" SubQuestion(sub_question='What were the net profits for Uber in 2021?', tool_name='uber_2021_10k'),\n",
" SubQuestion(sub_question='What were the net profits for Lyft in 2021?', tool_name='lyft_2021_10k')]"
]
},
"metadata": {},
"execution_count": 20
}
],
"source": [
"choices"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "9c48213d-6e6a-4c10-838a-2a7c710c3a05"
},
"source": [
"# Multi-Step Query Decomposition\n",
"\n",
"We have a multi-step query engine that's able to decompose a complex query into sequential subquestions. This\n",
"guide walks you through how to set it up!"
],
"id": "9c48213d-6e6a-4c10-838a-2a7c710c3a05"
},
{
"cell_type": "markdown",
"metadata": {
"id": "c5885d63"
},
"source": [
"#### Download Data"
],
"id": "c5885d63"
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"id": "ee4ace96",
"outputId": "b5a4574a-d9b7-4d4f-fec6-966d7f6938b4",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"--2024-02-27 11:01:47-- https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt\n",
"Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.111.133, ...\n",
"Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 75042 (73K) [text/plain]\n",
"Saving to: ‘data/paul_graham/paul_graham_essay.txt’\n",
"\n",
"\r data/paul 0%[ ] 0 --.-KB/s \rdata/paul_graham/pa 100%[===================>] 73.28K --.-KB/s in 0.01s \n",
"\n",
"2024-02-27 11:01:47 (5.51 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]\n",
"\n"
]
}
],
"source": [
"!mkdir -p 'data/paul_graham/'\n",
"!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'"
],
"id": "ee4ace96"
},
{
"cell_type": "markdown",
"metadata": {
"id": "50d3b817-b70e-4667-be4f-d3a0fe4bd119"
},
"source": [
"#### Load documents, build the VectorStoreIndex"
],
"id": "50d3b817-b70e-4667-be4f-d3a0fe4bd119"
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"id": "65ebfca2"
},
"outputs": [],
"source": [],
"id": "65ebfca2"
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"id": "690a6918-7c75-4f95-9ccc-d2c4a1fe00d7"
},
"outputs": [],
"source": [
"from llama_index.core import VectorStoreIndex, SimpleDirectoryReader\n",
"from llama_index.llms.openai import OpenAI\n",
"from IPython.display import Markdown, display"
],
"id": "690a6918-7c75-4f95-9ccc-d2c4a1fe00d7"
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"id": "c48da73f-aadb-480c-8db1-99c915b7cc1c"
},
"outputs": [],
"source": [
"# LLM (gpt-3.5)\n",
"gpt35 = OpenAI(temperature=0, model=\"gpt-3.5-turbo\")\n",
"\n"
],
"id": "c48da73f-aadb-480c-8db1-99c915b7cc1c"
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"id": "03d1691e-544b-454f-825b-5ee12f7faa8a"
},
"outputs": [],
"source": [
"# load documents\n",
"documents = SimpleDirectoryReader(\"./data/paul_graham/\").load_data()"
],
"id": "03d1691e-544b-454f-825b-5ee12f7faa8a"
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"id": "ad144ee7-96da-4dd6-be00-fd6cf0c78e58"
},
"outputs": [],
"source": [
"index = VectorStoreIndex.from_documents(documents)"
],
"id": "ad144ee7-96da-4dd6-be00-fd6cf0c78e58"
},
{
"cell_type": "markdown",
"metadata": {
"id": "b6caf93b-6345-4c65-a346-a95b0f1746c4"
},
"source": [
"#### Query Index"
],
"id": "b6caf93b-6345-4c65-a346-a95b0f1746c4"
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"id": "95d989ba-0c1d-43b6-a1d3-0ea7135f43a6"
},
"outputs": [],
"source": [
"from llama_index.core.indices.query.query_transform.base import (\n",
" StepDecomposeQueryTransform,\n",
")\n",
"\n",
"step_decompose_transform = StepDecomposeQueryTransform(llm=gpt35, verbose=True)\n",
"\n",
"# gpt-3\n",
"step_decompose_transform_gpt3 = StepDecomposeQueryTransform(\n",
" llm=gpt35, verbose=True\n",
")"
],
"id": "95d989ba-0c1d-43b6-a1d3-0ea7135f43a6"
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"id": "2a124db0-e2d7-4566-bcec-1d41cf669ff4"
},
"outputs": [],
"source": [
"index_summary = \"Used to answer questions about the author\""
],
"id": "2a124db0-e2d7-4566-bcec-1d41cf669ff4"
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"id": "85466fdf-93f3-4cb1-a5f9-0056a8245a6f",
"outputId": "74f9931a-fa4a-4a87-b5f1-9d75a8839d5f",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"\u001b[1;3;33m> Current query: Who was in the first batch of the accelerator program the author started?\n",
"\u001b[0m\u001b[1;3;38;5;200m> New query: Who started the accelerator program?\n",
"\u001b[0m\u001b[1;3;33m> Current query: Who was in the first batch of the accelerator program the author started?\n",
"\u001b[0m\u001b[1;3;38;5;200m> New query: Who was in the first batch of the accelerator program started by Paul Graham?\n",
"\u001b[0m\u001b[1;3;33m> Current query: Who was in the first batch of the accelerator program the author started?\n",
"\u001b[0m\u001b[1;3;38;5;200m> New query: Who were some of the individuals in the first batch of the accelerator program started by Paul Graham?\n",
"\u001b[0m"
]
}
],
"source": [
"# set Logging to DEBUG for more detailed outputs\n",
"from llama_index.core.query_engine import MultiStepQueryEngine\n",
"\n",
"query_engine = index.as_query_engine(llm=gpt35)\n",
"query_engine = MultiStepQueryEngine(\n",
" query_engine=query_engine,\n",
" query_transform=step_decompose_transform,\n",
" index_summary=index_summary,\n",
")\n",
"response_gpt_1 = query_engine.query(\n",
" \"Who was in the first batch of the accelerator program the author\"\n",
" \" started?\",\n",
")"
],
"id": "85466fdf-93f3-4cb1-a5f9-0056a8245a6f"
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"id": "bdda1b2c-ae46-47cf-91d7-3153e8d0473b",
"outputId": "533cae5b-23bd-4dcf-e79b-77051fc368a0",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 46
}
},
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<IPython.core.display.Markdown object>"
],
"text/markdown": "<b>The first batch of the accelerator program the author started included reddit, Justin Kan and Emmett Shear, Aaron Swartz, and Sam Altman.</b>"
},
"metadata": {}
}
],
"source": [
"display(Markdown(f\"<b>{response_gpt_1}</b>\"))"
],
"id": "bdda1b2c-ae46-47cf-91d7-3153e8d0473b"
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"id": "1c9670bd-729d-478b-a77c-c6e13c282456",
"outputId": "b8d2b81e-12af-414c-b7bf-67fbdfad6cff",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"[('Who started the accelerator program?', 'Paul Graham started the accelerator program.'), ('Who was in the first batch of the accelerator program started by Paul Graham?', 'The first batch of the accelerator program started by Paul Graham included reddit, Justin Kan and Emmett Shear, Aaron Swartz, and Sam Altman.'), ('Who were some of the individuals in the first batch of the accelerator program started by Paul Graham?', 'The first batch of the accelerator program started by Paul Graham included reddit, Justin Kan and Emmett Shear, Aaron Swartz, and Sam Altman.')]\n"
]
}
],
"source": [
"sub_qa = response_gpt_1.metadata[\"sub_qa\"]\n",
"tuples = [(t[0], t[1].response) for t in sub_qa]\n",
"print(tuples)"
],
"id": "1c9670bd-729d-478b-a77c-c6e13c282456"
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"id": "ec88df57",
"outputId": "050c2b01-ef75-4d00-95af-73c8f9a271a8",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"\u001b[1;3;33m> Current query: In which city did the author found his first company, Viaweb?\n",
"\u001b[0m\u001b[1;3;38;5;200m> New query: Where did the author found his first company, Viaweb?\n",
"\u001b[0m\u001b[1;3;33m> Current query: In which city did the author found his first company, Viaweb?\n",
"\u001b[0m\u001b[1;3;38;5;200m> New query: Where was the author working out of when he founded his first company, Viaweb?\n",
"\u001b[0m\u001b[1;3;33m> Current query: In which city did the author found his first company, Viaweb?\n",
"\u001b[0m\u001b[1;3;38;5;200m> New query: Where was Robert's apartment located when the author founded his first company, Viaweb?\n",
"\u001b[0m"
]
}
],
"source": [
"response_gpt_1 = query_engine.query(\n",
" \"In which city did the author found his first company, Viaweb?\",\n",
")"
],
"id": "ec88df57"
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"id": "653508f1-b2b0-479a-85b3-113cda507231",
"outputId": "262086fe-48dd-4bd4-da89-cfa63a46713d",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Cambridge\n"
]
}
],
"source": [
"print(response_gpt_1)"
],
"id": "653508f1-b2b0-479a-85b3-113cda507231"
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"id": "9fa93cdb-7007-4664-853a-5c81c6c17560",
"outputId": "02ea8f4f-a862-4cd5-9ec9-cf2472511897",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"\u001b[1;3;33m> Current query: In which city did the author found his first company, Viaweb?\n",
"\u001b[0m\u001b[1;3;38;5;200m> New query: Where did the author found his first company, Viaweb?\n",
"\u001b[0m\u001b[1;3;33m> Current query: In which city did the author found his first company, Viaweb?\n",
"\u001b[0m\u001b[1;3;38;5;200m> New query: Where was the author working when he founded his first company, Viaweb?\n",
"\u001b[0m\u001b[1;3;33m> Current query: In which city did the author found his first company, Viaweb?\n",
"\u001b[0m\u001b[1;3;38;5;200m> New query: Where was the author's friend Robert's apartment located when he founded his first company, Viaweb?\n",
"\u001b[0m"
]
}
],
"source": [
"query_engine = index.as_query_engine(llm=gpt35)\n",
"query_engine = MultiStepQueryEngine(\n",
" query_engine=query_engine,\n",
" query_transform=step_decompose_transform_gpt3,\n",
" index_summary=index_summary,\n",
")\n",
"\n",
"response_gpt3 = query_engine.query(\n",
" \"In which city did the author found his first company, Viaweb?\",\n",
")"
],
"id": "9fa93cdb-7007-4664-853a-5c81c6c17560"
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"id": "05899fcf-7a04-4d21-9e6d-04983755d175",
"outputId": "eaa3c7c9-40d4-4be0-c4dc-7c615311afba",
"colab": {
"base_uri": "https://localhost:8080/"
}
},
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"The author founded his first company, Viaweb, in Cambridge.\n"
]
}
],
"source": [
"print(response_gpt3)"
],
"id": "05899fcf-7a04-4d21-9e6d-04983755d175"
},
{
"cell_type": "code",
"source": [],
"metadata": {
"id": "sbPMcQy7tKR5"
},
"id": "sbPMcQy7tKR5",
"execution_count": 34,
"outputs": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"name": "python"
},
"colab": {
"provenance": []
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment