Skip to content

Instantly share code, notes, and snippets.

@codelahoma
Last active March 22, 2024 17:02
Show Gist options
  • Save codelahoma/11a5455a54a3c04b6ac058158e02e3fb to your computer and use it in GitHub Desktop.
Save codelahoma/11a5455a54a3c04b6ac058158e02e3fb to your computer and use it in GitHub Desktop.
Dynamic LCEL Inputs
{
"cells": [
{
"cell_type": "markdown",
"id": "9eaf7e95-0591-4c8a-95f5-3c9355b62807",
"metadata": {},
"source": [
"# Dynamic production of LCEL inputs\n",
"\n",
"In some situations, you might want to use LCEL to run multiple runnables in parallel to produce inputs for another runnable that follows it.\n",
"\n",
"When you know all of the inputs, you can just define a `dict[str, Runnable | Callable]`, which LCEL recognizes as a shorthand for `RunnableParallel`, and use that to produce them. Also, with a known set of inputs you can manually embed input variables for them in your prompt.\n",
"\n",
"But what about when we _don't_ know how many runnables we'll be parallelizing?\n",
"\n",
"This notebook demonstrates _a_ technique for handling this situation."
]
},
{
"cell_type": "code",
"execution_count": 137,
"id": "0a0eb34a-1732-4271-b606-0e9220b67dad",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_openai import ChatOpenAI\n",
"from langchain_core.runnables import RunnableLambda\n",
"from pprint import pprint"
]
},
{
"cell_type": "markdown",
"id": "48a035d5-6d0d-47c1-ab5c-b665ecf8cadf",
"metadata": {},
"source": [
"##### A quick word about `PromptTemplate` and `ChatPromptTemplate`.\n",
"\n",
"We use `PromptTemplate` here because it's simpler due the implementation of the overridden `+` operator for `str`, which appends the string to the _single_ `HumanMessage` it will produce.\n",
"\n",
"`ChatPromptTemplate` also provides a `+ str` override, but since chat prompt templates are about a `list[BaseMessage]`, the added text becomes a new `HumanMessage` appended to the list of messages. To use one for dynamic inputs, you should probably build up the template string _before_ creating the prompt to get the same effect of adding a single message. "
]
},
{
"cell_type": "code",
"execution_count": 137,
"id": "a79d1776-eca2-4e67-ba4b-780dd5c60097",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.prompts import PromptTemplate"
]
},
{
"cell_type": "code",
"execution_count": 138,
"id": "0776c4fe-85d1-4ea4-8000-74a28922d70e",
"metadata": {},
"outputs": [],
"source": [
"import getpass\n",
"import os\n",
"\n",
"def _set_if_undefined(var: str) -> None:\n",
" if os.environ.get(var):\n",
" return\n",
" os.environ[var] = getpass.getpass(var)\n",
"\n",
"\n",
"# Optional: Configure tracing to visualize and debug the agent\n",
"_set_if_undefined(\"LANGCHAIN_API_KEY\")\n",
"# Set the following to \"true\" to enable LangSmith tracing\n",
"os.environ[\"LANGCHAIN_TRACING_V2\"] = \"false\" \n",
"os.environ[\"LANGCHAIN_PROJECT\"] = \"your-project-name\"\n",
"\n",
"_set_if_undefined(\"OPENAI_API_KEY\")"
]
},
{
"cell_type": "markdown",
"id": "9c285ff0-171b-4ea2-b368-ea9143cd8022",
"metadata": {},
"source": [
"First we make a little runnable that has one job: spit out a name.\n"
]
},
{
"cell_type": "code",
"execution_count": 139,
"id": "e91ab445-621d-4121-b471-c98f6c57bce8",
"metadata": {},
"outputs": [],
"source": [
"sys_mess = \"You make up full names. Please respond with one name, and nothing else\"\n",
"\n",
"sys_prompt = PromptTemplate.from_template(sys_mess)\n",
"\n",
"llm = ChatOpenAI(temperature=1)\n",
"namer = sys_prompt | llm | StrOutputParser()"
]
},
{
"cell_type": "markdown",
"id": "f90c4a6e-243e-4d07-8152-d3ecadc33c64",
"metadata": {},
"source": [
"Timing it, we can see that it's generally a sub-second response with `gpt-3.5-turbo`."
]
},
{
"cell_type": "code",
"execution_count": 140,
"id": "5cc1534b-cab5-42b4-8be4-6caca4d84bc8",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 18.2 ms, sys: 1.01 ms, total: 19.2 ms\n",
"Wall time: 691 ms\n"
]
},
{
"data": {
"text/plain": [
"'Evangeline Winterbourne'"
]
},
"execution_count": 140,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"namer.invoke({})\n"
]
},
{
"cell_type": "markdown",
"id": "2461075b-7349-47a6-9af2-1b7f42f7ab48",
"metadata": {},
"source": [
"We're going to build up the output prompt, so its base message doesn't need any inputs defined."
]
},
{
"cell_type": "code",
"execution_count": 141,
"id": "55a1ae6f-7fdb-401e-b886-6ebbd3f77c47",
"metadata": {},
"outputs": [],
"source": [
"out_mess = \"\"\"\n",
"Here's a list of names.\n",
"\n",
"Please output it as is, and nothing else.\n",
"\n",
"Do not wrap the response in a code block.\n",
"\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"id": "16c6ecc7-9781-4787-bc5c-dcb5014f3564",
"metadata": {},
"source": [
"And now to the dynamic part.\n",
"\n",
"Here we:\n",
"1. Instantiate our `out_prompt` with the base message.\n",
"2. Print out the prompt's attributes so you can see it has no inputs\n",
"3. Loop NUMBER_OF_NAMES_TO_GENERATE times and:\n",
" - Define a unique key each time and:\n",
" - Add the key to the inputs dict, with the value being the `namer` runnable.\n",
" - Add an input variable of the key to the output prompt\n",
"4. Print out the prompt after dynamic manipulation. Notice that it has input variables now.\n",
"5. Invoke a new runnable built from the dynamically created input dictionary and output prompt.\n",
"\n",
"We also time the execution of the cell. The results will print after the prompts, but before the results.\n",
"\n",
"If you increase NUMBER_OF_NAMES_TO_GENERATE by one repeatedly, you'll notice that the execution times stay the same for a bit, then basically double. \n",
"\n",
"That's when you've run out of thread capacity and it has to wait for one to free up.\n",
"\n",
"You should see another plateau here of the same \"width\" as the previous one, and even more \"jump then plateau\" periods as you increase the number of nmes further.\n",
"\n",
"**NOTE:** Just a heads up, this example uses `gpt-3.5-turbo`, and sometimes it might get the number of names wrong. If you're curious why, check out the Langsmith trace logs—it's usually because `namer` gets a bit overenthusiastic and makes extra names.\n"
]
},
{
"cell_type": "code",
"execution_count": 142,
"id": "08c7aad4-cd41-4913-80cd-66b0e819af00",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"BEFORE_DYNAMIC_PROMPTING\n",
"{'_type': 'prompt',\n",
" 'input_types': {},\n",
" 'input_variables': [],\n",
" 'metadata': None,\n",
" 'name': None,\n",
" 'output_parser': None,\n",
" 'partial_variables': {},\n",
" 'tags': None,\n",
" 'template': '\\n'\n",
" \"Here's a list of names.\\n\"\n",
" '\\n'\n",
" 'Please output it as is, and nothing else.\\n'\n",
" '\\n'\n",
" 'Do not wrap the response in a code block.\\n'\n",
" '\\n',\n",
" 'template_format': 'f-string',\n",
" 'validate_template': False}\n",
"\n",
"AFTER DYNAMIC PROMPTING\n",
"{'_type': 'prompt',\n",
" 'input_types': {},\n",
" 'input_variables': ['key_0', 'key_1'],\n",
" 'metadata': None,\n",
" 'name': None,\n",
" 'output_parser': None,\n",
" 'partial_variables': {},\n",
" 'tags': None,\n",
" 'template': '\\n'\n",
" \"Here's a list of names.\\n\"\n",
" '\\n'\n",
" 'Please output it as is, and nothing else.\\n'\n",
" '\\n'\n",
" 'Do not wrap the response in a code block.\\n'\n",
" '\\n'\n",
" '{key_0}, {key_1}, ',\n",
" 'template_format': 'f-string',\n",
" 'validate_template': False}\n",
"CPU times: user 55.4 ms, sys: 4.42 ms, total: 59.8 ms\n",
"Wall time: 1.25 s\n"
]
},
{
"data": {
"text/plain": [
"'Cassandra Patel, Leonardo Rivera, Victoria Nguyen'"
]
},
"execution_count": 142,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"\n",
"NUMBER_OF_NAMES_TO_GENERATE=2\n",
"\n",
"out_prompt = PromptTemplate.from_template(out_mess)\n",
"\n",
"# print the output prompt as it is when instantiated\n",
"print(\"BEFORE_DYNAMIC_PROMPTING\")\n",
"pprint(out_prompt.dict())\n",
"\n",
"inputs = {}\n",
"for i in range(NUMBER_OF_NAMES_TO_GENERATE):\n",
" key = f\"key_{i}\"\n",
" inputs[key] = namer\n",
" out_prompt = out_prompt + f\"{{{key}}}, \"\n",
"\n",
"print(\"\\nAFTER DYNAMIC PROMPTING\")\n",
"pprint(out_prompt.dict())\n",
"\n",
"multi = inputs | out_prompt | llm | StrOutputParser()\n",
"multi.invoke({})\n",
"\n",
"# to see the keys coming out of the parallel executions\n",
"# uncomment the next too lines.\n",
"# show_multi_inputs = inputs | RunnableLambda(lambda x: x)\n",
"# show_multi_inputs.invoke({})"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.14"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment