jiamingkong/langchain.ipynb

## langchain.ipynb
{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Hacking the default templating of Langchain for better results\n",
    "\n",
    "I realized that `lngchain` package wasn't working quite well for RWKV was because that the default templates in the package was more suited for OpenAI chatGPT. So I decided to hack the default templates to make it more suited for RWKV. I did it with another python f-string `RWKV_TASK` as shown in the examples below.\n",
    "\n",
    "\n",
    "## Load a default model and make sure it works"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "RWKV_JIT_ON 1 RWKV_CUDA_ON 1 RESCALE_LAYER 6\n",
      "\n",
      "Loading D:/weights/rwkv/RWKV-4-Raven-3B-v11-Eng49%-Chn49%-Jpn1%-Other1%-20230429-ctx4096.pth ...\n",
      "Strategy: (total 32+1=33 layers)\n",
      "* cuda [float16, uint8], store 20 layers\n",
      "* cuda [float16, float16], store 13 layers\n",
      "0-cuda-float16-uint8 1-cuda-float16-uint8 2-cuda-float16-uint8 3-cuda-float16-uint8 4-cuda-float16-uint8 5-cuda-float16-uint8 6-cuda-float16-uint8 7-cuda-float16-uint8 8-cuda-float16-uint8 9-cuda-float16-uint8 10-cuda-float16-uint8 11-cuda-float16-uint8 12-cuda-float16-uint8 13-cuda-float16-uint8 14-cuda-float16-uint8 15-cuda-float16-uint8 16-cuda-float16-uint8 17-cuda-float16-uint8 18-cuda-float16-uint8 19-cuda-float16-uint8 20-cuda-float16-float16 21-cuda-float16-float16 22-cuda-float16-float16 23-cuda-float16-float16 24-cuda-float16-float16 25-cuda-float16-float16 26-cuda-float16-float16 27-cuda-float16-float16 28-cuda-float16-float16 29-cuda-float16-float16 30-cuda-float16-float16 31-cuda-float16-float16 32-cuda-float16-float16 \n",
      "emb.weight                        f16      cpu  50277  2560 \n",
      "blocks.0.ln1.weight               f16   cuda:0   2560       \n",
      "blocks.0.ln1.bias                 f16   cuda:0   2560       \n",
      "blocks.0.ln2.weight               f16   cuda:0   2560       \n",
      "blocks.0.ln2.bias                 f16   cuda:0   2560       \n",
      "blocks.0.att.time_decay           f32   cuda:0   2560       \n",
      "blocks.0.att.time_first           f32   cuda:0   2560       \n",
      "blocks.0.att.time_mix_k           f16   cuda:0   2560       \n",
      "blocks.0.att.time_mix_v           f16   cuda:0   2560       \n",
      "blocks.0.att.time_mix_r           f16   cuda:0   2560       \n",
      "blocks.0.att.key.weight            i8   cuda:0   2560  2560 \n",
      "blocks.0.att.value.weight          i8   cuda:0   2560  2560 \n",
      "blocks.0.att.receptance.weight     i8   cuda:0   2560  2560 \n",
      "blocks.0.att.output.weight         i8   cuda:0   2560  2560 \n",
      "blocks.0.ffn.time_mix_k           f16   cuda:0   2560       \n",
      "blocks.0.ffn.time_mix_r           f16   cuda:0   2560       \n",
      "blocks.0.ffn.key.weight            i8   cuda:0   2560 10240 \n",
      "blocks.0.ffn.receptance.weight     i8   cuda:0   2560  2560 \n",
      "blocks.0.ffn.value.weight          i8   cuda:0  10240  2560 \n",
      "............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................\n",
      "blocks.31.ln1.weight              f16   cuda:0   2560       \n",
      "blocks.31.ln1.bias                f16   cuda:0   2560       \n",
      "blocks.31.ln2.weight              f16   cuda:0   2560       \n",
      "blocks.31.ln2.bias                f16   cuda:0   2560       \n",
      "blocks.31.att.time_decay          f32   cuda:0   2560       \n",
      "blocks.31.att.time_first          f32   cuda:0   2560       \n",
      "blocks.31.att.time_mix_k          f16   cuda:0   2560       \n",
      "blocks.31.att.time_mix_v          f16   cuda:0   2560       \n",
      "blocks.31.att.time_mix_r          f16   cuda:0   2560       \n",
      "blocks.31.att.key.weight          f16   cuda:0   2560  2560 \n",
      "blocks.31.att.value.weight        f16   cuda:0   2560  2560 \n",
      "blocks.31.att.receptance.weight   f16   cuda:0   2560  2560 \n",
      "blocks.31.att.output.weight       f16   cuda:0   2560  2560 \n",
      "blocks.31.ffn.time_mix_k          f16   cuda:0   2560       \n",
      "blocks.31.ffn.time_mix_r          f16   cuda:0   2560       \n",
      "blocks.31.ffn.key.weight          f16   cuda:0   2560 10240 \n",
      "blocks.31.ffn.receptance.weight   f16   cuda:0   2560  2560 \n",
      "blocks.31.ffn.value.weight        f16   cuda:0  10240  2560 \n",
      "ln_out.weight                     f16   cuda:0   2560       \n",
      "ln_out.bias                       f16   cuda:0   2560       \n",
      "head.weight                       f16   cuda:0   2560 50277 \n"
     ]
    }
   ],
   "source": [
    "from langchain.llms import RWKV\n",
    "import os\n",
    "os.environ[\"RWKV_CUDA_ON\"] = '1' # if '1' then use CUDA kernel for seq mode (much faster)\n",
    "\n",
    "weight_path = \"D:/weights/rwkv/RWKV-4-Raven-3B-v11-Eng49%-Chn49%-Jpn1%-Other1%-20230429-ctx4096.pth\"\n",
    "tokenizer_json = \"D:/weights/rwkv/20B_tokenizer.json\"\n",
    "\n",
    "\n",
    "def generate_prompt(instruction, input=None):\n",
    "    if input:\n",
    "        return f\"\"\"Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n",
    "\n",
    "# Instruction:\n",
    "{instruction}\n",
    "\n",
    "# Input:\n",
    "{input}\n",
    "\n",
    "# Response:\n",
    "\"\"\"\n",
    "    else:\n",
    "        return f\"\"\"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n",
    "\n",
    "# Instruction:\n",
    "{instruction}\n",
    "\n",
    "# Response:\n",
    "\"\"\"\n",
    "\n",
    "model = RWKV(model=weight_path, strategy=\"cuda fp16i8 *20 -> cuda fp16\", tokens_path=tokenizer_json)\n",
    "response = model(generate_prompt(\"Write a python code that prints the first 10 numbers of the fibonacci sequence.\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Here's the code:\n",
      "```python\n",
      "a, b = 0, 1\n",
      "for i in range(10):\n",
      "    print(a)\n",
      "    a, b = b, a + b\n",
      "```\n"
     ]
    }
   ],
   "source": [
    "print(response)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## With proper template the Langchain would work better\n",
    "\n",
    "Below I used the same template as the official huggingface space. Here we can see that langchain can work with RWKV. Without this template, the RWKV model (even the Raven flavor) would more likely to continue a question rather than answering it.\n",
    "\n",
    "The example below shows how to build a very basic chain that makes RWKV performs any basic tasks the user provides."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "One possible company name for a product of superglue could be \"SuperGlue\".\n"
     ]
    }
   ],
   "source": [
    "from langchain.prompts import PromptTemplate\n",
    "TEMPLATE = \"\"\"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n",
    "# Instruction:\n",
    "{instruction}\n",
    "\n",
    "# Response:\n",
    "\"\"\"\n",
    "\n",
    "rwkv_prompt = PromptTemplate(\n",
    "    input_variables=[\"instruction\"],\n",
    "    template=TEMPLATE,\n",
    ")\n",
    "\n",
    "from langchain.chains import LLMChain\n",
    "chain = LLMChain(llm=model, prompt=rwkv_prompt)\n",
    "\n",
    "print(chain.run(\"What is a good company name for product of superglue?\"))"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The chat is already quite good, due to a good template\n",
    "\n",
    "But the catch here is that the RWKV will tend to complete the conversation way further than we need it. We need to add a \"\\n\" token in the stop field to make it stop at the end of first AI reply.\n",
    "\n",
    "\n",
    "**Syntax**:\n",
    "\n",
    "`conversation.predict(input_text, stop=\"\\n\")`\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new ConversationChain chain...\u001b[0m\n",
      "Prompt after formatting:\n",
      "\u001b[32;1m\u001b[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n",
      "\n",
      "Current conversation:\n",
      "\n",
      "Human: Hi there!\n",
      "AI:\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n",
      " Hello! How can I help you today?\n",
      "\n",
      "\n",
      "\u001b[1m> Entering new ConversationChain chain...\u001b[0m\n",
      "Prompt after formatting:\n",
      "\u001b[32;1m\u001b[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n",
      "\n",
      "Current conversation:\n",
      "Human: Hi there!\n",
      "AI:  Hello! How can I help you today?\n",
      "Human: Recommend some movies for a Friday night?\n",
      "AI:\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n",
      " Sure, what genre are you looking for?\n"
     ]
    }
   ],
   "source": [
    "\n",
    "from langchain import ConversationChain\n",
    "conversation = ConversationChain(llm=model, verbose=True)\n",
    "output = conversation.predict(input=\"Hi there!\", stop=\"\\n\")\n",
    "\n",
    "# add stop words to conversationchain\n",
    "# conversation.add_stop_words([\"\\n\"])\n",
    "\n",
    "def cut_output_for_first_sentence(output):\n",
    "    output = output.split(\"\\n\")\n",
    "    return output[0]\n",
    "\n",
    "output = cut_output_for_first_sentence(output)\n",
    "print(output)\n",
    "\n",
    "# continue conversation\n",
    "output = conversation.predict(input=\"Recommend some movies for a Friday night?\", stop=\"\\n\")\n",
    "output = cut_output_for_first_sentence(output)\n",
    "print(output)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Now use partial prompt to hook up with other chains\n",
    "\n",
    "In the example below, I define the `RWKV_TASK` as a python f-string, and then converted some examples in the langchain official documents over to the new template. The results are quite good. The RWKV model is able to perform the tasks that the user provides.\n",
    "\n",
    "In the example below, the RWKV model is asked to first generate a random play synopsis, and then generate the review of the play, too. The results are quite good."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [],
   "source": [
    "RWKV_TASK = \"\"\"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n",
    "\n",
    "# Instruction:\n",
    "{instruction}\n",
    "\n",
    "# Response:\n",
    "{response}\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "model.max_tokens_per_generation = 1000\n",
    "template = \"\"\"\n",
    "You are a playwright. Given the title of play and the era it is set in, it is your job to write a synopsis for that title.\n",
    "Title: {title}\n",
    "Era: {era}\n",
    "\"\"\"\n",
    "response = \"Playwright: This is a synopsis for the above play:\"\n",
    "\n",
    "# This is the hacky way to do it\n",
    "template = RWKV_TASK.format(instruction=template, response=response)\n",
    "prompt_template = PromptTemplate(template=template, input_variables=[\"title\", 'era'])\n",
    "synopsis_chain = LLMChain(llm=model, prompt=prompt_template, output_key=\"synopsis\")\n",
    "\n",
    "template = \"\"\"You are a play critic from the New York Times. Given the synopsis of play, it is your job to write a review for that play.\n",
    "Play Synopsis:\n",
    "{synopsis}\n",
    "\"\"\"\n",
    "response = \"Review from a New York Times play critic of the above play:\"\n",
    "template = RWKV_TASK.format(instruction = template, response=response)\n",
    "prompt_template = PromptTemplate(input_variables=[\"synopsis\"], template=template)\n",
    "review_chain = LLMChain(llm=model, prompt=prompt_template, output_key=\"review\")\n",
    "\n",
    "from langchain.chains import SequentialChain\n",
    "overall_chain = SequentialChain(\n",
    "    chains=[synopsis_chain, review_chain],\n",
    "    input_variables=[\"era\", \"title\"],\n",
    "    # Here we return multiple variables\n",
    "    output_variables=[\"synopsis\", \"review\"],\n",
    "    verbose=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\u001b[1m> Entering new SequentialChain chain...\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "result = overall_chain({\"title\":\"Tragedy at sunset on the beach\", \"era\": \"Victorian England\"})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"The play is a haunting and suspenseful exploration of grief, loss, and the consequences of unchecked power. The characters are portrayed with depth and complexity, making it difficult to predict their actions or motivations. The play raises important questions about power, morality, and the consequences of one's actions. It is a powerful and thought-provoking work that will leave you thinking long after the final curtain falls.\""
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "result[\"review\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "In the Victorian era, a tragic event occurs on the beach. The sun sets behind the cliffs and a group of people gather to watch as a lone figure stands at the edge of the water. As they watch, they hear a scream and see the figure fall into the water. The play ends with the characters standing in silence, waiting for something to happen.\n"
     ]
    }
   ],
   "source": [
    "print(result[\"synopsis\"])"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Some more examples with summarization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain import OpenAI\n",
    "from langchain.chains.summarize import load_summarize_chain\n",
    "\n",
    "map_reduce_prompt = \"\"\"Write a concise summary of the following:\n",
    "\n",
    "\"{text}\"\n",
    "\"\"\"\n",
    "\n",
    "map_reduce_pt = PromptTemplate(\n",
    "    input_variables=[\"text\"],\n",
    "    template=RWKV_TASK.format(instruction=map_reduce_prompt)\n",
    ")\n",
    "summary_chain = LLMChain(llm=model, prompt=map_reduce_pt, output_key=\"summary\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [],
   "source": [
    "text = \"\"\"\n",
    "NASA’s James Webb Space Telescope team has been selected to receive the 2023 Michael Collins Trophy for Lifetime and Current Achievements. This annual award from the Smithsonian’s National Air and Space Museum honors outstanding achievements in the fields of aerospace science and technology, and their history.\n",
    "\n",
    "“The James Webb Space Telescope team’s dedication and ingenuity is an inspiration to the world,” said NASA Associate Administrator Bob Cabana. “The partnerships that make this mission possible represent the best of humanity and are critical to enabling us to use Webb to understand our universe better.”\n",
    "\n",
    "The award was presented during a ceremony at the museum’s Steven F. Udvar-Hazy Center in Chantilly, Virginia, on March 23.\n",
    "\n",
    "“The 2023 Collins Trophy recipients have helped humans understand their place on this Earth,” said Chris Browne, the John and Adrienne Mars Director of the museum. “The James Webb Telescope has likewise given us new perspectives on the universe.”\n",
    "\n",
    "Launched Dec. 25, 2021, Webb is the largest and most powerful space science telescope ever built. In July 2022, the Webb team officially began Webb’s mission to explore the infrared universe.\n",
    "\n",
    "“Congratulations to the James Webb Space Telescope team for pushing the boundaries to reveal our history through the earliest, most distant galaxies that shine in the cosmos,” said Nicola Fox, associate administrator for the Science Mission Directorate at NASA Headquarters. “The awe-inspiring images and spectra are already delivering on Webb’s promise to unlock a new era of science.”\n",
    "\n",
    "With its optics performing nearly twice as well as the mission required, Webb is discovering some of the earliest galaxies ever observed, peering through dusty clouds to see stars forming, and delivering a more detailed view of the atmospheres of planets outside our solar system than ever before. Webb has also captured new views of planets within our solar system, including the clearest look at Neptune’s rings in decades. The Collins Trophy award recognizes the extraordinary accomplishments and significant contributions of the team members who designed, developed, and now operate the Webb mission.\n",
    "\n",
    "A large, bright star shines from the center with smaller stars scattered throughout the image. A clumpy cloud of material surrounds the central star, with more material above and below than on the sides.\n",
    "The luminous, hot star Wolf-Rayet 124 (WR 124) is prominent at the center of the James Webb Space Telescope’s composite image combining near-infrared and mid-infrared wavelengths of light from Webb’s Near-Infrared Camera and Mid-Infrared Instrument.\n",
    "Credits: NASA, ESA, CSA, STScI, Webb ERO Production Team\n",
    "“The James Webb Space Telescope is allowing us to study a time when the first stars and galaxies formed in the universe. This amazing achievement has been made possible over many years by the dedication of the thousands of people on the team, who have pushed the boundaries of technology to deliver this spectacular space telescope,” said Mark Clampin, director of the Astrophysics Division for the Science Mission Directorate at NASA Headquarters. Clampin delivered remarks after accepting the 2023 Collins Trophy on March 23, on behalf of the Webb team.\n",
    "\n",
    "Winners receive a trophy featuring a miniature version of the “Web of Space” sculpture, which was created by John Safer from Washington, D.C. The award was established in 1985 and was renamed in honor of Apollo 11 astronaut Michael Collins in 2020.\n",
    "\n",
    "Webb, an international mission led by NASA with its partners ESA (European Space Agency) and CSA (Canadian Space Agency), is the world’s premier space science observatory. Its design pushed the boundaries of space telescope capabilities to solve mysteries in our solar system, look beyond to distant worlds around other stars, and probe the mysterious structures and origins of our universe and our place in it.\n",
    "\n",
    "Recently, the Webb mission’s accomplishments also have been recognized by organizations including the Space Foundation, National Space Club and Foundation, Aviation Week, Bloomberg Businessweek, Popular Science, and TIME.\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"The James Webb Space Telescope team's accomplishments have been recognized with a 2023 Collins Trophy Award from the National Air and Space Museum.\""
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "summary_chain.run(text)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "base",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.9"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
	{
	"cells": [
	{
	"attachments": {},
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# Hacking the default templating of Langchain for better results\n",
	"\n",
	"I realized that `lngchain` package wasn't working quite well for RWKV was because that the default templates in the package was more suited for OpenAI chatGPT. So I decided to hack the default templates to make it more suited for RWKV. I did it with another python f-string `RWKV_TASK` as shown in the examples below.\n",
	"\n",
	"\n",
	"## Load a default model and make sure it works"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 12,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"RWKV_JIT_ON 1 RWKV_CUDA_ON 1 RESCALE_LAYER 6\n",
	"\n",
	"Loading D:/weights/rwkv/RWKV-4-Raven-3B-v11-Eng49%-Chn49%-Jpn1%-Other1%-20230429-ctx4096.pth ...\n",
	"Strategy: (total 32+1=33 layers)\n",
	"* cuda [float16, uint8], store 20 layers\n",
	"* cuda [float16, float16], store 13 layers\n",
	"0-cuda-float16-uint8 1-cuda-float16-uint8 2-cuda-float16-uint8 3-cuda-float16-uint8 4-cuda-float16-uint8 5-cuda-float16-uint8 6-cuda-float16-uint8 7-cuda-float16-uint8 8-cuda-float16-uint8 9-cuda-float16-uint8 10-cuda-float16-uint8 11-cuda-float16-uint8 12-cuda-float16-uint8 13-cuda-float16-uint8 14-cuda-float16-uint8 15-cuda-float16-uint8 16-cuda-float16-uint8 17-cuda-float16-uint8 18-cuda-float16-uint8 19-cuda-float16-uint8 20-cuda-float16-float16 21-cuda-float16-float16 22-cuda-float16-float16 23-cuda-float16-float16 24-cuda-float16-float16 25-cuda-float16-float16 26-cuda-float16-float16 27-cuda-float16-float16 28-cuda-float16-float16 29-cuda-float16-float16 30-cuda-float16-float16 31-cuda-float16-float16 32-cuda-float16-float16 \n",
	"emb.weight f16 cpu 50277 2560 \n",
	"blocks.0.ln1.weight f16 cuda:0 2560 \n",
	"blocks.0.ln1.bias f16 cuda:0 2560 \n",
	"blocks.0.ln2.weight f16 cuda:0 2560 \n",
	"blocks.0.ln2.bias f16 cuda:0 2560 \n",
	"blocks.0.att.time_decay f32 cuda:0 2560 \n",
	"blocks.0.att.time_first f32 cuda:0 2560 \n",
	"blocks.0.att.time_mix_k f16 cuda:0 2560 \n",
	"blocks.0.att.time_mix_v f16 cuda:0 2560 \n",
	"blocks.0.att.time_mix_r f16 cuda:0 2560 \n",
	"blocks.0.att.key.weight i8 cuda:0 2560 2560 \n",
	"blocks.0.att.value.weight i8 cuda:0 2560 2560 \n",
	"blocks.0.att.receptance.weight i8 cuda:0 2560 2560 \n",
	"blocks.0.att.output.weight i8 cuda:0 2560 2560 \n",
	"blocks.0.ffn.time_mix_k f16 cuda:0 2560 \n",
	"blocks.0.ffn.time_mix_r f16 cuda:0 2560 \n",
	"blocks.0.ffn.key.weight i8 cuda:0 2560 10240 \n",
	"blocks.0.ffn.receptance.weight i8 cuda:0 2560 2560 \n",
	"blocks.0.ffn.value.weight i8 cuda:0 10240 2560 \n",
	"............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................\n",
	"blocks.31.ln1.weight f16 cuda:0 2560 \n",
	"blocks.31.ln1.bias f16 cuda:0 2560 \n",
	"blocks.31.ln2.weight f16 cuda:0 2560 \n",
	"blocks.31.ln2.bias f16 cuda:0 2560 \n",
	"blocks.31.att.time_decay f32 cuda:0 2560 \n",
	"blocks.31.att.time_first f32 cuda:0 2560 \n",
	"blocks.31.att.time_mix_k f16 cuda:0 2560 \n",
	"blocks.31.att.time_mix_v f16 cuda:0 2560 \n",
	"blocks.31.att.time_mix_r f16 cuda:0 2560 \n",
	"blocks.31.att.key.weight f16 cuda:0 2560 2560 \n",
	"blocks.31.att.value.weight f16 cuda:0 2560 2560 \n",
	"blocks.31.att.receptance.weight f16 cuda:0 2560 2560 \n",
	"blocks.31.att.output.weight f16 cuda:0 2560 2560 \n",
	"blocks.31.ffn.time_mix_k f16 cuda:0 2560 \n",
	"blocks.31.ffn.time_mix_r f16 cuda:0 2560 \n",
	"blocks.31.ffn.key.weight f16 cuda:0 2560 10240 \n",
	"blocks.31.ffn.receptance.weight f16 cuda:0 2560 2560 \n",
	"blocks.31.ffn.value.weight f16 cuda:0 10240 2560 \n",
	"ln_out.weight f16 cuda:0 2560 \n",
	"ln_out.bias f16 cuda:0 2560 \n",
	"head.weight f16 cuda:0 2560 50277 \n"
	]
	}
	],
	"source": [
	"from langchain.llms import RWKV\n",
	"import os\n",
	"os.environ[\"RWKV_CUDA_ON\"] = '1' # if '1' then use CUDA kernel for seq mode (much faster)\n",
	"\n",
	"weight_path = \"D:/weights/rwkv/RWKV-4-Raven-3B-v11-Eng49%-Chn49%-Jpn1%-Other1%-20230429-ctx4096.pth\"\n",
	"tokenizer_json = \"D:/weights/rwkv/20B_tokenizer.json\"\n",
	"\n",
	"\n",
	"def generate_prompt(instruction, input=None):\n",
	" if input:\n",
	" return f\"\"\"Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n",
	"\n",
	"# Instruction:\n",
	"{instruction}\n",
	"\n",
	"# Input:\n",
	"{input}\n",
	"\n",
	"# Response:\n",
	"\"\"\"\n",
	" else:\n",
	" return f\"\"\"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n",
	"\n",
	"# Instruction:\n",
	"{instruction}\n",
	"\n",
	"# Response:\n",
	"\"\"\"\n",
	"\n",
	"model = RWKV(model=weight_path, strategy=\"cuda fp16i8 *20 -> cuda fp16\", tokens_path=tokenizer_json)\n",
	"response = model(generate_prompt(\"Write a python code that prints the first 10 numbers of the fibonacci sequence.\"))"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 13,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"Here's the code:\n",
	"```python\n",
	"a, b = 0, 1\n",
	"for i in range(10):\n",
	" print(a)\n",
	" a, b = b, a + b\n",
	"```\n"
	]
	}
	],
	"source": [
	"print(response)"
	]
	},
	{
	"attachments": {},
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## With proper template the Langchain would work better\n",
	"\n",
	"Below I used the same template as the official huggingface space. Here we can see that langchain can work with RWKV. Without this template, the RWKV model (even the Raven flavor) would more likely to continue a question rather than answering it.\n",
	"\n",
	"The example below shows how to build a very basic chain that makes RWKV performs any basic tasks the user provides."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 39,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"One possible company name for a product of superglue could be \"SuperGlue\".\n"
	]
	}
	],
	"source": [
	"from langchain.prompts import PromptTemplate\n",
	"TEMPLATE = \"\"\"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n",
	"# Instruction:\n",
	"{instruction}\n",
	"\n",
	"# Response:\n",
	"\"\"\"\n",
	"\n",
	"rwkv_prompt = PromptTemplate(\n",
	" input_variables=[\"instruction\"],\n",
	" template=TEMPLATE,\n",
	")\n",
	"\n",
	"from langchain.chains import LLMChain\n",
	"chain = LLMChain(llm=model, prompt=rwkv_prompt)\n",
	"\n",
	"print(chain.run(\"What is a good company name for product of superglue?\"))"
	]
	},
	{
	"attachments": {},
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## The chat is already quite good, due to a good template\n",
	"\n",
	"But the catch here is that the RWKV will tend to complete the conversation way further than we need it. We need to add a \"\\n\" token in the stop field to make it stop at the end of first AI reply.\n",
	"\n",
	"\n",
	"Syntax:\n",
	"\n",
	"`conversation.predict(input_text, stop=\"\\n\")`\n",
	"\n"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 47,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"\n",
	"\n",
	"\u001b[1m> Entering new ConversationChain chain...\u001b[0m\n",
	"Prompt after formatting:\n",
	"\u001b[32;1m\u001b[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n",
	"\n",
	"Current conversation:\n",
	"\n",
	"Human: Hi there!\n",
	"AI:\u001b[0m\n",
	"\n",
	"\u001b[1m> Finished chain.\u001b[0m\n",
	" Hello! How can I help you today?\n",
	"\n",
	"\n",
	"\u001b[1m> Entering new ConversationChain chain...\u001b[0m\n",
	"Prompt after formatting:\n",
	"\u001b[32;1m\u001b[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.\n",
	"\n",
	"Current conversation:\n",
	"Human: Hi there!\n",
	"AI: Hello! How can I help you today?\n",
	"Human: Recommend some movies for a Friday night?\n",
	"AI:\u001b[0m\n",
	"\n",
	"\u001b[1m> Finished chain.\u001b[0m\n",
	" Sure, what genre are you looking for?\n"
	]
	}
	],
	"source": [
	"\n",
	"from langchain import ConversationChain\n",
	"conversation = ConversationChain(llm=model, verbose=True)\n",
	"output = conversation.predict(input=\"Hi there!\", stop=\"\\n\")\n",
	"\n",
	"# add stop words to conversationchain\n",
	"# conversation.add_stop_words([\"\\n\"])\n",
	"\n",
	"def cut_output_for_first_sentence(output):\n",
	" output = output.split(\"\\n\")\n",
	" return output[0]\n",
	"\n",
	"output = cut_output_for_first_sentence(output)\n",
	"print(output)\n",
	"\n",
	"# continue conversation\n",
	"output = conversation.predict(input=\"Recommend some movies for a Friday night?\", stop=\"\\n\")\n",
	"output = cut_output_for_first_sentence(output)\n",
	"print(output)"
	]
	},
	{
	"attachments": {},
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Now use partial prompt to hook up with other chains\n",
	"\n",
	"In the example below, I define the `RWKV_TASK` as a python f-string, and then converted some examples in the langchain official documents over to the new template. The results are quite good. The RWKV model is able to perform the tasks that the user provides.\n",
	"\n",
	"In the example below, the RWKV model is asked to first generate a random play synopsis, and then generate the review of the play, too. The results are quite good."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 49,
	"metadata": {},
	"outputs": [],
	"source": [
	"RWKV_TASK = \"\"\"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n",
	"\n",
	"# Instruction:\n",
	"{instruction}\n",
	"\n",
	"# Response:\n",
	"{response}\n",
	"\"\"\""
	]
	},
	{
	"cell_type": "code",
	"execution_count": 26,
	"metadata": {},
	"outputs": [],
	"source": [
	"model.max_tokens_per_generation = 1000\n",
	"template = \"\"\"\n",
	"You are a playwright. Given the title of play and the era it is set in, it is your job to write a synopsis for that title.\n",
	"Title: {title}\n",
	"Era: {era}\n",
	"\"\"\"\n",
	"response = \"Playwright: This is a synopsis for the above play:\"\n",
	"\n",
	"# This is the hacky way to do it\n",
	"template = RWKV_TASK.format(instruction=template, response=response)\n",
	"prompt_template = PromptTemplate(template=template, input_variables=[\"title\", 'era'])\n",
	"synopsis_chain = LLMChain(llm=model, prompt=prompt_template, output_key=\"synopsis\")\n",
	"\n",
	"template = \"\"\"You are a play critic from the New York Times. Given the synopsis of play, it is your job to write a review for that play.\n",
	"Play Synopsis:\n",
	"{synopsis}\n",
	"\"\"\"\n",
	"response = \"Review from a New York Times play critic of the above play:\"\n",
	"template = RWKV_TASK.format(instruction = template, response=response)\n",
	"prompt_template = PromptTemplate(input_variables=[\"synopsis\"], template=template)\n",
	"review_chain = LLMChain(llm=model, prompt=prompt_template, output_key=\"review\")\n",
	"\n",
	"from langchain.chains import SequentialChain\n",
	"overall_chain = SequentialChain(\n",
	" chains=[synopsis_chain, review_chain],\n",
	" input_variables=[\"era\", \"title\"],\n",
	" # Here we return multiple variables\n",
	" output_variables=[\"synopsis\", \"review\"],\n",
	" verbose=True)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 27,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"\n",
	"\n",
	"\u001b[1m> Entering new SequentialChain chain...\u001b[0m\n",
	"\n",
	"\u001b[1m> Finished chain.\u001b[0m\n"
	]
	}
	],
	"source": [
	"result = overall_chain({\"title\":\"Tragedy at sunset on the beach\", \"era\": \"Victorian England\"})"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 28,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"\"The play is a haunting and suspenseful exploration of grief, loss, and the consequences of unchecked power. The characters are portrayed with depth and complexity, making it difficult to predict their actions or motivations. The play raises important questions about power, morality, and the consequences of one's actions. It is a powerful and thought-provoking work that will leave you thinking long after the final curtain falls.\""
	]
	},
	"execution_count": 28,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"result[\"review\"]"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 29,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"In the Victorian era, a tragic event occurs on the beach. The sun sets behind the cliffs and a group of people gather to watch as a lone figure stands at the edge of the water. As they watch, they hear a scream and see the figure fall into the water. The play ends with the characters standing in silence, waiting for something to happen.\n"
	]
	}
	],
	"source": [
	"print(result[\"synopsis\"])"
	]
	},
	{
	"attachments": {},
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Some more examples with summarization"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 32,
	"metadata": {},
	"outputs": [],
	"source": [
	"from langchain import OpenAI\n",
	"from langchain.chains.summarize import load_summarize_chain\n",
	"\n",
	"map_reduce_prompt = \"\"\"Write a concise summary of the following:\n",
	"\n",
	"\"{text}\"\n",
	"\"\"\"\n",
	"\n",
	"map_reduce_pt = PromptTemplate(\n",
	" input_variables=[\"text\"],\n",
	" template=RWKV_TASK.format(instruction=map_reduce_prompt)\n",
	")\n",
	"summary_chain = LLMChain(llm=model, prompt=map_reduce_pt, output_key=\"summary\")"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 36,
	"metadata": {},
	"outputs": [],
	"source": [
	"text = \"\"\"\n",
	"NASA’s James Webb Space Telescope team has been selected to receive the 2023 Michael Collins Trophy for Lifetime and Current Achievements. This annual award from the Smithsonian’s National Air and Space Museum honors outstanding achievements in the fields of aerospace science and technology, and their history.\n",
	"\n",
	"“The James Webb Space Telescope team’s dedication and ingenuity is an inspiration to the world,” said NASA Associate Administrator Bob Cabana. “The partnerships that make this mission possible represent the best of humanity and are critical to enabling us to use Webb to understand our universe better.”\n",
	"\n",
	"The award was presented during a ceremony at the museum’s Steven F. Udvar-Hazy Center in Chantilly, Virginia, on March 23.\n",
	"\n",
	"“The 2023 Collins Trophy recipients have helped humans understand their place on this Earth,” said Chris Browne, the John and Adrienne Mars Director of the museum. “The James Webb Telescope has likewise given us new perspectives on the universe.”\n",
	"\n",
	"Launched Dec. 25, 2021, Webb is the largest and most powerful space science telescope ever built. In July 2022, the Webb team officially began Webb’s mission to explore the infrared universe.\n",
	"\n",
	"“Congratulations to the James Webb Space Telescope team for pushing the boundaries to reveal our history through the earliest, most distant galaxies that shine in the cosmos,” said Nicola Fox, associate administrator for the Science Mission Directorate at NASA Headquarters. “The awe-inspiring images and spectra are already delivering on Webb’s promise to unlock a new era of science.”\n",
	"\n",
	"With its optics performing nearly twice as well as the mission required, Webb is discovering some of the earliest galaxies ever observed, peering through dusty clouds to see stars forming, and delivering a more detailed view of the atmospheres of planets outside our solar system than ever before. Webb has also captured new views of planets within our solar system, including the clearest look at Neptune’s rings in decades. The Collins Trophy award recognizes the extraordinary accomplishments and significant contributions of the team members who designed, developed, and now operate the Webb mission.\n",
	"\n",
	"A large, bright star shines from the center with smaller stars scattered throughout the image. A clumpy cloud of material surrounds the central star, with more material above and below than on the sides.\n",
	"The luminous, hot star Wolf-Rayet 124 (WR 124) is prominent at the center of the James Webb Space Telescope’s composite image combining near-infrared and mid-infrared wavelengths of light from Webb’s Near-Infrared Camera and Mid-Infrared Instrument.\n",
	"Credits: NASA, ESA, CSA, STScI, Webb ERO Production Team\n",
	"“The James Webb Space Telescope is allowing us to study a time when the first stars and galaxies formed in the universe. This amazing achievement has been made possible over many years by the dedication of the thousands of people on the team, who have pushed the boundaries of technology to deliver this spectacular space telescope,” said Mark Clampin, director of the Astrophysics Division for the Science Mission Directorate at NASA Headquarters. Clampin delivered remarks after accepting the 2023 Collins Trophy on March 23, on behalf of the Webb team.\n",
	"\n",
	"Winners receive a trophy featuring a miniature version of the “Web of Space” sculpture, which was created by John Safer from Washington, D.C. The award was established in 1985 and was renamed in honor of Apollo 11 astronaut Michael Collins in 2020.\n",
	"\n",
	"Webb, an international mission led by NASA with its partners ESA (European Space Agency) and CSA (Canadian Space Agency), is the world’s premier space science observatory. Its design pushed the boundaries of space telescope capabilities to solve mysteries in our solar system, look beyond to distant worlds around other stars, and probe the mysterious structures and origins of our universe and our place in it.\n",
	"\n",
	"Recently, the Webb mission’s accomplishments also have been recognized by organizations including the Space Foundation, National Space Club and Foundation, Aviation Week, Bloomberg Businessweek, Popular Science, and TIME.\n",
	"\"\"\""
	]
	},
	{
	"cell_type": "code",
	"execution_count": 37,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"\"The James Webb Space Telescope team's accomplishments have been recognized with a 2023 Collins Trophy Award from the National Air and Space Museum.\""
	]
	},
	"execution_count": 37,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"summary_chain.run(text)"
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "base",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.10.9"
	},
	"orig_nbformat": 4
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}