Skip to content

Instantly share code, notes, and snippets.

@suneel-pi
Created September 16, 2025 06:27
Show Gist options
  • Save suneel-pi/ad9b961b0970218e3b3d9934f24887b3 to your computer and use it in GitHub Desktop.
Save suneel-pi/ad9b961b0970218e3b3d9934f24887b3 to your computer and use it in GitHub Desktop.
Pi Scorer + LangSmith.ipynb
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/suneel-pi/ad9b961b0970218e3b3d9934f24887b3/pi-scorer-langsmith.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"<a href=\"https://withpi.ai\"><img src=\"https://play.withpi.ai/logo/logoFullBlack.svg\" width=\"240\"></a>\n",
"\n",
"<a href=\"https://code.withpi.ai\"><font size=\"4\">Documentation</font></a>\n",
"\n",
"<a href=\"https://build.withpi.ai\"><font size=\"4\">Copilot</font></a>"
],
"metadata": {
"id": "pi-masthead"
}
},
{
"cell_type": "markdown",
"source": [
"[Pi Scorer](https://build.withpi.ai) offers an alternative to LLM-as-a-judge with several advantages:\n",
"\n",
"* Significantly faster\n",
"\n",
"* Highly consistent — always returns the same score for the same inputs\n",
"\n",
"* Eliminates the need for prompt tuning or adjustments"
],
"metadata": {
"id": "eiR5tdXsVdNk"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "PONlf847Xp-A"
},
"outputs": [],
"source": [
"%%capture\n",
"%pip install -U langsmith openevals openai datasets"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "fAquPGo3X8hg"
},
"outputs": [],
"source": [
"# @title Setup API Keys\n",
"\n",
"import os\n",
"from google.colab import userdata\n",
"\n",
"# Get keys for your project from the project settings page\n",
"# https://smith.langchain.com/\n",
"os.environ[\"LANGSMITH_TRACING\"] = \"true\"\n",
"os.environ[\"LANGSMITH_API_KEY\"] = userdata.get(\"LANGSMITH_API_KEY\")\n",
"\n",
"# Your openai key\n",
"os.environ[\"OPENAI_API_KEY\"] = userdata.get(\"OPENAI_API_KEY\")\n",
"\n",
"# Get PI API key: https://build.withpi.ai/account/keys\n",
"os.environ[\"WITHPI_API_KEY\"] = userdata.get('WITHPI_API_KEY')"
]
},
{
"cell_type": "code",
"source": [
"# @title Example Inputs\n",
"\n",
"from langsmith import Client\n",
"\n",
"inputs = [\n",
" \"\"\"{\"Clause\":\"The Recipient shall indemnify, defend, and hold harmless the Disclosing Party from and against any and all losses, damages, liabilities, deficiencies, claims, actions, judgments, settlements, interest, awards, penalties, fines, costs, or expenses of whatever kind, including reasonable attorneys' fees, arising out of or relating to any third-party claim alleging a breach by the Recipient of its obligations under this Agreement.\"}\"\"\",\n",
" \"\"\"{\"Clause\":\"The Recipient shall indemnify, defend, and hold harmless the Disclosing Party from and against any and all losses, damages, liabilities, deficiencies, claims, actions, judgments, settlements, interest, awards, penalties, fines, costs, or expenses of whatever kind, including reasonable attorneys' fees, arising out of or relating to any third-party claim alleging a breach by the Recipient of its obligations under this Agreement.\"}\"\"\",\n",
" \"\"\"{\"Clause\":\"If any term or provision of this Agreement is found by a court of competent jurisdiction to be invalid, illegal, or unenforceable, such invalidity, illegality, or unenforceability shall not affect any other term or provision of this Agreement.\"}\"\"\",\n",
" \"\"\"{\"Clause\":\"The relationship between the parties is that of independent contractors. Nothing contained in this Agreement shall be construed as creating any agency, partnership, joint venture, or other form of joint enterprise, employment, or fiduciary relationship between the parties.\"}\"\"\",\n",
" \"\"\"{\"Clause\":\"The relationship between the parties is that of independent contractors. Nothing contained in this Agreement shall be construed as creating any agency, partnership, joint venture, or other form of joint enterprise, employment, or fiduciary relationship between the parties.\"}\"\"\",\n",
" \"\"\"{\"Clause\":\"No waiver by any party of any of the provisions hereof shall be effective unless explicitly set forth in writing and signed by the party so waiving. No failure to exercise, or delay in exercising, any right, remedy, power, or privilege arising from this Agreement shall operate or be construed as a waiver thereof.\"}\"\"\",\n",
" \"\"\"{\"Clause\":\"This Agreement may be executed in counterparts, each of which is deemed an original, but all of which together are deemed to be one and the same agreement.\"}\"\"\",\n",
" \"\"\"{\"Clause\":\"The headings in this Agreement are for reference only and do not affect the interpretation of this Agreement.\"}\"\"\",\n",
" \"\"\"{\"Clause\":\"All notices or other communications required or permitted hereunder shall be in writing and shall be deemed to have been duly given when delivered in person, by nationally recognized overnight courier, or by registered or certified mail, return receipt requested, postage prepaid.\"}\"\"\",\n",
" \"\"\"{\"Clause\":\"All notices or other communications required or permitted hereunder shall be in writing and shall be deemed to have been duly given when delivered in person, by nationally recognized overnight courier, or by registered or certified mail, return receipt requested, postage prepaid.\"}\"\"\",\n",
" \"\"\"{\"Clause\":\"Neither party may use the other party's name or logo in any press release, advertising, or other promotional materials without the other party's prior written consent.\"}\"\"\",\n",
" \"\"\"{\"Clause\":\"Neither party may use the other party's name or logo in any press release, advertising, or other promotional materials without the other party's prior written consent.\"}\"\"\",\n",
" \"\"\"{\"Clause\":\"The parties agree to negotiate in good faith to resolve any dispute between them regarding this Agreement before resorting to arbitration.\"}\"\"\",\n",
"]\n",
"examples = [{\"inputs\": {\"topic\": i}} for i in inputs]\n",
"display(examples)\n",
"\n",
"# Upload to Langsmith\n",
"langsmith_client = Client()\n",
"dataset = langsmith_client.create_dataset(dataset_name=\"legal_docs\", description=\"Legal Document Summarization\")\n",
"langsmith_client.create_examples(dataset_id=dataset.id, examples=examples)"
],
"metadata": {
"id": "03cwaJ_SAeVJ"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# @title My custom application setup: blog-post generator\n",
"\n",
"from langsmith import wrappers\n",
"from openai import OpenAI\n",
"\n",
"# Wrap the OpenAI client for LangSmith tracing\n",
"openai_client = wrappers.wrap_openai(OpenAI())\n",
"\n",
"# Define the application logic you want to evaluate inside a target function\n",
"# The SDK will automatically send the inputs from the dataset to your target function\n",
"def target(inputs: dict) -> dict:\n",
" response = openai_client.chat.completions.create(\n",
" model=\"gpt-4o-mini\",\n",
" messages=[\n",
" {\"role\": \"system\", \"content\": \"\"\"\n",
"You are a helpful assistant that summarizes legal documents into\n",
"easily understandable language for laypeople. Your goal is to\n",
"provide clear and concise summaries that capture the key aspects\n",
"of complex legal texts, making legal information accessible to\n",
"individuals without legal expertise. The output should be a\n",
"simplified summary of the original legal document.\n",
"\"\"\"},\n",
" {\"role\": \"user\", \"content\": inputs[\"topic\"]},\n",
" ],\n",
" )\n",
" return { \"response\": response.choices[0].message.content.strip() }"
],
"metadata": {
"id": "lgLGstfrBxjC"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# @title Pi Scorer Setup\n",
"\n",
"import os\n",
"import requests\n",
"\n",
"PI_API_URL = \"https://api.withpi.ai/v1/scoring_system/score\"\n",
"HEADERS = {\n",
" \"Content-Type\": \"application/json\",\n",
" \"x-api-key\": os.environ.get(\"WITHPI_API_KEY\"),\n",
"}\n",
"\n",
"def get_score(input: str, output: str, question: str):\n",
" payload = {\n",
" \"llm_input\": input,\n",
" \"llm_output\": output,\n",
" \"scoring_spec\": [{\"question\": question}]\n",
" }\n",
" response = requests.post(PI_API_URL, headers=HEADERS, json=payload)\n",
" pi_score = response.json()\n",
" return pi_score[\"total_score\"]\n",
"\n",
"def make_evaluator(key, question):\n",
" def evaluator(inputs, outputs):\n",
" return {\n",
" \"key\": key,\n",
" \"score\": get_score(inputs[\"topic\"], outputs[\"response\"], question),\n",
" }\n",
" return evaluator\n",
"\n",
"evaluators = [\n",
" make_evaluator(\n",
" \"Clarity of Explanation\", \"Does the explanation provide a clear and detailed understanding of the clause for a layperson?\"\n",
" ),\n",
" make_evaluator(\n",
" \"Logical Flow\", \"Does the explanation follow a logical flow that makes it easy to follow and understand?\"\n",
" ),\n",
" make_evaluator(\n",
" \"Conciseness and Completeness\", \"Is the explanation concise while still being informative and complete?\"\n",
" ),\n",
" make_evaluator(\n",
" \"Specificity of Examples\", \"Does the explanation include specific examples or scenarios to illustrate the clause&#x27;s application?\"\n",
" ),\n",
" make_evaluator(\n",
" \"Use of Everyday Language\", \"Is the summarized content presented in a way that is clear and comprehensible to the general public?\"\n",
" ),\n",
" make_evaluator(\n",
" \"Layperson Accessibility\", \"Is the explanation simplified enough to be understood by someone without specialized legal knowledge?\"\n",
" ),\n",
" make_evaluator(\n",
" \"Conciseness and Sufficiency\", \"Is the explanation concise while still providing sufficient information to understand the clause?\"\n",
" ),\n",
" make_evaluator(\n",
" \"Error-Free Language\", \"Does the explanation avoid errors in grammar and spelling?\"\n",
" ),\n",
"]"
],
"metadata": {
"id": "szKYrvl8CT9k"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# @title Run Langsmith evaluation\n",
"\n",
"experiment_results = langsmith_client.evaluate(\n",
" target,\n",
" data=\"legal_docs\",\n",
" evaluators=evaluators,\n",
" experiment_prefix=\"\",\n",
" max_concurrency=1,\n",
")"
],
"metadata": {
"id": "1jH5xDNnDbha"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"See result at https://smith.langchain.com/public/ff72f973-5b38-4d91-a6fe-11960eb17448/d"
],
"metadata": {
"id": "KrtEhmBJDNpO"
}
}
],
"metadata": {
"colab": {
"provenance": [],
"include_colab_link": true
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment