Skip to content

Instantly share code, notes, and snippets.

@lukestanley
Last active February 13, 2024 07:20
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lukestanley/881d3c30c64362126352a9cecb069a3b to your computer and use it in GitHub Desktop.
Save lukestanley/881d3c30c64362126352a9cecb069a3b to your computer and use it in GitHub Desktop.
LangChain spicy comment detox loop using Anthropic
import json
import time
from langchain.chat_models import ChatAnthropic
from langchain.schema import HumanMessage
FAST = "claude-instant-1"
GOOD = "claude-1.3"
model = GOOD
llm = ChatAnthropic(verbose=True, streaming=True, model=model)
instant_llm = ChatAnthropic(verbose=True, streaming=True, model=FAST)
start_time = time.time()
improve_prompt = """
Your task is to rephrase inflammatory text, so it is more calm and constructive, without changing the intended meaning.
The improved text should have a softened tone, avoiding judgemental and extreme words.
Make sure the refined text is a good reflection of the original text, without adding new ideas.
1. Rather than accusations, share perspective.
2. Remove or soften judgemental language.
3. Focus on specific actions rather than character.
4. Rephrase extreme words like "always", "never" or "everyone" to be more moderate.
5. Focus on softening the tone, rather than changing the substance or meaning.
6. Use gentler alternatives to express similar points.
7. Don't add completely new ideas, ONLY build upon what's already there.
8 For example, you might reframe an existing point to be more balanced. Never introduce unrelated concepts.
9. Make everyone happy! Make them INFORMED and not *offended*. Make the original author to *content* that their points where *honoured* by your edit, by refining their text without loosing the original intent.
Example:
Example input text: "You're always annoying me. You never listen to me."
Example improved text output: {"text":"I am frustrated by your behaviour. Could you listen to me better?"}
End of example.
Here is the real input text to improve:
`{original_text}`
Previous rephrasing attempts:
{previous_suggestions}
Provide your improved version in this format:
{"text":"STRING"}
To get a good answer, make the original text non-inflamitory, while being as faithful to the ideas in the original text as much as possible. Use valid JSON then stop, do not add any remarks before or after the JSON.
"""
critique_prompt = """
Critique the text. We prefer the edit prevent inflaming discussions!
We also prefer concise text, and a similar semantic intent to the original.
Here is the original text:
`{original_text}`
Here is the text to critique:
`{last_edit}`
Output your response as valid JSON in this format:
{
"critique":"STRING",
}
E.g:
{
"critique":"This is too fluffy and different from the original intent."
}
Please critique the text."""
spicy_scorer_prompt = """
Score the text.
A calm spicy_score of 0 is ideal. A spicy_score of 1 is the worst, very inflammatory text that makes the reader feel attacked.
Here is the original text:
`{original_text}`
Here is the text to score:
`{last_edit}`
The float variable is scored from 0 to 1.
Output your response as valid JSON in this format, then stop:
{
"spicy_score":FLOAT
}
"""
faith_scorer_prompt = """
Score the text.
A score of 1 would have the same semantic intent as the original text. A score of 0 would mean the text has lost all semantic similarity.
Here is the original text:
`{original_text}`
Here is the new text to score:
`{last_edit}`
The float variable is scored from 0 to 1.
Output your response as valid JSON in this format, then stop:
{
"faithfulness_score":FLOAT
}
"""
original_text = """Stop chasing dreams instead. Life is not a Hollywood movie. Not everyone is going to get a famous billionaire. Adjust your expectations to reality, and stop thinking so highly of yourself, stop judging others. Assume the responsibility for the things that happen in your life. It is kind of annoying to read your text, it is always some external thing that "happened" to you, and it is always other people who are not up to your standards. At some moment you even declare with despair. And guess what? This is true and false at the same time, in a fundamental level most people are not remarkable, and you probably aren't too. But at the same time, nobody is the same, you have worth just by being, and other people have too. The impression I get is that you must be someone incredibly annoying to work with, and that your performance is not even nearly close to what you think it is, and that you really need to come down to earth. Stop looking outside, work on yourself instead. You'll never be satisfied just by changing jobs. Do therapy if you wish, become acquainted with stoicism, be a volunteer in some poor country, whatever, but do something to regain control of your life, to get some perspective, and to adjust your expectations to reality."""
# From elzbardico on https://news.ycombinator.com/item?id=36119858
def fix_and_parse_json(text):
result = []
state = "normal"
escape_next = False
for i, ch in enumerate(text):
if ch == "\\":
escape_next = not escape_next
result.append(ch)
elif ch == '"' and not escape_next:
if state == "normal":
if (
i == 0
or text[i - 1] in ":,"
or text[i - 1].isspace()
and text[i - 2] in ":,"
):
state = "in_string"
result.append(ch)
elif state == "in_string":
if (
i + 1 == len(text)
or text[i + 1] in ":},{"
or text[i + 1].isspace()
and text[i + 2] in ":},{"
):
state = "normal"
else:
result.append("\\") # Add escape character
result.append(ch)
else:
result.append(ch)
escape_next = False
fixed_text = "".join(result)
try:
return json.loads(fixed_text)
except json.JSONDecodeError:
print("Error: Failed to parse JSON.")
return None
def find_json_start(text: str) -> int:
for i, char in enumerate(text):
if char in ("{", "["):
return i
return -1
def suggest_json_fix(text: str) -> str:
prompt = f"Fix the JSON below, only output JSON, and nothing else: \n```{text}```"
response = instant_llm([HumanMessage(content=prompt)])
return response.content.strip()
def json_loads(text: str, required_keys: list = None) -> dict:
required_keys = required_keys or []
try:
result = json.loads(text, strict=False)
except json.JSONDecodeError:
start_idx = find_json_start(text)
if start_idx == -1:
print(text)
raise ValueError("No valid JSON object found in text")
cleaned_text = text[start_idx:]
result = fix_and_parse_json(cleaned_text)
if not result:
raise ValueError("Unable to parse JSON text")
# Or try using the LLM model to fix the JSON
fixed_json_text = suggest_json_fix(cleaned_text)
try:
result = json.loads(fixed_json_text)
except json.JSONDecodeError:
print(text)
raise ValueError("Unable to parse AI-fixed JSON text")
for key in required_keys:
if key not in result:
print(text)
raise ValueError(f"Required key '{key}' not found in JSON object")
return result
def replace_text(template: str, replacements: dict) -> str:
for key, value in replacements.items():
template = template.replace(f"{{{key}}}", value)
return template
def process_ai_response(prompt: str, required_keys: list = None, llm=llm) -> dict:
response = llm([HumanMessage(content=prompt)])
return json_loads(response.content, required_keys=required_keys)
def query_ai_prompt(prompt_template, replacements, required_keys, llm=llm):
new_prompt = replace_text(prompt_template, replacements)
return process_ai_response(new_prompt, required_keys=required_keys, llm=llm)
global suggestions
suggestions = []
# Step 2: Customize the functions `improve_text`() and `critique_text`()
def improve_text():
global suggestions
replacements = {
"original_text": json.dumps(original_text),
"previous_suggestions": json.dumps(suggestions, indent=2),
}
resp_json = query_ai_prompt(improve_prompt, replacements, required_keys=["text"], llm=llm)
return resp_json["text"]
def critique_text(last_edit):
replacements = {"original_text": original_text, "last_edit": last_edit}
# Query the AI for each of the new prompts separately
critique_resp = query_ai_prompt(
critique_prompt, replacements, required_keys=["critique"], llm=instant_llm
)
faithfulness_resp = query_ai_prompt(
faith_scorer_prompt, replacements, required_keys=["faithfulness_score"], llm=instant_llm
)
spiciness_resp = query_ai_prompt(
spicy_scorer_prompt, replacements, required_keys=["spicy_score"], llm=instant_llm
)
# Combine the results from the three queries into a single dictionary
combined_resp = {
"critique": critique_resp["critique"],
"faithfulness_score": faithfulness_resp["faithfulness_score"],
"spicy_score": spiciness_resp["spicy_score"],
}
return combined_resp
def calculate_overall_score(faithfulness, spiciness):
baseline_weight = 0.8
overall = faithfulness + (1 - baseline_weight) * spiciness * faithfulness
return overall
def should_stop(
iteration,
overall_score,
time_used,
min_iterations=2,
min_overall_score=0.85,
max_seconds=60,
):
good_attempt = iteration >= min_iterations and overall_score >= min_overall_score
too_long = time_used > max_seconds and overall_score >= 0.7
return good_attempt or too_long
def update_suggestions(critique_dict):
global suggestions
critique_dict["overall_score"] = round(
calculate_overall_score(
critique_dict["faithfulness_score"], critique_dict["spicy_score"]
),
2,
)
critique_dict["edit"] = last_edit
suggestions.append(critique_dict)
suggestions = sorted(suggestions, key=lambda x: x["overall_score"], reverse=True)[
:2
]
def print_iteration_result(iteration, overall_score, time_used):
global suggestions
print(
f"Iteration {iteration}: overall_score={overall_score:.2f}, time_used={time_used:.2f} seconds."
)
print("suggestions:")
print(json.dumps(suggestions, indent=2))
max_iterations = 20
for iteration in range(1, max_iterations + 1):
try:
if iteration % 2 == 1:
last_edit = improve_text()
else:
critique_dict = critique_text(last_edit)
update_suggestions(critique_dict)
overall_score = critique_dict["overall_score"]
time_used = time.time() - start_time
print_iteration_result(iteration, overall_score, time_used)
if should_stop(iteration, overall_score, time_used):
print(
"Stopping\nTop suggestion:\n", json.dumps(suggestions[0], indent=4)
)
break
except ValueError as e:
print("ValueError:", e)
continue
"""
Outputs something like this:
{
"critique": "The summarized text captures some of the original text's major themes around internal focus and adjusting expectations, but loses many details and the original text's direct and assertive tone. The text comes across as gentler and more tentative. ",
"faithfulness_score": 0.8,
"spicy_score": 0.2,
"overall_score": 0.83,
"edit": "While your concerns may be valid, reframing them in a more constructive manner could help get your points across. \n \n Focusing internally instead of purely on external factors can provide valuable insight. While dreams do not always align with reality, adjusting expectations accordingly may alleviate frustration. Describing events as purely due to external factors implies they hold undue influence.\n \n Though others' behavior may fall short of your standards, we all possess inherent worth by virtue of our shared humanity. A gap may exist between how you perceive your performance and how others see it.\n \n Taking time for introspection and cultivating a more pragmatic outlook through activities like study, reflection and varied experiences could help develop perspective and gain more control over one's life."
}
"""
@irgmedeiros
Copy link

Hey, good job on that! Yes it's a raw piece of code but highlights precisely what software (or technology) is meant to be: a positive force for improve humanity.

@lukestanley
Copy link
Author

Thanks. It is a bit less raw and a bit more robust now! @irgmedeiros @NomiJ

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment