This gists has sections on
Why study prompting? LLM systems have a training process. Prompting is not based on magical thinking, they try to use peculiarities of the training process. For example 'lets think step by step' was/is the prefix for many training example, so the presence of these words in the prompt appeared be triggering chain-of-thought reasoning, at some stage. see wiki
by Isa Fulford - she also wrote OpenAI cookbook and Andrew Ng
- Base LLM trained on massive and diverse corpus of training text
- Instruction tuned LLM a base LLM that has further been trained on instructions on how to achieve goals (like web searching). trining technique is reinforcement
- course shows examples on how to be 'clear and specific', and give LLM time to think
- using
pip install openaihere. (done that before)
They always try out a prompt on a large variety of input cases - for better testing! Developing a prompt is an iterative process, so that evaluating the effects of a given change is quite important.
Be 'clear and specific'
- use delimiters to indicate distinct parts of the input vs prompt text (Summarize the text delimited by triple backticks. ``` text to summarize ```) this also helps to avoid prompt injection, as the input is treated differently (? but an attack can put in ``` into his text and attempt to escape right after that ?)
- prompt should ask for structured output, to make it easier to the output in next stage (like putting output between xml tags <open/> ... <close> or 'provide json output')
- provide an example for he expected output, otherwise the result may differ from what a next stage would expect ('use the following format')
- prompt should ask to . 'check if conditions are satisfied' or 'check assumptions required'
- few shot prompting: provide examples for successful execution of task (your task is to answer in a consistent style: ...in example: dialog that starts with who is speaking ...)
Give the model time to think by breaking the task up into steps
- example: instead of asking 'check if the students solution is correct' ask for
- first, work out your own solution to the problem
- then compare your solution to the students solution end evaluate if the students solution is correct'
- being more specific about the process steps also helps to reduce model hallucinations, like 'find relevant information about the text then answer questions based on the relevant information'
(? my question: what if there are multiple ways to solve a given problem ?)
on the process of: formulating prompt / trying it ot / assessing result + error analysis
example 0f iterative process: task of summarizing a fact sheet. Make first attempt and then add details:
- first attempt "summarize a fact sheet for retail store...." - resulting review is too long
- adds "use at least 50 words". (tries other variants like "use at most 280 characters" - funny, but the result is of same length :-)
- then doesn't like the content of the summary so adds "the description is intended for furniture retailers, so should be technical in nature and focus on the materials the product is constructed from"
- then adds "at the end of the review add the product id in the technical specification.
Important! once it works, use the prompt on larger set of examples (like 50-100) Find average / worst case of performance, etc.
pretty similar to the previous section.
- Also says: prompt should state the purpose of the review, as clarification (example: 'to give feedback to the XXX department' )
- or tell it to 'extract information relevant to the XXX department' instead of 'summarizing'. Gives a more concise result.
Tasks: sentiment analysis, info extractions, etc. Sentiment analysis is one of the strengths of LLMs (as it is much easier to use/requires much less work, compared to ML without LLM !) Andrew NG is going through some example prompts in this area.
- first prompt f"What is the sentiment of the following product review, which is delimited with triple backticks? ```{review}```"
- instead ask for a single word answer: f"""What is the sentiment of the following product review, which is delimited with triple backticks? Give an answer in a single word, "positive" or "negative" ```{review}```""" - this makes it answer to process the the result!
- ... "format your answer as a list of lower case words, separated by comma" - this makes it easier to understand how the customer perceives the product
- ... "is the review expressing anger?" - valuable question for customer support!
- use prompt to extract info from review """ ... Identify the following items from the review text: - item purchased by review - company that made the item. Format your review as a JSON with "Item" and "Brand" keys ... """
- says you can combine different tasks of sentiment analysis & info extraction in a single prompt.
Summarization:
- """determine the five topics that are being discussed in the following text""""
- """Determine whether each item in the following list is a topics is a topic in the text below. Give your answer as a list with 0 or one for each topic.""" this can be used to build a news alert.
- better to ask for json output and specify the keys...
translating/proofreading/conversion between formats.
- f"""Translate the following text to Spanish. Messages included in backticks '''{msg}'''"""
- or identify which language, or to translate into several languages at once.
- Tone transformation: """translate the following from slang into business letter..."
- or translate between json to html
- spellcheck or grammar checking """Proofread and correct the following text. If you find no errors just say 'No error' '''{msg}'''""" (still: sometimes puts quotes around answer, sometimes doesn't - as output format is not specified in the prompt)
- tip: there is the redline package to show differences between two texts. (redlines.Redlines(in_msg))
- also can be very specific about output: """Proofread and correct this review, Make it more compelling. Ensure it follows APA style guide and targets an advanced reader. Output in markdown format '''{msg}'''""" (I didn't even know that 'APA style guide is developed by the American Psychological Association for clear, precise, and inclusive scholarly communication, commonly used in the social sciences')
Take short text and expand on it - write an essay, write reply to an inquiry, etc. (says: don't use it to create spam)
- f""""You are a customer service AI assistant. Your task is to send an email reply to a valued customer. If the sentiment is positive of neutral, thank them for their review. If the sentiment is negative, apologize and suggest that they can reach out to customer service. Make sure to use specific details from the review. Write in a concise and professional tone. Sign the email as `AI customer agent`. Customer review: '''{msg}''' Customer sentiment: '''{sentiment}'''""""""
- use temperature (0 - only use most likely next token prediction, 30 - can use a less likely prediction too)
says use zero for task that require certain outcome (like json creation) and a bit higher temperature for chat agents, for example.
You call a chat bot with the conversation history in json format (the context, in AI speak)
[ { "role": "<conv-role>", "content": "<this turn of the converesation>" } ]
conv-role = assistant|user|system
system - the prompt for chatgtp, that governs the conversation.
assistant - chatgtp said something
user - the user said something
The system prompt for a bot that receives pizza orders:
{ "system": "You are OderBot, an automated service to collect orders for a pizza restaurant. You first greet the customer, then collect the order, then summarize it and check for a final time if the customer wants to add anything else. If it's a delivery, you ask for an address. Finally you collect the payment. Make sure to clarify all options, extras and sizes to uniquely identify the items in the menu. You respond in a short, friendly and conversational style. The menu includes \`\`\`{menu items\`\`\`}"}
- A second prompt for taking this conversation history and to create a json formatted order (Agentic processing!)
What they don't talk about: jailbreak prompts are using some of the same ideas as stated in this prompting guide:
- dan jailbreak prompt
- jailbreak prompt gist see discussions on this page!
These jailbreak prompts are building some kind of alternative narrative, also the extensive use of delimiters. These guys went through many iterations (iterative development) Obviously there is an arms race between the jailbreak prompters and the AI firms...
It makes sense to learn from the jailbreak people, when constructing your prompts.
The 3-rule prompt that stops the LLM from guessing - by Dylan Davis
The tips are specific for searching info in documents, but they have some general merit:
- choose the most advanced model with advanced reasoning capability (check the default setting, usually it is a less advanced model)
- for document extraction: "If information is not found, say 'not found in document'" - the system will rather guess, then admitting to not knowing the answer, which is a frequent source of hallucinations.
- for document extraction: 'for each claim, cite the specific location and relevant quote' - asking for grounded answers also reduced hallucinations.
- 'if uncertain mark as unverified'
- 'Re-scan the document, for each claim, give me the exact quote that supports it. If you can't find a quote, take the claim back' - ask the system to re-check the answer.
link to course by Elie Schoppik - he is also head of technical education at Anthropic - says Linkedin
-
Anthropic has done skills as a kind of open standard defined here - other platforms like gemini / etc also use it.
-
And they have published a book on writing skills
-
Skills can be used to just package a prompt, you can refer to such a repeated prompt via the skill name - instead of repeating the same prompt again and again.
-
Skills can be used to packaging a kind of workflows - which steps to do. Such a workflow can include usage of external tools (based on MCP protocol), or use other sub-workflows (packaged as different skills) or python scripts to run as part of the workflow.
- Asked Claude here: Claude says that skills tend more to be general-purpose, reusable workflows, rather than specific workflows. The idea is to not over-specify the approach, it is assumed that the language model already knows how to do things. When there are multiple possible approaches, then the skill should say which one to use. So it is kind of trying to create a repeatable generalized approach. That's complicated, skills are a kind of open standard, however language models are very different in their pre-training and post-training Claude says 'Skills blur the line between "capability specification" and "workflow automation"'.
Nowadays skills are also supported by other tools, such as Copilot in VSCode - it's standard format for packaging & referencing workflows.
The course material has it's own github repo here
They use free version of claude in this course
The example SKILL.md for this lesson defines a process for analyzing marketing campaigns, the input requirements, data quality check, output format, etc
each skill is a SKILL.md markdown file in its own directory - named after name of skill (see later) - that's the 'entry point'.
(Claude Opus also wants it all to packaged in a zip file with the same name)
Each SKILL.md has initial YAML-like section (like this one):
- the
nameis important, the skill is referenced by it's name. (must be all lower case, - sign as word separator, no spaces, no reserved words like 'claude' or 'anthropic') description- used for by the LL<>, to understand when this skill needs to be used - there is a limit of 1024 characters on the skill description.
---
name: analyzing-marketing-campaign
description: Analyzing weekly marketing campaign data, etc. etc. etc.
---
- SKILL.md can reference additional files (like additional markdown file) - these are in
referencessubdir, under the skills directory. - it can use python scripts (these are under scripts/ directory)
- it can reference additional files in assets/ subdirector (like picture / document templates etc )
- references subdir - you can split up a big skill files into scenarios - they call that progressive disclosure : if it decide that the sub-scenario does not apply to the current task, then it can avoid reading the referenced file ino the context window, so as to waste fewer tokens. Same with scripts that are part of the skill.
You can zip this directory and upload the zip file to the claude chatbot.
|
\---- analyzing-marketing-campaign
|
|--- SKILL.md
\-- references
|
\--budget_reallocation_rules.md # file invoked by SKILL.md in some particular scenario - (Note underscore separator for file files within skill, as opposed to skill name, that has - as separator)
How is this all used? The LLM engine reads all the skills name and descriptions, when it starts up - on session init (t) When the LLM reads a user prompt: it decides if any of the skills are relevant, based on the description of the skill. Only in this case is the skill 'read' into the context window - so that the whole mechanism is supposed not to be a waste of tokens for the user.
(still, all the descriptions of all skills must read into context window, so there is some bloat of context window)
- Mentions that claude chatbot comes with built-in skills for reading excel files (? not sure about that after talking to claude, it seems to have a skill builder skill, and that's it. ?)
How do skills relate to things like Mcp?
- Says mcp can be like tools that are used by the skill. Sort of the skill describing the workflow and context of leveraging the mcp api - upon demand (I think Claude uses 'Connectors' and/or 'plugins'' as the entity of encapsulating these)
- Same for subagents (subagents have their own context window). Only use them for specialized tasks.
Anthropic prebuilt skills
- They have a github repo for prebuilt skills
- these skills run scripts with dependent packages, but there are no instructions for for installing them. That would require a virtual env with all dependencies installed.