Skip to content

Instantly share code, notes, and snippets.

@cedrickchee
Last active May 19, 2024 15:45
Show Gist options
  • Save cedrickchee/5683330768d15e27f01a6bcb05eb2cdb to your computer and use it in GitHub Desktop.
Save cedrickchee/5683330768d15e27f01a6bcb05eb2cdb to your computer and use it in GitHub Desktop.
AI Agents

AI Agents

In response to Dr. Andrew Ng's letter, "Four AI agent strategies that improve GPT-4 and GPT-3.5 performance".

When I read Andrew's letter, I'm imagining him as Steve Balmer, shouting "Agentic, agentic, agentic workflows!". Haha, we can hear you. No need for that.

Before we move on, let's be clear what is agent in this context. The context is, we're now in 2024 and LLMs such as GPT-4 and Llama 3 is the state-of-the-art. In early 2022, everybody in the field knew about the agent from RL, but the general public had no conception of what it was. Their narrative were still everything is a chatbot. All sorts of different things are being called agents. Chatbots are being called agents. Things that make a function call are being called agents. Now when people think agent, they actually think the right thing.

An agent is something that you can give a goal and get an end step workflow done correctly in the minimum number of steps.

Agents have become more part of the public narrative. Bill Gates in his Nov 2023 blog post, "AI is about to completely change how you use computers" claims that agents are the future:1

... in many ways, software is still pretty dumb.

In the next five years, this will change completely. You won’t have to use different apps for different tasks. You’ll simply tell your device, in everyday language, what you want to do.

This type of software—something that responds to natural language and can accomplish many different tasks based on its knowledge of the user—is called an agent.

Agents are not only going to change how everyone interacts with computers. They’re also going to upend the software industry, bringing about the biggest revolution in computing since we went from typing commands to tapping on icons.

(People still joke about Clippy) ... Clippy was a bot, not an agent. ... Agents are smarter. They’re proactive—capable of making suggestions before you ask for them. They accomplish tasks across applications.

In the computing industry, we talk about platforms—the technologies that apps and services are built on. Android, iOS, and Windows are all platforms. Agents will be the next platform.

Nobody has figured out yet what the data structure for an agent will look like. ... We are already seeing new ways of storing information, such as vector databases.

There isn't yet a standard protocol that will allow agents to talk to each other. The cost needs to come down so agents are affordable for everyone.

But we’re a long way from that point. In the meantime, agents are coming.

AI agent competitions are rising

MetaGPT -> AgentCoder -> Devin/OpenDevin/Devika -> SWE-Agent -> AutoCodeRover

LLM-based agents are still in their infancy, and there’s a lot of room for improvement. Agent or multi-agents are still in the very early research/prototype stage.

AutoCodeRover is the agent king born from Singapore. Devin was announced 3 weeks ago and it's turning the spotlight on AI like it's the latest celebrity in town. Devin is generally useful but very slow and costly. It exposed models to an exponentially larger number of calls for production level work. AutoCodeRover is a research prototype. AgentCoder performance (relative to GPT-4) in the graph is astounding, but there is no improvement beyond 100% of this benchmark.

What's Next for AI Agents

I believe that AI agents will significantly improve in the near future, but the majority of companies and their workers are still figuring out how to integrate the first layer of AI into their workflows and processes.

Agentic workflows have the potential to unlock capabilities beyond what is possible with the current approach of prompting models for one-shot/zero-shot/CoT generations. The tools to create agents are improving rapidly. The architecture/pattern is improving with ideas such as Karpathy's LLM Operating System design. The comparison between traditional LLMs and the iterative, agentic approach is interesting whether or not there will be a pivotal shift in AI application.

What's next for AI agentic workflows ft. Andrew Ng

(Andrew Ng speaks about what's next for AI agentic workflows; planning and multi-agent collaboration. Planning is like the "ChatGPT moment" for AI agent.)

The field is quickly pivoting in a world where foundation models are looking more and more commodity. A huge amount of gain is going to happen from how do you use foundation models as the well-learned behavioral cloner to go solve agents.

I'm excited to see progress on SWE-bench and new benchmarks for even more complex/bigger tasks. The performance leap with iterative workflows are compelling.

References

Footnotes

  1. https://www.gatesnotes.com/AI-agents

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment