AI engineers operate at a higher level of abstraction than Machine Learning (ML) engineers or large language model (LLM) engineers and don't necessarily need to know how to build an LLM or a ML model.
AI engineering builds upon ML systems1, but with a focus on large scale, ready made models (aka. base models).
The distinct skills that AI engineers need to know include prompt engineering, working with data, LLM infrastructure, and evaluating LLMs.
According to Andrej Karpathy's AI Engineering framing:
LLMs created a whole new layer of abstraction and profession.
I've so far called this role "Prompt Engineer" but agree it is misleading. It's not just prompting alone, there's a lot of glue code/infra around it. Maybe "AI Engineer" is ~usable, though it takes something a bit too specific and makes it a bit too broad.
ML people train algorithms/networks, usually from scratch, usually at lower capability.
LLM training is becoming sufficently different from ML because of its systems-heavy workloads, and is also splitting off into a new kind of role, focused on very large scale training of transformers on supercomputers.
In numbers, there's probably going to be significantly more AI Engineers than there are ML engineers / LLM engineers.
One can be quite successful in this role without ever training anything.
I don't fully follow the Software 1.0/2.0 framing. Software 3.0 (IMO ~prompting LLMs) is amusing because prompts are human-designed "code", but in English, and interpreted by an LLM (itself now a Software 2.0 artifact). AI Engineers simultaneously program in all 3 paradigms. It's a bit 😵💫
Want to get started with LLMs? Simon Willison written a blog post "Making Large Language Models work for you". It provides you a good higher level explanation of LLMs in an easy to digest way.
As Generative AI becomes more widely used, there is a growing need for AI engineers who can build robust and reliable AI-powered applications. This has led to the "Rise of the AI Engineer". 2
Charlie Guo on "The emergence of AI Engineering" is mapping out a possible syllabus for someone training themselves in becoming an AI Engineer.
Matt Rickard on "Where AI Fits in Engineering Organizations".
I'm writing a practical guide specifically designed for software engineers interested in building applications with foundation models such as LLMs, Stable Diffusion models, etc.
It provides a systematic introduction to building applications using LLMs such as ChatGPT, GPT-4, Anthropic Claude, Llama 3, and other open models.
The guide covers the principles of AI engineering:
-
What is the new AI stack? What and how is it different than traditional ML engineering?
-
A practical guide on how to use LLMs effectively
- Performance considerations: techniques to mitigate model latency and cost.
-
Advanced tutorials:
-
Applying agentic behaviors in building LLM applications without using frameworks/libraries
-
Determine when (and when not) to fine-tune an LLM Most software engineers talk about LLMs, but few have the hands-on knowledge to train, validate and deploy fine-tuned LLMs for specific problems (narrow domain).
You will learn and run an end-to-end LLM fine-tuning project with the latest tools and best practices.
-
-
Lastly, the guide will touch about Retrieval Augmeted Generation (RAG) systems.