vkz/hoot.md

## hoot.md

      
    Raw
  

              hoot.md
            
          
    LangChain considered harmful?

_{WARN: proof-reading this I realized good folks involved in LangChain may take offence. If it does come off rude or offensive, I'm sorry. FWIW it was all written in good spirit. I remain unapologetically Homo Sovieticus in my upbringing and therefore tone deaf. Especially in English. Musings of a tired old engineer. Don't read too much into it.}
 
 
A few days have past since Launching solo ep0 when I decided to embark on this LLM / TAB-driven product development experiment and I guess I now have thoughts to get off my chest. Several things happened. Firstly, I changed the product architecture and delivery mechanism quite substantially - for the better, I think. Secondly, I can now boast a bit of hands on exposure to actually driving GitHub Copilot, ChatGPT, et al as well as talking to a few APIs and playing with a few prominent entries in the larger  LLM ecosystem. Today's hopefully short hoot regards my impressions after the excursion into the latter - the "hyped" frameworks and libs that kept showing on my Twitter timeline.
I am severely sleep deprived here, so don't expect coherence, but I'll give it a go. I'm not really one to dunk on software projects, but, I fear, what follows may be construed as an attack on LangChain specifically. TBH I've nothing against it, in partular, I certainly can offer no opinion as to how well it is written. The amount of effort to ship something like it, with docs and integrations and all, I can only imagine - it is to be celebrated. If you find it speeds up your development - more power to you (or them, I guess). What I want to offer today is an observation - a cautionary "tale" - about the approach it takes and where it may lead if you don't exercise restraint and merrily jump on the LLM bandwagon without thinking it through.
Allow me to borrow a few quotes from LangChain's own website. Firstly, it is a framework, alright.

LangChain is a framework for developing applications powered by language models

I mean, duh. There are now a few of those with varying amount of ambition. LangChain is nothing if not ambitios. Kudos to them for shipping both Python (which I don't know) and TypeScript / JavaScript (which I do). Not only! They also graciously provide docs for both. At the time of this writing their documentation website exhibits a bit of an issue, so be warned when you use it. I find the side table of contents omits a lot of the subsections, which you can only access by clicking "next" to navigate pages in order. Not a huge problem but it makes it difficult to know what's available. That aside it is easy enough to start with and it claims it can do a lot. I don't doubt it. My issue is with ... well, should it? Does it really benefit the user? Does it solve non-trivial ploblems for me? Let's see.

The main value props of LangChain are:

Components: abstractions for working with language models, along with a collection of implementations for each abstraction. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not
Off-the-shelf chains: a structured assembly of components for accomplishing specific higher-level tasks


Having read this and played with it a bit, another way of asking the same question would be this. Are the above mentioned abstractions it builds upon LLM provider APIs and tools worth it? Abstractions in my experience always come at a cost. It is up to us software engineers to exercise good judgement whether to opt in or go to the source - drop down a level or two. Benefits better oughtweigh the costs by a lot, at least as far as I'm concerned. More often than I would like to admit I had to reverse engineer the libraries and frameworks I was using (usually handed down to me) and go to the original lower level implementation or API. It's now happened so many times I've become paranoid and view every framework claiming to simplify my life with extreme suspicion. You know what's coming, don't you? In no time flat I was jumping around inside the LangChain codebase, cause documentation was no help, trying to figure out what the output objects for some calls contained. I reversed it alright. What can I say. It is rather typical of frameworks that try to abstract a few API providers with a unifying API - essentially MITMing your calls - to have base or abstract types or interfaces or classes, then have specific implementations inherit or extend as needed, possibly mixing in a few things. All in good spirit, except it is freaking nightmare to figure where things come from or where they end up. Even with types - I was looking at TypeScript implementation here. Unless you take on extra work to narrow your types, it may not be obvious and with TypeScript it is quite tempting to do the equivalent of typecasting to Any.
Ok, I admit, despite years of experience the "hype" got me a little bit this time. Mostly because LLMs and agents and all that have been developing at such fast pace, I was unsure where to even begin. Had to start somewhere, therefore LangChain. Long story short, a combination of looking at their docs, OpenAI API reference and spec (in YAML - shudder) and LangChain source code, I managed to find answers, but wait. Wasn't the whole point of this abstraction to absolve me of the need to learn the target APIs and tools? Our framework is all you need - they said! Here's the unified API for LLM providers - they said. Here's a unified API for storage - they said. And on, and on. Except come on. We've been through this. Doesn't work this way. Almost never works at all. Let's be honest, you build your service with e.g. OpenAI API in mind. And its such a simple API. Ditto other providers. Same with storage. You can't just nilly willy pretend it isn't Postgres backing you up as you write code. It'll leak through soon enough, cause you'll need something extra. And honestly, LangChain's API offers such shallow abstractions over most everything, I question the need to even have something like it. Also, if you go this route, your documentation needs to be top notch, precise with ton of examples and basically incorporate docs of whatever you're abstracting. When you have to develop this quickly, though, it'll inevitably lag behind. In case of hosted LLMs and inference their output is textual - completions, duh - and therefore API tends to be simple even when you supply functions that LLMs can call. The whole chaining thing is nothnig more than stringing together answers for the next query to the same provider or something else. But I get it, writing frameworks is tempting and a lot of fun. I love it.
My point is that at least hosted inference APIs are minimal and straightforward. At the moment of this writing I see no reason whatsoever to use frameworks like LangChain. At best they solve trivial problems yet place at least as much conginitive burden as going to the target being abstracted. I fear that with all the hype surrounding the field, newcommers will get lost in the noise and not even try to split the problem into essential subtasks. I feel like, yet again, instead of writing documentation, blogs, essays, lab journal entries, how to guides explaining in excruciating details with lots of examples, we rush to abstract something quite simple. Has the WEB taught us nothing?
Finally, let me argue the following perhaps non-obvious point. It is subtle. LangChain et al are called upon to "solve LLMs and agents" - bring the diverse ecosystem under a unified umbrella to make it more amenable to programming with parts being swapped, agents "communicating" etc. What LLM completions and "agents" driven by them offer is precisely to render any such frameworks entirely unnecessary. They are, therefore, for lack of better term stillborn products. Bear with me here. Let's assume, the goal is to pretend inference providers have agreed upon APIs and therefore can be swapped at will. First of all, I think there's real chance OpenAI API becomes something like S3 - something everyone will have to copy - but we can't know for sure, so let's not even go there. Assume they all differ and there's tons of them. And you're like "we are not writing custom code every time we want to switch - too much work - too many bugs - everyone knows its bad engineering". Something to note, is that OpenAI API may give you oranges while some other API gives you apples, necassarily LangChain will give you fruits. Maybe enough for you, but you can't know that when you start. In my experience not only does it always pay to ground your first implementation in something real, narrow and concrete, it is pretty much unavoidable - we humans work off of examples - we abstract later (unless we're software architects in which case, I'm sorry). Back to the stillborn argument. Conventional wisdom does seem to suggest that writing custom code for every API provider you might want to use as compnent would be too much. But is it true? More precisely, is it still true with LLMs in mind? I mean, it is the kind of code that's boilerplate heavy and you need to get the details of API right. Boilerplate - aka the kind of code that you'd encounter in many codebases. Details - something taken from the API spec. Hm. Now, who would be good at spitting out stuff like this? I wonder. See where I'm going with this, yet? It is only a matter of time. Months? Weeks? Days - the way things are developing. Oh, wait, here comes 🦍:

Gorilla is a LLM that can provide appropriate API calls. It is trained on three massive machine learning hub datasets: Torch Hub, TensorFlow Hub and HuggingFace. We are rapidly adding new domains, including Kubernetes, GCP, AWS, OpenAPI, and more. Zero-shot Gorilla outperforms GPT-4, Chat-GPT and Claude. Gorilla is extremely reliable, and significantly reduces hallucination errors.

--- Gorilla LLM
How do you like them apples? I got her number ... sorry, wrong story. I bet it isn't there quite yet, but I think the progress being made illustrates my point nicely. My argument, essentially, is that everything LangChain et al abstract today - inference provider APIs, memory, vector storage and embeddings, RPC and tools, etc - isn't worth abstracting. Not in code anyway. Along will come LLM that'll generate that custom code for me. It'll either be solid already or good enough for me to beat into submissing with actual target documentation and semantics in mind.¹
What's left then? Exponential backoff to avoid rate limiting? Well, isn't that something you ought to know - as in be aware of rate limiting that providers enforce? Even this amounts to closing over your provider calls in an off-the-shelf implementation, unless you'd rather write your own.
Conclusion


I want to clarify something in case it got lost in the above text. I think having projects like LangChain around is great. And it is great work. Rich ecosystem is better than poor. But there's rich in terms of what you can do and rich in all the different ways you can do it. One is definitely a win. The other, I'm not so sure. Just look at web-dev, man. I'm all for letting a thousand flowers bloom, but I worry. So I write. I feel like we should spend time experimenting and writing, as in, writing tons. How-tos. Blogs. Documentation. Examples. Take web-dev - much of the complexity is systemic - hiding it through abstraction instead of documenting it made things significantly worse. E.g. HTTP is not complex, but how many devs bothered to read relevant RFCs? I maybe barking up the wrong tree. Time will tell. If you take nothing else from this hoot I want it to be this: try the DIY path first - there isn't enough complexity in the ecosystem yet for this to be an insurmountable programming challenge. Chances are, you'll find it refreshingly easy. I guess, the real challenge is filtering all the noise that's pouring from social media, offering wrapped and ready solutions, nudging us to hurry up, creating sense of urgency and just stress.
I feel like the more interesting, meatier parts, of the domain are in "composing agents" - having them commune and us developers visualise, monitor, debug the process. Of the attempts I've seen so far, I feel like the approach people seem to take is wrong and ignores the conversational nature of agent interactions. I think I have a great solution that's been staring us in the face all this time - one we're very familiar and comfortable with. I dunno, pretty obvious now that I'm working on my own bot.
Any comments?

As always this hoot is nothing more than a GitHub gist. Please, use its respective comments section if you have something to say. Till next time.
Coda


I'm running a (mostly) Clojure SWAT team in London at fullmeta.co.uk. Now that git.ht is up, which you should go ahead and check out rigth now, I am back to CTOing and looking for my next gig. If you're hiring, check out my CV and shoot an email to vlad at fullmeta.co.uk. Happy to branch out into Typescript, Go, Erlang, C, even Java unless you're stuck below Java'18.


  Follow me on Twitter 

  
Footnotes


LangChain is in good company here. There's a ton of abstraction everywhere to become absolete. I think it is for the best. I love when people write Terroform and pretend they'll be able to swap AWS for GCP at a click of a button. Noone ever does cause they've been writing CloudFormation in Terraform syntax all this time and their entire codebase depends on calling AWS anyway. There's any number of examples like this. ↩