yashbonde/chainfury_why_fury.md

## chainfury_why_fury.md

      
    Raw
  

              chainfury_why_fury.md
            
          
    Why the fury?

(ENG-01) The first engineering blog.
ChainFury started as a weekend hackathon but since then has developed into a much bigger project (dare I say, one of the last systems). The core idea behind it being the rapid development (with chains), deployment (with embeddable chatbot UI) and gathering feedback for the performance. Initially it was built with langflow as inspiration which was in turn built on top of langchain.
Chandrani's written a great starting blog on ChainFury.
Success

langflow had a brilliant description of what the UI should look like with a great Template for each node that allowed creation of forms to populate. This standard allowed us to build out the front end in a nice spec and focus more on the backend during the 48-hour hackathon. We focused solely on the CRUDL of the chat, chatbots and other resources, industry standard authentication and JS-embeddable chatbox.
With years of experience under our belt shipped a solid product from nothing in less time than it takes any other company get approval for a hackathon. Team was super excited to see the first release and taking a break from building our core platform NimbleBox.

Challenges with langchain and langflow

In this post I want to talk a little bit more about the challenges we faced. The first and foremost was that langchain is very hard to use:

there is no straight forward syntax
interactivity with chats is not built in
support for multiple modalities is hard

These are hard problems and langchain does a great job of managing the immense complexity of 100s of different APIs while abstracting them away from the users. There are other concerns in langflow as well, ex. the way it determines the steps is by using langflow.utils.payload.get_root_node() function that looks like this:
def get_root_node(graph):
    """
    Returns the root node of the template.
    """
    incoming_edges = {edge.source for edge in graph.edges}
    return next((node for node in graph.nodes if node not in incoming_edges), None)
This might appear to be working solution but there is a bug hiding in plainsight. What if the DAG (graph) was initialised incorrectly? It relies on the assumption that whoever created the DAG, via front end, did the correct job. You cannot guarantee performance in this form, instead the correct algorithm is topological sort which guarantees that DAG will be executed in correct order despite with small overhead at runtime.
This is just a small critique and we stand on the shoulders of giants.
Introducing fury

We thus decided to rebuild the processing engine from ground up with abstractions that are pretty future proof and scalable. A lot of our production code is written in golang which has helped our team of self-taught engineers in designing systems with correct responsibilities. Python despite all it's greatness is a very limited language in building complex applications which require interactions with other systems whose behaviour is unpredictable, bute more on this later.
fury which is available at chainfury.fury keeps Agent (chatbot) as the centre piece and inspired from the Von Nuemann Architecture which is the backbone for entiry of modern computing.

Each Agent will be:

interactable via chat: this is the new standard interface for 2020s
have it's own memory: agent can remember things as it wants and store them in patterns it wants
multiple source models: models can provide all kinds of modality as outputs
chains: so developers can chose to build their own flow and uniqueness


Code

Here is the pseudocode I have in mind for this:
class Model:
  # user can subclass this and override the __call__
  def __call__(self, *args, **kwargs):
    ...

class Memory:
  # user can subclass this and override the following functions
  def get(self, key: str):
    ...
    
  def put(self, key: str, value: Any):
    ...
    
class Chain:
  def __init__(self, agent: Agent):
    # so the chain can access all the underlying elements of Agent including:
    # - models
    # - memories
    self.agent = Agent

  # user can subclass this and override the __call__
  def __call__(self):
    ...

# the main class, user can either subclass this or prvide the chain
class Agent:
  def __init__(self, models: List[Model], memories: List[Memory], chain: Chain):
    self.models = model
    self.memories = memories
    self.chain = chain
    
  def __call__(self, user_input: Any):
    return self.chain(user_input)


Model allows for any kind of model to be put into the picture, whether it is OpenAI GPT, Stable Diffusion, or even connect to a local running endpoint.


Memory makes it such that the users can chose to store things in a DB, file, etc. I am not fully sure what the final APIs will look like, but starting with a key/value store never hurt anyone


Agent is the simplest, its primary job is as a namespace and a standard interface to call the chain


Chain makes it so that any kind of flow that the user wants to implement can be handled


Sharing Responsibilities

It is still not clear how all the different outputs will be standardised, eg. a stable-diffusion output can be an image while the ChatGPT output can be a text. However we will provide enough abstractions and guarantees that the flow I/O will be consistent and dev can refer docs / chat to find out more
Future

I hinted above that ChainFury might be one of the last projects. The reason is simple, if chains are the new form of development and memory can allow it to store abstracted concepts effectively, then all of software dev. can eventually be abstracted stored in a DB and applied as and when needed.
If you have any thoughts on this you can raise an issue or start a discussion