Skip to content

Instantly share code, notes, and snippets.

** Terraform Module for Local LLM & Vector DB Orchestration

Migrating the Knowledge Retrieval API and Evaluation Script into Terraform Infrastructure. This Terraform configuration automates the deployment of the AI stack, replacing manual Docker Compose steps with Infrastructure as Code (IaC).

Terraform:

Terraform is an infrastructure as code tool that lets you build, change, and version infrastructure safely and efficiently. This includes low-level components like compute instances, storage, and networking; and high-level components like DNS entries and SaaS features.

Main.tf

terraform {

** Dockerized Knowledge Retrieval API with Local LLM Integration

Containerize the Knowledge Retrieval API and Evaluation script within the existing Docker Compose stack

1. DockerFile

FROM python:3.11-slim

WORKDIR /app

** How to implement RAG Evaluation

The purpose of Retrieval-Augmented Generation (RAG) Evaluation is to objectively measure the performance of a RAG system to ensure it produces accurate, relevant, and trustworthy answers. Using model qwen2.5-coder:7b and llama3.2 to do a comparison.

1. JUDGE LOGIC

def ask_judge(query, expected, actual, judge_model_name):
    judge_llm = Ollama(model=judge_model_name, base_url=BASE_URL, temperature=0)
    judge_template = """Compare these Docker commands for the query: "{query}"
    Expected: {expected}
 Actual: {actual}

** How to implement local LLM RAG

To maintain an in-house PDF-based knowledge search agent, we need to implement a local LLM integrated with RAG to enhance performance and accuracy. Thus use the simple PDF file to do a development first.

Ollama:

Ollama is a lightweight tool that lets you run large language models (LLMs) locally on your own machine or server.Instead of calling a cloud API (like OpenAI), Ollama allows you to download and serve models directly from your local environment.

Qdrant:

Qdrant is an open-source vector database designed for storing and searching high-dimensional vectors (embeddings).

** Wordpress + nginx + cerbot on AWS EC2

This doucment will guide how to set up a https wordpress on AWS

Nginx:

NGINX is open-source web server software used for reverse proxy, load balancing, and caching. It provides HTTPS server capabilities and is mainly designed for maximum performance and stability.

Cerbot:

** Wordpress ECR + RDS on AWS

This doucment explains how to set up wordpress on AWS elastic container service + Relational Database Service.

Elastic container service:

Amazon Elastic Container Service (ECS) is a cloud computing service in Amazon Web Services (AWS) that manages containers and lets developers run applications in the cloud without having to configure an environment for the code to run in.

Relational Database Service:

Amazon Relational Database Service (RDS) is a managed database service provided by Amazon Web Services (AWS). It makes it easy to set up and operate a scalable relational database in the AWS cloud. Amazon RDS supports an array of database engines to store and organize data.

** How to set up Kubeflow on AWS- P4- house prediction pipeline

This doucment explains how to run the house prediction script, then build and execute pipeline on kubeflow

Compile kubeflow pipeline:

follow the official document, can compile and build the simple pipeline

1. Using Jupyter Notebook

** How to set up Kubeflow on AWS- P3- pipeline

This doucment explains how to run the first pipeline on kubeflow

kubeflow pipeline:

Kubeflow Pipelines (KFP) is a platform for building and deploying portable and scalable machine learning (ML) workflows using Docker containers.

1. Creating Jupyter Notebook

** How to set up Kubeflow on AWS- P2- microk8s

I had tried to set up kubeflow via minikube or microk8s. But due to the resource limitation. even spent 2 hours still blocked. Thus, have to set up kubeflow on AWS.

microk8s:

MicroK8s is a lightweight Kubernetes distribution that makes it easy to run Kubernetes on a local machine, laptop, or edge device.

juju:

Juju is an operation Lifecycle manager (OLM) for clouds, bare metal or Kubernetes.

** How to set up Kubeflow locally- P1- minikube

Before build the kubeflow pipeline need to set up minikube on ubuntu 18.04 first.

Kubeflow:

Kubeflow is a community and ecosystem of open-source projects to address each stage in the machine learning (ML) lifecycle. It makes ML on Kubernetes simple, portable, and scalable.

Minikube:

Minikube is a tool that sets up a Kubernetes environment on a local PC or laptop. It’s technically a Kubernetes distribution, but because it addresses a different type of use case than most other distributions (like Rancher, OpenShift, and EKS), it’s more common to hear folks refer to it as a tool rather than a distribution.