Skip to content

Instantly share code, notes, and snippets.

@tsisodia10
Last active October 26, 2023 03:21
Show Gist options
  • Save tsisodia10/d00f8073e2845e92d1e9147a84e136e2 to your computer and use it in GitHub Desktop.
Save tsisodia10/d00f8073e2845e92d1e9147a84e136e2 to your computer and use it in GitHub Desktop.

Instrument the Python Chatbot application and fetch metrics with Prometheus

1. ChatBot app

import openai
import gradio as gr
import os
import threading
import time
import psutil
from prometheus_client import start_http_server, Counter, Histogram, Gauge

# Start Prometheus client
start_http_server(8000)  # Exposes metrics at http://localhost:8000

# Define metrics
REQUESTS = Counter('generate_response_total', 'Total number of generate_response calls')
MEMORY_USAGE = Gauge('application_memory_usage_bytes', 'Memory usage of the application')
CPU_USAGE = Gauge('application_cpu_usage_seconds_total', 'Total CPU time used by the application')
RESPONSE_TIME = Histogram('generate_response_duration_seconds', 'Histogram for the response time of generate_response')
ERRORS = Counter('generate_response_errors_total', 'Total number of errors in generate_response')

# Set your OpenAI API key
openai.api_key = os.environ.get("OPENAI_KEY")

def generate_response(prompt, max_tokens=50):
    with RESPONSE_TIME.time():  # Measure response time
        try:
            response = openai.Completion.create(
                engine="text-davinci-002",
                prompt=prompt,
                max_tokens=max_tokens
            )
            REQUESTS.inc()  # Increment request count
            return response.choices[0].text
        except Exception as e:
            ERRORS.inc()  # Increment error count
            raise e 
            
def generate_response_with_gradio(prompt, max_tokens=50):
    response = generate_response(prompt, max_tokens)
    return response

iface = gr.Interface(
    flagging_dir="/tmp/flagged",
    fn=generate_response_with_gradio,
    inputs="text",
    outputs="text",
    layout="horizontal",
    title="Few-Shot Prompting",
    description="Enter a text prompt.",
    capture_session=True
)

def collect_metrics():
    process = psutil.Process(os.getpid())
    while True:
        MEMORY_USAGE.set(process.memory_info().rss)  # RSS: Resident Set Size
        CPU_USAGE.set(process.cpu_times().user)
        time.sleep(5)  # Update metrics every 5 seconds

metrics_thread = threading.Thread(target=collect_metrics)
metrics_thread.start()

if __name__ == '__main__':
    iface.launch()

2. Dockerize it --

docker build -t <username>/<image>:latest .
docker push <username>/<image>:latest

3. Deploy the app on openshift cluster -

apiVersion: v1
kind: Secret
metadata:
  name: openai-secret
  namespace: prometheus
type: Opaque
data:
  OPENAI_KEY: <key>
apiVersion: v1
kind: Pod
metadata:
  name: openshift-ai-pod
  labels:
    app: frontend
  namespace: prometheus 
spec:
  containers:
  - name: openshift-ai-container
    image: twinkllsisodia/ai-chatbot:latest
    env:
      - name: OPENAI_KEY
        valueFrom:
          secretKeyRef:
            name: openai-secret
            key: OPENAI_KEY
    ports:
    - containerPort: 7860 
apiVersion: v1
kind: Service
metadata:
  name: ai-bot-service
  labels:
    app: frontend
  namespace: prometheus  
spec:
  ports:
  - name: metrics
    port: 8000
  selector:
    app: frontend

2. Deploy Prometheus Operator through operatorHub

3. Create Service Monitor and Prometheus instance

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: sm
  namespace: prometheus
  labels:
    app: frontend
spec:
  selector:
    matchLabels:
      app: frontend
  namespaceSelector:
    matchNames:
    - prometheus    
  endpoints:
  - port: metrics
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus
spec:
  serviceAccountName: prometheus-k8s
  serviceMonitorSelector:
    matchLabels:
      app: frontend
  namespaceSelector:
    matchNames:
    - prometheus
  resources:
    requests:
      memory: 400Mi
  enableAdminAPI: false

Screenshot 2023-10-25 at 11 19 53 PM

Screenshot 2023-10-25 at 11 20 05 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment