Skip to content

Instantly share code, notes, and snippets.

@tuhinsharma121
Last active April 2, 2024 11:38
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tuhinsharma121/6dea18152d53b166974bd4846954f859 to your computer and use it in GitHub Desktop.
Save tuhinsharma121/6dea18152d53b166974bd4846954f859 to your computer and use it in GitHub Desktop.
kind: Deployment
apiVersion: apps/v1
metadata:
name: fine-tuned-merlinite-7b
labels:
app: fine-tuned-merlinite-7b
spec:
replicas: 1
selector:
matchLabels:
app: fine-tuned-merlinite-7b
template:
metadata:
labels:
app: fine-tuned-merlinite-7b
spec:
restartPolicy: Always
schedulerName: default-scheduler
affinity: {}
terminationGracePeriodSeconds: 120
securityContext: {}
containers:
- resources:
limits:
nvidia.com/gpu: '1'
requests:
nvidia.com/gpu: '1'
readinessProbe:
httpGet:
path: /health
port: http
scheme: HTTP
timeoutSeconds: 5
periodSeconds: 30
successThreshold: 1
failureThreshold: 3
terminationMessagePath: /dev/termination-log
name: server
livenessProbe:
httpGet:
path: /health
port: http
scheme: HTTP
timeoutSeconds: 8
periodSeconds: 100
successThreshold: 1
failureThreshold: 3
env: []
args: [
"--model",
"/opt/app-root/src/training_results/final",
"--tokenizer",
"/opt/app-root/src/training_results/final"
]
securityContext:
capabilities:
drop:
- ALL
runAsNonRoot: true
allowPrivilegeEscalation: false
seccompProfile:
type: RuntimeDefault
ports:
- name: http
containerPort: 8000
protocol: TCP
imagePullPolicy: IfNotPresent
startupProbe:
httpGet:
path: /health
port: http
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 1
periodSeconds: 30
successThreshold: 1
failureThreshold: 24
volumeMounts:
- name: shm
mountPath: /dev/shm
- name: instruct-lab-pvc
mountPath: /opt/app-root/src
terminationMessagePolicy: File
image: 'quay.io/rh-aiservices-bu/vllm-openai-ubi9:0.3.3'
volumes:
- name: shm
emptyDir:
medium: Memory
sizeLimit: 1Gi
- name: instruct-lab-pvc
persistentVolumeClaim:
claimName: lab-train-shared-data-pvc
dnsPolicy: ClusterFirst
tolerations:
- key: tenant
value: rh-jarvis-ai
effect: NoSchedule
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 25%
maxSurge: 1
revisionHistoryLimit: 10
progressDeadlineSeconds: 600
---
kind: Service
apiVersion: v1
metadata:
name: fine-tuned-merlinite-7b
labels:
app: fine-tuned-merlinite-7b
spec:
clusterIP: None
ipFamilies:
- IPv4
ports:
- name: http
protocol: TCP
port: 8000
targetPort: http
type: ClusterIP
ipFamilyPolicy: SingleStack
sessionAffinity: None
selector:
app: fine-tuned-merlinite-7b
---
kind: Route
apiVersion: route.openshift.io/v1
metadata:
name: fine-tuned-merlinite-7b
labels:
shard: internal
app: fine-tuned-merlinite-7b
spec:
host: fine-tuned-merlinite-7b.apps.int.stc.ai.dev.us-east-1.aws.paas.redhat.com
to:
kind: Service
name: fine-tuned-merlinite-7b
weight: 100
port:
targetPort: http
tls:
termination: edge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment