Skip to content

Instantly share code, notes, and snippets.

@thepycoder
Created October 11, 2022 08:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save thepycoder/a0a3d55730744a6b0dfe0db75da2e769 to your computer and use it in GitHub Desktop.
Save thepycoder/a0a3d55730744a6b0dfe0db75da2e769 to your computer and use it in GitHub Desktop.
FastAPI Server
import torch
from fastapi import FastAPI
from transformers import AutoTokenizer, BatchEncoding, TensorType, AutoModelForSequenceClassification
application = FastAPI()
tokenizer = AutoTokenizer.from_pretrained("philschmid/MiniLM-L6-H384-uncased-sst2")
model = AutoModelForSequenceClassification.from_pretrained("philschmid/MiniLM-L6-H384-uncased-sst2").to('cuda:0')
@application.get("/predict")
def predict(query: str):
inputs: BatchEncoding = tokenizer(
text=query,
max_length=128,
truncation=True,
return_tensors='pt',
).to('cuda:0')
with torch.no_grad():
logits = model(**inputs).logits
return logits.to('cpu').tolist()
return go(f, seed, [])
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment