Skip to content

Instantly share code, notes, and snippets.

@mneedham
Created October 28, 2023 08:30
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mneedham/eec9246a5ce95dc792f2e73b16dfe78e to your computer and use it in GitHub Desktop.
Save mneedham/eec9246a5ce95dc792f2e73b16dfe78e to your computer and use it in GitHub Desktop.
Hugging Face's Text Embeddings Inference Library
git clone git@github.com:huggingface/text-embeddings-inference.git
cd text-embeddings-inference
cargo install --path router -F candle -F accelerate
model=BAAI/bge-large-en-v1.5
revision=refs/pr/5
text-embeddings-router --model-id $model --revision $revision --port 8080
[
"Aston Villa climbed to the top of Group E in the Europa Conference League after a classy victory against Dutch side AZ Alkmaar at AFAS Stadion.",
"Leon Bailey and Youri Tielemans got Villa off to the perfect start with a goal apiece in the space of 10 first-half minutes.",
"Alkmaar spurned several good opportunities before the break, included a guilt-edged chance for top scorer Vangelis Pavlidis when he was clear through on goal.",
"Ollie Watkins then further strengthened the visitors' grip on the game early in the second period, reacting quickest to turn home after Bailey's shot had been saved.",
"Captain John McGinn grabbed Villa's fourth when he showed desire to get ahead of his man and poke in at the near post, ensuring it would be a memorable night for Unai Emery's side.",
"Alkmaar winger Ibrahim Sadiq pulled one back shortly after the hour mark with a pinpoint shot from distance into the bottom corner.",
"Villa top of Group E after three games in the competition, but Zrinjski Mostar and Legia Warsaw, who meet later, both sit just three points behind.",
"Villa's World Cup-winning goalkeeper Emiliano Martinez hailed Emery as 'one of the top five managers in the world right now' in the build-up to Thursday's encounter.",
"Emery, 51, is a four-time Europa League winner, lifting Europe's second-tier competition three seasons in a row between 2014 and 2016 and overseeing Villarreal's 2021 final success against Manchester United.",
"He is now trying to mastermind Villa's first trophy success since they won the League Cup in 1996.",
"All the signs look extremely positive as they demolished an Alkmaar side who reached the semi-finals of the Conference League in 2022-23, losing against West Ham.",
"Rotation has been key as Villa look to fight on multiple fronts and they did not skip a beat, despite four changes from Sunday's 4-1 Premier League home win over West Ham.",
"Two of those replacements, Bailey and Tielemans, slotted in seamlessly as Villa's high-pressing approach saw them enjoy the early exchanges.",
"Bailey, scoring for the third game in succession for club and country, broke the deadlock with a fizzing half-volley, while Tielemans showed composure to gather McGinn's defence-splitting through ball before prodding through the legs of goalkeeper Mathew Ryan.",
"Watkins had the easiest finish of the night to chalk up his ninth of the campaign when Bailey's effort fell kindly into his path just five yards out.",
"McGinn's endeavour to make a driving run into the box paid dividends when he met a low cross from Jamaica winger Bailey to register his second goal in the competition.",
"A key facet of Villa's game under Emery is playing with bravery, pressing high up the field and they reaped the rewards in Alkmaar.",
"It is the sixth time they have scored four or more goals in a game this season and extends their unbeaten streak to five.",
"Defensively, Villa do still have some issues to iron out and they were fortunate to end the first half with a clean sheet.",
"Greek striker Pavlidis, who scored a hat-trick against Heerenveen at the weekend and has 13 goals in nine Eredivisie appearances this term, was gifted an opportunity just minutes after Villa had gone 2-0 ahead.",
"Boubacar Kamara lost the ball on the edge of his own box and Pavlidis with the goal at his mercy inexplicably blazed over the bar.",
"Villa were also opened up far too easily when Sadiq scored a consolation and they have just one clean sheet in their last five outings and five all season."
]
http POST 127.0.0.1:8080/embed \
  inputs:="$(jq '.[0:1]' sentences.json)" |
  jq -c '.[][:5]'
http POST 127.0.0.1:8080/embed \
  inputs:="$(jq '.[0:10]' sentences.json)" |
  jq -c '.[][:5]'
import json
with open("sentences.json", "r") as sentences_file:
file_contents = sentences_file.read()
sentences = json.loads(file_contents)
# Text Embeddings Inference
from llama_index.embeddings import TextEmbeddingsInference
embed_model = TextEmbeddingsInference(
base_url="http://localhost:8080",
model_name="BAAI/bge-large-en-v1.5",
timeout=60,
embed_batch_size=10,
)
embeddings = embed_model.get_text_embedding_batch(sentences)
# Store in ChromaDB
from llama_index.vector_stores import ChromaVectorStore
import chromadb
chroma_client = chromadb.EphemeralClient()
chroma_collection = chroma_client.create_collection("chunks")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
from llama_index.schema import TextNode
nodes = [
TextNode(text=sentence, embedding=embedding)
for sentence, embedding in zip(sentences, embeddings)
]
vector_store.add(nodes)
# LLM
from llama_index import ServiceContext
from llama_index.llms import Ollama
llm = Ollama(model="zephyr")
service_context = ServiceContext.from_defaults(
llm=llm, embed_model=embed_model
)
from llama_index import VectorStoreIndex
index = VectorStoreIndex.from_vector_store(
vector_store,
service_context=service_context,
)
query_engine = index.as_query_engine()
# Ask a question
response = query_engine.query("How many games have they been unbeaten?")
print(response.response)
print([(node.text, node.score) for node in response.source_nodes])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment