You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Haystack ingest/retrieve deployment with a REST API
This document describes how to deploy a Haystack instance to ingest documents and answer questions using those documents through the REST API.
This is a simple document search deployment. It does not answer questions like a large language model (e.g. GPT) does. The goal of this deployment is to understand basic Haystack principles that are used in more complex deployments later:
The instructions are based on the Using Haystack with REST API tutorial ("last updated" date on the page says April 14, 2023). Please check the tutorial before following the instructions here. If the instructions do not match the tutorial, trust the tutorial and update the instructions.
The tutorial uses ElasticSearch as the indexing engine. We will modify the deployment in another step to use a different vector database.
Create and configure a virtual machine
These instructions apply to GCP VM. Adjust to your environment.
A small VM can be used for small-scale experiments.
e2-standard-2 with 2 vCPU and 8 GB memory
100 GB boot disk
Ubuntu 22.04
Create an instance schedule to shut it down when not in use automatically.
Protect the VM by allowing access to port 8000 from specific external addresses. The easiest way to do that in GCP is to modify the standard http-server firewall rule:
Add port 8000 to the rule.
Set source addresses to allow only specific addresses or network segments.
Add the http-server network tag to the VM.
Install the prerequisites
At this time, the only prerequisite is Docker Compose. Follow the instructions to install Docker. It will install Docker Compose as well.
Note that we do not want the Docker Desktop installation, just the Docker engine and its CLI.
Configure Docker to not require root permission.
sudo gpasswd -a $USER docker
Log out (e.g. close the SSH session) and log back in, then run the Docker "hello world" to confirm that it works.
docker run hello-world
Configure and start Haystack
The instructions here summarize the tutorial page dated 4/13/2023. If the instructions do not match the tutorial, trust the tutorial and update the instructions.
Create a directory and get the template Docker Compose file for this Haystack deployment.
# Choose an appropriate root directory: /opt if you have access,# otehrwise your own home directory
mkdir doc-search
cd doc-search
curl --output docker-compose.yml \
https://raw.githubusercontent.com/deepset-ai/haystack/main/docker-compose.yml
The next step is to assemble a pipeline with the nodes we need for this application. Nodes is how Haystack splits the building blocks of the solution. Pipelines is how we put the nodes together.
Create the (empty) pipeline definition file and note the directory we are in.
touch document-search.haystack-pipeline.yml
pwd# copy the output - will be used in the next step
Open docker-compose.yml and edit the volumes value to point to the directory where we have the pipeline definition file (this directory) and the PIPELINE_YAML_PATH to the full path to that file.
haystack-api:
...volumes:
- ./:<output from pwd (above)>environment:
...
- PIPELINE_YAML_PATH=<output from pwd (above)>/document-search.haystack-pipeline.yml
Populate the pipeline definition file as shown below. The components of this pipeline are a store node, a retriever node, a classifier, a converter, and a preprocessor. The pipelines section combines the components to index (ingest) files and search (query) those files.
The command below creates the pipeline as defined in the tutorial.
The Haystack API should run under a web server if everything worked.
....many logs lines not shown here
doc-search-haystack-api-1 | INFO: Started server process [1]
doc-search-haystack-api-1 | INFO: Waiting for application startup.
doc-search-haystack-api-1 | INFO: Application startup complete.
doc-search-haystack-api-1 | INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
Open another terminal to the VM and verify that the Haystack API is up and running.
curl --request GET http://127.0.0.1:8000/initialized
# Should print "true"
Test the Haystack deployment
You can run these steps on the same VM where Haystack is installed or on another machine that has network access to the Haystack VM.
The official tutorial uploads hundreds of Wikipedia articles. This may take a while. To test it faster, we can upload only one of them.
# Replace the VM IP address# If running locally (on the VM), use 127.0.0.1export VM_IP=35.237.13.158
curl --request POST \
--url http://$VM_IP:8000/file-upload \
--header 'accept: application/json' \
--header 'content-type: multipart/form-data' \
--form files=@article_txt_countries_and_capitals/0_Minsk.txt \
--form meta=null
Now we can ask a question.
# Replace the VM IP address# If running locally (on the VM), use 127.0.0.1export VM_IP=35.237.13.158
curl --request POST \
--url http://$VM_IP:8000/query \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--data '{ "query": "What football teams are based in Minsk" }'
Note because we have only a simple retriever, we get the document chunks that contain the answer, not a nice and clean answer we would get from a large language model such as GPT. That is covered in a different tutorial.
Inspecting the API
Open a browswer window to http://<server IP>:8000/docs to see the API docs.