Skip to content

Instantly share code, notes, and snippets.

@dtaivpp
Last active May 15, 2024 18:47
Show Gist options
  • Save dtaivpp/9d2a458e59cf94ba803e99c4d8b140ec to your computer and use it in GitHub Desktop.
Save dtaivpp/9d2a458e59cf94ba803e99c4d8b140ec to your computer and use it in GitHub Desktop.

Semantic Search with OpenSearch and Cohere

Cluster Settings:

PUT /_cluster/settings
{
    "persistent": {
        "plugins.ml_commons.allow_registering_model_via_url": true,
        "plugins.ml_commons.only_run_on_ml_node": false,
        "plugins.ml_commons.connector_access_control_enabled": true,
        "plugins.ml_commons.model_access_control_enabled": true,
        "plugins.ml_commons.trusted_connector_endpoints_regex": [
          "^https://runtime\\.sagemaker\\..*[a-z0-9-]\\.amazonaws\\.com/.*$",
          "^https://api\\.openai\\.com/.*$",
          "^https://api\\.cohere\\.ai/.*$"
        ]
    }
}

Create Model Group:

POST /_plugins/_ml/model_groups/_register
{
    "name": "Cohere_Group",
    "description": "Public Cohere Model Group",
    "access_mode": "public"
}
# MODEL_GROUP_ID: 

Create Connector:

POST /_plugins/_ml/connectors/_create
{
   "name": "Cohere Connector",
   "description": "External connector for connections into Cohere",
   "version": "1.0",
   "protocol": "http",
   "credential": {
           "cohere_key": "<COHERE KEY HERE>"
       },
    "parameters": {
      "model": "embed-english-v2.0",
      "truncate": "END"
    },
   "actions": [{
       "action_type": "predict",
       "method": "POST",
       "url": "https://api.cohere.ai/v1/embed",
       "headers": {
               "Authorization": "Bearer ${credential.cohere_key}"
           },
			"request_body": "{ \"texts\": ${parameters.prompt}, \"truncate\": \"${parameters.truncate}\", \"model\": \"${parameters.model}\" }",
			"pre_process_function": "connector.pre_process.cohere.embedding",
			 "post_process_function": "connector.post_process.cohere.embedding"
       }]
}
# CONNECTOR_ID:

Register and deploy a model to the cluster:

POST /_plugins/_ml/models/_register?deploy=true
{
    "name": "embed-english-v2.0",
    "function_name": "remote",
    "description": "test model",
    "model_group_id": "<MODEL_GROUP_ID>",
    "connector_id": "<CONNECTOR_ID>"
}
# TASK_ID: 

Should see the model created and get Model ID:

GET /_plugins/_ml/tasks/<TASK_ID>
# MODEL_ID: 

Create Ingestion Pipeline

PUT _ingest/pipeline/cohere-ingest-pipeline
{
  "description": "Cohere Neural Search Pipeline",
  "processors" : [
    {
      "text_embedding": {
        "model_id": "<MODEL_ID>",
        "field_map": {
          "content": "content_embedding"
        }
      }
    }
  ]
}

Create KNN index. Note* need to match space to model space. eg embed-english-v2.0 recommends cosine similarity:

PUT /cohere-index
{
	"settings": {
		"index.knn": true,
		"default_pipeline": "cohere-ingest-pipeline"
	},
	"mappings": {
		"properties": {
			"content_embedding": {
				"type": "knn_vector",
				"dimension": 4096,
				"method": {
					"name": "hnsw",
					"space_type": "cosinesimil",
					"engine": "nmslib"
				}
			},
			"content": {
				"type": "text"
			}
		}
	}
}

Hydrate index with _bulk

POST _bulk
{ "create" : { "_index" : "cohere-index", "_id" : "1" }}
{ "content":"Testing neural search"}
{ "create" : { "_index" : "cohere-index", "_id" : "2" }}
{ "content": "What are we doing"}
{ "create" : { "_index" : "cohere-index", "_id" : "3" } }
{ "content": "This should exist"}

Search

GET /cohere-index/_search
{
  "query": {
    "bool" : {
      "should" : [
        {
          "script_score": {
              "neural": {
                "content_embedding": {
                  "query_text": "How do I ingest to opensearch",
                  "k": 10
              }
            },
            "script": {
              "source": "_score * 1.5"
            }
          }
        }
        ,
        {
          "script_score": {
            "query": {
              "match": { "content": "I want information about the new compression algorithems in OpenSearch" }
            },
            "script": {
              "source": "_score * 1.7"
            }
          }
        }
      ]
    }
  }
}

Cleanup

POST /_plugins/_ml/models/<MODEL_ID>/_undeploy
DELETE /_plugins/_ml/models/<MODEL_ID>
DELETE /_plugins/_ml/connectors/<CONNECTOR_ID>
DELETE _ingest/pipeline/cohere-ingest-pipeline
DELETE cohere-index

Troubleshoot:

POST /_plugins/_ml/models/<MODEL_ID>/_predict
{
  "parameters": {
    "texts": ["This should exist"]
  }
} 
GET /cohere-index/_search
{
  "query": {
    "match_all": {}
  }
}
@Xu-Hardy
Copy link

Xu-Hardy commented Dec 8, 2023

Is this your locally deployed opensearch? I reported an error using AWS hosting "Message": "Your request: '/_cluster/settings' payload is not allowed."
}

@dtaivpp
Copy link
Author

dtaivpp commented Dec 8, 2023

Yes, Amazon OpenSearch service does not have these feature flags (for conversational search and RAG). Connectors and hybrid search should be there however if I'm not mistaken.

@jonwiggins
Copy link

jonwiggins commented Dec 12, 2023

@dtaivpp I get that error whenever I try to enable plugins.ml_commons.allow_registering_model_via_url - It looks like this is required to load the sparse search models (https://opensearch.org/docs/latest/ml-commons-plugin/pretrained-models/#step-2-register-a-local-opensearch-provided-model). Does Amazon OpenSearch not support this? Or do you know if there are any plans to?

@dtaivpp
Copy link
Author

dtaivpp commented Dec 13, 2023

@jonwiggins Reading through the docs it seems like it should but let me take a look and see if I can test it.

@jonwiggins
Copy link

@dtaivpp Awesome, thank you! Totally stuck trying to use this feature, I really appreciate the help.

@dtaivpp
Copy link
Author

dtaivpp commented Dec 14, 2023

@jonwiggins Just occurred to me, which version of OpenSearch are you currently deploying?

@jonwiggins
Copy link

@dtaivpp OpenSearch_2_11_R20231113-P1 - which I think is the latest version AWS allows.
Are you able to turn that feature on in AWS OpenSearch?

@dtaivpp
Copy link
Author

dtaivpp commented Dec 22, 2023

@jonwiggins so after some digging it seems it is available but only with hosted models on sagemaker. The self run models aren't supported currently -_-

@jonwiggins
Copy link

Darn that's really disappointing. Thanks for getting back to me though, otherwise I would have kept trying to make it work, I appreciate it.

@dtaivpp
Copy link
Author

dtaivpp commented Dec 22, 2023

All good @jonwiggins. Also, if you have an account TAM for your AWS account I’d ask that they open what’s called a PFR for this. Enough people have been asking for this that I want to start getting it documented so we can eventually support it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment