jasonbosco/typesense-query-suggestions-and-analytics.md Secret

## typesense-query-suggestions-and-analytics.md

      
    Raw
  

              typesense-query-suggestions-and-analytics.md
            
          
    ⚠️ UPDATE: v0.25.0 is now generally available. Please upgrade to 0.25.0 and use the official docs here.


Analytics & Query Suggestions in Typesense

As of Typesense Server 0.25.0.rc34, you can now configure to aggregate your top search queries across one or more collections into a new collection.
You can then use the data in this collection for either viewing your top search terms, or in addition, also use the data to show query suggestions to your users.
Enabling search analytics


If you're using Typesense Cloud, please update to 0.25.0.rc52 or later and we will automatically do the following for you and you can skip this section.

Search analytics is disabled by default . This must be enabled first via the --enable-search-analytics flag for query suggestions and other analytics features to work.
./typesense-server --data-dir=/tmp/data --api-key=abcd --enable-search-analytics=true --analytics-flush-interval=60

Search queries are first aggregated in-memory in every Typesense node independently and then persisted in a configured search suggestions collection (see below) periodically. The --analytics-flush-interval flag determines how often the search query aggregations are persisted to the suggestion collection. Set this to a smaller value (minimum value is 60 seconds) to get more frequent updates to the suggestion collection. Default value is: 3600 (every hour).
Create a collection that will store the search terms

The q and count fields are mandatory.
curl -k "http://localhost:8108/collections" -X POST -H "Content-Type: application/json" \
      -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" -d '{
        "name": "top_queries",
        "fields": [
          {"name": "q", "type": "string" },
          {"name": "count", "type": "int32" }
        ]
      }'


Create an analytics rule for the searches to be aggregated

We can now configure Typesense to aggregate all the popular queries by creating a popular_queries analytics rule that stores the most frequently occuring search queries in the collection we created above. We limit the popular queries to the top 1000 queries via the limit parameter.
curl -k "http://localhost:8108/analytics/rules" -X POST -H "Content-Type: application/json" \
      -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" -d '{
        "name": "top_queries",
        "type": "popular_queries",
        "params": {
            "source": {
                "collections": ["hnstories"]
            },
            "destination": {
                "collection": "top_queries"
            },
            "limit": 1000
        }
        
    }'

Searches will now be aggregated!

When you now make a search on the source collection that you have configured in the rule above, it will be aggregated to the destination collection.
curl -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" -H "X-TYPESENSE-USER-ID: 100" http://localhost:8108/collections/hnstories/documents/search?q=apple&query_by=title&x-typesense-user-id=103

Optionally, you can also send a x-typesense-user-id parameter to indicate the user as shown above. This can also be sent via a X-TYPESENSE-USER-ID header as well. This helps with aggregating search queries better. If not specified, Typesense will use the IP address by default.
Since Typesense could be used for type-ahead seaches, a search query is accounted for aggregation only when there is atleast a 4 second pause after it. E.g. f -> fo -> foo -> 4 second pause will register the foo query.
Listing all the analytics rules

curl -k "http://localhost:8108/analytics/rules" -X GET -H "Content-Type: application/json" \
      -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" 

Removing an analytics rule

curl -k "http://localhost:8108/analytics/rules/top_queries" -X DELETE -H "Content-Type: application/json" \
      -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}"