⚠️ UPDATE: v0.25.0 is now generally available. Please upgrade to 0.25.0
and use the official docs here.
As of Typesense Server 0.25.0.rc34
, you can now configure to aggregate your top search queries across one or more collections into a new collection.
You can then use the data in this collection for either viewing your top search terms, or in addition, also use the data to show query suggestions to your users.
If you're using Typesense Cloud, please update to
0.25.0.rc52
or later and we will automatically do the following for you and you can skip this section.
Search analytics is disabled by default . This must be enabled first via the --enable-search-analytics
flag for query suggestions and other analytics features to work.
./typesense-server --data-dir=/tmp/data --api-key=abcd --enable-search-analytics=true --analytics-flush-interval=60
Search queries are first aggregated in-memory in every Typesense node independently and then persisted in a configured search suggestions collection (see below) periodically. The --analytics-flush-interval
flag determines how often the search query aggregations are persisted to the suggestion collection. Set this to a smaller value (minimum value is 60 seconds) to get more frequent updates to the suggestion collection. Default value is: 3600 (every hour).
The q
and count
fields are mandatory.
curl -k "http://localhost:8108/collections" -X POST -H "Content-Type: application/json" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" -d '{
"name": "top_queries",
"fields": [
{"name": "q", "type": "string" },
{"name": "count", "type": "int32" }
]
}'
We can now configure Typesense to aggregate all the popular queries by creating a popular_queries
analytics rule that stores the most frequently occuring search queries in the collection we created above. We limit the popular queries to the top 1000 queries via the limit
parameter.
curl -k "http://localhost:8108/analytics/rules" -X POST -H "Content-Type: application/json" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" -d '{
"name": "top_queries",
"type": "popular_queries",
"params": {
"source": {
"collections": ["hnstories"]
},
"destination": {
"collection": "top_queries"
},
"limit": 1000
}
}'
When you now make a search on the source collection that you have configured in the rule above, it will be aggregated to the destination collection.
curl -H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}" -H "X-TYPESENSE-USER-ID: 100" http://localhost:8108/collections/hnstories/documents/search?q=apple&query_by=title&x-typesense-user-id=103
Optionally, you can also send a x-typesense-user-id
parameter to indicate the user as shown above. This can also be sent via a X-TYPESENSE-USER-ID
header as well. This helps with aggregating search queries better. If not specified, Typesense will use the IP address by default.
Since Typesense could be used for type-ahead seaches, a search query is accounted for aggregation only when there is atleast a 4 second pause after it. E.g. f
-> fo
-> foo
-> 4 second pause will register the foo
query.
curl -k "http://localhost:8108/analytics/rules" -X GET -H "Content-Type: application/json" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}"
curl -k "http://localhost:8108/analytics/rules/top_queries" -X DELETE -H "Content-Type: application/json" \
-H "X-TYPESENSE-API-KEY: ${TYPESENSE_API_KEY}"