Skip to content

Instantly share code, notes, and snippets.

@murdockcrc
Last active September 28, 2018 16:58
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save murdockcrc/12266f9d844be416a6a0 to your computer and use it in GitHub Desktop.
Save murdockcrc/12266f9d844be416a6a0 to your computer and use it in GitHub Desktop.
DocumentDB stored procedure to retrieve a random document
/*
The MIT License (MIT)
Copyright (c) 2015 HdV Media GmbH
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/
module.exports = { /**
* This is executed as stored procedure to count the number of docs in the collection.
* To avoid script timeout on the server when there are lots of documents (100K+), the script executed in batches,
* each batch counts docs to some number and returns continuation token.
* The script is run multiple times, starting from empty continuation,
* then using continuation returned by last invocation script until continuation returned by the script is null/empty string.
*
* @param {String} filterQuery - Optional filter for query (e.g. "SELECT * FROM docs WHERE docs.category = 'food'").
* Preferably, you should select just the ID of the document, so you can retrieve it later and keep the load on the random
* query on the DB to the minimum, something like "SELECT docs.id FROM docs"
* @param {String} continuationToken - The continuation token passed by request, continue counting from this token.
*/
id: 'GetRandomDocument',
serverScript: function getRandomDocument(filterQuery, continuationToken) {
var collection = getContext().getCollection();
var maxResult = 1000; // MAX number of docs to process in one batch, when reached, return to client/request continuation.
// intentionally set low to demonstrate the concept. This can be much higher. Try experimenting.
// We've had it in to the high thousands before seeing the stored proceudre timing out.
// The number of documents counted.
var result = [];
tryQuery(continuationToken);
// Helper method to check for max result and call query.
function tryQuery(nextContinuationToken) {
var responseOptions = { continuation: nextContinuationToken, pageSize : maxResult };
// In case the server is running this script for long time/near timeout, it would return false,
// in this case we set the response to current continuation token,
// and the client will run this script again starting from this continuation.
// When the client calls this script 1st time, is passes empty continuation token.
if (result.length >= maxResult || !query(responseOptions)) {
setBody(nextContinuationToken);
}
}
function query(responseOptions) {
// For empty query string, use readDocuments rather than queryDocuments -- it's faster as doesn't need to process the query.
return (filterQuery && filterQuery.length) ?
collection.queryDocuments(collection.getSelfLink(), filterQuery, responseOptions, onReadDocuments) :
collection.readDocuments(collection.getSelfLink(), responseOptions, onReadDocuments);
}
//return random integer from range
function getRandomInt(min, max) {
return Math.floor(Math.random() * (max - min + 1)) + min;
}
// This is callback is called from collection.queryDocuments/readDocuments.
function onReadDocuments(err, docFeed, responseOptions) {
if (err) {
throw 'Error while reading document: ' + err;
}
//append all docFeed to results
docFeed.forEach(function(element){
result.push(element);
});
// If there is continuation, call query again with it,
// otherwise we are done, in which case set continuation to null.
if (responseOptions.continuation) {
tryQuery(responseOptions.continuation);
} else {
setBody(null);
}
}
// Set response body: use an object the client is expecting (2 properties: result and continuationToken).
function setBody(continuationToken) {
var randomIndex = getRandomInt(0, result.length - 1);
var body = { randomDocument: result[randomIndex], continuationToken: continuationToken };
getContext().getResponse().setBody(body);
}
}
}
@DarinMacRae
Copy link

Thank you for this script. I am just now using it for a new project. I noticed that the syntax isn't correct for the latest CosmosDb so I forked and modified. Feel free to use my version if it helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment