Skip to content

Instantly share code, notes, and snippets.

@Liversage
Created August 31, 2018 08:56
Show Gist options
  • Save Liversage/a2321fb6635c6b3717f68d65f8becac3 to your computer and use it in GitHub Desktop.
Save Liversage/a2321fb6635c6b3717f68d65f8becac3 to your computer and use it in GitHub Desktop.
High request unit cost in Cosmos DB when querying on ID

Cosmos DB high request unit cost issue

Introduction

This test application demonstrates an issue with Cosmos DB where the request unit cost (RU) for a SQL query on id is unexpectedly high and increases linearly with the number of documents in the collection. This issue only occurs when the collection is partitioned and the partition key is not id.

Cosmos DB provides an API to retrieve a document by ID and this operation will always have a RU cost that is "low". This cost essentially depends on the size of the document and as the number of documents increases this cost will not increase.

So even though there is an API to retrieve a document by ID in some cases it is still useful to perform SQL queries that involves the id property. The IDs may encode the type of the document allowing for retrieval of specific document types by using queries like

SELECT * FROM root WHERE STARTSWITH(root.id, "Foo")

The Microsoft Azure Storage Explorer also uses SQL to retrieve a document by ID:

SELECT * FROM c WHERE c.id = "Foobar"

The issue demonstrated by this test application can easily make the above sample queries require so many request units that the query fails. When this happens Microsoft Azure Storage Explorer will simply not display any results and you may belive that you have entered an incorrect ID when in fact the document exists.

Test

The test application is configured to use the emulator but can easily be modified to use a specific Cosmos DB instance.

The application creates the database Test and in this database two document collections, Fast and Slow, are created. To speed things up the offer throughput for these collections are set to 10000. Note that this may incur a cost as long as these document collections exists in Azure. The test application will delete the database when the test is done so the actual cost will be quite small.

The only difference between the two document collections are the partition key. In Slow the partition key is defined as /key and in Fast it is defined as /id.

The test application creates 5000 documents in each collection. A single document is then retrieved by ID from each collection using the following SQL:

SELECT * FROM root WHERE root.id = "..."

The request unit cost of this query is then reported.

Findings

In the Fast document collection the request unit cost is less than 3. This is the collection where the id property is the partition key. Changing the number of documents in the collection to something like 10000 will not change this cost.

In the Slow document collection the request unit cost is close to 90. This is the collection where the key property is the partition key. Doubling the number of documents in the collection will almost double the cost. With enough documents in the collection the cost to retrieve a single document by ID will be so high that the system trying to retrieve the document will become very slow which often can lead to failures in distributed systems (timeouts etc.).

Resolution

I have been in contact with Cosmos DB support and they have confirmed the issue. Fortunately, they have made some changes to how indexing is performed and a Cosmos DB instance using the new indexing strategy no longer suffers from this problem. Unfortunately, the new indexing strategy is not yet available by default so I had to ask Cosmos DB support to upgrade my Cosmos DB instances to use the new strategy (this will incur some downtime). Hopefully, the new strategy will be the default for all new Cosmos DB instances in the future.

Discussion

Without knowing the exact details on how Cosmos DB is implemented I believe that executing a SQL query on id in a document collection with a partition key that is not id will result in a scan of all documents leading to a request unit cost that increases linearly with the total size of the documents in the collection.

If the query involves other properties that are indexed the number of documents to scan can be greatly reduced bringing the request unit cost down. It is not possible to configure an index on id that will remove the need to perform the scan.

<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>netcoreapp2.1</TargetFramework>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|AnyCPU'">
<LangVersion>latest</LangVersion>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|AnyCPU'">
<LangVersion>latest</LangVersion>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.Azure.DocumentDB.Core" Version="1.9.1" />
</ItemGroup>
</Project>
using System;
using System.Collections.ObjectModel;
using System.Linq;
using System.Net;
using System.Threading.Tasks;
using Microsoft.Azure.Documents;
using Microsoft.Azure.Documents.Client;
using Microsoft.Azure.Documents.Linq;
namespace CosmosDBIssue
{
internal class Program
{
private static async Task Main(string[] args)
{
const string serviceEndpoint = "https://localhost:8081";
const string accountKey = "C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==";
const string databaseId = "Test";
const string fastDocumentCollectionId = "Fast";
const string slowDocumentCollectionId = "Slow";
const int documentCount = 5000;
var client = new DocumentClient(new Uri(serviceEndpoint), accountKey);
await client.CreateDatabaseIfNotExistsAsync(new Database { Id = databaseId });
Console.WriteLine($"Created database {databaseId}");
await CreateDocumentCollectionAsync(client, databaseId, fastDocumentCollectionId, "/id");
await CreateDocumentCollectionAsync(client, databaseId, slowDocumentCollectionId, "/key");
await CreateDataAsync(client, databaseId, fastDocumentCollectionId, documentCount);
await CreateDataAsync(client, databaseId, slowDocumentCollectionId, documentCount);
await MeasurePerformanceAsync(client, databaseId, fastDocumentCollectionId);
await MeasurePerformanceAsync(client, databaseId, slowDocumentCollectionId);
await client.DeleteDatabaseAsync(UriFactory.CreateDatabaseUri(databaseId));
Console.WriteLine($"Deleted database {databaseId}");
Console.ReadKey();
}
private static async Task MeasurePerformanceAsync(DocumentClient client, string databaseId, string documentCollectionId)
{
var (document, _) = await QueryAsync(client, databaseId, documentCollectionId, "SELECT * FROM root");
var knownId = document.Id;
var (_, requestCharge) = await QueryAsync(client, databaseId, documentCollectionId, $@"SELECT * FROM root WHERE root.id = ""{knownId}""");
Console.WriteLine($"Request charge = {requestCharge} in {documentCollectionId}");
}
private static async Task<(Document Document, double RequestCharge)> QueryAsync(DocumentClient client, string databaseId, string documentCollectionId, string sqlExpression)
{
var documentCollectionUri = UriFactory.CreateDocumentCollectionUri(databaseId, documentCollectionId);
var queryable = client.CreateDocumentQuery<Document>(documentCollectionUri, sqlExpression, new FeedOptions { EnableCrossPartitionQuery = true, MaxItemCount = 1 });
var query = queryable.AsDocumentQuery();
var response = await query.ExecuteNextAsync<Document>();
return (response.FirstOrDefault(), response.RequestCharge);
}
private static async Task CreateDocumentCollectionAsync(DocumentClient client, string databaseId, string documentCollectionId, string partitionKey)
{
try
{
var documentCollectionUri = UriFactory.CreateDocumentCollectionUri(databaseId, documentCollectionId);
await client.ReadDocumentCollectionAsync(documentCollectionUri);
await client.DeleteDocumentCollectionAsync(documentCollectionUri);
Console.WriteLine($"Deleted document collection {documentCollectionId}");
}
catch (DocumentClientException exception)
{
if (exception.StatusCode != HttpStatusCode.NotFound)
throw;
}
await client.CreateDocumentCollectionAsync(
UriFactory.CreateDatabaseUri(databaseId),
new DocumentCollection
{
Id = documentCollectionId,
PartitionKey = new PartitionKeyDefinition
{
Paths = new Collection<string>
{
partitionKey
}
},
IndexingPolicy = new IndexingPolicy
{
IncludedPaths = new Collection<IncludedPath>
{
new IncludedPath
{
Path = "/*",
Indexes = new Collection<Index>
{
new RangeIndex(DataType.Number, -1),
new RangeIndex(DataType.String, -1)
}
}
}
}
},
new RequestOptions { OfferThroughput = 10000 }
);
Console.WriteLine($"Created document collection {documentCollectionId}");
}
private static async Task CreateDataAsync(DocumentClient client, string databaseId, string documentCollectionId, int count)
{
var documentCollectionUri = UriFactory.CreateDocumentCollectionUri(databaseId, documentCollectionId);
for (var i = 0; i < count; i += 1)
{
await client.CreateDocumentAsync(documentCollectionUri, CreateDocument(i));
if (i % 1000 == 999)
Console.WriteLine($"Created {i + 1} documents in {documentCollectionId}");
}
}
private static Document CreateDocument(int i)
{
var id = Guid.NewGuid();
var document = new Document
{
Id = id.ToString()
};
document.SetPropertyValue("key", id);
return document;
}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment