Skip to content

Instantly share code, notes, and snippets.

@prasann
Last active June 25, 2024 08:56
Show Gist options
  • Save prasann/f81d3ecc30729e6d6f8744622336cf83 to your computer and use it in GitHub Desktop.
Save prasann/f81d3ecc30729e6d6f8744622336cf83 to your computer and use it in GitHub Desktop.
APIM policy to rate limit DallE endpoints based on the number of images based requested

Rate limiting DALL-E requests based on the number of images requested

Approach

The request of the DallE endpoint carries a parameter that defines the number of images to be generated. The sample policy will use this parameter to dynamically adjust the rate limit applied to the request.

Reference: https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#image-generation

Policy

<policies>
    <inbound>
        <base />
        <set-variable name="body" value="@(context.Request.Body.As<string>(preserveContent: true))" />
        <choose>
            <when condition="@(context.Request.Body.As<JObject>()["n"] != null)">
                <rate-limit-by-key calls="1" renewal-period="60" counter-key="@(context.Request.IpAddress)" increment-condition="@(context.Response.StatusCode == 200)" increment-count="@(int.Parse(context.Request.Body.As<JObject>()["n"].ToString()))" retry-after-header-name="x-apim-custom-retry" remaining-calls-header-name="x-apim-remaining" />
            </when>
            <otherwise>
                <rate-limit-by-key calls="1" renewal-period="60" counter-key="@(context.Request.IpAddress)" increment-condition="@(context.Response.StatusCode == 200)" increment-count="1" retry-after-header-name="x-apim-custom-retry" remaining-calls-header-name="x-apim-remaining" />
            </otherwise>
        </choose>
        <set-body>@((string)context.Variables["body"])</set-body>
    </inbound>
    <!-- Control if and how the requests are forwarded to services  -->
    <backend>
        <base />
    </backend>
    <!-- Customize the responses -->
    <outbound>
        <base />
    </outbound>
    <!-- Handle exceptions and customize error responses  -->
    <on-error>
        <base />
    </on-error>
</policies>

Explanation of the policy

This XML snippet is a configuration for a rate-limiting policy in Azure API Management (APIM). It specifies how API requests are throttled based on the client's IP address. Here's a breakdown of its components:

calls="1": This attribute specifies the number of calls allowed per the specified renewal period. In this case, it allows 1 call.

increment-condition="@(context.Response.StatusCode == 200)": This condition specifies when the call count should be incremented. The count is incremented only if the API response status code is 200, indicating a successful request. If the condition is not met (e.g., if an error occurs), the call does not count against the limit.

increment-count="@(int.Parse(context.Request.Body.As<JObject>()["n"].ToString()))": This dynamically sets the number of calls to be counted against the limit based on a value (n) in the request body. For example, if the request body contains "n": 3, and the response status code is 200, the counter is incremented by 3. This allows for flexible rate limiting based on the request content.

Why do we need set-body

In Azure API Management (APIM), when you manipulate the request or response body within a policy, such as reading the body content with context.Request.Body.As<string>(), APIM does not automatically forward the original body to the backend service. This behavior is by design to allow for flexibility in modifying the request or response body within the policy execution.

The reason you need to explicitly use set-body to forward the original body after reading or modifying it is due to how APIM handles the request stream. Reading the request body consumes the stream, and once consumed, it cannot be read again or automatically forwarded. This is why you store the body in a variable before any manipulation. After performing your operations (like rate limiting in your case), you then explicitly set the body back to its original state (or modified state if that was your intention) using the set-body policy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment