-
-
Save haproxytechblog/8a65a65b4f866f1a68e62b4573bee7a3 to your computer and use it in GitHub Desktop.
Creating an HAProxy AI Gateway to Control LLM Costs, Security, and Privacy
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
http-request set-var(txn.openai_key_hash) | |
log-format-sd "%{+Q,+E}o[request@58750 host=%[var(txn.host)] referer=%[var(txn.referer)] user_agent=%[var(txn.user_agent)]][custom@58750 openai_key_hash=%[var(txn.openai_key_hash)]]" | |
http_auth_bearer(Authorization),sha2(256),hex |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# example denylist.acl | |
5fd924625a10e0baacdb8 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
acl blocked_key var(txn.openai_key_hash) -m -i -f /denylist.acl | |
Http-request deny deny_status 403 if blocked_key |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<key hash> <per minute prompt limit>:<per day prompt limit>:<per minute completion limit>:<per day completion limit> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
5fd924625a10e0baacdb8 100:200:1000:50000 | |
813490e4ba67813490e4 300:600:2000:30000 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
backend rates | |
stick-table rates_prompt_minute.local type string len 128 size 50k expire 1m store gpc(1),gpc_rate(1,60s) peers "$peers_section_name" | |
stick-table rates_prompt_minute.aggregate type string len 128 size 50k expire 1m store gpc(1),gpc_rate(1,60s) peers "$peers_section_name" | |
stick-table rates_prompt_day.local type string len 128 size 50k expire 24h store gpc(1),gpc_rate(1,1d) peers "$peers_section_name" | |
stick-table rates_prompt_day.aggregate type string len 128 size 50k expire 24h store gpc(1),gpc_rate(1,1d) peers "$peers_section_name" | |
stick-table rates_completion_minute.local type string len 128 size 50k expire 1m store gpc(1),gpc_rate(1,60s) peers "$peers_section_name" | |
stick-table rates_completion_minute.aggregate type string len 128 size 50k expire 1m store gpc(1),gpc_rate(1,60s) peers "$peers_section_name" | |
stick-table rates_completion_day.local type string len 128 size 50k expire 24h store gpc(1),gpc_rate(1,1d) peers "$peers_section_name" | |
stick-table rates_completion_day.aggregate type string len 128 size 50k expire 24h store gpc(1),gpc_rate(1,1d) peers "$peers_section_name" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
frontend mysite | |
http-request set-var(txn.maxrate_min_prompt) var(txn.openai_key_hash),map(/rate-limits.map,0),field(1,:) | |
http-request set-var(txn.maxrate_day_prompt) var(txn.openai_key_hash),map(rate-limits.map,0),field(2,:) | |
http-request set-var(txn.maxrate_min_completion) var(txn.openai_key_hash),map(rate-limits.map,0),field(3,:) | |
http-request set-var(txn.maxrate_day_completion) var(txn.openai_key_hash),map(rate-limits.map,0),field(4,:) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
http-request track-sc0 var(txn.openai_key_hash) table rates/rates_prompt_minute.local | |
http-request track-sc1 var(txn.openai_key_hash) table rates/rates_prompt_minute.aggregate | |
http-request track-sc2 var(txn.openai_key_hash) table rates/rates_prompt_day.local | |
http-request track-sc3 var(txn.openai_key_hash) table rates/rates_prompt_day.aggregate | |
http-request track-sc4 var(txn.openai_key_hash) table rates/rates_completion_minute.local | |
http-request track-sc5 var(txn.openai_key_hash) table rates/rates_completion_minute.aggregate | |
http-request track-sc6 var(txn.openai_key_hash) table rates/rates_completion_day.local | |
http-request track-sc7 var(txn.openai_key_hash) table rates/rates_completion_day.aggregate |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
http-request set-var(txn.rate_prompt_minute) sc_gpc_rate(0,1) | |
http-request set-var(txn.rate_prompt_day) sc_gpc_rate(0,3) | |
http-request set-var(txn.rate_completion_minute) sc_gpc_rate(0,5) | |
http-request set-var(txn.rate_completion_day) sc_gpc_rate(0,7) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
If (Current rate - Maximum rate <= 0) then | |
Over the limit |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
http-request deny status 429 if { var(txn.rate_prompt_minute),sub(txn.maxrate_min_prompt) gt 0 } | |
http-request deny status 429 if { var(txn.rate_prompt_day),sub(txn.maxrate_day_prompt) gt 0 } | |
http-request deny status 429 if { var(txn.rate_completion_minute),sub(txn.maxrate_min_completion) gt 0 } | |
http-request deny status 429 if { var(txn.rate_completion_day),sub(txn.maxrate_day_completion) gt 0 } |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
http-response set-var(txn.prompt_tokens) res.body,json_query('$.usage.prompt_tokens','int') | |
http-response set-var(txn.completion_tokens) res.body,json_query('$.usage.completion_tokens','int') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
http-request sc-add-gpc(0,0) var(txn.smtp.prompt_tokens) if { var(txn.rate_prompt_minute),sub(txn.maxrate_min_prompt) le 0 } | |
http-request sc-add-gpc(0,2) var(txn.smtp.prompt_tokens) if { var(txn.rate_prompt_day),sub(txn.maxrate_day_prompt) le 0 } | |
http-request sc-add-gpc(0,4) var(txn.smtp.completion_tokens) if { var(txn.rate_prompt_minute),sub(txn.maxrate_min_completion) le 0 } | |
http-request sc-add-gpc(0,6) var(txn.smtp.completion_tokens) if { var(txn.rate_prompt_day),sub(txn.maxrate_day_comp.etion) le 0 } |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment