Companion to Medium post #2, section 5 ("Going deeper — add a categorical gate").
Requires bigquery-agent-analytics >= 0.2.2, which ships
categorical-eval --exit-code --pass-category --min-pass-rate.
| -- Section 6 INFORMATION_SCHEMA cost pivot from Medium post #2: | |
| -- "Your Agent Events Table Is Also a Test Suite" | |
| -- | |
| -- The BigQuery Agent Analytics SDK labels every query it issues | |
| -- with the feature that triggered it. This pivot groups BQ jobs | |
| -- from the last 24 hours by `sdk_feature`, so you can see what the | |
| -- CI gate cost in BQ compute and what the developer trace-reads | |
| -- after a failing run cost on top of that. | |
| -- | |
| -- Swap `region-us` for the region your dataset lives in |
| # .github/workflows/evaluate_thresholds.yml | |
| # | |
| # Section 4 reference workflow from Medium post #2: | |
| # "Your Agent Events Table Is Also a Test Suite" | |
| # | |
| # Four deterministic gates run against the last 24 hours of | |
| # production traces every time a PR touches agent code or prompts. | |
| # Each gate is its own step so a red status tells you which budget | |
| # regressed. | |
| # |
| #!/usr/bin/env bash | |
| # Section 3 hero command from Medium post #2: | |
| # "Your Agent Events Table Is Also a Test Suite" | |
| # | |
| # Runs the deterministic latency gate against the last 24 hours of | |
| # production traces. Exit 0 = all sessions within budget; exit 1 = | |
| # at least one session regressed; exit 2 = configuration error. | |
| # | |
| # The SDK's `evaluate --exit-code` path also prints one readable | |
| # FAIL session=... observed=... budget=... line on stderr per failing |
| <!DOCTYPE html> | |
| <html lang="en"> | |
| <head> | |
| <meta charset="UTF-8"> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> | |
| <title>BigQuery Agent Analytics - Real Demo with Gemini 3 Flash</title> | |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/react/18.2.0/umd/react.production.min.js"></script> | |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/react-dom/18.2.0/umd/react-dom.production.min.js"></script> | |
| <script src="https://cdnjs.cloudflare.com/ajax/libs/babel-standalone/7.23.5/babel.min.js"></script> | |
| <link href="https://fonts.googleapis.com/css2?family=Google+Sans:wght@400;500;600;700&family=Roboto+Mono:wght@400;500&display=swap" rel="stylesheet"> |
| -- This is a Hive program. Hive is an SQL-like language that compiles | |
| -- into Hadoop Map/Reduce jobs. It's very popular among analysts at | |
| -- Facebook, because it allows them to query enormous Hadoop data | |
| -- stores using a language much like SQL. | |
| -- Our logs are stored on the Hadoop Distributed File System, in the | |
| -- directory /logs/randomhacks.net/access. They're ordinary Apache | |
| -- logs in *.gz format. | |
| -- | |
| -- We want to pretend that these gzipped log files are a database table, |
| library(dplyr) | |
| library(RSQLite) | |
| #Set up connection to the SQLite database | |
| connection <- dbConnect(RSQLite::SQLite(), dbname = "clinton.sqlite") | |
| #Print all tables | |
| print("Tables") | |
| all_tables <- dbListTables(connection) | |
| print(all_tables) |
| { | |
| "metadata": { | |
| "name": "", | |
| "signature": "sha256:a04c38d9604adb7eb9ca89860dfa1ef72db66037cc2c07c391ef8e67a31f9254" | |
| }, | |
| "nbformat": 3, | |
| "nbformat_minor": 0, | |
| "worksheets": [ | |
| { |