Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save sayle-doit/60c515f165bc5e2900f1ffd0cb3e174f to your computer and use it in GitHub Desktop.
Save sayle-doit/60c515f165bc5e2900f1ffd0cb3e174f to your computer and use it in GitHub Desktop.
Compare BigQuery job costs when running a job on either BigQuery Editions with the autoscaler or on-demand with both new and old pricing models.
/*
* This query will look at the past 30 days of job history to analyze it for costs under
* BigQuery Editions while utilizing the new autoscaling feature that was introduced.
* It does this for those using both PAYG (Pay As You Go) and commitment models.
* It will also compare this versus running the query with the on-demand model.
*
* Note that this query utilizes some math modeling behaviors that the BigQuery
* autoscaler uses. Namely these are the up to 10 seconds "slot scale up time,"
* the minimum of 60 seconds "slot scale down time," and the behavior that the
* autoscaler scales up and down in factors of 100 slots for each job.
* Note that on scaling up and down it does bill for those periods of time.
*
* The costs generated by this are strictly an estimate and for Editions
* should be considered a minimum cost of the query as the autoscaler
* may not scale "evenly" and may change scaling during runtime.
*/
WITH jobs_duration AS
(
SELECT
job_id,
query,
start_time,
end_time,
TIMESTAMP_DIFF(end_time, start_time, SECOND) AS job_duration_s,
-- Add 60 seconds scale down time + 10 seconds scale up time to duration for autoscaled duration
TIMESTAMP_DIFF(end_time, start_time, SECOND)+60+10 AS autoscaled_duration_s,
-- Turn slot ms into slot secons
total_slot_ms/ 1000 AS total_slot_s
FROM
`region-us`.INFORMATION_SCHEMA.JOBS
WHERE
creation_time BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY) AND CURRENT_TIMESTAMP()
--AND job_id NOT LIKE 'script_%'
AND total_slot_ms > 0
AND total_bytes_billed > 0
AND state = "DONE"
),
jobs_slot_usage AS
(
SELECT
jd.job_id,
jd.query,
job_duration_s,
total_bytes_processed / POW(1024, 4) AS total_tib_processed,
CAST(FLOOR((CEIL(total_slot_s / autoscaled_duration_s) + 99) / 100) * 100 AS INT64) AS autoscaled_slot_usage,
autoscaled_duration_s,
-- Slot usage (rounded up to next 100 slots) * duration in seconds / (60 seconds * 60 seconds) = slot/hour used by this job
CAST(FLOOR((CEIL(total_slot_s / autoscaled_duration_s) + 99) / 100) * 100 AS INT64)
* (autoscaled_duration_s/(60*60))
AS slot_hour,
FROM
jobs_duration AS jd JOIN `region-us`.INFORMATION_SCHEMA.JOBS AS jobs
ON jd.job_id = jobs.job_id
)
SELECT
job_id,
query,
job_duration_s,
total_tib_processed,
-- Removed this because it has been a few months since the price change and this isn't very relevant anymore
--ROUND(total_tib_processed * 5, 2) AS old_on_demand_cost,
ROUND(total_tib_processed * 6.25, 2) AS on_demand_cost,
autoscaled_duration_s,
slot_hour,
ROUND(slot_hour * 0.04, 2) AS standard_edition_cost,
ROUND(slot_hour * 0.06, 2) AS enterprise_edition_cost,
ROUND(slot_hour * 0.1, 2) AS enterprise_plus_edition_cost,
ROUND(slot_hour * 0.048, 2) AS enterprise_edition_1yr_commit_cost,
ROUND(slot_hour * 0.036, 2) AS enterprise_edition_3yr_commit_cost,
ROUND(slot_hour * 0.08, 2) AS enterprise_plus_edition_1yr_commit_cost,
ROUND(slot_hour * 0.06, 2) AS enterprise_plus_edition_3yr_commit_cost
FROM
jobs_slot_usage;
@zendesk-sova
Copy link

Great stuff, however I should add that after BQ has scaled up it can reuse available slots without additional penalties and this can avoid cool down charges. I tried this query and it predicted double what we actually spend on Enterprise Edition.
Overall I like this approach but 70 s addition for each job is too general and every practitioner will need to play around this coefficient to get results that are more in line with their usage patterns.

@sayle-doit
Copy link
Author

sayle-doit commented May 23, 2023 via email

@hippopotony
Copy link

hippopotony commented May 26, 2023

For enterprise_edition_1yr_commit_cost, the price is 0.048, not 0.48. Not sure how the previous commenter can still use the result. Or he noticed the same error and used the correct price, but didn't point it out?

@ben-cohen-deel
Copy link

Hey Sayle, thanks for sharing this, superinteresting.

I was calculating this for our team and we noticed that the difference between slot_ms based cost, and actual cost of the reservation can be anywhere between 200% higher to 20% higher, which is highly dependent on the max_slots chosen.

When we had max_slot = 1600, we found BQ often scaling up to 1600 slots, while only the 99percentile of queries requires that much. It appears the autoscaler prioritizes for speed.

@sayle-doit
Copy link
Author

Unfortunately, that's the default behavior of the autoscaler. We have to state here that this is an estimated value because of that. AFAIK there currently is not any way to calculate what your actual usage will be due to how the autoscaler works.

I generally recommend someone use a very conservative value on the max_slot value as it tends to like to max out on basic queries then when others run it levels itself out to a better "balanced" value of slots per query.

@nonmanager
Copy link

Thank you for this, Sayle! Do you know if there is a way to change the autoscaler behavior to make it allocate slots more conservatively?

@sayle-doit
Copy link
Author

As far as I know there are not any methods available to us as users currently to adjust the autoscaler behavior. I highly suspect this is coming in the future (maybe as soon as Next in 2 weeks?), but I have not been given any information related to this and it's just my conjecture they will release some "knobs" to adjust how the autoscaler works for end-users. Many customers I have worked with have experienced this same behavior and given feedback directly to the teams at Google which makes me believe they are listening and adjusting appropriately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment