Skip to content

Instantly share code, notes, and snippets.

@jasonho-lynx
Created February 22, 2023 07:44
Show Gist options
  • Save jasonho-lynx/31299894cedaa6f8364c4eed51a084af to your computer and use it in GitHub Desktop.
Save jasonho-lynx/31299894cedaa6f8364c4eed51a084af to your computer and use it in GitHub Desktop.
Formats SQL files in a repo using the clickhouse-format package.
"""Formats SQL files in this repo using the clickhouse-format package. Prints a list of files that
require manual formatting e.g. those with %()s params or queries that use backticks for column
names.
"""
import subprocess
from glob import glob
sql_files = glob("**/*.sql", recursive=True)
print("Files that require manual formatting:")
for sql_file in sql_files:
with open(sql_file, "r+") as f:
# read query from file
query = f.read()
# run the clickhouse-format command to format the SQL query
# installation: https://clickhouse.com/docs/en/install/#install-from-deb-packages
# clickhouse-format is within the clickhouse-common-static package
process = subprocess.run(
["clickhouse-format", "-"],
input=query.encode(),
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
# get the formatted SQL query from the output.
# if there are syntax errors, this will be an empty string
formatted_query = process.stdout.decode()
if formatted_query == "":
print(sql_file)
continue
# if no errors, overwrite the file with the formatted query.
# go back to the start of the file
f.seek(0)
# write the new query
f.write(formatted_query)
# truncate any old data that comes after the new query
f.truncate()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment