Skip to content

Instantly share code, notes, and snippets.

@joshschmelzle
Forked from nymous/README.md
Created March 1, 2023 21:03
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save joshschmelzle/e44b4476261d3bc641ceda89aa29f00d to your computer and use it in GitHub Desktop.
Save joshschmelzle/e44b4476261d3bc641ceda89aa29f00d to your computer and use it in GitHub Desktop.
Logging setup for FastAPI, Uvicorn and Structlog (with Datadog integration)

Logging setup for FastAPI

This logging setup configures Structlog to output pretty logs in development, and JSON log lines in production.

Then, you can use Structlog loggers or standard logging loggers, and they both will be processed by the Structlog pipeline (see the hello() endpoint for reference). That way any log generated by your dependencies will also be processed and enriched, even if they know nothing about Structlog!

Requests are assigned a correlation ID with the asgi-correlation-id middleware (either captured from incoming request or generated on the fly). All logs are linked to the correlation ID, and to the Datadog trace/span if instrumented. This data "global to the request" is stored in context vars, and automatically added to all logs produced during the request thanks to Structlog. You can add to these "global local variables" at any point in an endpoint with structlog.contextvars.bind_contextvars(customer_id=123). Any log sent after this call will contain this attribute, whether it comes from structlog or logging.

The duration of the request is also captured, and all technical information about the request (URL, method, client IP/port, status code) are logged as attributes.

In production, Uvicorn startup logs are silenced because they are printed before we have time to configure JSON logging (and log ingestion systems don't like unformatted logs 😅). Uvicorn standard access logs are disabled, and reimplemented using Structlog (to be formatted in JSON or in the console).

How to run

Install dependencies with Poetry.

Run in development:

uvicorn main:app --reload

and in production (to log in JSON format and instrument the code with Datadog):

LOG_JSON_FORMAT=true DD_TRACE_ENABLED=true ddtrace-run uvicorn main:app --host 0.0.0.0 --log-config uvicorn_disable_logging.json

Note

I will add my references and inspirations soon.

The "duration" logged and returned in the header does not encompass the full duration of the request. The code could be moved to a higher middleware running earlier in the chain, to also capture the duration of "inner" middlewares such as the correlation-id middleware, or others...

If you install Rich or better-exceptions, you will automagically get pretty formatted exception with context, variable values... Try it and see if you like it

There are some TODOs left in the code, you can suggest improvements in the comments 😄 The Datadog middleware is especially horrible, there must be a better way to add it manually. I also need to open an issue in Starlette comparing in which order middlewares are activated in different languages and frameworks, to try and make it more consistent with other ecosystems.

License

This code is available under the MIT license.

import logging
import sys
import structlog
from ddtrace import tracer
from structlog.types import EventDict, Processor
# https://github.com/hynek/structlog/issues/35#issuecomment-591321744
def rename_event_key(_, __, event_dict: EventDict) -> EventDict:
"""
Log entries keep the text message in the `event` field, but Datadog
uses the `message` field. This processor moves the value from one field to
the other.
See https://github.com/hynek/structlog/issues/35#issuecomment-591321744
"""
event_dict["message"] = event_dict.pop("event")
return event_dict
def drop_color_message_key(_, __, event_dict: EventDict) -> EventDict:
"""
Uvicorn logs the message a second time in the extra `color_message`, but we don't
need it. This processor drops the key from the event dict if it exists.
"""
event_dict.pop("color_message", None)
return event_dict
def tracer_injection(_, __, event_dict: EventDict) -> EventDict:
# get correlation ids from current tracer context
span = tracer.current_span()
trace_id, span_id = (span.trace_id, span.span_id) if span else (None, None)
# add ids to structlog event dictionary
event_dict["dd.trace_id"] = str(trace_id or 0)
event_dict["dd.span_id"] = str(span_id or 0)
return event_dict
def setup_logging(json_logs: bool = False, log_level: str = "INFO"):
timestamper = structlog.processors.TimeStamper(fmt="iso")
shared_processors: list[Processor] = [
structlog.contextvars.merge_contextvars,
structlog.stdlib.add_logger_name,
structlog.stdlib.add_log_level,
structlog.stdlib.PositionalArgumentsFormatter(),
structlog.stdlib.ExtraAdder(),
drop_color_message_key,
tracer_injection,
timestamper,
structlog.processors.StackInfoRenderer(),
]
if json_logs:
# We rename the `event` key to `message` only in JSON logs, as Datadog looks for the
# `message` key but the pretty ConsoleRenderer looks for `event`
shared_processors.append(rename_event_key)
# Format the exception only for JSON logs, as we want to pretty-print them when
# using the ConsoleRenderer
shared_processors.append(structlog.processors.format_exc_info)
structlog.configure(
processors=shared_processors
+ [
# Prepare event dict for `ProcessorFormatter`.
structlog.stdlib.ProcessorFormatter.wrap_for_formatter,
],
logger_factory=structlog.stdlib.LoggerFactory(),
cache_logger_on_first_use=True,
)
log_renderer: structlog.types.Processor
if json_logs:
log_renderer = structlog.processors.JSONRenderer()
else:
log_renderer = structlog.dev.ConsoleRenderer()
formatter = structlog.stdlib.ProcessorFormatter(
# These run ONLY on `logging` entries that do NOT originate within
# structlog.
foreign_pre_chain=shared_processors,
# These run on ALL entries after the pre_chain is done.
processors=[
# Remove _record & _from_structlog.
structlog.stdlib.ProcessorFormatter.remove_processors_meta,
log_renderer,
],
)
handler = logging.StreamHandler()
# Use OUR `ProcessorFormatter` to format all `logging` entries.
handler.setFormatter(formatter)
root_logger = logging.getLogger()
root_logger.addHandler(handler)
root_logger.setLevel(log_level.upper())
for _log in ["uvicorn", "uvicorn.error"]:
# Clear the log handlers for uvicorn loggers, and enable propagation
# so the messages are caught by our root logger and formatted correctly
# by structlog
logging.getLogger(_log).handlers.clear()
logging.getLogger(_log).propagate = True
# Since we re-create the access logs ourselves, to add all information
# in the structured log (see the `logging_middleware` in main.py), we clear
# the handlers and prevent the logs to propagate to a logger higher up in the
# hierarchy (effectively rendering them silent).
logging.getLogger("uvicorn.access").handlers.clear()
logging.getLogger("uvicorn.access").propagate = False
def handle_exception(exc_type, exc_value, exc_traceback):
"""
Log any uncaught exception instead of letting it be printed by Python
(but leave KeyboardInterrupt untouched to allow users to Ctrl+C to stop)
See https://stackoverflow.com/a/16993115/3641865
"""
if issubclass(exc_type, KeyboardInterrupt):
sys.__excepthook__(exc_type, exc_value, exc_traceback)
return
root_logger.error(
"Uncaught exception", exc_info=(exc_type, exc_value, exc_traceback)
)
sys.excepthook = handle_exception
import os
import time
# This line needs to be run before any `ddtrace` import, to avoid sending traces
# in local dev environment (we don't have a Datadog agent configured locally, so
# it prints a stacktrace every time it tries to send a trace)
# TODO: Find a better way to activate Datadog traces?
os.environ["DD_TRACE_ENABLED"] = os.getenv("DD_TRACE_ENABLED", "false") # noqa
import structlog
import uvicorn
from asgi_correlation_id import CorrelationIdMiddleware
from asgi_correlation_id.context import correlation_id
from ddtrace.contrib.asgi.middleware import TraceMiddleware
from fastapi import FastAPI, Request, Response
from pydantic import parse_obj_as
from uvicorn.protocols.utils import get_path_with_query_string
from custom_logging import setup_logging
LOG_JSON_FORMAT = parse_obj_as(bool, os.getenv("LOG_JSON_FORMAT", False))
LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO")
setup_logging(json_logs=LOG_JSON_FORMAT, log_level=LOG_LEVEL)
access_logger = structlog.stdlib.get_logger("api.access")
app = FastAPI(title="Example API", version="1.0.0")
@app.middleware("http")
async def logging_middleware(request: Request, call_next) -> Response:
structlog.contextvars.clear_contextvars()
# These context vars will be added to all log entries emitted during the request
request_id = correlation_id.get()
structlog.contextvars.bind_contextvars(request_id=request_id)
start_time = time.perf_counter_ns()
# If the call_next raises an error, we still want to return our own 500 response,
# so we can add headers to it (process time, request ID...)
response = Response(status_code=500)
try:
response = await call_next(request)
except Exception:
# TODO: Validate that we don't swallow exceptions (unit test?)
structlog.stdlib.get_logger("api.error").exception("Uncaught exception")
raise
finally:
process_time = time.perf_counter_ns() - start_time
status_code = response.status_code
url = get_path_with_query_string(request.scope)
client_host = request.client.host
client_port = request.client.port
http_method = request.method
http_version = request.scope["http_version"]
# Recreate the Uvicorn access log format, but add all parameters as structured information
access_logger.info(
f"""{client_host}:{client_port} - "{http_method} {url} HTTP/{http_version}" {status_code}""",
http={
"url": str(request.url),
"status_code": status_code,
"method": http_method,
"request_id": request_id,
"version": http_version,
},
network={"client": {"ip": client_host, "port": client_port}},
duration=process_time,
)
response.headers["X-Process-Time"] = str(process_time / 10 ** 9)
return response
# This middleware must be placed after the logging, to populate the context with the request ID
# NOTE: Why last??
# Answer: middlewares are applied in the reverse order of when they are added (you can verify this
# by debugging `app.middleware_stack` and recursively drilling down the `app` property).
app.add_middleware(CorrelationIdMiddleware)
# UGLY HACK
# Datadog's `TraceMiddleware` is applied as the very first middleware in the list, by patching `FastAPI` constructor.
# Unfortunately that means that it is the innermost middleware, so the trace/span are created last in the middleware
# chain. Because we want to add the trace_id/span_id in the access log, we need to extract it from the middleware list,
# put it back as the outermost middleware, and rebuild the middleware stack.
# TODO: Open an issue in dd-trace-py to ask if it can change its initialization, or if there is an easy way to add the
# middleware manually, so we can add it later in the chain and have it be the outermost middleware.
# TODO: Open an issue in Starlette to better explain the order of middlewares
tracing_middleware = next(
(m for m in app.user_middleware if m.cls == TraceMiddleware), None
)
if tracing_middleware is not None:
app.user_middleware = [m for m in app.user_middleware if m.cls != TraceMiddleware]
structlog.stdlib.get_logger("api.datadog_patch").info(
"Patching Datadog tracing middleware to be the outermost middleware..."
)
app.user_middleware.insert(0, tracing_middleware)
app.middleware_stack = app.build_middleware_stack()
@app.get("/")
def hello():
custom_structlog_logger = structlog.stdlib.get_logger("my.structlog.logger")
custom_structlog_logger.info("This is an info message from Structlog")
custom_structlog_logger.warning("This is a warning message from Structlog, with attributes", an_extra="attribute")
custom_structlog_logger.error("This is an error message from Structlog")
custom_logging_logger = logging.getLogger("my.logging.logger")
custom_logging_logger.info("This is an info message from standard logger")
custom_logging_logger.warning("This is a warning message from standard logger, with attributes", extra={"another_extra": "attribute"})
return "Hello, World!"
if __name__ == "__main__":
uvicorn.run(app, host="127.0.0.1", port=8000, log_config=None)
[tool.poetry]
name = "example-api"
version = "0.1.0"
description = ""
authors = ["Nymous"]
[tool.poetry.dependencies]
python = "^3.10"
fastapi = "^0.78.0"
uvicorn = "^0.18.1"
ddtrace = "^1.2.1"
structlog = "^21.5.0"
asgi-correlation-id = "^3.0.0"
[tool.poetry.dev-dependencies]
black = "^22.3.0"
isort = "^5.10.1"
mypy = "^0.961"
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
{
"version": 1,
"disable_existing_loggers": false,
"formatters": {
"default": {
"()": "uvicorn.logging.DefaultFormatter",
"format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
},
"access": {
"()": "uvicorn.logging.AccessFormatter",
"format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
}
},
"handlers": {
"default": {
"formatter": "default",
"class": "logging.NullHandler"
},
"access": {
"formatter": "access",
"class": "logging.NullHandler"
}
},
"loggers": {
"uvicorn.error": {
"level": "INFO",
"handlers": [
"default"
],
"propagate": false
},
"uvicorn.access": {
"level": "INFO",
"handlers": [
"access"
],
"propagate": false
}
}
}
MIT License
Copyright (c) 2023 Thomas GAUDIN
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment