Auto instrumentation is a mechanism to produce the telemetry data of an uninstrumented application without modifying the original application code itself. It relies on patching libraries utilized by the application and running the application via a command line script:
auto-instrumentation-command python3 uninstrumented_program.py
When uninstrumented_program.py
is run in this way, it displays the results like it had been instrumented beforehand. The practical benefit of auto instrumentation is of course to make it possible for the end user to save time and effort by not having to instrument existing code.
In order to make this a bit more clear, here is a brief example. The complete example can be found here. This example has 2 services running: formatter
and publisher
. We have a script in hello.py
that communicates with these services.
---------------- ----------------
| | | |
hello.py. ----> | formatter:8081 | < ----- > | publisher:8082 |
| | | |
---------------- ----------------
The publisher
comes in 2 flavors: instrumented and uninstrumented. The most relevant part of these components is shown below:
@app.route("/format_request")
def format_request():
with tracer.start_as_current_span(
"format_request",
parent=propagators.extract(get_as_list, request.headers),
):
hello_to = request.args.get("helloTo")
return "Hello, %s!" % hello_to
@app.route("/publish_request")
def publish_request():
with tracer.start_as_current_span(
"publish_request", propagators.extract(get_as_list, request.headers)
):
hello_str = request.args.get("helloStr")
print(hello_str)
return "published"
@app.route("/publish_request")
def publish_request():
hello_str = request.args.get("helloStr")
print(hello_str)
return "published"
with tracer.start_as_current_span("hello") as hello_span:
with tracer.start_as_current_span("hello-format", parent=hello_span):
hello_str = http_get(8081, "format_request", "helloTo", hello_to)
with tracer.start_as_current_span("hello-publish", parent=hello_span):
http_get(8082, "publish_request", "helloStr", hello_str)
The instrumented publisher is first run like this: python3 publisher_instrumented.py
and it produces output similar to this one when the "hello" script, hello.py
is run:
Hello, testing! Span(name="publish", context=SpanContext(trace_id=0xd18be4c644d3be57a8623bbdbdbcef76, span_id=0x6162c475bab8d365, trace_state={}), kind=SpanKind.SERVER, parent=SpanContext(trace_id=0xd18be4c644d3be57a8623bbdbdbcef76, span_id=0xdafb264c5b1b6ed0, trace_state={}), start_time=2019-12-19T01:11:12.172866Z, end_time=2019-12-19T01:11:12.173383Z) 127.0.0.1 - - [18/Dec/2019 19:11:12] "GET /publish?helloStr=Hello%2C+testing%21 HTTP/1.1" 200 -
The uninstrumented publisher is now run like this: opentelemetry-auto-instrument python3 publisher_uninstrumented.py
and, again, it produces output similar to this one when the "hello" script, hello.py
is run:
Hello, testing! Span(name="publish", context=SpanContext(trace_id=0xd18be4c644d3be57a8623bbdbdbcef76, span_id=0x6162c475bab8d365, trace_state={}), kind=SpanKind.SERVER, parent=SpanContext(trace_id=0xd18be4c644d3be57a8623bbdbdbcef76, span_id=0xdafb264c5b1b6ed0, trace_state={}), start_time=2019-12-19T01:11:12.172866Z, end_time=2019-12-19T01:11:12.173383Z) 127.0.0.1 - - [18/Dec/2019 19:11:12] "GET /publish?helloStr=Hello%2C+testing%21 HTTP/1.1" 200 -
As you can see, both outputs are very similar, which means that auto instrumentation does the same as manual instrumentation.
The Python auto instrumentation mechanism consists of the command line interface, entry points and the patchers.
The command line interface provides the opentelemetry-auto-instrument
command. That command is defined in a specific entry point named console_scripts
(more on entry points later). This entry point is currently part of the API package [1], and when the API package is installed, the command is available in the console. This command is implemented as a Python function that executes when the opentelemetry-auto-instrument
script is run. Here is the aforementioned function:
def run() -> None:
bootstrap_dir = dirname(__file__)
python_path = environ.get("PYTHONPATH", None)
# Add our bootstrap directory to the head of $PYTHONPATH to ensure
# it is loaded before program code
if python_path is not None:
environ["PYTHONPATH"] = join(bootstrap_dir, python_path)
else:
environ["PYTHONPATH"] = bootstrap_dir
python3 = which(argv[1])
execl(python3, python3, *argv[2:]) # type: ignore
This function does 2 important things:
- Adds the path to a directory to the environment variable
PYTHONPATH
(more on this in the next section). - Runs
python
with the arguments passed to the script. For example,opentelemetry-auto-instrument publisher_uninstrumented.py
, will callpython publisher_uninstrumented.py
.
The function called by the script inserts the auto_instrument
directory at the beginning of the PYTHONPATH environment variable. This directory contains 3 files:
__init__.py
(irrelevant at this moment)auto_instrument.py
(which holds the previous function)customize.py
(which is now the relevant file)
The sitecustomize.py
file is executed before Python begins to execute publisher_uninstrumented.py
. This is a mechanism provided by site
, a package in the Python Standard Library. This allows the auto instrumentation mechanism to tap into the Python execution order to run code before anything else.
The code in this file runs before the uninstrumented code. Before continuing, let's first have a short explaination on entry points
We have mentioned entry points before in this document. Nevertheless, we need to explain them better now.
Python provides a standard system to install packages, similar to how other languages do. A Python package may define an entry point and itself or other packages can implement these entry points. For example, the API package defines the entry point opentelemetry_patcher
here, as you can see, an entry point is simply a string. The opentelemetry_patcher
entry point is implemented by another package, the opentelemetry-ext-flask
package [2], here. Here is the implementation of the entry point:
entry_points={
"opentelemetry_patcher": [
"flask = opentelemetry.ext.flask:FlaskPatcher"
]
},
This implementation of the entry point is named flask
and it is just a path to a Python object, in this case a class [3], FlaskPatcher
.
When a package is installed, its entry points implementations are registered against the definition of the entry points. Once this is done, the entry points library allows the user to load the objects pointed to by the entry point paths in the different implementation of the entry points, along with their names. For example, when the entry point implementation named flask
is loaded, it will return the FlaskPatcher
class.
Ok, back to the sitecustomize.py
file. Here is its relevant content:
for entry_point in iter_entry_points("opentelemetry_patcher"):
try:
entry_point.load()().patch() # type: ignore
_LOG.debug("Patched %s", entry_point.name)
except Exception: # pylint: disable=broad-except
_LOG.exception("Patching of %s failed", entry_point.name)
The code in this file basically iterates through all the entry point implementations that were registered against the opentelemetry_patcher
entry point and then calls the load function on them. When this load
function is called, it returns the object pointed to by the entry point implementation path. In our example, the load
function will return the FlaskPatcher
class when the flask
entry point gets iterated by.
The most important line here is this one:
entry_point.load()().patch()
When it is the flask
entry point turn, this happens:
FlaskPatcher().patch()
Since now we have a pair of parentheses at the right of FlaskPatcher
, the class gets instantiated into a FlaskPatcher
object:
flask_patcher_object.patch()
Finally, the patch
method is called.
The idea of using entry points as explained in this section is that we can add patchers dynamically, this means, without having to modify any of the code of the core auto instrumentation system. Each patcher comes in a package, and we only need to install it for it to be loaded, there is no need to touch the loader code every time we want to add a new patcher. This is the standard way of doing this in Python and it is a very powerful mechanism to keep the different components of our system cleanly separated.
In the previous section we could read how the patch method was called for every opentelemetry_patcher
entry point implementation (in other words, for every patcher object). Let's take a look at what a patcher is.
Every patcher (just as FlaskPatcher
before) is an object of a child class of BasePatcher
which can be found here.
This class is simply an interface (or in Python terms, an Abstract Base Class, or ABC) that requires that its children define a patch
and an unpatch
methods:
class BasePatcher(ABC):
"""An ABC for patchers"""
@abstractmethod
def patch(self) -> None:
"""Patch"""
@abstractmethod
def unpatch(self) -> None:
"""Unpatch"""
This base class exists in the API package and serves as an interface for all the patchers, like FlaskPatcher
before.
Now, patchers are expected to do monkey patching on their respective frameworks. The FlaskPatcher
basically replaces the flask.Flask
class with another:
class _PatchedFlask(flask.Flask):
...
class FlaskPatcher(BasePatcher):
...
def _patch(self):
self._original_flask = flask.Flask
flask.Flask = _PatchedFlask
...
So, when the FlaskPatcher.patch
method is called when the entry point implementation is loaded, the flask.Flask
class is replaced with _PatchedFlask
. Every patcher has a different way to implement patching because of the differences between their corresponding frameworks. The actual instrumentation is done there, for example, by modifying functions in these frameworks so that they are enclosed in OpenTelemetry spans.
Keep in mind that all this happens before the code in publisher_uninstrumented.py
gets executed. By the time it does, flask.Flask
has already been changed into another class that does instrumentation, and magically, our uninstrumented code is instrumented now.
So far, most of what has been explained here (with the exception of the opentelemetry-ext-flask
package) can be placed in the opentelemetry-python
repo. There exist already several patcher packages provided by DataDog and SignalFX that perform patching for different frameworks like Django, Requests, GRPC, PyMongo, etc. These components apparently provide a similar interface than the one explained here, with a patch or instrument methods that do the actual patching. It should be relatively (probably there will be a lot of details to consider) straightforward to incorporate this code in classes that have patch
and unpatch
methods instead.
Both DataDog and SignalFx have also implemented an auto instrumentation mechanism that works in a very similar way to the one explained before, both provide a command through the console_scripts
entry point, both use sitecustomize.py
to tap in the execution order. The DataDog repo is much larger and also includes a lot of code to handle other things like threading considerations.
Provided there is agreement on this approach, relevant code could be ported from both repos (and other community sources) on a feature by feature basis.
[1] | The location of the console_scrips entry point for this command is expected to change and to be placed in an auto-instrumentation-specific package soon. |
[2] | The opentelemetry-python repo contains several Python packages, the opentelemetry-api package, the opentelemetry-sdk package, etc. |
[3] | Everything is an object in Python, even classes. |