Skip to content

Instantly share code, notes, and snippets.

@adrianhall
Last active March 19, 2020 22:41
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save adrianhall/a4df7698796a2bdd55788b06c5faae78 to your computer and use it in GitHub Desktop.
Save adrianhall/a4df7698796a2bdd55788b06c5faae78 to your computer and use it in GitHub Desktop.

What I'm trying to do

Write a python program (Python 3.8.2, Visual Studio Code insiders + insiders plugin, Windows 10) to upload a python file to a new storage blob, then set up distributed tracing and logging.

Starting application

This should be fairly basic.

  • pip install --pre azure.identity
  • pip install azure.storage.blob
from azure.identity import DefaultAzureCredential
from azure.storage.blob import BlobServiceClient

storage_url="https://pythonopentelemetrydemo.blob.core.windows.net/"
container_name="demo"
blob_name="main.py"

# Create a client for the Azure Storage Service
service_client = BlobServiceClient(
  account_url=storage_url,
  credential=DefaultAzureCredential()
)

# Create a client for the named container
container_client = service_client.get_container_client(container_name)
container_client.create_container()

# Create a client for the named blob
blob_client = container_client.get_blob_client(blob_name)
with open(blob_name, "rb") as data:
  blob_client.upload_blob(data)

First error: ClientAuthenticationError - need to provide a token. Not a biggie, since I know we don't support VS Code yet. There is a link provided.

ISSUE The instructions on the azure.identity page do not suggest a mechanism for creating a suitable identity. However, the ref-docs do - we need to ensure we are providing the right information to correct errors.

When reading the azure.identity ref-docs I noticed the exclude_interactive_browser_credential, so I set that to False. This did not start a browser session as expected.

ISSUE Setting exclude_interactive_browser_credential=False in the constructor of DefaultAzureCredential did not pop up a browser, as expected.

So, back to the basics. I need to create a service principal, and I know how to do that on the command line.

  • az ad sp create-for-rbac --name https://myapp.azurewebsites.net --skip-assignment
  • $env:AZURE_TENANT_ID="blah"
  • $env:AZURE_CLIENT_ID="blah"
  • $env:AZURE_CLIENT_SECRET="blah"

RECOMMENDATION When discussing the command to create the SP on this page, it would be a good idea to provide Windows Powershell, CMD, bash and csh examples to set the environment variables right there, or a link to the environment variables. I had to hunt for it lower down.

On the next run, I got a different error:

azure.core.exceptions.HttpResponseError: This request is not authorized to perform this operation.
RequestId:bc39730c-c01e-007f-4d1f-febe46000000
Time:2020-03-19T18:49:45.9485911Z
ErrorCode:AuthorizationFailure
Error:None

ISSUE: There is no "Error" or description on what to do next. I get that it's a permissions error, but I want to understand how to fix the permissions error.

I had to resort to Google to fix this one. This led to this page as the first match.

The actual process for assigning RBAC in the storage IAM was relatively cumbersome. I would have liked a link in Azure Identity docs to say "grant permissions to your service principal" with a list of resources and a link to the appropriate docs. Also, I was literally guessing as to what permissions role I needed. Owner didn't work, nor did Managed Application Contributor Role.
So I guess a combination of Owner and Contributor.

That finally worked, but now I've got a ContainerAlreadyExists error. I seem to remember I can do MatchConditions, which is in Azure Core, to get around this. Let's go back to the docs.

ISSUE On azure.identity page, I clicked on the home link marked Azure SDK for Python. It did not take me anywhere. I expected it to take me to the home page for the ensure SDK.

None of the samples, nor the documentation provide an IfMissing sample.

ISSUE If you go to https://azure.github.io/azure-sdk-for-python, then click on Storage > 12.3.0, it will open a new window with the docs. Now scroll down to the bottom of the new page and select Samples to open one of the GitHub pages. Finally, go back to your index page and click on Storage > 12.3.0 again. note that it switches to the right tab, but does not open the right page.

I think the actual reference is for match=MatchConditions.IfMissing, but I'm not sure.

ISSUE The docs for the **kwargs are missing on azure.github.io/azure-sdk-for-python in the azure-storage-blob pages.
I think. Since I'm fairly certain we support the IsMissing. But I might be wrong.

Instead, I used the ResourceExistsError that is documented. This is, however, documented in the API docs and not in the doc here. Surely, the exception handling is standard operating procedure?

ISSUE Where is ResourceExistsError? It doesn't say. I know it's in azure-core now, but how do I import it? This is not discussed where it is needed.

This moved me on to the upload_blob. I was fully expecting that since I had given my SP contributor access to the whole account, this would be a breeze in terms of permissions. Nope:

azure.core.exceptions.HttpResponseError: This request is not authorized to perform this operation using this permission.
RequestId:cb1687b2-501e-0030-2423-fecf12000000
Time:2020-03-19T19:17:46.5733390Z
ErrorCode:AuthorizationPermissionMismatch
Error:None

Again, no pointers, links, etc. on what I need to do to fix this. my SP is an Owner and a Contributor now. At this point, I'm guessing. Hmmm - scrolling through - Storage Account Contributor. That sounds good. Let's add that. Re-run - nope, not that. Learn more gets me in two clicks to this which is the most useless page I've seen in a long time. But it does give me Storage Blob Data Owner - maybe that! Fortunately, that works.

Final issue, Visual Studio Code has yellow squiggly lines under azure.identity, etc. and thats an "unresolved import" error. No idea how to get rid of those, but it isn't hurting my environment right now, so move on.

So I've got a working application

Next, turn on logging. First stop, check the documentation. There is no page on "enabling logging" on this page, but I wrote a blog on this, so let's see how that works. Shockingly, it does! I'm psyched now. However, I got a ResourceExistsError, so want to fix that by allowing overwrites. Let's check the docs for upload_blob again! I actually see the overwrite=True option! (Yes, I'm getting really shocked here)

ISSUE I saw a lot of REDACTED, which I get. However, I think the number of redactions is way too high.

Example:

INFO:azure.core.pipeline.policies.http_logging_policy:Request URL: 'https://pythonopentelemetrydemo.blob.core.windows.net/demo/main.py'
INFO:azure.core.pipeline.policies.http_logging_policy:Request method: 'PUT'
INFO:azure.core.pipeline.policies.http_logging_policy:Request headers:
INFO:azure.core.pipeline.policies.http_logging_policy:    'Content-Type': 'application/octet-stream'
INFO:azure.core.pipeline.policies.http_logging_policy:    'Content-Length': '1016'
INFO:azure.core.pipeline.policies.http_logging_policy:    'x-ms-version': 'REDACTED'
INFO:azure.core.pipeline.policies.http_logging_policy:    'x-ms-blob-type': 'REDACTED'
INFO:azure.core.pipeline.policies.http_logging_policy:    'x-ms-date': 'REDACTED'
INFO:azure.core.pipeline.policies.http_logging_policy:    'x-ms-client-request-id': '5463e063-6a18-11ea-8626-74d02bc692f1'
INFO:azure.core.pipeline.policies.http_logging_policy:    'User-Agent': 'azsdk-python-storage-blob/12.3.0 Python/3.8.2 (Windows-10-10.0.19587-SP0)'
INFO:azure.core.pipeline.policies.http_logging_policy:    'Authorization': 'REDACTED'
INFO:azure.core.pipeline.policies.http_logging_policy:Response status: 201
INFO:azure.core.pipeline.policies.http_logging_policy:Response headers:
INFO:azure.core.pipeline.policies.http_logging_policy:    'Content-Length': '0'
INFO:azure.core.pipeline.policies.http_logging_policy:    'Content-MD5': 'REDACTED'
INFO:azure.core.pipeline.policies.http_logging_policy:    'Last-Modified': 'Thu, 19 Mar 2020 19:32:13 GMT'        
INFO:azure.core.pipeline.policies.http_logging_policy:    'ETag': '"0x8D7CC3C39E36BB8"'
INFO:azure.core.pipeline.policies.http_logging_policy:    'Server': 'Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0'INFO:azure.core.pipeline.policies.http_logging_policy:    'x-ms-request-id': 'REDACTED'
INFO:azure.core.pipeline.policies.http_logging_policy:    'x-ms-client-request-id': '5463e063-6a18-11ea-8626-74d02bc692f1'
INFO:azure.core.pipeline.policies.http_logging_policy:    'x-ms-version': 'REDACTED'
INFO:azure.core.pipeline.policies.http_logging_policy:    'x-ms-content-crc64': 'REDACTED'
INFO:azure.core.pipeline.policies.http_logging_policy:    'x-ms-request-server-encrypted': 'REDACTED'
INFO:azure.core.pipeline.policies.http_logging_policy:    'Date': 'Thu, 19 Mar 2020 19:32:13 GMT'

Why is the request ID redacted? I need that in the debug logs every time. And the version? date?

Time for distributed tracing

The whole point of this, of course, was to turn on distributed tracing. For this, I started here Probably not the best place,

ISSUE Go to https://azure.github.io/azure-sdk-for-python and search for OpenTelemtry - no matches, even though it's got an explicit package name.

There is no advice here on when to use OpenCensus vs. OpenTelemetry, so I picked one at random. I've got an App Insights app set up already. Following the plan, but

from azure.core.settings import settings
from azure.core.tracing.ext.opentelemetry_span import OpenTeleMetrySpan
settings.tracing_implementation = OpenTelemetrySpan

Why isn't this something like:

from azure.core.tracing.opentelemtry import OpenTelemetryClient
OpenTelemetryClient.initialize()

Here, the documentation is just plain wrong. It doesn't work. Specifically, I did the following:

  • pip install opentelemetry-azure-monitor-exporter opentelemetry-api opentelemetry-sdk
  • pip install azure-core-tracing-opentelemetry

I then added the code:

# BEGIN-OpenTelemetry
from azure.core.settings import settings
from azure.core.tracing.ext.opentelemetry_span import OpenTelemetrySpan

# Azure Monitor Exporter
from opentelemetry.ext.azure_monitor import AzureMonitorSpanExporter

settings.tracing_implementation = OpenTelemetrySpan
exporter = AzureMonitorSpanExporter(
  instrumentation_key=os.environ['AZURE_MONITOR_INSTRUMENTATION_KEY']
)

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerSource
from opentelemetry.sdk.trace.export import SimpleExportSpanProcessor

trace.set_preferred_tracer_implementation(lambda T: TracerSource())
tracer = trace.get_tracer(__name__)
tracer.tracer_source().add_span_processor(SimpleExportSpanProcessor(exporter))

# END-OpenTelemetry

And finally, added the with tracer.start_as_current_span(name="MyApp") around the client code. I get an error:

Traceback (most recent call last):
  File "main.py", line 13, in <module>
    from opentelemetry.ext.azure_monitor import AzureMonitorSpanExporter
ModuleNotFoundError: No module named 'opentelemetry.ext'

So, I am missing something here. There is a README for the exporter, so I went there... and moved to the code that they provided.

That gets me further:

Traceback (most recent call last):
  File "main.py", line 13, in <module>
    from azure_monitor import AzureMonitorSpanExporter
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python38-32\lib\site-packages\azure_monitor\__init__.py", line 3, in <module>
    from azure_monitor.trace import AzureMonitorSpanExporter
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python38-32\lib\site-packages\azure_monitor\trace.py", line 
9, in <module>
    from azure_monitor import protocol, util
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python38-32\lib\site-packages\azure_monitor\util.py", line 10, in <module>
    from opentelemetry.sdk.version import __version__ as opentelemetry_version
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python38-32\lib\site-packages\opentelemetry\sdk\__init__.py", line 19, in <module>
    from . import metrics, trace, util
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python38-32\lib\site-packages\opentelemetry\sdk\metrics\__init__.py", line 20, in <module>
    from opentelemetry.sdk.metrics.export.batcher import UngroupedBatcher
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python38-32\lib\site-packages\opentelemetry\sdk\metrics\export\batcher.py", line 18, in <module>
    from opentelemetry.metrics import Counter, Measure, MetricT, Observer
ImportError: cannot import name 'Observer' from 'opentelemetry.metrics' (C:\Users\adrian\AppData\Local\Programs\Python\Python38-32\lib\site-packages\opentelemetry\metrics\__init__.py)

No idea what is going on, so I'm going to assume that this is not working and we've released a pile of non-working garbage.

UPDATE-1 I finally got OpenTelemetry working by pinning all the versions to v0.4a0 - OpenTelemetry released a new version 0.5b0 which is incompatible with 0.4a0. When you install azure-core-tracing-opentelemetry, it depends on 0.4a0 of the API, but not the SDK, so you end up in a situation where you have opentelemetry-api==0.4a0 and opentelemetry-sdk==0.5b0 - which is a bad place to be. I'm putting the blame for this on OpenTelemetry right now.

So, rewind on the OpenTelemetry piece. I removed all the OpenTelemetry pip installs, then I added the following packages

pip install opentelemetry-api==0.4a0
pip install opentelemetry-sdk==0.4a0
pip install azure-core-tracing-opentelemetry --pre

The version was discovered by inspecting the dependencies on azure-core-tracing-opentelemetry. From there, I installed the opentelemetry-sdk for the same version. Now I can add some initialization code and wrap blocks in the with tracer lines:

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerSource
from opentelemetry.sdk.trace.export import SimpleExportSpanProcessor

# Configure distributed tracing in Azure SDK
from azure.core.settings import settings
from azure.core.tracing.ext.opentelemetry_span import OpenTelemetrySpan
settings.tracing_implementation = OpenTelemetrySpan

# Configure OpenTelemetry Exporter
from opentelemetry.sdk.trace.export import ConsoleSpanExporter
exporter = ConsoleSpanExporter()

# Configure OpenTelemetry Tracing
trace.set_preferred_tracer_source_implementation(lambda T: TracerSource())
trace.tracer_source().add_span_processor(
    SimpleExportSpanProcessor(exporter)
)
tracer = trace.get_tracer(__name__)

def upload_to(url, container, blob):
  with tracer.start_as_current_span("upload_to"):
    # Create a client for this request
    client = BlobServiceClient(account_url=url,credential=DefaultAzureCredential())

    # Create a client for the named container
    container_client = client.get_container_client(container)
    try:
      container_client.create_container()
      print("Container created")
    except ResourceExistsError:
      print("Container already exists (ignored).")

    # Create a client for the named blob
    blob_client = container_client.get_blob_client(blob)
    with open(blob, "rb") as data:
      blob_client.upload_blob(data, overwrite=True)

# Upload main.py to the container
with tracer.start_as_current_span("mainapp"):
  storage_url=os.environ['APP_STORAGE_ACCOUNT_URL']
  upload_to(storage_url, "demo", "main.py")

Finally, I get spans being printed. Install opentelemetry-azure-monitor-exporter package, then change the exporter to the following:

from azure_monitor import AzureMonitorSpanExporter
exporter = AzureMonitorSpanExporter(
  instrumentation_key = os.environ['APPINSIGHTS_INSTRUMENTATION_KEY']
)

I've already got an Application Insights resource, and I've set the instrumentation key in an environment variable. I can now go to the Application Insights resource, look at the application map, and see the awesome graphic. I can also look in the dependencies logs and see the spans.

ISSUE Cannot see the transaction diagnostics. It looks really cool in the documentation, but I cannot find it in the portal. Where is it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment