Skip to content

Instantly share code, notes, and snippets.

@josheinstein
Last active March 23, 2022 18:47
Show Gist options
  • Save josheinstein/51aadf6933b9765c742d3e352aa272a4 to your computer and use it in GitHub Desktop.
Save josheinstein/51aadf6933b9765c742d3e352aa272a4 to your computer and use it in GitHub Desktop.
Parsing Azure diagnostic blobs in Power BI
let
// Storage Account containing the blobs
Source = AzureStorage.Blobs("mystorageaccount"),
// Select the container containing azuread user risk events
Container = Source{[Name="insights-logs-userriskevents"]}[Data],
// Get only blobs for a specific day (parameter ReportDate must be defined)
// The tenantId below is a random example -- use your own tenantId, or add a parameter
FilteredRows = Table.SelectRows(Container, each Text.StartsWith([Name], "tenantId=63cc8934-5723-42fe-ab52-0bc17d946172/y=" & DateTime.ToText(ReportDate, [Format="yyyy"]) & "/m=" & DateTime.ToText(ReportDate, [Format="MM"]) & "/d=" & DateTime.ToText(ReportDate, [Format="dd"]) & "/")),
// Call the ParseJsonBlob function
ExpandedTable = ParseJsonBlob(FilteredRows[Content]),
// Expand the "properties" nested record
ExpandedProperties = Table.ExpandRecordColumn(ExpandedTable, "properties", {"id", "requestId", "correlationId", "riskType", "riskEventType", "riskState", "riskLevel", "riskDetail", "source", "detectionTimingType", "activity", "ipAddress", "location", "activityDateTime", "detectedDateTime", "lastUpdatedDateTime", "userId", "userDisplayName", "userPrincipalName", "additionalInfo", "tokenIssuerType", "resourceTenantId", "homeTenantId", "userType", "crossTenantAccessType"}, {"properties.id", "properties.requestId", "properties.correlationId", "properties.riskType", "properties.riskEventType", "properties.riskState", "properties.riskLevel", "properties.riskDetail", "properties.source", "properties.detectionTimingType", "properties.activity", "properties.ipAddress", "properties.location", "properties.activityDateTime", "properties.detectedDateTime", "properties.lastUpdatedDateTime", "properties.userId", "properties.userDisplayName", "properties.userPrincipalName", "properties.additionalInfo", "properties.tokenIssuerType", "properties.resourceTenantId", "properties.homeTenantId", "properties.userType", "properties.crossTenantAccessType"})
in
ExpandedProperties
(Binaries as list) as table => let
// Merge the blobs into one large Binary
CombinedBinary = Binary.Combine(Binaries),
// Extract each line of text from the blob as a list
Lines = Lines.FromBinary(CombinedBinary),
// Each line contains a complete JSON document -- parse it
JsonList = List.Transform(Lines, Json.Document),
// Turn the list of JSON objects into a table
JsonTable = Table.FromList(JsonList, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
// The resulting table has just one column.
// Get the first row and extract its field names (assume all blobs have the same schema)
// Expand the properties of the JSON objects
ExpandedTable = Table.ExpandRecordColumn(JsonTable, "Column1", Record.FieldNames(List.First(JsonTable[Column1])))
in ExpandedTable
@josheinstein
Copy link
Author

Azure diagnostics blobs consist of multiple lines inside a JSON file, with each line being a complete JSON object. There is no containing array. So the contents of the blob cannot be parsed as a single JSON object, but instead must be parsed line by line. Not so straightforward in Power BI. This M function, ParseJsonBlob, can be used to take a list of binaries and return a single expanded table containing the parsed records from each blob.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment