- Azure
- Install
azure-cli
:pip install azure-cli
- login:
az login
, to clear login:az logout
oraz account clear
If installed azure-cli
via pip
, run . az.completion.sh
to enable auto completion.
logical container represented a set of services and resources
- List subscription:
az account list
- Set subscription:
az account set --subscription "your subscription name"
- Show current subscription:
az account show --list table|json|yaml
A resource is a single unit of computing, storage or network interfaces. A resource group is a logical container for Azure resources. A service is a higher-level concept represented a collection of related resources and functionality.
Services:
- Azure Virtual Machines
- Azure App Service
- Azure SQL Database
- Azure Storage
Overview of device management with Microsoft Azure IoT Hub | Microsoft Learn
First, you need to create an Azure IoT Hub instance and configure IoT Edge devices. Follow the official documentation to set up the IoT Hub and IoT Edge: Azure IoT Hub and IoT Edge setup
graph TD
A[Plan] --> B[Provision]
B --> C[Configure]
C --> D[Monitor]
D --> E[Retire]
A --> A1[1. Create metadata]
A --> A2[2. Group devices]
A --> A3[3. Device twin to store metadata: \n tags and properties]
B --> B1[1. Create flexible device \n identities and credentials]
B --> B2[2. Report their capabilities and \n conditions through device twin]
C --> C1[1. Changes and firmware updates to devices]
C --> C2[2. `desired` and `direct methods` or \n `broadcast jobs`]
D --> D1[1. Device twin to report real-time operational conditions]
E --> E1[1. Device twin to maintain device info]
E --> E2[2. IoT Hub registry for securely revoking \ndevice credentials and identities]
graph LR
A[Event Hubs] --> B[Functions]
A --> C[Stream Analytics]
A --> D[Time Series Insights]
A --> E[Apache Spark]
A --> F[Databricks]
Understand the Azure IoT Hub query language | Microsoft Learn
- Columns: DeviceID, LastActivityTime
SELECT COUNT() as TotalNumber
"jobs" provide a way to execute operations on sets of devices.
SELECT <select_list>
FROM <from_specification>
[WHERE <filter_condition>]
[GROUP BY <group_specification>]
SELECT * FROM devices.jobs
WHERE devices.jobs.deviceId = 'device-0'
SELECT * FROM devices.jobs
WHERE devices.jobs.deviceId = 'myDeviceId'
AND devices.jobs.jobType = 'scheduleUpdateTwin'
AND devices.jobs.status = 'completed'
AND devices.jobs.createdTimeUtc > '2016-09-01'
SELECT properties.reported.telemetryConfig.status AS status,
COUNT() AS numberOfDevices
FROM devices
GROUP BY properties.reported.telemetryConfig.status
SELECT DeviceId, LastActivityTime
FROM devices
WHERE status = 'enabled' AND connectionState = 'Disconnected'
SELECT COUNT() as totalNumberOfDevices FROM devices
SELECT * FROM devices
WHERE tags.location.region = 'US'
SELECT * FROM devices
WHERE properties.reported.connectivity IN ['wired', 'wifi']
SELECT * FROM devices
WHERE is_defined(properties.reported.connectivity)
SELECT * FROM devices.modules
WHERE properties.reported.status = 'scanning'
AND deviceId IN ['device1', 'device2']
- "IoT Hub" > "Message routing" > "Add" > "Add endpoint" > "Add route" (Events Hub, Storage, Cosmos DB)
- Builtin Event Hubs Endpoint will be disable if routing added.
- Hierarchy: Create Endpoint (Storage, Cosmos, Hub, etc) > Create Route
Message to IoT Hub contains 3 parts:
- System properties
- Application properties: to send a timestamp from the device using the iothub-creation-time-utc property to record when the message was sent by the device.
- Message body
Query language for routing Tutorial - Configure message routing | Microsoft Learn:
- System properties:
$contentType = 'application/json'
or$iothub-connection-device-id = 'myDevice'
- Application properties:
test = 'true
- Message body:
$body.Weather.HistoricalData[0].Month = 'Feb'
- Logic:
$contentEncoding = 'UTF-8' AND processingPath = 'hot'
- Twin:
$twin.properties.desired.telemetryConfig.sendFrequency = '5m'
$twin.tags.deploymentLocation.floor = 1
Use base64
to read binary data in IoT Hub message body: json.loads(base64.b64decode(msg["Body"]).decode("utf-8"))
Develop module for Linux devices using Azure IoT Edge tutorial | Microsoft Learn
Sure, here's a markdown table comparing the pros and cons of each Azure service alternative to Azure Stream Analytics:
Service | Pros | Cons |
---|---|---|
Azure Functions | - Serverless architecture allows for automatic scaling and reduced costs - Supports a variety of programming languages, including Python - Can be used to perform real-time data processing and analytics - Easy to set up and use |
- Limited to processing small amounts of data - Limited to processing data in response to events, rather than continuously - Limited control over the underlying infrastructure |
Azure Data Factory | - Supports a variety of sources and destinations, including Azure Blob Storage, Azure SQL Database, and Azure Cosmos DB - Provides a visual interface for designing and monitoring data pipelines - Supports integration with other Azure services, such as Azure Databricks and Azure HDInsight - Can be used to perform batch processing and scheduled data transfers |
- Limited real-time data processing capabilities - Limited control over the underlying infrastructure - Can be complex to set up and configure |
Azure Event Hubs | - Highly scalable and can handle millions of events per second - Supports various protocols, including AMQP, Kafka, and HTTP - Can be used to ingest data from various sources, including IoT devices and applications - Provides built-in support for event processing and streaming analytics |
- Limited control over the underlying infrastructure - Limited support for data transformation and enrichment - Can be complex to set up and configure |
Azure Databricks | - Fully managed service that supports Apache Spark - Provides a collaborative environment for data engineers, data scientists, and machine learning practitioners - Provides built-in support for data transformation and machine learning - Can be used to perform batch processing, real-time processing, and machine learning |
- Can be expensive, especially for large data volumes - Limited control over the underlying infrastructure - Can be complex to set up and configure |
Azure HDInsight | - Fully managed service that supports various big data technologies, including Hadoop, Spark, and Hive - Provides a flexible and scalable environment for processing and analyzing large datasets - Provides built-in support for data transformation and machine learning - Can be used for batch processing, real-time processing, and machine learning |
- Can be expensive, especially for large data volumes - Limited control over the underlying infrastructure - Can be complex to set up and configure |
- Azure Event Hubs trigger and bindings for Azure Functions | Microsoft Learn
- Work with Azure Functions Core Tools | Microsoft Learn
Prerequisites: 1) install azure cli, 2) install azure functions core tools
Two types of bindings:
- Trigger: Respond to events sent to an event hub event stream
- Output binding: Write events to an event stream
Type | Trigger | Input binding | Output binding |
---|---|---|---|
HTTP | x | ||
Timer | x | ||
Azure Queue Storage | x | x | |
Azure Service Bus topic | x | x | |
Azure Service Bus queue | x | x | |
Azure Cosmos DB | x | x | x |
Azure Blob Storage | x | x | x |
Azure Hub | x | x |
brew tap azure/functions
brew install azure-functions-core-tools@4
# file: .zshrc
# rosetta terminal setup
if [ $(arch) = "i386" ]; then
alias python="/usr/local/bin/python3"
alias brew86='/usr/local/bin/brew'
alias pyenv86="arch -x86_64 pyenv"
alias func="/usr/local/Cellar/azure-functions-core-tools@4/4.0.4785/func"
fi
<project_root>/
| - .venv/ # used by local development.
| - .vscode/
| - function_app.py
| - additional_functions.py
| - tests/
| | - test_my_function.py
| - .funcignore # ignore .vscode/ .venv/
| - host.json # configuration options that affect all functions in a function app instance. This file does get published to Azure
| - local.settings.json # store app settings and connection strings when it's running locally. This file doesn't get published to Azure
| - requirements.txt
| - Dockerfile
- Install Azure Functions Core Tools
- Create Virtual Environment:
python -m venv .venv
andsource .venv/bin/activate
func init LocalFunctionProj --python -m V2
- Create a function in an existing project:
func new --template "Http Trigger" --name MyHttpTrigger
func new --template "Azure Queue Storage Trigger" --name MyQueueTrigger
- Create a function in an existing project:
cd LocalFunctionProj
func templates list -l python
:- Azure Blob Storage trigger
- Azure Cosmos DB trigger
- Durable Functions activity
- Durable Functions entity
- Durable Functions HTTP starter
- Durable Functions orchestrator
- Azure Event Grid trigger
- Azure Event Hub trigger
- HTTP trigger
- Kafka output
- Kafka trigger
- Azure Queue Storage trigger
- RabbitMQ trigger
- Azure Service Bus Queue trigger
- Azure Service Bus Topic trigger
- Timer trigger
- (optional) Run the function locally
- start storage emulator:
azurite
. This is used whenAzureWebJobsStorage
setting in thelocal.settings.json
project file is set toUseDevelopmentStorage=true
- Start function locally:
func start
(not supportarm64
) - x86 emulation on ARM64
- Enable Rosetta in Terminal: open "Terminal" application from "Get Info" and "Open using Rosetta"
- make sure your shell is
zsh
- Run command
arch
- Reinstall all dependencies
- start storage emulator:
- Create Azure resources for your function
az login
az config param-persist on
az group create --name AzureFunctionsQuickstart-rg --location <REGION>
az storage account create --name <STORAGE_NAME> --sku Standard_LRS
az functionapp create --consumption-plan-location westeurope --runtime python --runtime-version 3.9 --functions-version 4 --name <APP_NAME> --os-type linux --storage-account <STORAGE_NAME>
- (optional) Get your storage connection strings
- Azure Portal
- "Storage accounts"
- "Settings"
- "Access keys"
- Copy the "Connection string"
- Deploy
func azure functionapp publish <APP_NAME>
- Update app setting
az functionapp config appsettings set --name <FUNCTION_APP_NAME> --resource-group <RESOURCE_GROUP_NAME> --settings AzureWebJobsFeatureFlags=EnableWorkerIndexing
(v2 requireAzureWebJobsFeatureFlags=EnableWorkerIndexing
but alrady included with-m v2
) - Verify
func azure functionapp logstream <APP_NAME> --browser
- Cleanup
az group delete --name AzureFunctionsQuickstart-rg
- Kubernetes cluster:
func kubernetes deploy --name <DEPLOYMENT_NAME> --registry <REGISTRY_USERNAME>
- Extensions
- Install all extensions
func extensions install
- Specific extension
func extensions install --package Microsoft.Azure.WebJobs.Extensions.Storage --version 5.0.0
- Install all extensions
- Monitor executions in Azure Functions
- Configure monitoring for Azure Functions
- Enable streaming logs:
- built-in:
func azure functionapp logstream <FunctionAppName>
- live metrics:
func azure functionapp logstream <FunctionAppName> --browser
- built-in:
Python developer reference for Azure Functions | Microsoft Learn
func init <name> --python -m V2 [--dockerfile]
func templates list -l python
func new --name <name> --template <template> --language <language>
--authlevel <authlevel>
: function, anonymous, admin (for HTTP trigger)
func start
: start local runtime host- Function App:
func azure functionapp fetch-app-settings <APP_NAME>
func azure functionapp list-functions <app_name>
func azure functionapp logstream <APP_NAME> [--browser]
: Connects the local cmd to streaming logsfunc azure functionapp publish <FunctionAppName>
--additional-packages
: List of packages to install when building native dependencies--build [remove|local]
: build action when deploying to a Linux function app--list-ignored-files
: Displays a list of files that are ignored during publishing, which is based on the .funcignore file.--list-included-files
: Displays a list of files that are published--no-build
: Project isn't built during publishing. For Python, pip install isn't performed.--slot
: Optional name of a specific slot to which to publish.
func azure storage fetch-connection-string <STORAGE_ACCOUNT_NAME>
- Deploy:
func kubernetes deploy [--max-replicas] [--min-replicas] [--name]
func kubernetes install
func kubernetes remove
- Setting:
func settings list
func settings add <name> <value>
andfunc settings delete <SETTING_NAME>
func settings decrypt
andfunc settings encrypt
- Break up the function app into modular components
- Reusable APIs
# http_blueprint.py
import logging
import azure.functions as func
bp = func.Blueprint()
@bp.route(route="default_template")
def default_template(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
name = req.params.get('name')
if not name:
try:
req_body = req.get_json()
except ValueError:
pass
else:
name = req_body.get('name')
if name:
return func.HttpResponse(
f"Hello, {name}. This HTTP-triggered function "
f"executed successfully.")
else:
return func.HttpResponse(
"This HTTP-triggered function executed successfully. "
"Pass a name in the query string or in the request body for a"
" personalized response.",
status_code=200
)
# function_app.py
import azure.functions as func
from http_blueprint import bp
app = func.FunctionApp()
app.register_functions(bp)
graph LR
A[Event Producers] -->|kafka| B[Azure Event Hubs]
B -->|partition| C1[partition1]
B -->|partition| C2[partition2]
B -->|partition| C3[partition3]
B -->|partition| C4[partition4]
C1 -->|Consumer Group 1| D1[Function\nLease 1-2]
C2 -->|Consumer Group 1| D1
C3 -->|Consumer Group 1| D2[Function\nLease 2-4]
C4 -->|Consumer Group 1| D2
Once you have implemented the Azure Functions for processing IoT data and exposing it via REST API, you can deploy them to Azure. Follow the official documentation to deploy your Python Azure Functions: Deploy Python Azure Functions
After deploying your Azure Functions, you can test them by sending data from your factory production machines to the IoT Hub and then calling the REST API to retrieve the data.
For example, you can use a tool like Postman to send HTTP requests to your REST API and verify that the data is being returned correctly.
By following these steps, you should be able to build IoT data for factory production machines using Azure IoT Edge, Azure IoT Hub, and Azure Functions in Python. The data will be saved to SQL at hourly intervals, and you can expose this SQL data using a REST API. If you have any questions or need further clarification, please let me know.
-
Set up Azure resources: Create an Azure IoT Hub and an Azure SQL Database. Follow the official Azure documentation to learn how to set up these resources and configure them.
-
Set up Azure IoT Edge: Install Azure IoT Edge on your production machines and configure it to send data to your Azure IoT Hub.
-
Set up Azure Functions: Create an Azure Functions app and a function that triggers on an IoT Hub event. Use this function to process the data sent by the IoT Edge device and save it to the SQL Database.
-
Expose SQL data using REST API: Create an Azure Function that exposes the data from the SQL Database using a REST API.
-
Build and deploy your solution: Write Python code to implement the data processing logic in your Azure Functions and deploy the code to Azure.
To start developing your solution in Python, you can use the Azure SDK for Python. The SDK provides a set of Python libraries that allow you to interact with Azure services and resources from your Python code. You can use the SDK to manage Azure resources, send and receive data from IoT devices, and invoke Azure Functions.
The Azure SDK for Python can be installed using pip, like this:
pip install azure-iot-device azure-iothub-service-client azure-functions
Once you have the SDK installed, you can start writing Python code to interact with Azure services and resources. The official Azure documentation provides detailed guidance on how to use the Azure SDK for Python, as well as sample code and tutorials to help you get started.
Stream Analytics can connect to:
- Azure Event Hubs and Azure IoT Hub for streaming data ingestion, and
- Azure Blob storage to ingest historical data.
- Job input can also include static or slow-changing reference data from Azure Blob storage or SQL Database that you can join to streaming data to perform lookup operations.
run on:
- serverless cloud
- run on IoT Edge for ultra-low latency analytics.
use for:
- Dashboards for data visualization
- Real-time alerts from temporal and spatial patterns or anomalies
- Extract, Transform, Load (ETL)
Stream Analytics query language is consistent to the SQL language
- simple data manipulation,
- aggregation functions, and
- complex geospatial functions
- defining and invoking your own functions (only support Javascript and C#)
- define function calls in the Azure Machine Learning
You can continue to use Stream Analytics by sending events to Event Hubs using the Event Hubs Kafka API without changing the event sender
Spark Structured Streaming + Databricks for Python
Property | Description |
---|---|
EventProcessedUtcTime | the event was processed. |
EventEnqueuedUtcTime | event was received by the IoT Hub. |
PartitionId | partition ID for the input adapter. |
IoTHub.MessageId | correlate two-way communication in IoT Hub. |
IoTHub.CorrelationId | message responses and feedback in IoT Hub. |
IoTHub.ConnectionDeviceId | The authentication ID used to send this message. |
IoTHub.ConnectionDeviceGenerationId | The generation ID of the authenticated device |
IoTHub.EnqueuedTime | message was received by the IoT Hub. |
Stream Analytics Query Language Reference - Stream Analytics Query | Microsoft Learn
SELECT
DeviceID,
Location.Lat,
Location.Long,
SensorReadings.SensorMetadata.Version
FROM input
-------------------
SELECT input.Location.*
FROM input
-------------------
SELECT
input.DeviceID,
thresholds.SensorName
FROM input -- stream input
JOIN thresholds -- reference data input
ON
input.DeviceId = thresholds.DeviceId
WHERE
GetRecordPropertyValue(input.SensorReadings, thresholds.SensorName) > thresholds.Value
-- the where statement selects the property value coming from the reference data
-------------------
SELECT
GetArrayElement(arrayField, 0) AS firstElement
FROM input
------------------- Select all array element as individual events. The APPLY operator together with the GetArrayElements built-in function extracts all array elements as individual events
SELECT
arrayElement.ArrayIndex,
arrayElement.ArrayValue
FROM input as event
CROSS APPLY GetArrayElements(event.arrayField) AS arrayElement
Function Name | Description |
---|---|
AVG | Calculates the average value of a numeric input over a specified time window. |
COUNT | Counts the number of input events over a specified time window. |
Collect | Collects input events into an array over a specified time window. |
CollectTOP | Collects the top N input events into an array over a specified time window. |
MAX | Finds the maximum value of a numeric input over a specified time window. |
MIN | Finds the minimum value of a numeric input over a specified time window. |
Percentile_Cont | Calculates the continuous percentile of a numeric input over a specified time window. |
Percentile_Disc | Calculates the discrete percentile of a numeric input over a specified time window. |
STDEV | Calculates the standard deviation of a numeric input over a specified time window. |
STDEVP | Calculates the population standard deviation of a numeric input over a specified time window. |
SUM | Calculates the sum of a numeric input over a specified time window. |
TopOne | Finds the top input event over a specified time window. |
VAR | Calculates the variance of a numeric input over a specified time window. |
VARP | Calculates the population variance of a numeric input over a specified time window. |
Use reference data for lookups in Azure Stream Analytics | Microsoft Learn
- Azure Blob Storage
- Azure SQL Database
- transform or copy reference data to Blob Storage from Azure Data Factory to use cloud-based and on-premises data stores. Azure-DataFactory/SamplesV1/ReferenceDataRefreshForASAJobs at main · Azure/Azure-DataFactory · GitHub
The LicensePlate data can join with a static dataset that has registration details to identify license plates that have expired
SELECT I1.EntryTime, I1.LicensePlate, I1.TollId, R.RegistrationId
FROM Input1 I1 TIMESTAMP BY EntryTime
JOIN Registration R
ON I1.LicensePlate = R.LicensePlate
WHERE R.Expired = '1'
- Tutorial - Analyze fraudulent call data with Azure Stream Analytics and visualize results in Power BI dashboard | Microsoft Learn
- Tutorial - Run Azure Functions in Azure Stream Analytics jobs | Microsoft Learn
- Update or merge records in Azure SQL Database with Azure Functions | Microsoft Learn
- Azure Stream Analytics - YouTube
Currently, Azure Stream Analytics (ASA) only supports inserting (appending) rows to SQL outputs (Azure SQL Databases, and Azure Synapse Analytics). This article discusses workarounds to enable UPDATE, UPSERT, or MERGE on SQL databases, with Azure Functions as the intermediary layer.
Detecting fraudulent calls using self-joint on "CallRecTime" (idea: if the same IMSI calls two different switches within 1-5 seconds, it is a fraudulent call)
SELECT System.Timestamp AS WindowEnd, COUNT(*) AS FraudulentCalls
INTO "MyPBIoutput"
FROM "CallStream" CS1 TIMESTAMP BY CallRecTime
JOIN "CallStream" CS2 TIMESTAMP BY CallRecTime
ON CS1.CallingIMSI = CS2.CallingIMSI
AND DATEDIFF(ss, CS1, CS2) BETWEEN 1 AND 5
WHERE CS1.SwitchNum != CS2.SwitchNum
GROUP BY TumblingWindow(Duration(second, 1))
Create Bicep files - Visual Studio Code - Azure Resource Manager | Microsoft Learn
infrastructure-as-code: instruction manual for your infrastructure. The manual details the end configuration of your resources and how to reach that configuration state.
- Right-click the Bicep file inside the VSCode, and then select Deploy Bicep file.
- From the Select Resource Group listbox on the top, select Create new Resource Group.
- Enter exampleRG as the resource group name, and then press [ENTER].
- Select a location for the resource group, and then press [ENTER].
- From Select a parameter file, select None.
- Enter a unique storage account name, and then press [ENTER]. If you get an error message indicating the storage account is already taken, the storage name you provided is in use. Provide a name that is more likely to be unique.
- From Create parameters file from values used in this deployment?, select No.
Using Azure-cli:
az group create --name exampleRG --location eastus
az deployment group create --resource-group exampleRG --template-file main.bicep --parameters storageName=uniquename
Cleanup resources: az group delete --name exampleRG
- Make sure Docker image build and run in local successfully
- Create user-assigned managed identity (
Azure Portal
->Managed Identities
) - Create container registry (
Azure Portal
->Container registries
) Go to resource
->Container registries
->Access keys
->Enable Admin user
->Copy Username and Password
- Push the image to Azure Container registry
docker login strandaiapistaging.azurecr.io
docker tag strand-linux strandaiapistaging.azurecr.io/stranddemo:latest
docker push strandaiapistaging.azurecr.io/stranddemo:latest
- Authorize managed identity for your registry
- Container registry -> Access control (IAM) -> Role assignments -> Add role assignment
- Select
ArcPull
- Member: select the managed identity you created
- Create web app: App Services -> Create -> Web App
- Configure the Web App:
- Web App -> Configuration -> Application settings -> New application setting -> WEBSITES_PORT: 8502
- Identity -> User assigned -> Select the managed identity you created
- Deployment Center -> Authentication -> Managed Identity -> Select the managed identity you created (also enable Continuous Deployment at the end)
docker rmi strandgptapiprd.azurecr.io/strandapi:latest
docker tag strand-prod-amd64 strandgptapiprd.azurecr.io/strandapi:latest
docker push strandgptapiprd.azurecr.io/strandapi:latest