trungnt13/Azure.md

## Azure.md

      
    Raw
  

              Azure.md
            
          
    Azure


Azure

Helloworld

Subscription


IoT

IoT Hub

Hub Query Language
Routing


IoT Edge


Stream Analytics
Functions

Function Project Structure
Function step-by-step
Azure function core tools reference
Function v2 blue print
Horizontal scaling for functions
Deploy and test your Azure Functions


Azure Stream Analytics

Stream Analytics Using Reference Data
Examples


Azure Bicep

Deploy bicep file


Azure App Service (Web App)


Helloworld


Install azure-cli: pip install azure-cli
login: az login, to clear login: az logout or az account clear

If installed azure-cli via pip, run . az.completion.sh to enable auto completion.
Subscription


logical container represented a set of services and resources


List subscription: az account list
Set subscription: az account set --subscription "your subscription name"
Show current subscription: az account show --list table|json|yaml


A resource is a single unit of computing, storage or network interfaces. A resource group is a logical container for Azure resources.
A service is a higher-level concept represented a collection of related resources and functionality.

Services:

Azure Virtual Machines
Azure App Service
Azure SQL Database
Azure Storage

IoT

Overview of device management with Microsoft Azure IoT Hub | Microsoft Learn
First, you need to create an Azure IoT Hub instance and configure IoT Edge devices. Follow the official documentation to set up the IoT Hub and IoT Edge: Azure IoT Hub and IoT Edge setup

  
      graph TD
    A[Plan] --> B[Provision]
    B --> C[Configure]
    C --> D[Monitor]
    D --> E[Retire]

    A --> A1[1. Create metadata]
    A --> A2[2. Group devices]
    A --> A3[3. Device twin to store metadata: \n tags and properties]

    B --> B1[1. Create flexible device \n identities and credentials]
    B --> B2[2. Report their capabilities and \n conditions through device twin]

    C --> C1[1. Changes and firmware updates to devices]
    C --> C2[2. `desired` and `direct methods` or \n `broadcast jobs`]

    D --> D1[1. Device twin to report real-time operational conditions]

    E --> E1[1. Device twin to maintain device info]
    E --> E2[2. IoT Hub registry for securely revoking \ndevice credentials and identities]

    
IoT Hub


      graph LR
  A[Event Hubs] --> B[Functions]
  A --> C[Stream Analytics]
  A --> D[Time Series Insights]
  A --> E[Apache Spark]
  A --> F[Databricks]

    
Hub Query Language

Understand the Azure IoT Hub query language | Microsoft Learn

Columns: DeviceID, LastActivityTime
SELECT COUNT() as TotalNumber


"jobs" provide a way to execute operations on sets of devices.

SELECT <select_list>
  FROM <from_specification>
  [WHERE <filter_condition>]
  [GROUP BY <group_specification>]

SELECT * FROM devices.jobs
  WHERE devices.jobs.deviceId = 'device-0'

SELECT * FROM devices.jobs
  WHERE devices.jobs.deviceId = 'myDeviceId'
    AND devices.jobs.jobType = 'scheduleUpdateTwin'
    AND devices.jobs.status = 'completed'
    AND devices.jobs.createdTimeUtc > '2016-09-01'

SELECT properties.reported.telemetryConfig.status AS status,
  COUNT() AS numberOfDevices
FROM devices
GROUP BY properties.reported.telemetryConfig.status

SELECT DeviceId, LastActivityTime
FROM devices
WHERE status = 'enabled' AND connectionState = 'Disconnected'

SELECT COUNT() as totalNumberOfDevices FROM devices

SELECT * FROM devices
WHERE tags.location.region = 'US'

SELECT * FROM devices
WHERE properties.reported.connectivity IN ['wired', 'wifi']

SELECT * FROM devices
WHERE is_defined(properties.reported.connectivity)

SELECT * FROM devices.modules
  WHERE properties.reported.status = 'scanning'
  AND deviceId IN ['device1', 'device2']
Routing


"IoT Hub" > "Message routing" > "Add" > "Add endpoint" > "Add route" (Events Hub, Storage, Cosmos DB)
Builtin Event Hubs Endpoint will be disable if routing added.
Hierarchy: Create Endpoint (Storage, Cosmos, Hub, etc) > Create Route

Message to IoT Hub contains 3 parts:

System properties
Application properties: to send a timestamp from the device using the iothub-creation-time-utc property to record when the message was sent by the device.
Message body

Query language for routing Tutorial - Configure message routing | Microsoft Learn:

System properties: $contentType = 'application/json' or $iothub-connection-device-id = 'myDevice'
Application properties: test = 'true
Message body: $body.Weather.HistoricalData[0].Month = 'Feb'
Logic: $contentEncoding = 'UTF-8' AND processingPath = 'hot'
Twin: $twin.properties.desired.telemetryConfig.sendFrequency = '5m'

$twin.tags.deploymentLocation.floor = 1


Use base64 to read binary data in IoT Hub message body: json.loads(base64.b64decode(msg["Body"]).decode("utf-8"))
IoT Edge

Develop module for Linux devices using Azure IoT Edge tutorial | Microsoft Learn

Stream Analytics

Sure, here's a markdown table comparing the pros and cons of each Azure service alternative to Azure Stream Analytics:


Service
Pros
Cons


Azure Functions
- Serverless architecture allows for automatic scaling and reduced costs 
- Supports a variety of programming languages, including Python
- Can be used to perform real-time data processing and analytics
- Easy to set up and use
- Limited to processing small amounts of data
- Limited to processing data in response to events, rather than continuously
- Limited control over the underlying infrastructure


Azure Data Factory
- Supports a variety of sources and destinations, including Azure Blob Storage, Azure SQL Database, and Azure Cosmos DB
- Provides a visual interface for designing and monitoring data pipelines
- Supports integration with other Azure services, such as Azure Databricks and Azure HDInsight
- Can be used to perform batch processing and scheduled data transfers
- Limited real-time data processing capabilities
- Limited control over the underlying infrastructure
- Can be complex to set up and configure


Azure Event Hubs
- Highly scalable and can handle millions of events per second
- Supports various protocols, including AMQP, Kafka, and HTTP
- Can be used to ingest data from various sources, including IoT devices and applications
- Provides built-in support for event processing and streaming analytics
- Limited control over the underlying infrastructure
- Limited support for data transformation and enrichment
- Can be complex to set up and configure


Azure Databricks
- Fully managed service that supports Apache Spark
- Provides a collaborative environment for data engineers, data scientists, and machine learning practitioners
- Provides built-in support for data transformation and machine learning
- Can be used to perform batch processing, real-time processing, and machine learning
- Can be expensive, especially for large data volumes
- Limited control over the underlying infrastructure
- Can be complex to set up and configure


Azure HDInsight
- Fully managed service that supports various big data technologies, including Hadoop, Spark, and Hive
- Provides a flexible and scalable environment for processing and analyzing large datasets
- Provides built-in support for data transformation and machine learning
- Can be used for batch processing, real-time processing, and machine learning
- Can be expensive, especially for large data volumes
- Limited control over the underlying infrastructure
- Can be complex to set up and configure


Functions


Azure Event Hubs trigger and bindings for Azure Functions | Microsoft Learn
Work with Azure Functions Core Tools | Microsoft Learn

Azure Functions Core Tools reference | Microsoft Learn
Create a Python function from the command line - Azure Functions | Microsoft Learn


Prerequisites: 1) install azure cli, 2) install azure functions core tools
Two types of bindings:

Trigger: Respond to events sent to an event hub event stream
Output binding: Write events to an event stream


Type
Trigger
Input binding
Output binding


HTTP
x


Timer
x


Azure Queue Storage
x

x


Azure Service Bus topic
x

x


Azure Service Bus queue
x

x


Azure Cosmos DB
x
x
x


Azure Blob Storage
x
x
x


Azure Hub
x

x


brew tap azure/functions
brew install azure-functions-core-tools@4
# file: .zshrc
# rosetta terminal setup
if [ $(arch) = "i386" ]; then
    alias python="/usr/local/bin/python3"
    alias brew86='/usr/local/bin/brew'
    alias pyenv86="arch -x86_64 pyenv"
    alias func="/usr/local/Cellar/azure-functions-core-tools@4/4.0.4785/func"
fi
Function Project Structure

 <project_root>/
 | - .venv/ # used by local development.
 | - .vscode/
 | - function_app.py
 | - additional_functions.py
 | - tests/
 | | - test_my_function.py
 | - .funcignore #  ignore .vscode/ .venv/
 | - host.json #  configuration options that affect all functions in a function app instance. This file does get published to Azure
 | - local.settings.json #  store app settings and connection strings when it's running locally. This file doesn't get published to Azure
 | - requirements.txt
 | - Dockerfile

Function step-by-step


Install Azure Functions Core Tools
Create Virtual Environment: python -m venv .venv and source .venv/bin/activate
func init LocalFunctionProj --python -m V2

Create a function in an existing project: func new --template "Http Trigger" --name MyHttpTrigger
func new --template "Azure Queue Storage Trigger" --name MyQueueTrigger


cd LocalFunctionProj
func templates list -l python:

Azure Blob Storage trigger
Azure Cosmos DB trigger
Durable Functions activity
Durable Functions entity
Durable Functions HTTP starter
Durable Functions orchestrator
Azure Event Grid trigger
Azure Event Hub trigger
HTTP trigger
Kafka output
Kafka trigger
Azure Queue Storage trigger
RabbitMQ trigger
Azure Service Bus Queue trigger
Azure Service Bus Topic trigger
Timer trigger


(optional) Run the function locally

start storage emulator: azurite. This is used when AzureWebJobsStorage setting in the local.settings.json project file is set to UseDevelopmentStorage=true
Start function locally: func start (not support arm64)
x86 emulation on ARM64

Enable Rosetta in Terminal: open "Terminal" application from "Get Info" and "Open using Rosetta"
make sure your shell is zsh
Run command arch
Reinstall all dependencies


Create Azure resources for your function

az login
az config param-persist on
az group create --name AzureFunctionsQuickstart-rg --location <REGION>
az storage account create --name <STORAGE_NAME> --sku Standard_LRS
az functionapp create --consumption-plan-location westeurope --runtime python --runtime-version 3.9 --functions-version 4 --name <APP_NAME> --os-type linux --storage-account <STORAGE_NAME>


(optional) Get your storage connection strings

Azure Portal
"Storage accounts"
"Settings"
"Access keys"
Copy the "Connection string"


Deploy func azure functionapp publish <APP_NAME>

Manually upload and deploy
Function Continous deployment


Update app setting az functionapp config appsettings set --name <FUNCTION_APP_NAME> --resource-group <RESOURCE_GROUP_NAME> --settings AzureWebJobsFeatureFlags=EnableWorkerIndexing (v2 require AzureWebJobsFeatureFlags=EnableWorkerIndexing but alrady included with -m v2)
Verify func azure functionapp logstream <APP_NAME> --browser
Cleanup az group delete --name AzureFunctionsQuickstart-rg
Kubernetes cluster: func kubernetes deploy --name <DEPLOYMENT_NAME> --registry <REGISTRY_USERNAME>
Extensions

Install all extensions func extensions install
Specific extension func extensions install --package Microsoft.Azure.WebJobs.Extensions.Storage --version 5.0.0


Monitor executions in Azure Functions
Configure monitoring for Azure Functions
Enable streaming logs:

built-in: func azure functionapp logstream <FunctionAppName>
live metrics: func azure functionapp logstream <FunctionAppName> --browser


Azure function core tools reference

Python developer reference for Azure Functions | Microsoft Learn

func init <name> --python -m V2 [--dockerfile]
func templates list -l python
func new --name <name> --template <template> --language <language>

--authlevel <authlevel>: function, anonymous, admin (for HTTP trigger)


func start: start local runtime host
Function App:

func azure functionapp fetch-app-settings <APP_NAME>
func azure functionapp list-functions <app_name>
func azure functionapp logstream <APP_NAME> [--browser]: Connects the local cmd to streaming logs
func azure functionapp publish <FunctionAppName>

--additional-packages: List of packages to install when building native dependencies
--build [remove|local]: build action when deploying to a Linux function app
--list-ignored-files: Displays a list of files that are ignored during publishing, which is based on the .funcignore file.
--list-included-files: Displays a list of files that are published
--no-build: Project isn't built during publishing. For Python, pip install isn't performed.
--slot: Optional name of a specific slot to which to publish.


func azure storage fetch-connection-string <STORAGE_ACCOUNT_NAME>


Deploy:

func kubernetes deploy [--max-replicas] [--min-replicas] [--name]
func kubernetes install
func kubernetes remove


Setting:

func settings list
func settings add <name> <value> and func settings delete <SETTING_NAME>
func settings decrypt and func settings encrypt


Function v2 blue print


Break up the function app into modular components
Reusable APIs

# http_blueprint.py
import logging

import azure.functions as func

bp = func.Blueprint()

@bp.route(route="default_template")
def default_template(req: func.HttpRequest) -> func.HttpResponse:
    logging.info('Python HTTP trigger function processed a request.')

    name = req.params.get('name')
    if not name:
        try:
            req_body = req.get_json()
        except ValueError:
            pass
        else:
            name = req_body.get('name')

    if name:
        return func.HttpResponse(
            f"Hello, {name}. This HTTP-triggered function "
            f"executed successfully.")
    else:
        return func.HttpResponse(
            "This HTTP-triggered function executed successfully. "
            "Pass a name in the query string or in the request body for a"
            " personalized response.",
            status_code=200
        )
# function_app.py
import azure.functions as func
from http_blueprint import bp

app = func.FunctionApp()

app.register_functions(bp)
Horizontal scaling for functions


      graph LR
  A[Event Producers] -->|kafka| B[Azure Event Hubs]
  B -->|partition| C1[partition1]
  B -->|partition| C2[partition2]
  B -->|partition| C3[partition3]
  B -->|partition| C4[partition4]

  C1 -->|Consumer Group 1| D1[Function\nLease 1-2]
  C2 -->|Consumer Group 1| D1
  C3 -->|Consumer Group 1| D2[Function\nLease 2-4]
  C4 -->|Consumer Group 1| D2

    
Deploy and test your Azure Functions

Once you have implemented the Azure Functions for processing IoT data and exposing it via REST API, you can deploy them to Azure. Follow the official documentation to deploy your Python Azure Functions: Deploy Python Azure Functions
After deploying your Azure Functions, you can test them by sending data from your factory production machines to the IoT Hub and then calling the REST API to retrieve the data.
For example, you can use a tool like Postman to send HTTP requests to your REST API and verify that the data is being returned correctly.
By following these steps, you should be able to build IoT data for factory production machines using Azure IoT Edge, Azure IoT Hub, and Azure Functions in Python. The data will be saved to SQL at hourly intervals, and you can expose this SQL data using a REST API. If you have any questions or need further clarification, please let me know.


Set up Azure resources: Create an Azure IoT Hub and an Azure SQL Database. Follow the official Azure documentation to learn how to set up these resources and configure them.


Set up Azure IoT Edge: Install Azure IoT Edge on your production machines and configure it to send data to your Azure IoT Hub.


Set up Azure Functions: Create an Azure Functions app and a function that triggers on an IoT Hub event. Use this function to process the data sent by the IoT Edge device and save it to the SQL Database.


Expose SQL data using REST API: Create an Azure Function that exposes the data from the SQL Database using a REST API.


Build and deploy your solution: Write Python code to implement the data processing logic in your Azure Functions and deploy the code to Azure.


To start developing your solution in Python, you can use the Azure SDK for Python. The SDK provides a set of Python libraries that allow you to interact with Azure services and resources from your Python code. You can use the SDK to manage Azure resources, send and receive data from IoT devices, and invoke Azure Functions.
The Azure SDK for Python can be installed using pip, like this:
pip install azure-iot-device azure-iothub-service-client azure-functions

Once you have the SDK installed, you can start writing Python code to interact with Azure services and resources. The official Azure documentation provides detailed guidance on how to use the Azure SDK for Python, as well as sample code and tutorials to help you get started.

Azure Stream Analytics


Stream Analytics can connect to:

Azure Event Hubs and Azure IoT Hub for streaming data ingestion, and
Azure Blob storage to ingest historical data.
Job input can also include static or slow-changing reference data from Azure Blob storage or SQL Database that you can join to streaming data to perform lookup operations.

run on:

serverless cloud
run on IoT Edge for ultra-low latency analytics.

use for:

Dashboards for data visualization
Real-time alerts from temporal and spatial patterns or anomalies
Extract, Transform, Load (ETL)

Stream Analytics query language is consistent to the SQL language

simple data manipulation,
aggregation functions, and
complex geospatial functions
defining and invoking your own functions (only support Javascript and C#)
define function calls in the Azure Machine Learning

You can continue to use Stream Analytics by sending events to Event Hubs using the Event Hubs Kafka API without changing the event sender
Spark Structured Streaming + Databricks for Python


Property
Description


EventProcessedUtcTime
the event was processed.


EventEnqueuedUtcTime
event was received by the IoT Hub.


PartitionId
partition ID for the input adapter.


IoTHub.MessageId
correlate two-way communication in IoT Hub.


IoTHub.CorrelationId
message responses and feedback in IoT Hub.


IoTHub.ConnectionDeviceId
The authentication ID used to send this message.


IoTHub.ConnectionDeviceGenerationId
The generation ID of the authenticated device


IoTHub.EnqueuedTime
message was received by the IoT Hub.


Stream Analytics Query Language Reference - Stream Analytics Query | Microsoft Learn

SELECT
    DeviceID,
    Location.Lat,
    Location.Long,
    SensorReadings.SensorMetadata.Version
FROM input
-------------------
SELECT input.Location.*
FROM input
-------------------
SELECT
    input.DeviceID,
    thresholds.SensorName
FROM input      -- stream input
JOIN thresholds -- reference data input
ON
    input.DeviceId = thresholds.DeviceId
WHERE
    GetRecordPropertyValue(input.SensorReadings, thresholds.SensorName) > thresholds.Value
    -- the where statement selects the property value coming from the reference data
-------------------
SELECT
    GetArrayElement(arrayField, 0) AS firstElement
FROM input
------------------- Select all array element as individual events. The APPLY operator together with the GetArrayElements built-in function extracts all array elements as individual events
SELECT
    arrayElement.ArrayIndex,
    arrayElement.ArrayValue
FROM input as event
CROSS APPLY GetArrayElements(event.arrayField) AS arrayElement


Function Name
Description


AVG
Calculates the average value of a numeric input over a specified time window.


COUNT
Counts the number of input events over a specified time window.


Collect
Collects input events into an array over a specified time window.


CollectTOP
Collects the top N input events into an array over a specified time window.


MAX
Finds the maximum value of a numeric input over a specified time window.


MIN
Finds the minimum value of a numeric input over a specified time window.


Percentile_Cont
Calculates the continuous percentile of a numeric input over a specified time window.


Percentile_Disc
Calculates the discrete percentile of a numeric input over a specified time window.


STDEV
Calculates the standard deviation of a numeric input over a specified time window.


STDEVP
Calculates the population standard deviation of a numeric input over a specified time window.


SUM
Calculates the sum of a numeric input over a specified time window.


TopOne
Finds the top input event over a specified time window.


VAR
Calculates the variance of a numeric input over a specified time window.


VARP
Calculates the population variance of a numeric input over a specified time window.


Stream Analytics Using Reference Data

Use reference data for lookups in Azure Stream Analytics | Microsoft Learn

Azure Blob Storage
Azure SQL Database
transform or copy reference data to Blob Storage from Azure Data Factory to use cloud-based and on-premises data stores. Azure-DataFactory/SamplesV1/ReferenceDataRefreshForASAJobs at main · Azure/Azure-DataFactory · GitHub


The LicensePlate data can join with a static dataset that has registration details to identify license plates that have expired

SELECT I1.EntryTime, I1.LicensePlate, I1.TollId, R.RegistrationId
FROM Input1 I1 TIMESTAMP BY EntryTime
JOIN Registration R
ON I1.LicensePlate = R.LicensePlate
WHERE R.Expired = '1'
Examples


Tutorial - Analyze fraudulent call data with Azure Stream Analytics and visualize results in Power BI dashboard | Microsoft Learn
Tutorial - Run Azure Functions in Azure Stream Analytics jobs | Microsoft Learn
Update or merge records in Azure SQL Database with Azure Functions | Microsoft Learn
Azure Stream Analytics - YouTube


Currently, Azure Stream Analytics (ASA) only supports inserting (appending) rows to SQL outputs (Azure SQL Databases, and Azure Synapse Analytics). This article discusses workarounds to enable UPDATE, UPSERT, or MERGE on SQL databases, with Azure Functions as the intermediary layer.

Detecting fraudulent calls using self-joint on "CallRecTime" (idea: if the same IMSI calls two different switches within 1-5 seconds, it is a fraudulent call)
 SELECT System.Timestamp AS WindowEnd, COUNT(*) AS FraudulentCalls
 INTO "MyPBIoutput"
 FROM "CallStream" CS1 TIMESTAMP BY CallRecTime
 JOIN "CallStream" CS2 TIMESTAMP BY CallRecTime
 ON CS1.CallingIMSI = CS2.CallingIMSI
 AND DATEDIFF(ss, CS1, CS2) BETWEEN 1 AND 5
 WHERE CS1.SwitchNum != CS2.SwitchNum
 GROUP BY TumblingWindow(Duration(second, 1))

Azure Bicep

Create Bicep files - Visual Studio Code - Azure Resource Manager | Microsoft Learn

infrastructure-as-code: instruction manual for your infrastructure. The manual details the end configuration of your resources and how to reach that configuration state.

Deploy bicep file


Right-click the Bicep file inside the VSCode, and then select Deploy Bicep file.
From the Select Resource Group listbox on the top, select Create new Resource Group.
Enter exampleRG as the resource group name, and then press [ENTER].
Select a location for the resource group, and then press [ENTER].
From Select a parameter file, select None.
Enter a unique storage account name, and then press [ENTER]. If you get an error message indicating the storage account is already taken, the storage name you provided is in use. Provide a name that is more likely to be unique.
From Create parameters file from values used in this deployment?, select No.

Using Azure-cli:
az group create --name exampleRG --location eastus
az deployment group create --resource-group exampleRG --template-file main.bicep --parameters storageName=uniquename

Cleanup resources: az group delete --name exampleRG

Azure App Service (Web App)


Make sure Docker image build and run in local successfully
Create user-assigned managed identity (Azure Portal -> Managed Identities)
Create container registry (Azure Portal -> Container registries)
Go to resource -> Container registries -> Access keys -> Enable Admin user -> Copy Username and Password
Push the image to Azure Container registry

docker login strandaiapistaging.azurecr.io
docker tag strand-linux strandaiapistaging.azurecr.io/stranddemo:latest
docker push strandaiapistaging.azurecr.io/stranddemo:latest


Authorize managed identity for your registry

Container registry -> Access control (IAM) -> Role assignments -> Add role assignment
Select ArcPull
Member: select the managed identity you created


Create web app: App Services -> Create -> Web App
Configure the Web App:

Web App -> Configuration -> Application settings -> New application setting -> WEBSITES_PORT: 8502
Identity -> User assigned -> Select the managed identity you created
Deployment Center -> Authentication -> Managed Identity -> Select the managed identity you created (also enable Continuous Deployment at the end)


docker rmi strandgptapiprd.azurecr.io/strandapi:latest
docker tag strand-prod-amd64 strandgptapiprd.azurecr.io/strandapi:latest
docker push strandgptapiprd.azurecr.io/strandapi:latest
Service	Pros	Cons
Azure Functions	- Serverless architecture allows for automatic scaling and reduced costs - Supports a variety of programming languages, including Python - Can be used to perform real-time data processing and analytics - Easy to set up and use	- Limited to processing small amounts of data - Limited to processing data in response to events, rather than continuously - Limited control over the underlying infrastructure
Azure Data Factory	- Supports a variety of sources and destinations, including Azure Blob Storage, Azure SQL Database, and Azure Cosmos DB - Provides a visual interface for designing and monitoring data pipelines - Supports integration with other Azure services, such as Azure Databricks and Azure HDInsight - Can be used to perform batch processing and scheduled data transfers	- Limited real-time data processing capabilities - Limited control over the underlying infrastructure - Can be complex to set up and configure
Azure Event Hubs	- Highly scalable and can handle millions of events per second - Supports various protocols, including AMQP, Kafka, and HTTP - Can be used to ingest data from various sources, including IoT devices and applications - Provides built-in support for event processing and streaming analytics	- Limited control over the underlying infrastructure - Limited support for data transformation and enrichment - Can be complex to set up and configure
Azure Databricks	- Fully managed service that supports Apache Spark - Provides a collaborative environment for data engineers, data scientists, and machine learning practitioners - Provides built-in support for data transformation and machine learning - Can be used to perform batch processing, real-time processing, and machine learning	- Can be expensive, especially for large data volumes - Limited control over the underlying infrastructure - Can be complex to set up and configure
Azure HDInsight	- Fully managed service that supports various big data technologies, including Hadoop, Spark, and Hive - Provides a flexible and scalable environment for processing and analyzing large datasets - Provides built-in support for data transformation and machine learning - Can be used for batch processing, real-time processing, and machine learning	- Can be expensive, especially for large data volumes - Limited control over the underlying infrastructure - Can be complex to set up and configure
Type	Trigger	Input binding	Output binding
HTTP	x
Timer	x
Azure Queue Storage	x		x
Azure Service Bus topic	x		x
Azure Service Bus queue	x		x
Azure Cosmos DB	x	x	x
Azure Blob Storage	x	x	x
Azure Hub	x		x
Property	Description
EventProcessedUtcTime	the event was processed.
EventEnqueuedUtcTime	event was received by the IoT Hub.
PartitionId	partition ID for the input adapter.
IoTHub.MessageId	correlate two-way communication in IoT Hub.
IoTHub.CorrelationId	message responses and feedback in IoT Hub.
IoTHub.ConnectionDeviceId	The authentication ID used to send this message.
IoTHub.ConnectionDeviceGenerationId	The generation ID of the authenticated device
IoTHub.EnqueuedTime	message was received by the IoT Hub.
Function Name	Description
AVG	Calculates the average value of a numeric input over a specified time window.
COUNT	Counts the number of input events over a specified time window.
Collect	Collects input events into an array over a specified time window.
CollectTOP	Collects the top N input events into an array over a specified time window.
MAX	Finds the maximum value of a numeric input over a specified time window.
MIN	Finds the minimum value of a numeric input over a specified time window.
Percentile_Cont	Calculates the continuous percentile of a numeric input over a specified time window.
Percentile_Disc	Calculates the discrete percentile of a numeric input over a specified time window.
STDEV	Calculates the standard deviation of a numeric input over a specified time window.
STDEVP	Calculates the population standard deviation of a numeric input over a specified time window.
SUM	Calculates the sum of a numeric input over a specified time window.
TopOne	Finds the top input event over a specified time window.
VAR	Calculates the variance of a numeric input over a specified time window.
VARP	Calculates the population variance of a numeric input over a specified time window.