Skip to content

Instantly share code, notes, and snippets.

@Honry
Last active May 21, 2024 04:28
Show Gist options
  • Save Honry/88b87c43b3f51a6c38c10454f3599405 to your computer and use it in GitHub Desktop.
Save Honry/88b87c43b3f51a6c38c10454f3599405 to your computer and use it in GitHub Desktop.
Build WebNN EP

WebNN EP is built base on ONNX Runtime for Web.

Prerequisites

Following https://onnxruntime.ai/docs/build/web.html#prerequisites to install all prerequisites.

Build

WebNN EP depends on JSEP, in order to take the advantage of its jsepWrapAsync and runAsync implementation in pre-jsep.js for enabling the WebNN async API usage in WebNN EP.

  1. Use the following command line to build wasm with SIMD + multi-thread support:
  • Build Release:

C:\code\onnxruntime>.\build.bat --config Release --build_wasm --enable_wasm_simd --enable_wasm_threads --use_jsep --use_webnn --target onnxruntime_webassembly --skip_tests

  • Build Debug (optional for debug purpose):

C:\code\onnxruntime>.\build.bat --config Debug --build_wasm --enable_wasm_simd --enable_wasm_threads --use_jsep --use_webnn --target onnxruntime_webassembly --enable_wasm_debug_info --skip_tests

  1. Prepare js/web:

C:\code\onnxruntime>cd js
C:\code\onnxruntime\js>npm ci
C:\code\onnxruntime\js>cd common
C:\code\onnxruntime\js\common>npm ci
C:\code\onnxruntime\js\common>cd ..\web
C:\code\onnxruntime\js\web>npm ci
C:\code\onnxruntime\js\web>npm run pull:wasm

  • For Release:

C:\code\onnxruntime\js\web>copy /Y ..\..\build\Windows\Release\ort-wasm-simd-threaded.jsep.mjs .\dist\ C:\code\onnxruntime\js\web>copy /Y ..\..\build\Windows\Release\ort-wasm-simd-threaded.jsep.wasm .\dist\

  • For Debug:

C:\code\onnxruntime\js\web>copy /Y ..\..\build\Windows\Debug\ort-wasm-simd-threaded.jsep.mjs .\dist\ C:\code\onnxruntime\js\web>copy /Y ..\..\build\Windows\Debug\ort-wasm-simd-threaded.jsep.wasm .\dist\

  1. Build Web artifacts:

C:\code\onnxruntime\js\web>npm run build

This generates the final JavaScript bundle files to use. They are under folder <ORT_ROOT>/js/web/dist.

Run npm tests

  • Run with Chrome Stable (Support WebNN CPU deviceType only at present):

C:\code\onnxruntime\js\web>npm test -- model <path-to-ort-model-test-folder> -b=webnn --wasm-number-threads=1 --debug

  • Run with Chrome Canary (Support both WebNN CPU and GPU deviceType):
    set CHROME_BIN ENV:

C:\code\onnxruntime\js\web>set CHROME_BIN=<path-to-chrome-canary>

  • Run test:

C:\code\onnxruntime\js\web>npm test -- model <path-to-ort-model-test-folder> -b=webnn --wasm-number-threads=1 --debug
e.g. npm run test -- suite1 -b=webnn --wasm-number-threads 1 --debug --webnn-device-type gpu

Usage

Make sure the files mentioned above are put in dist folder.

  • Import from HTML's <script> tag:
<script src="./dist/ort.all.min.js"></script>

or

<script src="./dist/ort.all.js"></script>
  • Import from source code inside <script type="module"> tag (ESM):
<script type="module">
  import * as ort from "ort.all.min.mjs";
</script>

or

<script type="module">
  import * as ort from "ort.all.mjs";
</script>
  • Import in a CommonJS project (CJS format, resolve from package.json "exports" field):
const ort = require('onnxruntime-web/all');
  • Import in an ESM project (ESM format, resolve from package.json "exports" field):
import * as ort from 'onnxruntime-web/all';
  • Use released version:

WebNN EP has been released in onnxruntime-web dev channel started from 1.18.0-dev.20240126, you can either use this version or a newer one.

<script src="https://cdnjs.cloudflare.com/ajax/libs/onnxruntime-web/1.18.0-dev.20240126-fc44f96ad5/ort.all.min.js" integrity="sha512-YZnoZqAi/xvZBkTDkyLGRAcNST4wpq/vtIJ+0NCvC8j0qJ9WyVWNfqSLe6co1VOoKfX+zc415jLZaCQuVu/QqA==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>

or

npm i onnxruntime-web@1.18.0-dev.20240126-fc44f96ad5
  • Set ORT Web ENV:
ort.env.wasm.numThreads = 4; // only available when the server allows SharedArrayBuffer
ort.env.wasm.proxy = false; // true for running behind Web Worker
ort.env.logLevel = 'warning'; // set the severity level for logging. 'verbose' | 'info' | 'warning' | 'error' | 'fatal'
ort.env.debug = false;  // true for build Wasm with Debug
  • Create ORT inference session:
const options = {
  executionProviders: [
    {
      name: 'webnn',
      deviceType: 'cpu', // 'cpu', 'gpu' or 'npu'
      powerPreference: "default",
      numThreads: 1, // allows using multi-threads for for WebNN deviceType 'cpu'
    },
  ],
}
// Free dimension override. Only need for the model with dynamic input shape.
options.freeDimensionOverrides = {
  batch_size: 1, // "batch_size", a example of dynamic dimension name in the input shape
};
options.logSeverityLevel = 0; // 0: kVERBOSE|1: kINFO|2: kWARNING|3: kERROR|4: kFATAL
// Check https://github.com/microsoft/onnxruntime/blob/main/js/common/lib/inference-session.ts for more options.

// Create inference session.
const session = await ort.InferenceSession.create(modelPath, options);
  • Run ORT inference session:
// prepare inputs
const dataA = new Float32Array(12).fill(1);
const dataB = new Float32Array(12).fill(2);
const tensorA = new ort.Tensor('float32', dataA, [3, 4]);
const tensorB = new ort.Tensor('float32', dataB, [4, 3]);

// prepare feeds. use model input names as keys.
const feeds = {
  a: new Tensor('float32', dataA, [3, 4]),
  b: new Tensor('float32', dataB, [4, 3])
};

// feed inputs and run
const results = await session.run(feeds);

Check more examples of API usage from https://github.com/microsoft/onnxruntime-inference-examples/tree/main/js.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment