Skip to content

Instantly share code, notes, and snippets.

Last active May 21, 2024 04:28
Show Gist options
  • Save Honry/88b87c43b3f51a6c38c10454f3599405 to your computer and use it in GitHub Desktop.
Save Honry/88b87c43b3f51a6c38c10454f3599405 to your computer and use it in GitHub Desktop.
Build WebNN EP

WebNN EP is built base on ONNX Runtime for Web.


Following to install all prerequisites.


WebNN EP depends on JSEP, in order to take the advantage of its jsepWrapAsync and runAsync implementation in pre-jsep.js for enabling the WebNN async API usage in WebNN EP.

  1. Use the following command line to build wasm with SIMD + multi-thread support:
  • Build Release:

C:\code\onnxruntime>.\build.bat --config Release --build_wasm --enable_wasm_simd --enable_wasm_threads --use_jsep --use_webnn --target onnxruntime_webassembly --skip_tests

  • Build Debug (optional for debug purpose):

C:\code\onnxruntime>.\build.bat --config Debug --build_wasm --enable_wasm_simd --enable_wasm_threads --use_jsep --use_webnn --target onnxruntime_webassembly --enable_wasm_debug_info --skip_tests

  1. Prepare js/web:

C:\code\onnxruntime>cd js
C:\code\onnxruntime\js>npm ci
C:\code\onnxruntime\js>cd common
C:\code\onnxruntime\js\common>npm ci
C:\code\onnxruntime\js\common>cd ..\web
C:\code\onnxruntime\js\web>npm ci
C:\code\onnxruntime\js\web>npm run pull:wasm

  • For Release:

C:\code\onnxruntime\js\web>copy /Y ..\..\build\Windows\Release\ort-wasm-simd-threaded.jsep.mjs .\dist\ C:\code\onnxruntime\js\web>copy /Y ..\..\build\Windows\Release\ort-wasm-simd-threaded.jsep.wasm .\dist\

  • For Debug:

C:\code\onnxruntime\js\web>copy /Y ..\..\build\Windows\Debug\ort-wasm-simd-threaded.jsep.mjs .\dist\ C:\code\onnxruntime\js\web>copy /Y ..\..\build\Windows\Debug\ort-wasm-simd-threaded.jsep.wasm .\dist\

  1. Build Web artifacts:

C:\code\onnxruntime\js\web>npm run build

This generates the final JavaScript bundle files to use. They are under folder <ORT_ROOT>/js/web/dist.

Run npm tests

  • Run with Chrome Stable (Support WebNN CPU deviceType only at present):

C:\code\onnxruntime\js\web>npm test -- model <path-to-ort-model-test-folder> -b=webnn --wasm-number-threads=1 --debug

  • Run with Chrome Canary (Support both WebNN CPU and GPU deviceType):

C:\code\onnxruntime\js\web>set CHROME_BIN=<path-to-chrome-canary>

  • Run test:

C:\code\onnxruntime\js\web>npm test -- model <path-to-ort-model-test-folder> -b=webnn --wasm-number-threads=1 --debug
e.g. npm run test -- suite1 -b=webnn --wasm-number-threads 1 --debug --webnn-device-type gpu


Make sure the files mentioned above are put in dist folder.

  • Import from HTML's <script> tag:
<script src="./dist/ort.all.min.js"></script>


<script src="./dist/ort.all.js"></script>
  • Import from source code inside <script type="module"> tag (ESM):
<script type="module">
  import * as ort from "ort.all.min.mjs";


<script type="module">
  import * as ort from "ort.all.mjs";
  • Import in a CommonJS project (CJS format, resolve from package.json "exports" field):
const ort = require('onnxruntime-web/all');
  • Import in an ESM project (ESM format, resolve from package.json "exports" field):
import * as ort from 'onnxruntime-web/all';
  • Use released version:

WebNN EP has been released in onnxruntime-web dev channel started from 1.18.0-dev.20240126, you can either use this version or a newer one.

<script src="" integrity="sha512-YZnoZqAi/xvZBkTDkyLGRAcNST4wpq/vtIJ+0NCvC8j0qJ9WyVWNfqSLe6co1VOoKfX+zc415jLZaCQuVu/QqA==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>


npm i onnxruntime-web@1.18.0-dev.20240126-fc44f96ad5
  • Set ORT Web ENV:
ort.env.wasm.numThreads = 4; // only available when the server allows SharedArrayBuffer
ort.env.wasm.proxy = false; // true for running behind Web Worker
ort.env.logLevel = 'warning'; // set the severity level for logging. 'verbose' | 'info' | 'warning' | 'error' | 'fatal'
ort.env.debug = false;  // true for build Wasm with Debug
  • Create ORT inference session:
const options = {
  executionProviders: [
      name: 'webnn',
      deviceType: 'cpu', // 'cpu', 'gpu' or 'npu'
      powerPreference: "default",
      numThreads: 1, // allows using multi-threads for for WebNN deviceType 'cpu'
// Free dimension override. Only need for the model with dynamic input shape.
options.freeDimensionOverrides = {
  batch_size: 1, // "batch_size", a example of dynamic dimension name in the input shape
options.logSeverityLevel = 0; // 0: kVERBOSE|1: kINFO|2: kWARNING|3: kERROR|4: kFATAL
// Check for more options.

// Create inference session.
const session = await ort.InferenceSession.create(modelPath, options);
  • Run ORT inference session:
// prepare inputs
const dataA = new Float32Array(12).fill(1);
const dataB = new Float32Array(12).fill(2);
const tensorA = new ort.Tensor('float32', dataA, [3, 4]);
const tensorB = new ort.Tensor('float32', dataB, [4, 3]);

// prepare feeds. use model input names as keys.
const feeds = {
  a: new Tensor('float32', dataA, [3, 4]),
  b: new Tensor('float32', dataB, [4, 3])

// feed inputs and run
const results = await;

Check more examples of API usage from

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment