WebNN EP is built base on ONNX Runtime for Web.
Following https://onnxruntime.ai/docs/build/web.html#prerequisites to install all prerequisites.
WebNN EP depends on JSEP, in order to take the advantage of its jsepWrapAsync
and runAsync
implementation in pre-jsep.js
for enabling the WebNN async API usage in WebNN EP.
- Use the following command line to build wasm with SIMD + multi-thread support:
- Build Release:
C:\code\onnxruntime>
.\build.bat --config Release --build_wasm --enable_wasm_simd --enable_wasm_threads --use_jsep --use_webnn --target onnxruntime_webassembly --skip_tests
- Build Debug (optional for debug purpose):
C:\code\onnxruntime>
.\build.bat --config Debug --build_wasm --enable_wasm_simd --enable_wasm_threads --use_jsep --use_webnn --target onnxruntime_webassembly --enable_wasm_debug_info --skip_tests
- Prepare
js/web
:
C:\code\onnxruntime>
cd js
C:\code\onnxruntime\js>npm ci
C:\code\onnxruntime\js>cd common
C:\code\onnxruntime\js\common>npm ci
C:\code\onnxruntime\js\common>cd ..\web
C:\code\onnxruntime\js\web>npm ci
C:\code\onnxruntime\js\web>npm run pull:wasm
- For Release:
C:\code\onnxruntime\js\web>
copy /Y ..\..\build\Windows\Release\ort-wasm-simd-threaded.jsep.mjs .\dist\
C:\code\onnxruntime\js\web>copy /Y ..\..\build\Windows\Release\ort-wasm-simd-threaded.jsep.wasm .\dist\
- For Debug:
C:\code\onnxruntime\js\web>
copy /Y ..\..\build\Windows\Debug\ort-wasm-simd-threaded.jsep.mjs .\dist\
C:\code\onnxruntime\js\web>copy /Y ..\..\build\Windows\Debug\ort-wasm-simd-threaded.jsep.wasm .\dist\
- Build Web artifacts:
C:\code\onnxruntime\js\web>
npm run build
This generates the final JavaScript bundle files to use. They are under folder <ORT_ROOT>/js/web/dist
.
- Run with Chrome Stable (Support WebNN CPU deviceType only at present):
C:\code\onnxruntime\js\web>
npm test -- model <path-to-ort-model-test-folder> -b=webnn --wasm-number-threads=1 --debug
- Run with Chrome Canary (Support both WebNN CPU and GPU deviceType):
setCHROME_BIN
ENV:
C:\code\onnxruntime\js\web>
set CHROME_BIN=<path-to-chrome-canary>
- Run test:
C:\code\onnxruntime\js\web>
npm test -- model <path-to-ort-model-test-folder> -b=webnn --wasm-number-threads=1 --debug
e.g.npm run test -- suite1 -b=webnn --wasm-number-threads 1 --debug --webnn-device-type gpu
Make sure the files mentioned above are put in dist folder.
- Import from HTML's
<script>
tag:
<script src="./dist/ort.all.min.js"></script>
or
<script src="./dist/ort.all.js"></script>
- Import from source code inside
<script type="module">
tag (ESM):
<script type="module">
import * as ort from "ort.all.min.mjs";
</script>
or
<script type="module">
import * as ort from "ort.all.mjs";
</script>
- Import in a CommonJS project (CJS format, resolve from package.json "exports" field):
const ort = require('onnxruntime-web/all');
- Import in an ESM project (ESM format, resolve from package.json "exports" field):
import * as ort from 'onnxruntime-web/all';
- Use released version:
WebNN EP has been released in onnxruntime-web dev channel started from 1.18.0-dev.20240126, you can either use this version or a newer one.
<script src="https://cdnjs.cloudflare.com/ajax/libs/onnxruntime-web/1.18.0-dev.20240126-fc44f96ad5/ort.all.min.js" integrity="sha512-YZnoZqAi/xvZBkTDkyLGRAcNST4wpq/vtIJ+0NCvC8j0qJ9WyVWNfqSLe6co1VOoKfX+zc415jLZaCQuVu/QqA==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>
or
npm i onnxruntime-web@1.18.0-dev.20240126-fc44f96ad5
- Set ORT Web ENV:
ort.env.wasm.numThreads = 4; // only available when the server allows SharedArrayBuffer
ort.env.wasm.proxy = false; // true for running behind Web Worker
ort.env.logLevel = 'warning'; // set the severity level for logging. 'verbose' | 'info' | 'warning' | 'error' | 'fatal'
ort.env.debug = false; // true for build Wasm with Debug
- Create ORT inference session:
const options = {
executionProviders: [
{
name: 'webnn',
deviceType: 'cpu', // 'cpu', 'gpu' or 'npu'
powerPreference: "default",
},
],
}
// Free dimension override. Only need for the model with dynamic input shape.
options.freeDimensionOverrides = {
batch_size: 1, // "batch_size", a example of dynamic dimension name in the input shape
};
options.logSeverityLevel = 0; // 0: kVERBOSE|1: kINFO|2: kWARNING|3: kERROR|4: kFATAL
// Check https://github.com/microsoft/onnxruntime/blob/main/js/common/lib/inference-session.ts for more options.
// Create inference session.
const session = await ort.InferenceSession.create(modelPath, options);
- Run ORT inference session:
// prepare inputs
const dataA = new Float32Array(12).fill(1);
const dataB = new Float32Array(12).fill(2);
const tensorA = new ort.Tensor('float32', dataA, [3, 4]);
const tensorB = new ort.Tensor('float32', dataB, [4, 3]);
// prepare feeds. use model input names as keys.
const feeds = {
a: new Tensor('float32', dataA, [3, 4]),
b: new Tensor('float32', dataB, [4, 3])
};
// feed inputs and run
const results = await session.run(feeds);
Check more examples of API usage from https://github.com/microsoft/onnxruntime-inference-examples/tree/main/js.