Node.js' fetch()
implementation depends on Undici. Undici's fetch()
does not support fetching file:
protocol. See
- Isomorphic local file fetching #45798
- fetch: Enable fetch via file URL (under flag) #2751
- Standardize fetch calls to file: URLs
I've been using XMLHttpRequest()
and fetch()
to get file:
protocol for quite some time in the browser; both with and without browser extensions, on Chromium-based browsers and Firefox.
On Chromium and Chrome with --allow-file-access-from-files
flag and on Firefox we can do
let xhr = new XMLHttpRequest();
xhr.onload = (e) => console.log(xhr.response);
xhr.open("GET", "file:///home/user/bin/nm_host.js");
xhr.send(null);
In a Chromium-based browser (Chrome, Brave, Opera, Edge) we can create an unpacked extension with this in the manifest.json
"host_permissions": [
"file:///*"
],
then we can fetch file:
URL's from the extension - which means we can control fetching file:
URL's from any arbitrary URL because we have "externally_connectable"
and other ways to
Both deno
and bun
support file:
protocol for fetch()
.
I found this StackOverflow question How to get a local file via fetch/axios? interesting and specific re not using Node.js' fs
module or import
/import()
to get the file:
For study reasons I need to use some of these network APIs like fetch or axios but to get a LOCAL file, so WITHOUT using fs module, or WITHOUT importing them.
So I dove in to Undici's source
In the source code at /lib/web/fetch/index.js
we find
case 'file:': {
// For now, unfortunate as it is, file URLs are left as an exercise for the reader.
// When in doubt, return a network error.
return Promise.resolve(makeNetworkError('not implemented... yet...'))
}
When we use file:
protocol without any modifications to Undici's fetch()
we get this roadmap of errors
TypeError: fetch failed
at fetch (/node_modules/undici/index.js:109:13) {
[cause]: TypeError: Invalid URL
at new URL (node:internal/url:804:36)
at parseURL (/node_modules/undici/lib/core/util.js:51:11)
at Object.parseOrigin (/node_modules/undici/lib/core/util.js:117:9)
at new Pool (/node_modules/undici/lib/dispatcher/pool.js:70:23)
at Agent.defaultFactory (/node_modules/undici/lib/dispatcher/agent.js:22:7)
at [dispatch] (/node_modules/undici/lib/dispatcher/agent.js:93:34)
at Intercept (/node_modules/undici/lib/interceptor/redirect-interceptor.js:11:16)
at [Intercepted Dispatch] (/node_modules/undici/lib/dispatcher/dispatcher-base.js:158:12)
at Agent.dispatch (/node_modules/undici/lib/dispatcher/dispatcher-base.js:179:40)
at /node_modules/undici/lib/web/fetch/index.js:2079:51 {
code: 'ERR_INVALID_URL',
input: 'null'
}
}
Debugging and modifying each line from the respective files is tedious, yet educational. I modified parseURL()
, parseOrigin()
in undici/lib/core/util.js
and [kInterceptedDispatch]
in DispatcherBase
.
That leaves from undici/lib/web/fetch/index.js
[cause]: Error: connect ECONNREFUSED 127.0.0.1:80
at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1606:16)
{
errno: -111,
code: 'ECONNREFUSED',
syscall: 'connect',
address: '127.0.0.1',
port: 80
}
Looks like Undici's fetch()
is using TCPConnectWrap
for fetch()
which I have not dove into, yet. We have some breadcrumbs to follow in
- core/request.js: http proxies support #446
- Unable to connect to establish a TCP connection (ECONNREFUSED) #40702
We'll set that branch aside for the moment and explore other approaches for the time being.
My own programming process usually involves concurrent branches of different approaches to solve an issue or achieve a requirement.
If we did not have the restriction to not use fs
or Ecmascript static import
or dynamic import()
we could just use either of those modules directly, by short circuiting Undici's fetch()
with something like this in undici/lib/web/fetch/index.js
at around
#L129 to just short circuit the process when the URL is file:
where fs
module usage will be unobservable to the end-user
const p = createDeferredPromise()
const url = new URL(input);
if (url.protocol === "file:") {
return import("node:fs").then((fs) => {
p.resolve(new Response(fs.readFileSync(url)));
return p.promise;
})
}
Ecmascript Modules import()
makes a network request. import
/import()
works for file:
URL's, e.g.,
await import(import.meta.resolve("file:///home/user/bin/exports.js"));
I find it interesting we can successfully do the above, but Undici's fetch()
for the same URL throws.
While I was modifying Undici's fetch()
source code I was also thinking about other ways to achieve the requirement of not using Node.js' fs
or Ecmascript Modules without using Undici's fetch()
.
We know curl
supports fetching file:
URL's. curl
is also portable. So we can fetch and build curl
just for this purpose
git clone https://github.com/curl/curl.git
cd curl
autoreconf -fi
# Disable everything we are not using here
LDFLAGS="-static" ./configure --disable-alt-svc --disable-ares --disable-cookies --disable-basic-auth --disable-bearer-auth --disable-digest-auth --disable-kerberos-auth --disable-negotiate-auth --disable-aws --disable-dateparse --disable-dnsshuffle --disable-doh --disable-form-api --disable-get-easy-options --disable-hsts --disable-http-auth --disable-ipv6 --disable-libcurl-option --disable-manual --disable-ntlm --disable-ntlm-wb --disable-progress-meter --disable-proxy --disable-pthreads --disable-socketpair --disable-threaded-resolver --disable-tls-srp --disable-unix-sockets --disable-versioned-symbols --without-brotli --without-libpsl --without-nghttp2 --without-ngtcp2 --without-zstd --without-libidn2 --without-librtmp --without-ssl --without-zlib --enable-static --prefix=$HOME/bin```
make -j $(nproc)
make install
then use curl
for out file:
URL's
import { spawn } from "node:child_process";
import { Duplex } from "node:stream";
async function fetchFile(path) {
// https://github.com/chcunningham/atomics-post-message/blob/main/server.js
const mimeTypes = {
"html": "text/html",
"jpeg": "image/jpeg",
"jpg": "image/jpeg",
"png": "image/png",
"js": "text/javascript",
"wasm": "application/wasm",
"css": "text/css",
};
try {
const { stdout, stderr } = spawn("./bin/curl", ["-NSs", path]);
if (stderr) {
// Handle file not found: "curl: (37) Couldn't open file /home/user/bin/nm_hosts.js\n"
const err = await new Response(Duplex.toWeb(stderr).readable).text();
if (err) {
throw err;
}
}
return new Response(Duplex.toWeb(stdout).readable, {
headers: {
"Content-Type": mimeTypes[path.split(".").pop()] || "text/plain",
},
});
} catch (e) {
throw e;
}
}
export { fetchFile };
In our import module
import { fetchFile } from "./fetchFile.js";
const file = "file:///home/user/bin/nm_host.js";
fetchFile(file)
.then((r) => {
console.log(r.headers.get("content-type"));
return r.text();
})
.then(console.log)
.catch((err) => console.error({ err }));
curl
might be a little heavy just to fetch from file:
protocol in JavaScript files that use node
and fetch()
. The above is not a full implementation of WHATWG Fetch, either. It works within the restrictions of the question posted on StackOverflow.
Using
dd
command and GNU Coreutilshead
, respectively, to get the file