mikesamuel/frenemies.md

## frenemies.md

      
    Raw
  

              frenemies.md
            
          
    Node Frenemies

Lets project teams trust code they know with more than code they don't.

Goal
Non-goals
Use Case Summary
Solution: Node module loader acts as trusted intermediary to enable mutual suspicion

Reliable Channel Example
Example Code


Use Case Solution Sketches

Contract Values
Opaque Values
Reifying Permissions
Access Restrictions


Alternate approaches

We're all adults here.
Unit testing
Turn off unneeded functionality
Examine all third-party dependencies
Write your own instead of using third-party code
Load modules in separate contexts


Failure modes

Node.js projects are collaborations between different authors:

First-party authors who understand the project's goals and security needs.
Authors of prod dependencies in package.json -- those that have been
explicitly selected by first-party authors and are familiar with
the security needs of the kinds of projects that need them.
Deep dependencies authored by people trying to solve a general
problem with little insight into any particular project's specific
security needs.
A dependency is "deep" when you have to go through multiple project
manifests to find out why it is installed.

Goal

Allow a project's authors to grant more authority to modules they
are familiar with while still granting some authority to deep
dependencies.
We want to make sure the path of least resistance for a deep
dependency is to degrade gracefully when granted less authority
instead of working around security measures to provide a higher
quality of service but at a higher level of risk.
Non-goals

It is not a goal to allow one to safely load malicious third-party code.
Even code written in good faith can put a project at risk if parts are
written carelessly or if its authors define "secure" differently from
the first-party authors.
We assume that all module authors are acting in good faith but that
first-party and third-party authors still end up working at cross purposes
and that first-party authors are better placed to represent end-users'
interests.
Use Case Summary

These use cases are discussed in more detail later when we relate
them to the propsed solution.
Most of the use cases revolve around treating modules known to align
with the project's security goals differently from the mass of
deep dependencies whose behavior we want to bound.

Use JS engine hooks to restrict use of eval to carefully
reviewed modules.
Prevent or warn on shell access, and network sends to sensitive
backends by unwhitelisted modules.
Restrict use of unsafe or error-prone APIs that require specific
expertise and care to use correctly.  Error messages could
steer users at wrapping APIs that are suitable for general use.

This can also enable safe-by-construction guarantees:

Allow some modules to mint values that pass a verifier so that
any module can verify they are safe-by-construction regardless
of what other modules they pass through.  This is especially useful
for contract values that encapsulate a security property.
E.g. server-side trusted types benefit from being able to
limit the ability to create values that pass a verifier to those
that have been reviewed by domain experts.

Opaque values make it easier to keep sensitive data private:

Allow a layer that extracts sensitive inputs to ferry them to their
destination without having to worry about intermediate modules.
For example, code for a reset password or credit card processing
form might want to ensure that neither shows up in log.  Being able
to wrap a value lets us guarantee that the sensitive input is not
going to show up in logs or error traces.
Similarly for PII like physical locations from client-side Geo APIs.

Solution: Node module loader acts as trusted intermediary to enable mutual suspicion

We can realize these use cases if we have a way for one module to
open a reliable channel to another.
Reliable Channel Example

Two modules alice.js and bob.js wish to collaborate in a tightly
coupled manner (be less suspicious of requests from one another) while
checking inputs from other modules more carefully.  They may wish to
use carol.js to carry messages, but do not wish to grant read access
to carol.js.
One way to allow this would be for Alice to create a box containing
the value she wishes to communicate that only Bob can open
(confidentiality), and provide Bob a way to know that the box came
from Alice (authenticity).
We could do this via asymmetric crypto if all we wanted to pass were
strings and Alice and Bob had a way to exchange keys, but JavaScript
objects that close over local variables do not serialize well.

The random oracle model explains cryptographic primitives
in terms of a stable mapping from inputs to unpredictable strings.
JavaScript does not allow forging object references
(e.g. casting ints to pointers) or extracting object references
from a closure's local scope.
These two properties, the uniqueness and privacy of new Objects,
let us get the benefits of crypto-style operators for JavaScript
objects without serializing by replacing that model with a
distinct object reference oracle model.
In frenemies.js we implement a pure-JavaScript analogue of
secure channels that can convey arbitrary values providing
Confidentiality and Authenticity and
building on top of a JavaScript analogue of private/public key pairs.
It does not provide Integrity or Non-Repudiation.  If the boxed
value is available via other means, it might be modified between being
boxed and being unboxed.  For this reason, although a boxer can't
repudiate the identity of the boxed value, they could repudiate the
value (or absence) of any property reachable via properties that were
not frozen/sealed when the object was boxed.  To get non-repudiation,
we would need to provide an unbox operator that additionally checks
that the object was deeply-frozen and sealed before boxing completed.
Availability doesn't have a clear meaning in this context.  There
is no guarantee that Carol, in the example above, will deliver Alice's
box to Bob; an out of memory error, stack overflow, or OS interrupt
could prevent any step in the process.
A private key is a per-module function that takes a function.
It calls its argument in such a way that a call to the corresponding
public key will return its first argument instead of its second.
A public key is a per-module function that returns its first
argument (defaults to true) when called in the context of the
corresponding private key, or its second argument (default to false)
otherwise.
Example Code

// alice.js
'use strict';
// Alice sends a message to Bob.

const bob = require('./bob');
const carol = require('./carol');

const mayOpen = (opener) => opener === bob.publicKey && opener();
const messageForBob = frenemies.box(
  'Have a nice day, Bob! Sincerely, Alice',
  mayOpen);

exports.send = () => carol.convey(bob, messageForBob);
// bob.js
'use strict';
// Bob gets a message from Alice and verifies that it comes from her.
const alice = require('./alice');

function ifFrom(sender) {
  return sender === alice.publicKey && sender();
}

// Carol puts messages in mailboxes.
exports.mailbox = function mailbox(box) {
  const value = frenemies.unbox(
    box, ifFrom, 'a message of questionable provenance!');
  console.log(`Bob read: ${value}`);
};
exports.messagesReceived = messagesReceived;
// carol.js
'use strict';
// Carol passes messages between Alice and Bob.
// Maybe she's a message bus.  Who knows?!

// Maybe Carol is evil!  Maybe not!  Again, who knows?!
const evil = Math.random() >= 0.5;

exports.convey = function convey(recipient, message) {
  if (evil) {
    console.log('Carol got ' + message);  // OPAQUE.  No leak.
    // No leak.  Denied by since alice.mayOpen gets called
    // in the context of Alice's private key, not Bob's.
    console.log(frenemies.unbox(message, (x) => true));
  }
  // Carol delivers Bob's mail.  She may be evil, but she's not a monster!
  recipient.mailbox(message);
  if (evil) {
    recipient.mailbox(
      // Bob will not open it since his ifFrom predicate expects
      // Alice's public key, not Carol's.
      frenemies.box('Have an evil day! Sincerely, Alice', (x) => true));
  }
};
Use Case Solution Sketches

Here we sketch out how we address the use cases above.
Contract Values

Project members may trust some modules to produce HTML:

widely-used-html-sanitizer which filters HTML tags and attributes
autoescaping-template-engine which plugs untrusted values into a
trusted HTML template

based on their extensive experience with these modules,
but not others:

mysterious-markdown-converter which shows up in the project's
dependencies for reasons no one has investigated.

Luckily all these modules produce TrustedHtml
values which takes a box containing the string of HTML.
When it's time to flush a chunk of HTML to an HTTP response buffer,
the request handler unboxes the TrustedHTML content if it comes from
one of the first two.  If it comes from another source it treats it
with more suspicion, perhaps passing it back to the sanitizer which
unboxes it without regards to its source and whitelists tags and
attributes.
The template engine is also a consumer -- when rendering it
gets the project policy to decide when to inline a chunk of
trusted HTML without escaping.
Other project teams might trust widely-used-html-sanitizer
with their own policy, but don't trust third-party code to
craft their own policies, so only whitelist a wrapped version
of widely-used-html-sanitizer.
This proposal provides publicKeys per module which provides a
basis for defining whitelists, and a mechanism for consumers of
values to check whitelists.
Boxes are tamper-proof in transit, so don't require reorganizing
the way data flows through a system as long as the intermediate
layers don't insist on coercing values to strings.
// Load a list of module IDs from a configuration file.
const { myWhitelist } = require('package.json')

function makeWhitelist(allowed) {
  const idSet = new Set([allowed])
  return (publicKey) => (
      frenemies.isPublicKey(publicKey) &&
      publicKey() &&
      // Possible to avoid depending on Set.prototype
      idSet.has(publicKey.moduleIdentifier))
}
Opaque Values

We may trust our framework code to route inputs to the right place
when everything is ok, but there's too much code that looks like
function execute(...args) {
  try {
    // ...
  } catch (exc) {
    log('Execution failed given %o: %o', args, exc);
    // ...
  }
}
dataBundle may contain sensitive information:

Real names
Unsalted passwords
Geo locations
Credit cards

We need to encourage developers to build systems that they can
debug in the field, but we still need to keep sensitive data out
of logs that might not be as resistant to attack as our key stores.
Reliably opaque values can help us balance these two needs.
As soon as request handler extracts sensitive data, it can box it
and specify which modules can open it using the same kind of module
whitelist as above.
Opaque values don't require mutual suspicions, so reasonably reliable
opaque values could be had by other means.
Reifying Permissions

What about eval
explains that quirks of JavaScript make it easier to unintentionally
convert a string to code than in other languages, and
Dynamically bounding eval
proposes a way to allow some uses of eval without the blanket ban
implied by --disallow_code_generation_from_strings.
We can represent permissions as boxes, and the permission checker
can unbox it with an ifFrom parameter that only accepts the permission
granter.
To request permissions, the requestor would create a box and give it
to the grantor which could use the whitelist scheme above.
These permissions would be delegatable with the usual caveats.
Permissions can also be revocable.
// Granter
const whitelist = (see ^ above);
function mayI(request) {
  let permission () => false
  let revoke = () => {}
  if (frenemies.unbox(request, whitelist, false)) {
    let youMay = true;
    permission = () => youMay
    revoke = () => { youMay = false }
  }
  return {
    permission: frenemies.box(permission, () => true),
    revoke
  }
}


// Requester
const granter = require('granter')
const { permission, revoke } = granter.mayI(frenemies.box(true, () => true));
// Do something that invokes checker with permission


// Checker
const granter = require('granter')
function check(permission) {
  if (frenemies.unbox(
          permission, (k) => k === granter.publicKey && k(), () => false)()) {
    // Grant
  } else {
    // Deny
  }
}
Access Restrictions

Some modules are inherently sensitive:

fs
child_process
net
user-libraries that wrap the same

These are incredibly powerful, and most large projects use one or all
of these to great effect.
They or their counterparts in other languages are also involved in
many large-scale exploits.
Most third-party modules do not require all of these.
It would be nice for a project team to be able to bound which code
should use those, review that code, and then enforce that going
forward.  Then if a new dependency needs access, it should fail
loudly so that they can incrementally review the new use.
It would also be nice if tricky use of reflective operators, like
that which exploits the Function constructor for which we advocated
permissions, failed safely.
We have shown above that this proposal provides a basis for whitelist
modules which lets us define the boundary of what has been reviewed.
Enforcement is less clear.  Further research is needed, but I think
it should be possible to make this largely transparent to existing
node code.  Code that runs early in the main file could:

Identify sensitive modules and load them.
Patch require.cache with a getter and/or use require hooks to
wrap the sensitive modules in a proxy that closes over the loaders
private key so that each loader has a subjective view via
membranes.
Perform enforcement in the proxy.

These libraries tend to also be higher latency, so this is the
appropriate place for proxies.
Alternate approaches

Here are alternate approaches that have been suggested to granting
different authority to different code.
We're all adults here.

Node.js takes a "we're all adults here" attitude to development.
None of is going to do something stupid like undefined = 1337,
so we can all get along without worrying about corner cases that
only occur when undefined === 1337.
This argument is raised in two different ways:

We're all adults so we can provide third-party library authors
with all the authority they could need and trust them to manage
risk on their clients' behalf.
We're all adults so, if we do need to implement a security policy
in code, we can use half-measures since adults don't try to work
around security policies.

This attitude is great in moderation, but we're not all adults --
we're a large collection of adults who can't possibly all talk to
each other because that's an O(n²) problem.
Large groups don't have the common context that "we're all adults here"
implies.
There are several kinds of context that affect the security of an end
product.

Deep dependency authors don't understand the specific security
needs of that end product.
Deep dependency authors are often domain experts in the problem their
library solves, but often are not experts in how an attacker can
turn powerful features they may use can be turned against the end
product.
End product developers do not understand how the deep dependency
author solves the problem.

If a third-party developer has a choice between using a powerful feature
to definitely solve a problem for one client who is requesting it and
maybe keeping risk low for other clients who are not present requesting
they don't, they will likely do so.  Ones who consistently don't will
not gain users.
It is unreasonable to expect third-party developers to approximate
POLA for a specific end product.
We need to enable a small group of security specialists to guarantee
security properties while third-party developers focus on bringing
their strengths to bear.
Re whether we can trust adults not to work around half-measures,
in goals above we wanted to

make sure the path of least resistance for a deep dependency is to
degrade gracefully

A library author wants to provide a high level of service.
If they can by peeking inside a box and don't clearly see how this
increases the risk to an end product, then they are likely to peek.
If they can't peek inside, the path of least resistance is to degrade
gracefully.
We're all adults here; sometimes adults with deadlines and not enough
coffee.  Strong fences make good neighbours, let security specialists
manage product's security, and guide third-party developers towards
graceful degradation and away from hacks that work around policies.
Unit testing

Why not write unit test that your code doesn't do things with
untrusted inputs that it shouldn't?
Unit test suites can give confidence that a system does what it
should, but do a poor job checking that it doesn't do what
it shouldn't.  Mechanisms that limit or contain the damage
that code can do when exposed to malicious inputs can help
large development teams maintain velocity while giving
confidence in the latter.
Turn off unneeded functionality

We talk about powerful modules like fs and sh and powerful
operators like eval and note that few modules need one of these
but that most projects do.  If this were not the case, then
shell injection and leaks of sensitive files would not be as
frequent as they are.
Project teams benefit from having them when they need them if
we can limit the downside.
Examine all third-party dependencies

Some argue that developers should not depend on any code that
they haven't read, or that they wouldn't be happy writing.
Developers do use code that they haven't read.
We don't advocate reading all your dependencies because
that sounds super boring and would probably become your
full-time job.
Write your own instead of using third-party code

This may be a fine approach for something project-critical
where you have domain expertise on the project team, but
doesn't scale.
If a large project tried this, they would have to become
large enough that internally and would have enough pressure
to provide reusable components, they would recreate the
first-party/third-party disconnect.
Load modules in separate contexts

Some have proposed loading each module in its own realm,
with its own version of globals.
It may be a fine idea, but probably breaks modules that
assume Object.prototype and other builtins are identical
across modules.
It also does not, by itself, address these use cases,
though might, if there were a way to prevent some modules
from require-ing specific others.
It could be complementary with this proposal.
Failure modes

Implicit in "grant ... to module" is the idea that a module is a principal.
This brings along the usual risks when authorizing based on principals:
Impersonation - one module masquerading as another.
For example, the vm module uses the filename option to load code
in a way that makes it obvious in stack traces where code originated.
Modules that use stack introspection
(e.g. node-sec-patterns)
to authenticate a caller may mistakenly authorize code run in a context
created with { filename: './high-authority-module.js' }.
Another way a module might try to prove its identity to a sensitive
function that it calls is to pass in a value that only it knows.  This
is susceptible to replay attacks -- once passed, the receiver knows the
value, and can save the value so that it can later use the privileges
of the caller.
The attached code should not be susceptible to impersonation as long
as a module does not leak its privateKey, box, or unbox functions.
Key Replacement
A module might override another module's public key.
require('foo').publicKey = frenemies.publicKey
We attempt to make the public key read-only after
a module has initialized, but common idioms like the
two below mean we can't prevent this entirely.
// index.js
module.exports = require('./lib/foo')

// another index.js
Object.assign(exports, require('./lib/foo'), require('./lib/bar')) 
This does not affect confidentiality or integrity, but does affect
availability.  If Dave can replace the public key that Alice receives
via require('./bob').publicKey then Alice's mayOpen predicate will
deny Dave access.  If Dave substitute's his public key for Bob's, he
would still need access to Bob's private key or unbox operator to open
boxes meant for Bob.
Attacking the policy - if we store grant decisions in a
configuration file like package.json, a lower privileged module
could (temporarily or not) edit that file to grants itself more
privileges.
Attack of the clones - since CommonJS modules typically load from the
file system, a lower privileged module could use the fs builtin to
prepend script it wishes to run to the main file of a more highly
privileged module.
We do not attempt to solve these last two problems.  Existing techniques
like denying the node runtime write access to source and configuration
files and doing resource integrity checks on load should suffice.
Attacking objects - Node.js does not just run JS.  C++ addons may be
able to violate the assumptions that underlie our random oracle replacement.
Programmatic access to out-of-band debuggers (1 2).
Deserialize APIs have also allowed object forgery in similar systems.
We do not attempt to solve these problems either.  "If you can't trust
native code who can you trust" is not an ideal security posture, but
project teams should already be careful about which C++ addons they load
in production, and a feature like this might allow bounding access to
out-of-band APIs (like debug hooks and deserialization APIs) which would
be a better security situation than having no such bounds.