Skip to content

Instantly share code, notes, and snippets.

@joepie91
Last active June 25, 2023 08:52
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save joepie91/828532657d23d512d76c1e68b101f436 to your computer and use it in GitHub Desktop.
Save joepie91/828532657d23d512d76c1e68b101f436 to your computer and use it in GitHub Desktop.

A few notes on the "Gathering weak npm credentials" article

Yesterday, an article was released that describes how one person could obtain access to enough packages on npm to affect 52% of the package installations in the Node.js ecosystem. Unfortunately, this has brought about some comments from readers that completely miss the mark, and that draw away attention from the real issue behind all this.

To be very clear: This (security) issue was caused by 1) poor password management on the side of developers, 2) handing out unnecessary publish access to packages, and most of all 3) poor security on the side of the npm registry.

With that being said, let's address some of the common claims. This is going to be slightly ranty, because to be honest I'm rather disappointed that otherwise competent infosec people distract from the underlying causes like this. All that's going to do is prevent this from getting fixed in other language package registries, which almost certainly suffer from the same issues.

"This is what you get when you use small dependencies, because there are such long dependency chains"

This is very unlikely to be a relevant factor here. Don't forget that a key part of the problem here is that publisher access is handed out unnecessarily; if the Node.js ecosystem were to consist of a few large dependencies (that everybody used) instead of many small ones (that are only used by those who actually need the entire dependency), you'd just end up with each large dependency being responsible for a larger part of the 52%.

There's a potential point of discussion in that a modular ecosystem means that more different groups of people are involved in the implementation of a given dependency, and that this could provide for a larger (human) attack surface; however, this is a completely unexplored argument for which no data currently exists, and this particular article does not provide sufficient evidence to show it to be true.

Perhaps not surprisingly, the "it's because of small dependencies" argument seems to come primarily from people who don't fully understand the Node.js dependency model and make a lot of (incorrect) assumptions about its consequences, and who appear to take every opportunity to blame things on "small dependencies" regardless of technical accuracy.

In short: No, this is not because of small dependencies. It would very likely happen with large dependencies as well.

"See, that's why you should always lock your dependency versions. This is why semantic versioning is bad."

Aside from semantic versioning being a practice that's separate from automatically updating based on a semver range, preventing automatic updates isn't going to prevent this issue either. The problem here is with publish access to the modules, which is a completely separate concern from "how the obtained access is misused".

In practice, most people who "lock dependency versions" seem to follow a practice of "automatically merge any update that doesn't break tests" - which really is no different from just letting semver ranges do their thing. Even if you do audit updates before you apply them (and let's be realistic, how many people actually do this for every update?), it would be trivial to subtly backdoor most of the affected packages due to their often aging and messy codebase, where one more bit of strange code doesn't really stand out.

The chances of locked dependencies preventing exploitation are close to zero. Even if you do audit your updates, it's relatively trivial for a competent developer to sneak by a backdoor. At the same time, "people not applying updates" is a far bigger security issue than audit-less dependency locking will solve.

All this applies to "vendoring in dependencies", too - vendoring in dependencies is no technically different from pinning a version/hash of a dependency.

In short: No, dependency locking will not prevent exploitation through this vector. Unless you have a strict auditing process (which you should, but many do not), you should not lock dependency versions.

"That's why you should be able to add a hash to your package.json, so that it verifies the integrity of the dependency."

This solves a completely different and almost unimportant problem. The only thing that a package hash will do, is assuring that everybody who installs the dependencies gets the exact same dependencies (for a locked set of versions). However, the npm registry already does that - it prevents republishing different code under an already-used version number, and even with publisher access you cannot bypass that.

Package hashes also give you absolutely zero assurances about future updates; package hashes are not signatures.

In short: This just doesn't even have anything to do with the credentials issue. It's totally unrelated.

"See? This is why Node.js is bad."

Unfortunately plenty of people are conveniently using this article as an excuse to complain about Node.js (because that's apparently the hip thing to do?), without bothering to understand what happened. Very simply put: this issue is not in any way specific to Node.js. The issue here is an issue of developers with poor password policies and poor registry access controls. It just so happens that the research was done on npm.

As far as I am aware, this kind of research has not been carried out for any other language package registries - but many other registries appear to be similarly poorly monitored and secured, and are very likely to be subject to the exact same attack.

If you're using this as an excuse to complain about Node.js, without bothering to understand the issue well enough to realize that it's a language-independent issue, then perhaps you should reconsider exactly how well-informed your point of view of Node.js (or other tools, for that matter) really is. Instead, you should take this as a lesson and prevent this from happening in other language ecosystems.

In short: This has absolutely nothing to do with Node.js specifically. That's just where the research happens to be done. Take the advice and start looking at other language package registries, to ensure they are not vulnerable to this either.

So then how should I fix this?

  1. Demand from npm Inc. that they prioritize implementing 2FA immediately, actively monitor for incidents like this, and generally implement all the mitigations suggested in the article. It's really not reasonable how poorly monitored or secured the registry is, especially given that it's operated by a commercial organization, and it's been around for a long time.
  2. If you have an npm account, follow the instructions here.
  3. Carry out or encourage the same kind of research on the package registry for your favorite language. It's very likely that other package registries are similarly insecure and poorly monitored.

Unfortunately, as a mere consumer of packages, there's nothing you can do about this other than demanding that npm Inc. gets their registry security in order. This is fundamentally an infrastructure problem.

@siepkes
Copy link

siepkes commented Jun 22, 2017

@joepie91 The following is a followup on the short conversation we had on Twitter where I pointed out a plugin for Gradle (mainly used with Java) which allows you to specify hashes in your build scripts ( https://github.com/WhisperSystems/gradle-witness ).

So let me first say that i'm not really familiar with NPM (haven't used it that much) so i'm more talking about a system in general which allows you to specify the hashes of the artifacts (dependencies) you are using in your project. In case of NPM i'm assuming that would be the ability of specifying the hashes of the artifacts of your dependencies (including the transitive ones, ie. the dependencies of your dependencies) in your package.json.

For example:

{ "dependencies" :
  { "foobar" : 
    { "version" : "1.0.0",
      "sha256_hash" : "362ab8daaec0823cbe52a395107193622e2a80ab6c8333b7e49fe3e2455affd8"
    }
  }
}

That will ensure that that package can never be tampered with. If someone were to hack the NPM repo and change the artifact with some backdoored version the hash would simply not match up and you build would fail. The trick is that the hash is kept in your project repository so the attacker can't change that. That's the security feature. If the hash would be specified in the dependency itself, yeah that would be quite useless...

Now a signature is a totally different solution. That would require the publisher of the package to sign his/hers package and publish the signature (perhaps inside the package, perhaps separately). Now the big advantage is that you only have to trust the publishers key once to get verification of all artifacts and all future artifacts (permitted that the private key of the publisher isn't compromised at some point obviously). However this is a far more complex system to setup with trust and such.

So the ability of specifying hashes in your own project repo actually does fix the problem.

@joepie91
Copy link
Author

@siepkes The issue is that the problem that hashes solve, isn't the problem that's being laid out in the article. Lockfiles in NPM 5 do actually already include hashes, but it doesn't affect this scenario.

  • The problem that hashes solve: Somebody with administrative access to the registry might replace an existing release with a backdoored version.
  • The problem that the article is about: Somebody with regular publisher access might publish a new release containing a backdoor.

The problem was never "what if somebody with administrative access is evil", so "replacing existing releases" is not a concern, as a regular user (whether a maintainer of a package or otherwise) is not allowed to do this anyway. You'd need to actually compromise the registry to be able to do that, but that's not what this article is about.

The problem here is that somebody with legitimate publisher access to a package could publish a new malicious release. This is not prevented by a hash in the lockfile, because the new release is necessarily a different version than the one that everybody has installed. Nothing in the registry changes, a new release is just added.

@siepkes
Copy link

siepkes commented Jun 22, 2017

@joepie91 So we are talking here about what if the original author is / becomes evil (or his/hers account is hacked) and releases a new evil version? That's an easy fix; Don't use version ranges. Like I said I don't have much experience with NPM but it seems quite common to specify a range in the NPM world instead of a specific version. In Java with Gradle and Maven this is theoretically possible but no one specifies ranges. Everyone always specifies a specific version. That's why a thing like the Gradle Witness Plugin works and also solves the problem you refer to.

I also don't see how 2 factor authentication is going to solve the problem of the author going rogue (evil) once and for all. 2FA basically only protects more against the authors account getting compromised but it doesn't help you if it happens anyway (People also get hacked with 2FA). And you only need to compromise a couple of well spread libraries for a good effect. Basically the only thing that will protect you in that scenario is specific versions of dependencies with hashes.

Besides the obvious security implications of not specifying an exact version of your dependencies your build is also not reproducible. Meaning if I check out a specific version from version control (a tag) and built it and then do the same thing a month later I get a different application. For that reason alone it would be unacceptable for me to use ranges.

@joepie91
Copy link
Author

@siepkes I feel like you didn't fully read or understand the post. I'm explicitly addressing this in the post already:

In practice, most people who "lock dependency versions" seem to follow a practice of "automatically merge any update that doesn't break tests" - which really is no different from just letting semver ranges do their thing. Even if you do audit updates before you apply them (and let's be realistic, how many people actually do this for every update?), it would be trivial to subtly backdoor most of the affected packages due to their often aging and messy codebase, where one more bit of strange code doesn't really stand out.

The chances of locked dependencies preventing exploitation are close to zero. Even if you do audit your updates, it's relatively trivial for a competent developer to sneak by a backdoor. At the same time, "people not applying updates" is a far bigger security issue than audit-less dependency locking will solve.

And no, I'm not talking about "the original author going rogue" at all. I'm talking about the original author's account being compromised by a third party, which 2FA does prevent in most cases. "People can get hacked with 2FA too" is not an argument against that either; the goal of security measures is to reduce the chance of compromise or, in most cases, make it more expensive. 2FA absolutely does that, regardless of whether it works perfectly all of the time - and in many cases, it will make compromising an account too expensive to bother with.

As for builds not being reproducible: you're free to use a lockfile if this is what you desire, and you're willing to invest the effort in constantly keeping your dependencies up to date manually. But this is a very poor default; most people do not have the time or effort to invest in that, and therefore will simply never update their dependencies, leaving them vulnerable to security issues that are unpatched in their version of the dependencies. And given that with semantic versioning it's very rare for dependencies to break unannounced, it's certainly preferable for most projects to have a non-reproducible but automatically updated build.

There's a big problem in your reasoning, in that it only looks at things as technical absolutes. You consider something to either work perfectly, or not at all. Something either produces a reproducible build, or it does not. In reality, there are many more 'soft' factors involved in developing robust software - making attacks expensive, compensating for human mistakes and bias, encouraging the right behaviour, and so on. If you overlook those factors - which you seem to be doing in this discussion - then you will arrive at ineffective solutions, that look great on paper but totally fail to protect people in the real world.

Specifying explicit versions makes for a nice and elegant-sounding technical rationale, but in the real world it just leads to insecure software. The same applies to eg. vendoring dependencies. 2FA is not perfect, but considerably cuts down on the amount of account compromises, and makes them easier to detect while they're happening. And so on, and so forth.

@siepkes
Copy link

siepkes commented Jun 27, 2017

@joepie91

I feel like you didn't fully read or understand the post.

No we just have different opinions on the matter and the piece you are quoting was just no enough to convince me pinning versions and keeping checksums is a bad idea.

it's relatively trivial for a competent developer to sneak by a backdoor. At the same time, "people not applying updates" is a far bigger security issue than audit-less dependency locking will solve.

That's a totally different threat. But when you have actually audited, validated or even just glanced at the library you need it to stay in place. And for that you need locked dependencies.

The chances of locked dependencies preventing exploitation are close to zero.

Zero? Once the authors account is taken over having locked dependencies in place with their hashes specified will prevent the malicious actor from inserting malicious code in your applications. That's definitely a feat I call desirable. And that 2FA helps in preventing account take over, sure but I like a safety net for when the account does get taken over. At least the guys from Open Whisper Systems seem to agree with me since they created Gradle Witness for the Android Signal app.

And no, I'm not talking about "the original author going rogue" at all. I'm talking about the original author's account being compromised by a third party, which 2FA does prevent in most cases.

I'm not saying anywhere that having 2FA is a bad idea. I think NPM should definitely implement it! And like you say: Prevents it in most cases. Even if it causes a 95%. And in anycase; Going rogue or account taken over has almost the same outcome.

and you're willing to invest the effort in constantly keeping your dependencies up to date manually.

I think your projecting here. Your reference seems to be Node / NPM. Like I said in the Java world locking your dependencies is actually the norm. So it's a matter of perspective of what "normal" is.

But this is a very poor default; most people do not have the time or effort to invest in that, and therefore will simply never update their dependencies, leaving them vulnerable to security issues that are unpatched in their version of the dependencies.

Even with symantec versioning most libraries don't keep providing bug fixes and patches for all the minor versions they ever released. So while it buys you some time in the end not maintaining your software will lead to the same thing.

There's a big problem in your reasoning, in that it only looks at things as technical absolutes.

Well technical stuff like standards are usually quite absolute. Can you do just half of symantec versioning ?

You consider something to either work perfectly, or not at all.

I never said anything of the sort. Thats just putting words in my mouth so i'll ignore this.

Something either produces a reproducible build, or it does not.

Well, yeah, something has a reproducible build or it does not. That's quite absolute. And yes using version ranges creates an effect which leads to an un-reproducible build. If that isn't a problem for you and your happy with it; Good for you. Personally I like my build reproducible because they make troubleshooting easier and provide "a way back" when I need to rollback because of instability.

In reality, there are many more 'soft' factors involved in developing robust software - making attacks expensive, compensating for human mistakes and bias, encouraging the right behaviour, and so on.

Could you please point out to me where exactly it is that I state that the one and only factor to developing good software is having a reproducible build? The only claim I'm making is that having a reproducible build for me (and lots of other people) is a very desirable trait.

If you overlook those factors - which you seem to be doing in this discussion - then you will arrive at ineffective solutions, that look great on paper but totally fail to protect people in the real world.

I'm only talking about how pinned versions with hashes lead to reproducible builds and why those are a desirable trait in my opinion since you stated that they have no value and I think they can actually be of value in the threat model discussed. So let's debate one topic at the time shall we?

@chrisdlangton
Copy link

@joepie91 I am a huge advocate for both signatures and for digests which what npm is doing for sha256_hash is actually called a checksum digest in information security terminology.

@siepkes I can only agree with the point you make regarding the checksum digest solving integrity issues when an NPM administrator maliciously alters a package on a server (not using the NPM platform). the checksum digest is useless in solving a malicious publisher releasing via the NPM platform itself

While signatures establish a trust relationship, holder-of-key methodology actually enables the propagation of signed malware as trusted code, it does nothing to mitigate the use of trusted code that has been made malicious when the private key has been leveraged or exposed. And by leveraged I do not mean the bad actor actually has the private key, I refer specifically to leveraging a build system that has access tot eh private key to sign the altered code to produce trusted signed malware in a scenario the bad actor never obtained the private key.

Bottom line I agree with @joepie91 that perhaps there is a lot of misunderstanding from @siepkes on the details of hashes and signatures and what they do and do not mitigate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment