Skip to content

Instantly share code, notes, and snippets.

@blahah
Last active January 27, 2021 16:12
Show Gist options
  • Save blahah/0fe1deb5f49422c0723fd3c18c4751f0 to your computer and use it in GitHub Desktop.
Save blahah/0fe1deb5f49422c0723fd3c18c4751f0 to your computer and use it in GitHub Desktop.
Reply to Wiley's claim that they "do not create fake DOIs"
NOTE: this is an archive of my response to Tom Griffin of Wiley's email to the Liblicense listserv
claiming that Wiley do not create fake DOIs.
My response is below, in case it does not pass moderation at the original list.

edit: My email response has appeared on the listserv: http://listserv.crl.edu/wa.exe?A2=LIBLICENSE-L;5e950629.1606

Hi Tom,

I am the person who posted the original document about fake DOIs (https://docs.google.com/document/d/1uTVHPI8r4VO31KihsyiBHsh_gp8jZ38fMvP5nP5XOkw/).

Every single claim you make above is demonstrably false.

You said:

Wiley does not use fake DOIs.

Wiley does have fake DOIs on its website. I have attached a screenshot, and I refer you to this blog post: https://go-to-hellman.blogspot.co.ke/2016/06/wileys-fake-journal-of-constructive.html. To claim otherwise is creatively redefining 'fake DOI'. What you're doing is polluting the web with things that are designed to match the pattern of real, Crossref registered DOIs, but which are in fact designed to trigger access restriction.

In addition to creating fake DOIs, the blog post linked shows that Wiley is creating fake articles attributed to real institutions.

We strongly support the DOI system and were a founding member of CrossRef

Goeff Bilder at Crossref has explicitly said (here https://twitter.com/gbilder/status/736979917720702977) that Crossref discourages the behaviour you have exhibited. So whether you claim to support them or not, they don't support what you are doing.

What was contained in the document were URLs

The URLs in my original post were constructed by me using the pattern I observed on Wiley's site to reach a PDF directly from a Wiley DOI. That is, "http://onlinelibrary.wiley.com/doi/" + DOI + "/pdf". I found those DOIs in the wild and tried to resolve them via Wiley's website, as explained in the first link.

These URLs are not discoverable online, they cannot be indexed

Here's one being indexed on Google: https://www.google.com/search?safe=off&q=Constructive+Metaphysics+in+Theories+of+Continental+Drift&cad=h.

... and will not have an impact on the DOI system

The entire problem is that they are having an impact. If academics in the course of their work find an apparent Wiley DOI and try to visit the corresponding page, then find their institution blocked, that is a very serious and damaging impact.

Only individual IP addresses were affected, no institutions were banned from accessing Wiley content

After I initially visited the URLs corresponding to the fake DOIs in the first link, several departments and a separate institute at Cambridge were blocked from visiting Wiley, including open access titles. The block lasted at least a week.

All access has now been restored and clicking on those links will no longer disable access.

Here's a thread on Twitter demonstrating this to be false: https://twitter.com/CT_Bergstrom/status/745860745611591680

The URLs are a security measure visible only to Wiley and our customers’ security officers, and we do not know how they came to be known more widely.

See my first post and the google link above. They came to be known more widely because I posted about them after you blocked my institution.

I see only two possible explanations for your email: deliberate misinformation or complete technical incompetence. Either way, I hope the library community and the broader academic community will continue to hold Wiley accountable for their harmful behaviour.

Richard Smith-Unna

Mozilla Fellow for Science, University of Cambridge

@blahah
Copy link
Author

blahah commented Jun 23, 2016

Bizarrely in the case above, resolving one fake DOI via Wiley's site led to another fake DOI being presented!

@steltenpower
Copy link

What would happen if we get EVERYBODY to 'click the bait' ? :-)

@eshellman
Copy link

eshellman commented Jun 23, 2016

All the trap urls lead to the same page. That's why Google's ranking likes it.
The trap urls might be nonces that can be used to trace back to the user session. If so, it might not be useful for everyone to click the bait. (i.e. there could be billions of fake urls to click)

@blahah
Copy link
Author

blahah commented Jun 23, 2016

@eshellman interesting points!

I have archived wget results for every trap URL - I'll check them tomorrow and verify the forwarding thing.

Many people have clicked the links and been banned - some people have only a session-based ban, while others are IP banned. Still a good point that new trap URLs could be generated for sessions they want to trace.

@eshellman
Copy link

the trap url nonces might be used to track downloads by a "bot". The "punishment" might depend on the download history that the nonce is tracking. Just speculation, and not consistent with the lack of sophistication displayed otherwise by the traps.

@eshellman
Copy link

@blahah what mechanism was used to hide the trap urls?

@hvwaldow
Copy link

@eshellman "the trap url nonces might be used to track downloads by a "bot". "

Wiley denies that ("We also do not use these URLs to trap crawlers on Wiley Online Library.").

From the wording in that mail ("The URLs are a security measure") and their response to customers that click such a link, I suspect that they try to catch mass-downloads through compromised institutional accounts (or just by ToS - violating customers). The former aim is a good thing and could make the world a better place. For sysadmins, that is.

Everything we know about what they actually do is completely unrelated to this aim and might easily produce antagonistic results.

Is it conceivable that billion dollar company in the business of information handling is that incompetent?

@peterjc
Copy link

peterjc commented Jun 24, 2016

Good to see the email made it to the moderated Liblicense list's archive:
http://listserv.crl.edu/wa.exe?A2=ind1606&L=LIBLICENSE-L&F=&S=&P=56413

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment