Skip to content

Instantly share code, notes, and snippets.

@jaanli
Last active April 13, 2024 11:48
Show Gist options
  • Save jaanli/6f947466ab784774b11d4bbc8c43c077 to your computer and use it in GitHub Desktop.
Save jaanli/6f947466ab784774b11d4bbc8c43c077 to your computer and use it in GitHub Desktop.
Research patterns for finding information online

Information is often behind paywalls, behind artificial intelligence interfaces that require payment, and going to the source is often required to advance basic science and biomedical research.

Here are simple workflows tested over the years for accessing information:

graph TB
A[Can I access plain text content that contains the information needed to satisfy my intent?] --> B(<a href='https://web.archive.org'>Wayback Machine</a>)
A --> C(<a href='https://archive.is'>archive.is</a>)
A --> D(<a href='https://libgen.rs'>Library Genesis</a>)
A --> E[<a href='https://bing.com'>Bing Cached Pages</a><br/>Search on Bing, click arrow by result, click 'Cached']
A --> F(<a href='https://sci-hub.se'>Sci-Hub</a><br/>Requires DOI, search at <a href='https://search.crossref.org/'>search.crossref.org</a>)
Loading

Use at your own risk, while aware of the law in your jurisdiction around piracy, copyrighted materials, and restrictions on re-use of data for training AI models for downstream commercial or non-profit applications. The licensing of training data for AI is an active area of computer science and legal scholarship.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment