Skip to content

Instantly share code, notes, and snippets.

View jancurn's full-sized avatar
:octocat:
Hello

Jan Čurn jancurn

:octocat:
Hello
View GitHub Profile
@jancurn
jancurn / proxy-chain-example.js
Last active February 29, 2024 07:26
Example showing how to use the proxy-chain NPM package to let headless Chrome use a proxy server with username and password
const puppeteer = require('puppeteer');
const proxyChain = require('proxy-chain');
(async() => {
const oldProxyUrl = 'http://bob:password123@proxy.example.com:8000';
const newProxyUrl = await proxyChain.anonymizeProxy(oldProxyUrl);
// Prints something like "http://127.0.0.1:45678"
console.log(newProxyUrl);
@jancurn
jancurn / Dockerfile
Last active February 15, 2024 05:38
Example of an Apify actor stored in a GitHub Gist.
# Here you choose the base Docker image for the actor. Apify provides the following images:
# apify/actor-node-basic
# apify/actor-node-chrome
# apify/actor-node-puppeteer
# However, you can use any other image from Docker Hub.
# For more information, see https://apify.com/docs/actor#base-images
FROM apify/actor-node-basic
# Copy all files and directories from the directory to the Docker image
COPY . ./
@jancurn
jancurn / puppeteer-proxy-page-authenticate.js
Last active January 11, 2023 17:31
Puppeteer's page.authenticate() does not work for proxy authorization!
const puppeteer = require('puppeteer');
(async() => {
const proxyUrl = 'http://proxy.example.com:8000';
const username = 'bob';
const password = 'password123';
const browser = await puppeteer.launch({
args: [`--proxy-server=${proxyUrl}`],
headless: false,
@jancurn
jancurn / apify_proxy_tunnel.js
Last active February 16, 2021 17:28
This example demonstrates how to create a tunnel via Apify's HTTP proxy service. For details, see https://blog.apify.com/tunneling-arbitrary-protocols-over-http-proxy-with-static-ip-address-b3a2222191ff
const { createTunnel, closeTunnel, redactUrl } = require('proxy-chain');
(async () => {
// Select the proxy to tunnel through. Note that some proxies do not allow
// HTTP traffic (port 80) over the HTTP CONNECT tunnel, or might not allow connection
// to target on any other port than 80 (HTTP) or 443 (HTTPS).
// You might want to try different proxy groups.
const PROXY_URL = 'http://auto:<PROXY_PASSWORD>@proxy.apify.com:8000';
// Target server to connect to. Here we use www.example.com and port 443 for HTTPS.
@jancurn
jancurn / hello_world.js
Created September 24, 2018 14:03
Apify SDK hello world example
const Apify = require('apify');
Apify.main(async () => {
const requestQueue = await Apify.openRequestQueue();
await requestQueue.addRequest(new Apify.Request({ url: 'https://www.iana.org/' }));
const pseudoUrls = [new Apify.PseudoUrl('https://www.iana.org/[.*]')];
const crawler = new Apify.PuppeteerCrawler({
requestQueue,
handlePageFunction: async ({ request, page }) => {
@jancurn
jancurn / spider.py
Created September 4, 2019 06:53
Scrapy Executor code example
import scrapy
import apify
class MySpider(scrapy.Spider):
name = 'apifySpider'
def start_requests(self):
urls = [
'https://apify.com',
'https://apify.com/store',
@jancurn
jancurn / example.js
Created August 15, 2018 09:09
Apify actor that demonstrates usage of Puppeteer live view
const Apify = require('apify');
Apify.main(async () => {
const browser = await Apify.launchPuppeteer({ liveView: true });
const page = await browser.newPage();
await page.goto('https://news.ycombinator.com/');
const pageTitle = await page.title();
console.log(`Page loaded: ${pageTitle}`);
const Apify = require('apify');
Apify.main(async () => {
// Get input of your act and print it
const input = await Apify.getValue('INPUT');
console.log('My input:');
console.dir(input);
// Save the output
const output = {