Skip to content

Instantly share code, notes, and snippets.

@kai11
Created October 25, 2020 11:20
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save kai11/e91c6fad990c6490b2a4fe8c4defebfe to your computer and use it in GitHub Desktop.
Save kai11/e91c6fad990c6490b2a4fe8c4defebfe to your computer and use it in GitHub Desktop.
Running Archivebox through SOCKS5 proxy

Corresponding Archivebox issue ArchiveBox/ArchiveBox#249

I cannot use Wireguard because I cannot setup it on my VPS. i decided to use https://hub.docker.com/r/ncarlier/redsocks/ However, by default it generates redsocks config with unauthenticated http\https proxy, for SOCKS5 it must be modified.

Running Archivebox thorugh SOCKS5 proxy

1. Creating docker bridge with name "archivebox"

docker network create --opt com.docker.network.bridge.name=archivebox -d bridge archivebox

2. Running redsocks

  1. Update attached redsocks.tmpl file with proxy information. It's a slightly modified version from https://hub.docker.com/r/ncarlier/redsocks/
  2. Host and port will be ignored - I hardcoded them in my config, but ${proxy_ip} and ${proxy_port} can be used instead.
  3. Command below must use full path for redsocks.tmpl file - binging like -v ./redsocks.tmpl:... don't work on Docker 19.03.
  4. Redsocks also ignore traffic for private networks (see https://github.com/ncarlier/dockerfiles/blob/master/redsocks/whitelist.txt) It's possible to run archivebox web, which will fetch pages through proxy but access it from local network without proxy. AFAIK this won't work with wireguard solution.
docker run -e "DOCKER_NET=archivebox" --name=archivebox_redsocks \
    -v <full path>/redsocks.tmpl:/etc/redsocks.tmpl \
    --privileged=true --net=host -d ncarlier/redsocks 1.1.1.1 9000

3. Testing with curl

Both commands must return IP address of SOCKS5 proxy, not IP of the server

docker run -it --network=archivebox curlimages/curl:latest curl https://ifcfg.co
docker run -it --network=archivebox curlimages/curl:latest curl http://ifcfg.co

4. Running Archivebox

// init data folder
docker run -it -v <full_path>/data:/data nikisweeting/archivebox init
// creating superuser
docker run -it -v <full_path>/data:/data nikisweeting/archivebox manage createsuperuser
// import url
ONLY_NEW=False USE_COLOR=True SHOW_PROGRESS=False docker run -it \
    --network=archivebox -v <full_path>/data:/data nikisweeting/archivebox add <url>
// serve web to local IPs, import urls added via UI with proxy
docker run -d -p 9001:9001 --network=archivebox --name=archivebox_web \
    -v <full_path>/data:/data nikisweeting/archivebox server 0.0.0.0:9001

Limitations

  1. Redsocks container adds new chain REDSOCKS to iptables. Without fixing it's name, it's not possible to run multiple networking containers.
  2. Better solution will be to build new redsocks container and pass socks5, host, port, user and password as arguments instead of host\port.
  3. disclose_src in redsocks config don't work, redsocks from container don't start.

Troubleshooting

  1. Redsocks logs Most of the problems I had with this setup was caused by broken redsocks config or it's not working. Command below must show "main.c:152 main(...) redsocks started" or nothing will work.
docker logs archivebox_redsocks
  1. Iptables rules
sudo iptables-save | grep REDSOCKS
  1. Cleanup iptables rules Either stop docker container (will remove extra rules on shutdown) or run
iptables-save | grep -v REDSOCKS | iptables-restore
base {
log_debug = off;
log_info = on;
log = "stderr";
daemon = off;
user = redsocks;
group = redsocks;
redirector = iptables;
}
redsocks {
local_ip = 0.0.0.0;
local_port = 12345;
type = socks5;
ip = <socks5_ip>;
port = <socks5_port>;
login = <socks5_user>;
password = <socks5_password>;
}
redsocks {
local_ip = 0.0.0.0;
local_port = 12346;
type = socks5;
ip = <socks5_ip>;
port = <socks5_port>;
login = <socks5_user>;
password = <socks5_password>;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment