Skip to content

Instantly share code, notes, and snippets.

@felixlohmeier
Last active January 17, 2024 10:14
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save felixlohmeier/0ec8b2e8241356ed52af072d9102b391 to your computer and use it in GitHub Desktop.
Save felixlohmeier/0ec8b2e8241356ed52af072d9102b391 to your computer and use it in GitHub Desktop.
How To Install OpenRefine on a web server with Ubuntu 22.04

How To Install OpenRefine on a web server with Ubuntu 22.04

OpenRefine is intended to be installed locally as a desktop application. Due to the client-server architecture it is also possible to install OpenRefine on a web server to share it with multiple users. This can be useful despite the missing user administration, e.g. temporarily for a workshop or permanently in a protected network.

Security warning

Can I somehow host OpenRefine for others to access ?

OpenRefine has no built-in security for multi-user or multi-tenant scenarios. OpenRefine has a single data model that is not shared, so there is a risk of columnar data operations being overwritten by other users, so care must be taken by users. Having said that, if you are inclined to proceed at your own risk, you can get some security by using a proxy.

https://github.com/OpenRefine/OpenRefine/wiki/FAQ

Installation (without Authentication)

Tested on Ubuntu 22.04 LTS

  1. Install Java 11 and Unzip
apt update && apt install openjdk-11-jre-headless unzip
  1. Configure Java to be behave better on cloud infrastructure
sed -i 's/securerandom.source=file:\/dev\/random/securerandom.source=file:\/dev\/urandom/' /etc/java-11-openjdk/security/java.security
  1. Install OpenRefine 3.7.7
mkdir /opt/openrefine
wget https://github.com/OpenRefine/OpenRefine/releases/download/3.7.7/openrefine-linux-3.7.7.tar.gz
tar -xzf openrefine-linux-3.7.7.tar.gz -C /opt/openrefine --strip 1
  1. Create a systemd service for OpenRefine
adduser --system openrefine
echo "[Unit]
Description=OpenRefine
[Service]
User=openrefine
ExecStart=/opt/openrefine/refine -i 0.0.0.0
TimeoutStopSec=3600s
Restart=always
RestartSec=10
[Install]
WantedBy=default.target
" > /etc/systemd/system/openrefine.service
systemctl enable openrefine.service
systemctl start openrefine.service
  1. Redirect port 3333 to 80
iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 3333
apt install iptables-persistent
  1. Optional: Adjust memory setting in refine.ini
sed -i 's/REFINE_MEMORY=1400M/REFINE_MEMORY=6192M/' /opt/openrefine/refine.ini
systemctl restart openrefine.service

See also

@ymerouani
Copy link

Thank you for the guide! I have followed it, but being as inexperienced as I am, I do not know how to then access openrefine remotely from my laptop. I can login fine into my server using putty (it uses ssh).

@felixlohmeier
Copy link
Author

Hi @matriim, If you followed all the steps (including step 5) and there are no other web servers running on the server, then the installation of OpenRefine should be accessible at the IP address of the server. Example: http://198.51.100.10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment