Skip to content

Instantly share code, notes, and snippets.

@bagder
Last active May 31, 2018 09:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bagder/c22b31fab3bf9e21ff82f872bd5bd372 to your computer and use it in GitHub Desktop.
Save bagder/c22b31fab3bf9e21ff82f872bd5bd372 to your computer and use it in GitHub Desktop.
URLs are dangerous things

URLs are dangerous things

curl is a tool, libcurl is a library. They're used to retrieve or send data, where the source or destination is specified by a URL.

URL is short for Uniform Resource Locator and typically identifies a "resource" on a remote server.

This document discusses some aspects and precautions that need to be considered when applications pass URLs to curl or libcurl to work on.

What if the user can set the URL

Applications may find it tempting to let users set the URL that it can work on. That's probably fine, but opens up for mischief and trickery that you as an application author may want to address or take precautions against.

If your curl-using script allow a custom URL do you also, perhaps unintentionally, allow the user to pass other options to the curl command line if creative use of special characters are applied?

If the user can set the URL, the user can also specify the scheme part to other protocols that you didn't intend for users to use and perhaps didn't consider. curl supports over 20 different URL schemes. "http://" might be what you thought, "ftp://" or "imap://" might be what the user gives your application...

If using .netrc is enabled, setting a user name in the URL will make the application read the .netrc file and automatically pass on authentication headers to the remote server!

Remedies:

  • curl command lines can use --proto to limit what URL schemes it accepts.
  • libcurl programs can use CURLOPT_PROTOCOLS
  • consider not allowing the user to set the full URL
  • consider strictly filtering input to only allow specific choices

Un-authenticated connections

Protocols that don't have any form of cryptographic authentication can not with any certainty know that they communicate with the right remote server.

If your application is using a fixed scheme or fixed host name, it is not safe as long as the connection is un-authenticated. There can be a man-in-the-middle or in fact the whole server might have been replaced by an evil actor.

Un-authenticated protocols are unsafe. The data that comes back to curl may have been injected by an attacker. The data that curl sends might be modified before it reaches the intended server. If it even reaches the intended server at all.

Remedies:

  • Restrict operations to authenticated transfers
  • Make sure the server's certificate etc is verified

FTP uses two connections

When performing an FTP transfer, two TCP connections are used: one for setting up the transfer and one for the actual data.

FTP is not only un-authenticated, but the setting up of the second transfer is also a weak spot. The second connection to use for data, is either setup with the PORT/EPRT command that makes the server connect back to the client on the given IP+PORT, or with PASV/EPSV that makes the server setup a port to listen to and tells the client to connect to a given IP+PORT.

Again, un-authenticated means that the connection might be meddled with by a man-in-the-middle or that there's a malicious server pretending to be the right one.

A malicious FTP server can respond to PASV commands with the IP+PORT of a totally different machine. Perhaps even a third party host, and when there are many clients trying to connect to that third party, it creates a DDOS of it. If the client makes an upload operation, it can make the client send the data to another site. If the attacker can affect what data the client uploads, it can be made to work as a HTTP request and then the client could be made to issue HTTP requests to third party hosts.

An attacker that manages to control curl's command line options can tell curl to send an FTP PORT command to ask the server to connect to a third party host instead of back to curl.

The fact that FTP uses two connections makes it vulnerable in a way that is hard to avoid.

Malicious servers

Similar to a Man-In-The-Middle attack, a server can of course have been "taken over" by an attacker and is now used to send back malicious or otherwise deliberately bad content.

Authenticated protocols don't fully protect against servers being hacked and modified, since at times attackers manage to replace contents and affect responses while still being perfectly authenticated.

"Bounces" to another server

The dual connection nature of FTP and the redirect feature of HTTP(S) allows a server to redirect curl to another server and port. With HTTP(S) redirects, it can even change protocol - as long as curl has not be told to disable that protocol for redirects.

The risk for your application following malicious redirects of course increases if you allow users to enter URLs without strict filters.

Localhost is hard to protect

Allowing users to specify the host name part of the URL makes it really hard for your application to avoid the risk of it hitting a local server instead of the (intended?) remote server.

Having your application connect to a local host, be it the same machine that runs the application or a machine on the same local network, may be possible to exploit to "port-scan" the particular hosts - depending on how the application reacts to this.

curl has no way to protect against accesses of localhost. Filtering the URL for 127.0.0.1, ::1 or variations of "localhost" will NOT be enough since any host name can be setup to return a local IP address for curl to work with. And curl similarly cannot know the IP ranges of your local networks and even if it did, it connects to those networks just as easy as any other network.

RFC 3986 vs WHATWG URL

curl supports URLs mostly according to how they are defined in RFC 3986, and has done so since the beginning.

Web browsers mostly adhere to the WHATWG URL Specification.

This deviance makes some URLs copied between browsers (or returned over HTTP for redirection) and curl not work the same way. This can mislead users into getting the wrong thing, connecting to the wrong host or otherwise not work identically.

@bagder
Copy link
Author

bagder commented May 31, 2018

The gist of this (haha) was merged into this official document:

https://curl.haxx.se/libcurl/security.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment