Skip to content

Instantly share code, notes, and snippets.

@Stanback
Last active December 14, 2016 16:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Stanback/43c1401ac3c1f7edf31a5d6becc4a710 to your computer and use it in GitHub Desktop.
Save Stanback/43c1401ac3c1f7edf31a5d6becc4a710 to your computer and use it in GitHub Desktop.
Regex for fixing improperly formatted URIs
/*
* Snippet to encode invalid characters from improperly formatted URIs
*
* RFC 3986 defines that URIs may contain the following characters:
* ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]@!$&'()*+,;=`.```
*/
import java.net.URLEncoder
"""[^A-Za-z0-9-._~:/?#\[\]@!$&'\(\)*+,;=%`]|%[^0-9a-fA-F]{2}]""".r.
replaceAllIn("http://testingurl.com/?foo%20bar baz", m => URLEncoder.encode(m.group(0), "UTF-8"))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment