Skip to content

Instantly share code, notes, and snippets.

@ianmilligan1
Created April 16, 2020 20:25
Show Gist options
  • Save ianmilligan1/89e96b676441706a10424d11e4917914 to your computer and use it in GitHub Desktop.
Save ianmilligan1/89e96b676441706a10424d11e4917914 to your computer and use it in GitHub Desktop.
.select($"crawl_date", $"url", RemoveHTMLDF(ExtractBoilerpipeTextDF(RemoveHTTPHeaderDF($"content"))))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment