Skip to content

Instantly share code, notes, and snippets.

@kylebgorman
Created March 29, 2020 13:32
Show Gist options
  • Save kylebgorman/5109b09fbfc3a2c1dbbdd405326c1130 to your computer and use it in GitHub Desktop.
Save kylebgorman/5109b09fbfc3a2c1dbbdd405326c1130 to your computer and use it in GitHub Desktop.
Downloads English WMT news crawl data
#!/bin/bash
# Downloads all the WMT News Crawl data for English.
set -euo pipefail
curl -C - http://data.statmt.org/news-crawl/en/news.20[07-19].en.shuffled.deduped.gz -o "news.20#1.gz"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment