Skip to content

Instantly share code, notes, and snippets.

@mayhewsw
Created March 20, 2019 19:41
Show Gist options
  • Save mayhewsw/9fc7d001642a8644a87aca15344ef05d to your computer and use it in GitHub Desktop.
Save mayhewsw/9fc7d001642a8644a87aca15344ef05d to your computer and use it in GitHub Desktop.
Assuming a text-vector file with header line, this will help you select the number of vectors you want, and clean a little.
# Number of vector lines you want
N=50000
IN=$1
OUT=$2
# Get the dimension from the header.
DIM=$(head -n 1 $IN | cut -d' ' -f2)
# Actually take the top...
head -n $(($N+1)) $IN > $OUT
# Fix the header
sed -i.bak "1 s/^.*$/${N} ${DIM}/" $OUT
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment