Skip to content

Instantly share code, notes, and snippets.

@aravindpai
Created May 22, 2020 19:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aravindpai/9a04a2d4cd8696a6a647ae7ecf67a04c to your computer and use it in GitHub Desktop.
Save aravindpai/9a04a2d4cd8696a6a647ae7ecf67a04c to your computer and use it in GitHub Desktop.
i=0
while(True):
#compute frequency
pairs = get_stats(oov)
#extract keys
pairs = pairs.keys()
#find the pairs available in the learned operations
ind=[merges.index(i) for i in pairs if i in merges]
if(len(ind)==0):
print("\nBPE Completed...")
break
#choose the most frequent learned operation
best = merges[min(ind)]
#merge the best pair
oov = merge_vocab(best, oov)
print("Iteration ",i+1, list(oov.keys())[0])
i=i+1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment