Skip to content

Instantly share code, notes, and snippets.

@ayushoriginal
Created June 24, 2019 12:22
Show Gist options
  • Save ayushoriginal/1e9566ef3c793a012d8af665b503d284 to your computer and use it in GitHub Desktop.
Save ayushoriginal/1e9566ef3c793a012d8af665b503d284 to your computer and use it in GitHub Desktop.
clean_stopword_cs
List<string> stopword = new List<string>();
foreach (var line in File.ReadLines("stopword.txt"))
{
stopword.Add(line);
}
var stopword_set = new HashSet<string>(stopword); //Hashset of Stopwords
// Let t_tokens_str be the tokenized version of the string
List<string> stop_tokens_str = new List<string>();
foreach (string word in tokens_str)
{
if (stopword_set.Contains(word))
continue;
else
stop_tokens_str.Add(word);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment