Skip to content

Instantly share code, notes, and snippets.

@XavierGeerinck
Created October 4, 2016 08:26
Show Gist options
  • Save XavierGeerinck/7adf9ac9b340f2e646a54a0463670f08 to your computer and use it in GitHub Desktop.
Save XavierGeerinck/7adf9ac9b340f2e646a54a0463670f08 to your computer and use it in GitHub Desktop.
Finds taboo words in a academic paper / thesis and gives recommendations to change

Info

This utility finds taboo words in .tex files and gives a recommendation of how to replace them for a correct academic/thesis writestyle. All these recommendations can be found at: https://www.scribbr.com/academic-writing/taboo-words/

Usage: node taboolist_word_finder.js <WordListFile> <DirWithTexFiles>

taboolist_word_finder.js

// Make sure we got a filename on the command line.
if (process.argv.length < 3) {
	console.log('Usage: node ' + process.argv[1] + ' <Wordlist> <DirWithFiles>');
	process.exit(1);
}

const fs = require('fs');
const wordList = process.argv[2];
const dirPath = process.argv[3];
const extension = '.tex';

let files = fs.readdirSync(dirPath);
let tabooList = fs.readFileSync('wordlist.txt').toString();
let tabooListWords = tabooList.split(',');

for (let file of files) {
	if (file.indexOf(extension) == -1) {
		continue;
	}

	let fileContent = fs.readFileSync(dirPath + '/' + file).toString();

    for (let tabooListWord of tabooListWords) {
    	tabooListWord = tabooListWord.trim();

    	let tabooWord = tabooListWord.split(':')[0];
    	let tabooCorrection = tabooListWord.split(':')[1];

        if (fileContent.indexOf(tabooWord) > -1) {
            console.log(`Found word "${tabooWord}" in file "${file}" on position ${fileContent.indexOf(tabooWord)}, replace with: "${tabooCorrection}"`);
        }
    }
}

wordlist.txt

a bit:remove/somewhat, 
a lot of:many/several/a great number of/eight,
a couple of:many/several/a great number of/eight,
america:the United States/the US/the USA, 
isn’t:is not, 
can’t:can not, 
doesn’t:does not, 
would've:would have, 
kind of:somewhat/… to some degree, 
sort of:somewhat/… to some degree, 
til:until/to, 
till:until/to, 
you:one/reform sentence (you can clearly see the results —> the results can clearly be seen), 
your:one/reform sentence (you can clearly see the results —> the results can clearly be seen), 
bad:poor/negative, 
big:large/sizable, 
humungous:large/sizable, 
get:receives,
gets:receives 
give:provides/offers/presents, 
good:useful/prime, 
show:illustrates/demonstrates/reveals, 
stuff:belongings/posessions/personal effects, 
thing:details/findings/recommendations,
things:details/findings/recommendations, 
always:frequently/commonly/typically, 
never:frequently/commonly/typically, 
perfect:an ideal solution/one of the best solutions, 
best:an ideal solution/one of the best solutions, 
worst:an ideal solution/one of the best solutions, 
most:an ideal solution/one of the best solutions, 
always:an ideal solution/one of the best solutions, 
never:an ideal solution/one of the best solutions, 
very:important/critical/crucial, 
extremely:important/critical/crucial, 
really:important/critical/crucial, 
too:important/critical/crucial,
so:important/critical/crucial, 
beautiful:<remove>, 
ugly:<remove>, 
wonderful:<remove>, 
horrible:<remove>, 
good:<remove>, 
bad:<remove>, 
naturally:<remove>, 
obviously:<remove>, 
of course:<remove>, 
has got:has, 
have got:have, 
serves to:<remove>, 
helps to:<remove>, 
literally:<rephrase> example: were literally dying to —> Were dying/very eager to, 
would of:would have, 
had of:would had, 
think outside of the box:<avoid>, 
but at the end of the day:<avoid>, 
photos:<avoid>, 
fridge:<avoid>, 
phone:<avoid>, 
info:<avoid>, 
cops:<avoid>, 
cool:<avoid>, 
firemen:<avoid>, 
mankind:<avoid>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment