Skip to content

Instantly share code, notes, and snippets.

View desilinguist's full-sized avatar

Nitin Madnani desilinguist

View GitHub Profile
function FindProxyForURL(url, host) {
if (shExpMatch(url, "https://gitlab.ets.org/*") || shExpMatch(url, "https://artifactory.ets.org/*")) {
return "PROXY localhost:9128";
}
return "DIRECT";
}
@desilinguist
desilinguist / catplot.py
Created September 8, 2022 23:13
`generate_learning_curve_plots()` from SKLL but using `catplot()`
def generate_learning_curve_plots(experiment_name,
output_dir,
learning_curve_tsv_file):
"""
Generate the learning curve plots given the TSV output
file from a learning curve experiment.
Parameters
----------
experiment_name : str
@desilinguist
desilinguist / facebook-pnas.md
Last active August 29, 2015 14:03
Analyzing Facebook's PNAS paper on Emotional Contagion

The Internet has been abuzz today about Facebook data scientists publishing a paper in the Proceedings of the National Academy of Sciences describing experiments in which they deliberately removed posts from users' news feeds. I decided to actually read the paper, describe what was actually done, and write up some of my observations both as a scientist as well as a Facebook user. I try my best to use language that should be accessible to almost everyone, not just other scientists.

What they did

The Facebook data scientists selected 689,003 people who viewed Facebook in English as subjects for this experiment. The experiments took place during a week between January 11th through 18th, 2012. The basic idea of the experiment was to measure the effect of removing positive (or negative) posts from the people's news feeds on how positive (or negative) their own posts were in the days after these changes were made.

Now on to the details of the experiment. The

@desilinguist
desilinguist / download_videos.sh
Last active August 29, 2015 14:01
Downloading YouTube videos from titles in a filename.
python geturls.py --f test.txt | cut -d'|' -f2 | youtube-dl -a -
@desilinguist
desilinguist / tokenize.doctest
Created February 14, 2012 13:56
Latest version of treebank.py that fixes comma and colon errors when followed by numbers. Also the latest version of tokenize.doctest that tests for these errors.
.. Copyright (C) 2001-2012 NLTK Project
.. For license information, see LICENSE.TXT
>>> from nltk.tokenize import *
Regression Tests: Treebank Tokenizer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Some test strings.
@desilinguist
desilinguist / tokenize.doctest
Created December 21, 2011 15:35
Fixed Treebank Tokenizer for NLTK
.. Copyright (C) 2001-2012 NLTK Project
.. For license information, see LICENSE.TXT
>>> from nltk.tokenize import *
Regression Tests: Treebank Tokenizer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Some test strings.