Skip to content

Instantly share code, notes, and snippets.

View chauff's full-sized avatar
🎯
Focusing

Claudia Hauff chauff

🎯
Focusing
View GitHub Profile
@chauff
chauff / keybase.md
Created July 13, 2017 08:29
Keybase file required

Keybase proof

I hereby claim:

  • I am chauff on github.
  • I am claudiahauff (https://keybase.io/claudiahauff) on keybase.
  • I have a public key ASDQiFH6MvbiLTnkkf1busouMdCq0Yutwk-1GXLPAq0I-Qo

To claim this, I am signing this object:

@chauff
chauff / gist:d29bc2eb24cd533b2f0a
Created June 11, 2015 19:25
Interesting pointers for Big Data Processing
http://vision.cloudera.com/apache-kafka-a-platform-for-real-time-data-streams-part-1/
An introduction to Kafka
"This totaled to over 800 billion events per day, with 175TB of daily writes and over 650 TB of reads (since each write fans out to multiple readers)"
@chauff
chauff / README.md
Created December 6, 2012 13:21
Perl scripts that generate pseudo-relevance judgments and system rankings based on them as well as data that shows the unique contributions of each TREC run (or all runs by a group) to the assessment pool. Used in the visualizations shown here: http://www.st.ewi.tudelft.nl/~hauff/visualization/trecVis.html

To run the code, start run.sh (the script contains/explains all necessary parameters).

Four csv files will be generated: one for the pseudo-qrels visualization, two for the unique contributions of each run and each group respectively to the depth-k pool and a last one for the overlap in retrieved relevant documents between runs.

Add the login/password to the TREC website in download.pl

The first CSV file contains 1 row per TREC run with its effectiveness with respect to the true relevance judgments and with respect to the pseudo-qrel judgments. Used here: http://www.st.ewi.tudelft.nl/~hauff/visualization/trecVis.html

The second CSV file contains 1 row per TREC run with the number of unique document contributions to the assessment pool (relevant as well as non-relevant).