Skip to content

Instantly share code, notes, and snippets.

Emaad Ahmed Manzoor emaadmanzoor

Block or report user

Report or block emaadmanzoor

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
View Word embeddings via PMI-matrix factorization.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@emaadmanzoor
emaadmanzoor / 95-865-Model_Evaluation_Demo.md
Last active Feb 16, 2018
95865 Model Evaluation Demo
View 95-865-Model_Evaluation_Demo.md
View get_spark_streaming_batch_statistics.py
#!/usr/bin/env python
# Copyright 2016 Emaad Ahmed Manzoor
# License: Apache License, Version 2.0
# http://www.eyeshalfclosed.com/blog/2016/07/22/spark-streaming-statistics/
"""
Get Spark Streaming microbatch statistics:
- Batch start time
- Scheduling delay (in seconds) for each microbatch
View 00-StreamSpot-Bootstrap-Clusters.md

StreamSpot Bootstrap Clusters

www3.cs.stonybrook.edu/~emanzoor/streamspot/

Below are the bootstrap clusters used for the experiments in the StreamSpot paper for each of following datasets:

  • all (01-C50_k10_all.txt): Chunk length of 50, 10 clusters.
  • ydc (02-C25_k5_ydc.txt): Chunk length of 25, 5 clusters.
  • gfc (03-C50_k5_gfc.txt): Chunk length of 50, 5 clusters.
@emaadmanzoor
emaadmanzoor / QuantifyingMonotonyAversion.md
Last active Aug 29, 2015
Quantifying Monotony Aversion
View QuantifyingMonotonyAversion.md

See the project website for more details.

Please report any issues to emaadahmed.manzoor@kaust.edu.sa.

Execution

Running this requires having the following files in the same directory as calculate_cluster_statistics.py:

  • all_links.p
  • all_tweets.p
@emaadmanzoor
emaadmanzoor / AttentionPotentialValidation.md
Last active Aug 29, 2015
Attention Potential Validation Code
View AttentionPotentialValidation.md

See the project website for more details.

Please report any issues to emaadahmed.manzoor@kaust.edu.sa.

Correlation Results

The attention potential (as estimated in section 4), when evaluated on this Twitter dataset:

  • Is 73.61% correlated with the retweets obtained.
  • Is significantly correlated (p < 0.05).
@emaadmanzoor
emaadmanzoor / freivald.py
Created Sep 9, 2013
Frievald's Algorithm
View freivald.py
import random
import operator
t = int(raw_input())
randint = random.randint
def deterministic(a,b,c,n):
no = 0
for p in xrange(n):
for q in xrange(n):
@emaadmanzoor
emaadmanzoor / ExpandEdinburghFSDCorpus.md
Last active Aug 16, 2019
Expand the Edinburgh Twitter FSD corpus
View ExpandEdinburghFSDCorpus.md

Expand The Edinburgh Twitter FSD Corpus

The Python scripts attached here take care of the following tedious work, and should help one quickly get started with some real work on the corpus:

  • Respect the Twitter API rate limits and throttle API hits.
  • Don't hit the API for already expanded tweet ID's, so you can resume tweet expansion after stopping midway.
  • Parse the API response and dump it into the correct column in the sqlite3 database.
  • Gracefully handle exceptions while acquiring tweets from the API.
  • Wrap version 1.1 of the Twitter API.
  • Start from a specified tweet ID, assuming the input file is sorted in increasing order of tweet ID.
You can’t perform that action at this time.