###Mining Digital Repositories: Challenges and Horizons, KB, Den Haag, 10 April 2014
Live notes, so an incomplete, partial record of what actually happened.
Tag #digrep14
####Challenges (Thursday)
Hans Jansen, KB
# dhist. | |
# An another algorithm for computing histogram breaks. Produces irregular bins. | |
# Provided by Lorraine Denby | |
# | |
# | |
# @keyword internal | |
dhist <- function(x, a=5*diff(quantile(x, c(0.25,0.75))), nbins=10, rx = range(x)) { | |
x <- sort(x) | |
if(a == 0) | |
a <- diff(range(x))/100000000 |
<script type="text/javascript"> | |
SyntaxHighlighter.autoloader( | |
"r path/to/your/syntaxhighlighter/scripts/shBrushR.js", | |
"plain path/to/your/syntaxhighlighter/scripts/shBrushPlain.js", | |
"sql path/to/your/syntaxhighlighter/scripts/shBrushSql.js", | |
"js path/to/your/syntaxhighlighter/scripts/shBrushJScript.js", | |
"html xml path/to/your/syntaxhighlighter/scripts/shBrushXml.js" | |
); | |
SyntaxHighlighter.defaults["toolbar"] = false; | |
SyntaxHighlighter.all(); |
doInstall <- TRUE | |
toInstall <- c("twitteR", "dismo", "maps", "ggplot2") | |
if(doInstall){install.packages(toInstall, repos = "http://cran.us.r-project.org")} | |
lapply(toInstall, library, character.only = TRUE) | |
searchTerm <- "#rstats" | |
searchResults <- searchTwitter(searchTerm, n = 1000) # Gather Tweets | |
tweetFrame <- twListToDF(searchResults) # Convert to a nice dF | |
userInfo <- lookupUsers(tweetFrame$screenName) # Batch lookup of user info |
# | |
install.packages(c("twitteR","wordcloud","tm")) | |
library(twitteR); library(wordcloud); library(tm) | |
# Search for #bes12 tweets | |
bestweets <- searchTwitter("#bes12", n=5000) | |
length(bestweets) # ends up with 1344 as of 21-Dec-12 at 17:00 London time | |
# make into a data.frame | |
bestweets_df <- twListToDF(bestweets) |
library("RCurl") | |
library("XML") | |
library("plyr") | |
library("ggplot2") | |
library("directlabels") | |
######################## | |
# Download PubMed Data # | |
######################## |
Hi bwa users, | |
The bwa-mem manuscript has been rejected. Interestingly, the first | |
reviewer only raised a couple of minor concerns and then accepted the | |
manuscript in the second round of the review. The second reviewer made | |
quite a few mistakes on some basic concepts and was hostile from the | |
beginning. The third reviewer gave fair and good review in the first | |
round, all of which have been addressed, but he then tried hard to | |
argue one particular mapper to be the best in accuracy that on the | |
contrary is inferior to most others in my view. I admit that my |
Elsevier | 529633 | |
---|---|---|
Springer-Verlag | 206527 | |
Wiley Blackwell (John Wiley & Sons) | 110387 | |
Wiley Blackwell (Blackwell Publishing) | 100235 | |
Informa UK (Taylor & Francis) | 85869 | |
Trans Tech Publications | 53310 | |
Sage Publications | 42105 | |
Oxford University Press | 40496 | |
American Chemical Society | 39543 | |
Ovid Technologies (Wolters Kluwer) - Lippincott Williams & Wilkins | 39186 |
import MySQLdb.cursors | |
from twisted.enterprise import adbapi | |
from scrapy.xlib.pydispatch import dispatcher | |
from scrapy import signals | |
from scrapy.utils.project import get_project_settings | |
from scrapy import log | |
SETTINGS = get_project_settings() |
#!/bin/bash | |
usage() { | |
cat << EOF | |
Usage: $0 [OPTION]... COMMAND | |
Execute the given command in a way that works safely with cron. This should | |
typically be used inside of a cron job definition like so: | |
* * * * * $(which "$0") [OPTION]... COMMAND | |
Arguments: |
###Mining Digital Repositories: Challenges and Horizons, KB, Den Haag, 10 April 2014
Live notes, so an incomplete, partial record of what actually happened.
Tag #digrep14
####Challenges (Thursday)
Hans Jansen, KB