Skip to content

Instantly share code, notes, and snippets.

View rossmounce's full-sized avatar
🏠
Working from home

Ross Mounce rossmounce

🏠
Working from home
View GitHub Profile
@rossmounce
rossmounce / wordles.r
Last active December 10, 2015 13:58 — forked from sckott/beswordles.r
#N.B. On *ubuntu RCurl may not install for you off the bat. If so read: http://www.omegahat.org/RCurl/FAQ.html & sudo apt-get install libcurl4-openssl-dev
install.packages(c("RCurl","twitteR","wordcloud","tm","stringr"))
library(twitteR); library(wordcloud); library(tm); library(stringr);
# Search for #mooc tweets
mooctweets <- searchTwitter("#mooc", n=2000)
length(mooctweets) # ends up with 713 as of 03-Jan-13 at 15:42 London time
# make into a data.frame
mooctweets_df <- twListToDF(mooctweets)
@rossmounce
rossmounce / CCBYjournals
Created March 4, 2013 17:02
This is an incomplete list of 'gold' open access journals that use the CC BY licence. Thanks to Cameron Neylon. Please add any you know of that I've missed. Hybrid OA journals are not to be listed here.
Journal Name ISSN
Abstract and Applied Analysis 10853375
Acta Crystallographica Section E 16005368
Acta Electrotechnica et Informatica 13358243
Acta Linguistica Asiatica 22323317
Acta Medica Martiniana 13358421
Acta Societatis Botanicorum Poloniae 16977
Acta Universitaria 1886266
Acta Universitatis Palackianae Olomucensis : Gymnica 12121185
Acta Veterinaria Scandinavica 17510147
@rossmounce
rossmounce / grep_references.sh
Last active December 20, 2015 09:58
A grep command for all the different Reference List headings encountered in just two years worth of Zootaxa articles. Some post-publication standardization needed me thinks!
egrep "(^Citations$|Cited Literature$|Literature [cC]ited$|Literatures cited$|Literature Cited\:$|References$|^references$|Refrences$|References [cC]ited$|REFERENCES$|Bibliography$|BIBLIOGRAPHY$|LITERATURE CITED$|LITERATURE cited$|REFERENCES CITED$|References \[not in Zootaxa format\]$|^Reference$|^Literature$|^References \(asterisks|^References \(except original descriptions|Litterature cited$|Literture Cited$)"
@rossmounce
rossmounce / Trees2Trees
Created August 12, 2013 16:59
A _really_ basic script for doing many-to-many tree2tree distance (RF) comparisons in R, using the phangorn package and the function treedist. I should probably use one of the 'apply' functions here, right?
library(phangorn)
#264 REFERENCE trees in phylip format, PAUP numbering hence 2
ref2 <- read.tree("jackr2.tre")
#264 trees in phylip format to pair-wise compare to the reference trees, TNT numbering hence 1
tr2 <- read.tree("jack1.tre")
x <- {}
#all reference trees to one comp tree
for (i in 1:length(tr2)) {
@rossmounce
rossmounce / feeds.xml
Last active February 28, 2023 15:20
My OPML bundle of academic journal RSS feeds related to my interests (phylogenetics, palaeontology), split into 4 different thematic sections.
<?xml version="1.0" encoding="UTF-8"?>
<opml version="1.0">
<head>
<title>Ross's academic journal RSS feed subscriptions</title>
</head>
<body>
<outline text="General Biology Journals" title="General Biology Journals">
<outline type="rss" text="BioEssays" title="BioEssays" xmlUrl="http://onlinelibrary.wiley.com/rss/journal/10.1002/(ISSN)1521-1878" htmlUrl="http://onlinelibrary.wiley.com/resolve/doi?DOI=10.1002%2F%28ISSN%291521-1878"/>
<outline type="rss" text="Biol J Linn Soc" title="Biol J Linn Soc" xmlUrl="http://onlinelibrary.wiley.com/rss/journal/10.1111/(ISSN)1095-8312" htmlUrl="http://onlinelibrary.wiley.com/resolve/doi?DOI=10.1111%2F%28ISSN%291095-8312"/>
@rossmounce
rossmounce / trythis
Created February 11, 2014 17:24
Content Negotiation, example of Internal Server Error
curl -g --location --header 'Accept: application/x-bibtex' "http://dx.doi.org/10.1651/0278-0372(2005)025[0159:GR]2.0.CO;2" > test.txt
RETURNS
<h1>Internal Server Error</h1>
(I've encountered about 91 DOIs that appear to give this error)
I know I'm doing all types of wrong here:
Source HTML file here: http://mdpi.com/1420-3049/19/4/5150/htm
I want the text for the dc.source:
Molecules 2014, Vol. 19, Pages 5150-5162
Am using beautiful soup, so probably best to do it in that BUT it should also be regex-able. I can do this in bash no problem!
@rossmounce
rossmounce / reply1.txt
Last active August 29, 2015 14:12
Reply to Rod Page (having technical problems posting this at PeerJ PrePrints)
Thanks for your feedback Rod. I really value it.
I don't pretend to have all the answers. All of the academic content discovery
services are fairly murky about how they actually index things,
as I'm sure you know (Google Scholar perhaps being the most open-ish about how it does things?).
> how comparable are PLoS and Zootaxa from the perspective of search engines?
I am not a search engine. I am a human researcher. Whether a paper is
published in Nature, Science, PLOS ONE or Zootaxa, it is the same to me -
@rossmounce
rossmounce / sample.csv
Created May 21, 2015 08:35
Data for gephi
We can make this file beautiful and searchable if this error is corrected: It looks like row 2 should actually have 34 columns, instead of 6. in line 1.
img1,Actinokineospora_fastidiosa,Amycolatopsis_alba_DSM_44262,Amycolatopsis_albidoflavus,Amycolatopsis_azurea,Amycolatopsis_balhimycina,Amycolatopsis_benzoatilytica,Amycolatopsis_coloradensis,Amycolatopsis_decaplanina_DSM_44594,Amycolatopsis_echigonensis,Amycolatopsis_kentuckyensis,Amycolatopsis_keratiniphila,Amycolatopsis_keratiniphila_subsp._nogabecina,Amycolatopsis_lexingtonensis,Amycolatopsis_lurida,Amycolatopsis_marina,Amycolatopsis_mediterranei,Amycolatopsis_methanolica_239,Amycolatopsis_nigrescens_CSC17Ta-90,Amycolatopsis_orientalis,Amycolatopsis_palatopharyngis,Amycolatopsis_plumensis,Amycolatopsis_regifaucium,Amycolatopsis_rifamycinica,Amycolatopsis_rubida,Amycolatopsis_saalfeldensis,Amycolatopsis_sacchari,Amycolatopsis_sp.,Amycolatopsis_sulphurea,Amycolatopsis_taiwanensis,Amycolatopsis_thermoflava_N1165,Amycolatopsis_tolypomycina,Amycolatopsis_vancoresmycina,Prauserella_rugosa
img2,Antarctobacter_heliothermus,Donghicola_eburneus,Jannaschia_helgolandensis,Ketogulonicigenium_vulgare,Loktanella_salsila
@rossmounce
rossmounce / data_err.csv
Created May 23, 2015 15:09
csvclean data.csv ; 70 errors logged to data_err.csv 13666 rows were joined/reduced to 6628 rows after eliminating expected internal line breaks.
line_number msg _id _full_text occurrenceID catalogNumber scientificName scientificNameAuthorship typeStatus locality country waterBody expedition recordedBy collectionCode kingdom phylum class order family genus subgenus specificEpithet infraspecificEpithet higherClassification taxonRank stateProvince continent island islandGroup higherGeography habitat decimalLongitude decimalLatitude geodeticDatum georeferenceProtocol maxError verbatimLongitude verbatimLatitude minimumElevationInMeters maximumElevationInMeters minimumDepthInMeters maximumDepthInMeters recordNumber individualCount lifeStage sex preparations identifiedBy dateIdentified identificationQualifier eventTime day month year earliestEonOrLowestEonothem latestEonOrHighestEonothem earliestEraOrLowestErathem latestEraOrHighestErathem earliestPeriodOrLowestSystem latestPeriodOrHighestSystem earliestEpochOrLowestSeries latestEpochOrHighestSeries earliestAgeOrLowestStage latestAgeOrHighestStage lowestBiostratigraphicZone highestBiostratigraphicZone group