Ignasi Bartomeus ibartomeus

## Parse terribly formated data
#question: Can we create an heuristic to parse this type of data:

#Input example:
Halictus crenicornis
GALICIA: 1♀, Monte do Gozo, Santiago de Compostela (La Coruña), 350 m, 5.VIII.2016, 29TNH404481. – 1♀, Río Castro, Cerdedo (Pontevedra), 20.VII.1996. – 1♀, Oca (Pontevedra), 20.VII.1996.
ASTURIAS: 1♂, 1♀, Raitán, Carreño (Asturias), 130 m, 30TTP72, 17.VIII.2005. – 1♀, Poreño, Villaviciosa (Asturias), 43,426443º, -5,445950º, 13.V.2015, sobre flor de Centaurea nigra, C. Guardado leg. – 1♀, Poreño, Villaviciosa (Asturias), 43,426443º, -5,445950º, 27.V.2014, M. Miñarro leg. – 1♀, Muñiz (Asturias), 14.VII.2016, sobre flor de Taraxacum, D. Luna leg.

#Desired output (csv):
Species, CCAA, female, male, locality, province, elevation, date, UTM, latitude, longitude, notes
Halictus crenicornis, Galicia, 1, 0, "Monte do Gozo, Santiago de Compostela", La Coruña, 350, 5.VIII.2016, 29TNH404481, NA, NA, NA

## set.R
#Calculate how many sets there are in a set game.


#each card has 4 characteristics with 3 factors each
color <- c("red", "green", "purple")
shape <- c("round", "diamond", "curly")
texture <- c("fill", "empty", "striped")
number <- c("one", "two", "three")

create_board <- function(n = 12){

## origin_demo.txt
#This is a quick demo performed for Sevilla R users group

#Elena asked about IUCN data. You can retrive this data using taxize: https://github.com/ropensci/taxize
#Elena also suggested vectorbase can be scrapped. Maybe for the next hackaton?

#package OriginR
library(originr)

#define species (don't worry about typos)
sp <- c("Apis mellifera", "carpobrotus edulis", "Lavandula stoechas",

## clean_species
#I have >1000 bees to check its name, so I want to automatize taxize for
# fixing misspellings when possible
# updating synonims to accepted names
# keeping ONLY accepted species (fully resolved at species level)

# this uses taxize > 0.7.6.9157 If you are using older version (e.g. what its now on CRAN) see the history of this file.
library(taxize)
library(dplyr)

#example: good, synomin, typo, unexisting, genus only.

## ggplot
---
title: "Sevillarusers - ggplot2 intro"
author: "Ra?l Ortiz"
date: "Tuesday, October 27, 2015"
output: pdf_document
---

# Introducci?n al paquete gr?fico "ggplot2".

## Establezco el directorio de trabajo.

## PowerSevilla.R
#SevillaR talk

#The problem:

time <- c(2000:2015)
abundance <- rnorm(16, 150, 50) #poison??
plot(abundance ~ time, t = "l")


#can I detect a trend?

## CIS
#Explore CIS data


#load data----
load("barometro_enero.RData")
head(barometro)
str(barometro)
head(nombres_etiquetas)
nombres_etiquetas

## random_slope_models
#This shows how to get the random slopes and CI's for each level in a hierarchical model

#dataset used
head(iris)

#what we want to investigate
#Is there a general relationship? and how it differs by species
plot(iris$Sepal.Width ~ iris$Petal.Width, col = iris$Species, las =1)

#Our model with random slope and intercept

## pref
---
title: "Preferring a preference index"
author: "I. Bartomeus"
output: html_document
---

I've been reading about preference indexes lately, speciphically for characterizing pollinator preferences for plants, so here is what I learnt. Preference is defined as using an item (e.g. plant) more than expected given the item abundance.

First I like to use a quantitative framework (you can use ranks-based indices as in Williams et al 2011, which has nice propiertiest too). The simpliest quantitative index is the forage ratio:

## multifunc2
# This approach to assess multifunctionality is based in the idea that sites providing
    # best multiple functions will have not only a high mean value across function
    # (approach 3 in Byrnes et al.) but also low variability in the function delivered
    # across functions (i.e. Coef of var).

#I use Byrnes multifunc package to ilustrate it.
library(devtools)
install_github("multifunc", "jebyrnes")
library(multifunc)
library(ggplot2)
	#question: Can we create an heuristic to parse this type of data:

	#Input example:
	Halictus crenicornis
	GALICIA: 1♀, Monte do Gozo, Santiago de Compostela (La Coruña), 350 m, 5.VIII.2016, 29TNH404481. – 1♀, Río Castro, Cerdedo (Pontevedra), 20.VII.1996. – 1♀, Oca (Pontevedra), 20.VII.1996.
	ASTURIAS: 1♂, 1♀, Raitán, Carreño (Asturias), 130 m, 30TTP72, 17.VIII.2005. – 1♀, Poreño, Villaviciosa (Asturias), 43,426443º, -5,445950º, 13.V.2015, sobre flor de Centaurea nigra, C. Guardado leg. – 1♀, Poreño, Villaviciosa (Asturias), 43,426443º, -5,445950º, 27.V.2014, M. Miñarro leg. – 1♀, Muñiz (Asturias), 14.VII.2016, sobre flor de Taraxacum, D. Luna leg.

	#Desired output (csv):
	Species, CCAA, female, male, locality, province, elevation, date, UTM, latitude, longitude, notes
	Halictus crenicornis, Galicia, 1, 0, "Monte do Gozo, Santiago de Compostela", La Coruña, 350, 5.VIII.2016, 29TNH404481, NA, NA, NA
	#Calculate how many sets there are in a set game.


	#each card has 4 characteristics with 3 factors each
	color <- c("red", "green", "purple")
	shape <- c("round", "diamond", "curly")
	texture <- c("fill", "empty", "striped")
	number <- c("one", "two", "three")

	create_board <- function(n = 12){
	#This is a quick demo performed for Sevilla R users group

	#Elena asked about IUCN data. You can retrive this data using taxize: https://github.com/ropensci/taxize
	#Elena also suggested vectorbase can be scrapped. Maybe for the next hackaton?

	#package OriginR
	library(originr)

	#define species (don't worry about typos)
	sp <- c("Apis mellifera", "carpobrotus edulis", "Lavandula stoechas",
	#I have >1000 bees to check its name, so I want to automatize taxize for
	# fixing misspellings when possible
	# updating synonims to accepted names
	# keeping ONLY accepted species (fully resolved at species level)

	# this uses taxize > 0.7.6.9157 If you are using older version (e.g. what its now on CRAN) see the history of this file.
	library(taxize)
	library(dplyr)

	#example: good, synomin, typo, unexisting, genus only.
	---
	title: "Sevillarusers - ggplot2 intro"
	author: "Ra?l Ortiz"
	date: "Tuesday, October 27, 2015"
	output: pdf_document
	---

	# Introducci?n al paquete gr?fico "ggplot2".

	## Establezco el directorio de trabajo.
	#SevillaR talk

	#The problem:

	time <- c(2000:2015)
	abundance <- rnorm(16, 150, 50) #poison??
	plot(abundance ~ time, t = "l")


	#can I detect a trend?
	#Explore CIS data


	#load data----
	load("barometro_enero.RData")
	head(barometro)
	str(barometro)
	head(nombres_etiquetas)
	nombres_etiquetas
	#This shows how to get the random slopes and CI's for each level in a hierarchical model

	#dataset used
	head(iris)

	#what we want to investigate
	#Is there a general relationship? and how it differs by species
	plot(iris$Sepal.Width ~ iris$Petal.Width, col = iris$Species, las =1)

	#Our model with random slope and intercept
	---
	title: "Preferring a preference index"
	author: "I. Bartomeus"
	output: html_document
	---

	I've been reading about preference indexes lately, speciphically for characterizing pollinator preferences for plants, so here is what I learnt. Preference is defined as using an item (e.g. plant) more than expected given the item abundance.

	First I like to use a quantitative framework (you can use ranks-based indices as in Williams et al 2011, which has nice propiertiest too). The simpliest quantitative index is the forage ratio:
	# This approach to assess multifunctionality is based in the idea that sites providing
	# best multiple functions will have not only a high mean value across function
	# (approach 3 in Byrnes et al.) but also low variability in the function delivered
	# across functions (i.e. Coef of var).

	#I use Byrnes multifunc package to ilustrate it.
	library(devtools)
	install_github("multifunc", "jebyrnes")
	library(multifunc)
	library(ggplot2)