daroczig/UK_dialect_maps.tpl

## UK_dialect_maps.tpl
<!--head
meta:
  title: UK language usage
  description: Analysing the results of The Cambridge Online Survey of World Englishes
    in the United Kingdom
  author: ' (@daroczig)'
  packages:
  - class
  - descr
  - dismo
  - raster
  - RColorBrewer
  - rgdal
  - scales
  - MASS
inputs:
- required: yes
  class: character
  name: q
  label: Question
  standalone: yes
  value: Pop or soda?
  length:
    min: 1.0
    max: 1.0
  description: Question to analyse
  matchable: yes
  options:
  - Pop or soda?
  - What do you call the long cold sandwich that contains cold cuts, lettuce, and
    so on?
  - What is your generic casual or informal term for a sweetened carbonated beverage?
  - What is your general, informal term for the rubber-soled shoes worn in gym class,
    for athletic activities, etc.?
  - What do you call the kind of crustacean that looks like a tiny lobster and lives
    in lakes and streams?
  - What word(s) do you use in casual speech to address a group of two or more people?
  - What do you call the little gray (or black or brown) creature (that looks like
    an insect but is actually a crustacean) that rolls up into a ball when you touch
    it?
  - What do you call the kind of rain that falls while the sun is shining?
  - What do you call the gooey or dry matter that collects in the corners of your
    eyes, especially while you are sleeping?
  - How do you pronounce the vowel sound in the word 'aunt' ("parent's sister")?
  - What is your preferred general and casual term for a sale of your unwanted items
    (which may be held on your porch, in your yard, garden, or house, from the back
    of your car, etc.)?
  - What do you call the wheeled contraption in which you carry groceries at the grocery
    store or supermarket?
  - What do you call a traffic intersection in which several roads meet in a circle
    and you have to get off at a certain point?
  - Do you pronounce r's when they aren't followed by a vowel, as in car, cart, carton,
    and so on?
  - How do you pronounce 'sawing' and 'saw it', as in "I enjoying sawing wood" and
    "she saw it"?
  - How do you pronounce 'Shah of', as in "Abbas was a famous Shah of Iran"?
  - How do you pronounce 'which' and 'witch'?
  - What do you call the meal you eat in the evening, normally somewhere between 5
    and 10 PM?
  - What do you call an upholstered seat for more than one person?
  - What do you a call a store that is devoted primarily to selling alcoholic beverages?
  - What do you call a room equipped with toilets and lavatories for public use?
  - What do you call the auxiliary brake that's attached to a rear wheel or the transmission
    and keeps the car from moving accidentally?
  - What do you call an automobile transmission system in which gears are selected
    by the driver by means of a hand-operated gearshift and a foot-operated clutch?
  - What do you call an artificial nipple, usually made of plastic, which an infant
    can suck or chew on?
  - What do you call food purchased at a restaurant to be eaten elsewhere?
  - What do you call this large aquatic bug that skims along the surface of water?
  - What do you call a narrow street or passageway between or behind buildings?
  - What do you call an unattended machine (normally outside a bank) that dispenses
    money when a personal coded card is used?
  - What do you call your fifth/smallest toe?
  - What do you call this long green herb that is used as a garnish or in soups, salads
    and stir-fry dishes? (It belongs to the genus Allium and lacks a fully-developed
    bulb.)
  - How do you pronounce the last vowel in the word "cinema"?
  - How do you pronounce the last vowel in the word "happy"?
  - How do you pronounce the letter "H"?
  - How do you pronounce the name of this small British quick bread (or cake if the
    recipe includes sugar)?
  - How do you pronounce the past tense of the verb "eat"?
  - How do you pronounce the word "again"?
  - How do you pronounce the word "bald"?
  - How do you pronounce the word "cut"?
  - How do you pronounce the word "last"?
  - How do you pronounce the word "sandwich"?
  - How do you pronounce the word "schedule"?
  - How do you pronounce the word "sixth"?
  - What do you call a a sandwich made with bread or bread roll (usually white and
    buttered) and chips, often with some sort of sauce?
  - What do you call a narrow, pedestrian lane found in urban areas which usually
    runs between or behind buildings?
  - What do you call a rack you dry your clothes on in a house?
  - What do you call a small round piece of bread typically used as a side dish?
  - What do you call a young person in cheap trendy clothes and jewellery?
  - What do you call circular junction in which road traffic must travel in one direction
    around a central island?
  - What do you call item of clothing worn on the lower part of the body from the
    waist to the ankles, covering both legs separately?
  - What do you call short undergarments worn on the lower body?
  - What do you call the creepy crawly thing that often rolls into a ball when touched?
  - What do you call the person who collects and removes rubbish from residential
    areas for further processing and disposal?
  - What do you call the popular sport played between two teams of eleven players
    with a spherical ball?
  - What do you say to call for a temporary respite or truce during a game or activity?
  - What is your general term for sweetened carbonated beverages?
  - What is your general term for the type of rubber-soled shoes that one typically
    wears for athletic activities or casual situations?
  allow_multiple: no
- required: no
  class: integer
  name: k
  label: Number of neighbours to check
  standalone: yes
  value: 3.0
  length:
    min: 1.0
    max: 1.0
  description: Number of neighbours to check in the k-nearest neighbourgh cluster
  limit:
    min: 1.0
    max: 10.0
- required: no
  class: character
  name: colp
  label: Color palette
  standalone: yes
  value: Set1
  length:
    min: 1.0
    max: 1.0
  description: Color paletter from colorbrewer.com
  matchable: yes
  options:
  - BrBG
  - PiYG
  - PRGn
  - PuOr
  - RdBu
  - RdGy
  - RdYlBu
  - RdYlGn
  - Spectral
  - Accent
  - Dark2
  - Paired
  - Pastel1
  - Pastel2
  - Set1
  - Set2
  - Set3
  - Blues
  - BuGn
  - BuPu
  - GnBu
  - Greens
  - Greys
  - Oranges
  - OrRd
  - PuBu
  - PuBuGn
  - PuRd
  - Purples
  - RdPu
  - Reds
  - YlGn
  - YlGnBu
  - YlOrBr
  - YlOrRd
  allow_multiple: no
head-->

## Data source

Bert Vaux and Marius L. Jøhndal (University of Cambridge, United Kingdom) have just recently published some exciting results of the [The Cambridge Online Survey of World Englishes](http://www.tekstlab.uio.no/cambridge_survey/) that we try to analyse a bit further below.

## Apologetics

Please note that the below report is generated automatically based on a [statistical report template](http://support.rapporter.net/entries/22471338-What-is-a-template-) and the results, map, tables and all these text is generated real-time or served from cache. This means that you are now reading a non-proofread quick report written by computers.

## Map

First, let us plot the raw results about _<%=q%>_ gathered in the United Kingdom on a terrain map borrowed from [Google](https://developers.google.com/maps/):

<%=
df <- UK_language_data$df
polies <- UK_language_data$polies
poliesM <- UK_language_data$poliesM
bgmap <- UK_language_data$bgmap
smallpolies <- UK_language_data$smallpolies

## load data
#df <- readRDS(system.file('custom-data/UK-survey.RData', package = 'rapport.server'))

## fix levels
levels(df$Q)[2] <- 'What do you call the long cold sandwich that contains cold cuts, lettuce, and so on?'
levels(df$Q)[3] <- 'What is your generic casual or informal term for a sweetened carbonated beverage?'
levels(df$Q) <- gsub('[>|<]', '\'', levels(df$Q))

## filter data
df <- df[which(df$Q == q), ]
df$A <- factor(df$A)
levels(df$A) <- gsub('[>|<]', '\'', levels(df$A))

## order
lt <- as.numeric(table(df$A))
lno <- length(lt)
ln <- min(length(lt), ifelse('other' %in% levels(df$A), 6, 5))
lo <- order(lt, decreasing = TRUE)[1:ln]
lt <- lt[lo]
llo <- names(table(df$A))
ll <- llo[lo]

## drop 5+ cats
if (!'other' %in% ll) {
  ll <- c(ll, 'other')
  ln <- ln + 1
}
df$A <- as.character(df$A)
ids <- which(!df$A %in% ll)
if (length(ids)>0) {
  df$A[ids] <- 'other'
}
df$A <- factor(df$A, levels = ll)
if (table(df$A)[['other']] == 0) {
  ll <- setdiff(ll, 'other')
  df$A <- factor(df$A, levels = ll)
  ln <- ln - 1
}

## strwrap
ll <- sapply(ll, function(l) paste(strwrap(l, 30), collapse = '\n'))

## colors
if (nrow(df) > 0) {
cs <- brewer.pal(ln, colp)
ct <- alpha(cs, 0.4)
df$cs <- df$ct <- df$A
levels(df$cs) <- cs
levels(df$ct) <- ct
}

## map data
#polies  <- readRDS(system.file('custom-data/UK-polies.RData', package = 'rapport.server'))
#poliesM <- readRDS(system.file('custom-data/UK-polies-mercator.RData', package = 'rapport.server'))
#bgmap   <- readRDS(system.file('custom-data/UK-raster.RData', package = 'rapport.server'))
bgmap@file@name  <- "/usr/local/lib/R/site-library/rapport.server/custom-data/UK-raster-raw.gif"
#bgmap@data@names <- system.file('custom-data/UK.raster.raw', package = 'rapport.server')

## cluster
centroids <- coordinates(polies)
require(class)
if (nrow(df) > 0) {
cols <- knn(df[, c('LNG', 'LAT')], centroids, df$ct, k)
}

## update plot settins
evalsOptions('width', 700)
evalsOptions('height', 700)
#evalsOptions('res', 150)
evalsOptions('graph.unify', FALSE)
%>

<%=
## plot
set.caption(q)
%>

<% if (nrow(df) > 0) { %>

<%=
plot(bgmap, maxpixels = 10e7, xaxs = 'i', yaxs = 'i')
+plot(poliesM, add = TRUE, col = as.character(cols))
+points(Mercator(df[, c(2,1)]) , col = as.character(df$cs), pch = '*', cex = 2)
+if (legend("topright", legend = paste0(ll, ' [', lt, ']'), col = cs, pch = '*', box.col = '#B2B2B2', bg = '#B2B2B2', cex = 1, plot = FALSE)$rect$h > 700000) {
  legend("topright", legend = paste0(ll, ' [', lt, ']'), col = cs, pch = '*', box.col = '#B2B2B2', bg = '#B2B2B2', cex = 1)
} else {
  legend("topright", legend = paste0(ll, ' [', lt, ']'), col = cs, pch = '*', box.col = '#B2B2B2', bg = '#B2B2B2', cex = 1.5)
}
%>
<% } else { %>
<%= plot(bgmap, maxpixels = 10e7, xaxs = 'i', yaxs = 'i') %>
<% } %>

### Responses

You can see the raw results geocoded by the Zip code of the respondents in the above map marked by coloured stars for the <%=lno%> categories offered in the survey. See the legend on the top right corner for details where the number of cases for each category is shown after the labels in square brackets.
<% if (lno > ln) { %>

### Merged categories

Please note that <%=lno-ln%> categories were merged to the "other" category (n=<%=length(which(df$A == 'other'))%>) in the map for convenience:

<%= paste(pandoc.list.return(llo[-lo]), collapse = '\n') %>
<% } %>

### K-nearest neighbours

Beside the <%=nrow(df)%> answers, 192 subdivisions of the United Kingdom is also shown in similar (a bit dimmer and transparent) colours defined by [k-nearest neighbour algorithm](https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm) where _k_ being <%=k%>.This classification method builds and uses the survey data to determine the most likely category for the given subdivision based on the _k_ number of nearest neighbour(s).

This means that setting _k_ to _1_ would find the nearest point to each subdivisions centre and colour the polygons accordingly, and using a higher number for _k_ would return a more smoothed map of colours.

## Language usage across the UK

Although the characteristics of the four countries addressed in this report may be seen in the above map, some more detailed descriptive statistics are also worth noting.

### Observations

<% if (nrow(df) > 0) { %>

<%=
## find country
ps <- attributes(smallpolies)$polygons
df$country <- factor(rowSums(sapply(1:4, function(i) {
    pss <- ps[[i]]@Polygons
    ifelse(rowSums(sapply(1:length(pss), function(j) {
        coordsPolygon <- pss[j][[1]]@coords
        point.in.polygon(df$LNG, df$LAT, coordsPolygon[, 1], coordsPolygon[, 2])
    })), i, 0)
})))
levels(df$country) <- smallpolies@data$NAME_1

## crosstable
ct <- table(df$A, df$country)
ct
%>

The above table shows the number of geocoded cases for each category in each country, that is just not too informative. A row-percentage table with the marginal and emphasized based on the computed Pearson-residuals might be a lot better to check out.

### Percentages

<%=
ctr <- apply(round(prop.table(addmargins(ct, 2), 2)*100, 2), c(1,2), function(s) paste0(s, '%'))
emphasize.cols(5)
ctres <- suppressWarnings(CrossTable(ct))$CST$stdres
ctre  <- which(ctres < -2 | ctres > 2, arr.ind = TRUE)
emphasize.strong.cells(ctre)
set.caption('Residuals being higher than 2 or smaller than -2 are highlighted with bold font')
ctr
%>

The last column of the above table shows the summarized distribution of the answers about _<%=q%>_ that is worth comparing to the country-specific values. The most interesting <%=nrow(ctre)%> values are highlighted based on their residuals.

### Statistical tests

<%=
t <- suppressWarnings(chisq.test(ct))
lambda <- lambda.test(ct)
cramer <- sqrt(as.numeric(t$statistic)/(sum(ct)*min(dim(ct))))
%>

<%if (t$p.value < 0.05) { %>

It seems that a real association can be pointed out between the question and the country ($\chi$=<%=as.numeric(t$statistic)%> at the degree of freedom being <%=as.numeric(t$parameter)%>) at the significance level of <%=t$p.value%>. This means that there is a significance difference in what people think about _<%=q%>_ in the analysed four countries. This association seems to be <%=ifelse(cramer < 0.5, "weak", "strong")%> based on Cramer\'s V (<%=cramer%>).

<% } else { %>

It seems that no real association can be pointed out between the question and the country ($\chi$=<%=as.numeric(t$statistic)%> at the degree of freedom being <%=as.numeric(t$parameter)%>) at the significance level of <%=t$p.value%>. This means that there is no significance difference in what people think about _<%=q%>_ in the analysed four countries. For this end, no further statistical tests were performed.

<% } %>

## Summary

<%=
fraction.to.string <- function(x) {
    s <- attr(fractions(x, max.denominator = 10), 'frac')
    s <- strsplit(s, '/')[[1]]
    s <- as.numeric(s)
    if (length(s) == 1 && s == 0)
        return('less then one tenth')
    if (length(s) == 1 && s == 1)
        return('more then nine tenth')
    if (s[2] > 10)
        s <- c(round(x*10, 0), 10)
    s1 <- factor(s[1], levels = 1:9)
    levels(s1) <- c('one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine')
    s2 <- factor(s[2], levels = 2:10)
    levels(s2) <- c('half', 'third', 'fourth', 'fith', 'sixth', 'seventh', 'eighth', 'ninth', 'tenth')
    paste(s1, s2)
}
%>

The **most popular category** in the United Kingdom was <<_<%=names(which.max(prop.table(addmargins(ct, 2), 2)[, 5]))%>_>> for <<_<%=q%>_>> chosen by _<%=fraction.to.string(max(prop.table(addmargins(ct, 2), 2)[, 5]))%>_ of the respondents.

<%if (t$p.value < 0.05) { %>

And the most important differences between the countries can be summarised as:

<%=
df$citizen <- df$country
levels(df$citizen) <- c('Brittish', 'Northern Irish', 'Scottish', 'Welsh')
#apply(ctre[!duplicated(ctre[, 1]), ], 1, function(x) {
res <- apply(ctre, 1, function(x) {
    paste(sample(c('it seems, that', 'one may say, that', 'in short,', 'eventually,'), 1),
          paste(paste0('_', fraction.to.string(prop.table(addmargins(ct, 2), 2)[x[1], x[2]]), '_'), 'of'),
          sample(c(paste('people living in', levels(df$country)[x[2]]),
                   paste(levels(df$citizen)[x[2]], 'people')), 1),
          paste(ifelse(ctres[x[1], x[2]] < 0,
                       sample(c('dislike the answer', 'do not really like the asnwer', 'tends to dislike the answer', 'disagree with', 'do not agree with'), 1),
                       sample(c('like the answer', 'love the answer', 'tends to like the answer', 'agree with', 'sympathies with'), 1)),
                paste0('<<_', row.names(ctres)[x[1]], '_>>'),
                'that is', ifelse(ctres[x[1], x[2]] < 0, 'low', 'high'),
                sample(c('compared to the average', 'compared to the other countries', 'in a grand avarage', paste('compared to e.g.', sample(setdiff(levels(df$citizen), levels(df$citizen)[x[2]]), 1), sample(c('people', 'citizens'), 1)), paste('compared to lets say', sample(setdiff(levels(df$citizen), levels(df$citizen)[x[2]]), 1), sample(c('people', 'citizens'), 1)), paste('comparing to e.g.', sample(setdiff(levels(df$citizen), levels(df$citizen)[x[2]]), 1), sample(c('people', 'citizens'), 1))), 1)
          ))
})
paste(pandoc.list.return(res), collapse = '\n')
%>


<% } else { %>

And people tend to think in the same way about _<%=q%>_ all in England, Scotland, Wales and Northern Ireland. Why not give a try to [analyse another question](http://rapporter.net/api/form/b6591e70fa19b53786dc9e1f7e734e5ca26bd4c6e13acffc07fbdc77092d8c55)?

<% } %>

<% } else { %>
There were no responses gathered about _<%=q%>_ in the United Kingdom. Why not give a try to [analyse another question](http://rapporter.net/api/form/b6591e70fa19b53786dc9e1f7e734e5ca26bd4c6e13acffc07fbdc77092d8c55)?
<% } %>
	<!--head
	meta:
	title: UK language usage
	description: Analysing the results of The Cambridge Online Survey of World Englishes
	in the United Kingdom
	author: ' (@daroczig)'
	packages:
	- class
	- descr
	- dismo
	- raster
	- RColorBrewer
	- rgdal
	- scales
	- MASS
	inputs:
	- required: yes
	class: character
	name: q
	label: Question
	standalone: yes
	value: Pop or soda?
	length:
	min: 1.0
	max: 1.0
	description: Question to analyse
	matchable: yes
	options:
	- Pop or soda?
	- What do you call the long cold sandwich that contains cold cuts, lettuce, and
	so on?
	- What is your generic casual or informal term for a sweetened carbonated beverage?
	- What is your general, informal term for the rubber-soled shoes worn in gym class,
	for athletic activities, etc.?
	- What do you call the kind of crustacean that looks like a tiny lobster and lives
	in lakes and streams?
	- What word(s) do you use in casual speech to address a group of two or more people?
	- What do you call the little gray (or black or brown) creature (that looks like
	an insect but is actually a crustacean) that rolls up into a ball when you touch
	it?
	- What do you call the kind of rain that falls while the sun is shining?
	- What do you call the gooey or dry matter that collects in the corners of your
	eyes, especially while you are sleeping?
	- How do you pronounce the vowel sound in the word 'aunt' ("parent's sister")?
	- What is your preferred general and casual term for a sale of your unwanted items
	(which may be held on your porch, in your yard, garden, or house, from the back
	of your car, etc.)?
	- What do you call the wheeled contraption in which you carry groceries at the grocery
	store or supermarket?
	- What do you call a traffic intersection in which several roads meet in a circle
	and you have to get off at a certain point?
	- Do you pronounce r's when they aren't followed by a vowel, as in car, cart, carton,
	and so on?
	- How do you pronounce 'sawing' and 'saw it', as in "I enjoying sawing wood" and
	"she saw it"?
	- How do you pronounce 'Shah of', as in "Abbas was a famous Shah of Iran"?
	- How do you pronounce 'which' and 'witch'?
	- What do you call the meal you eat in the evening, normally somewhere between 5
	and 10 PM?
	- What do you call an upholstered seat for more than one person?
	- What do you a call a store that is devoted primarily to selling alcoholic beverages?
	- What do you call a room equipped with toilets and lavatories for public use?
	- What do you call the auxiliary brake that's attached to a rear wheel or the transmission
	and keeps the car from moving accidentally?
	- What do you call an automobile transmission system in which gears are selected
	by the driver by means of a hand-operated gearshift and a foot-operated clutch?
	- What do you call an artificial nipple, usually made of plastic, which an infant
	can suck or chew on?
	- What do you call food purchased at a restaurant to be eaten elsewhere?
	- What do you call this large aquatic bug that skims along the surface of water?
	- What do you call a narrow street or passageway between or behind buildings?
	- What do you call an unattended machine (normally outside a bank) that dispenses
	money when a personal coded card is used?
	- What do you call your fifth/smallest toe?
	- What do you call this long green herb that is used as a garnish or in soups, salads
	and stir-fry dishes? (It belongs to the genus Allium and lacks a fully-developed
	bulb.)
	- How do you pronounce the last vowel in the word "cinema"?
	- How do you pronounce the last vowel in the word "happy"?
	- How do you pronounce the letter "H"?
	- How do you pronounce the name of this small British quick bread (or cake if the
	recipe includes sugar)?
	- How do you pronounce the past tense of the verb "eat"?
	- How do you pronounce the word "again"?
	- How do you pronounce the word "bald"?
	- How do you pronounce the word "cut"?
	- How do you pronounce the word "last"?
	- How do you pronounce the word "sandwich"?
	- How do you pronounce the word "schedule"?
	- How do you pronounce the word "sixth"?
	- What do you call a a sandwich made with bread or bread roll (usually white and
	buttered) and chips, often with some sort of sauce?
	- What do you call a narrow, pedestrian lane found in urban areas which usually
	runs between or behind buildings?
	- What do you call a rack you dry your clothes on in a house?
	- What do you call a small round piece of bread typically used as a side dish?
	- What do you call a young person in cheap trendy clothes and jewellery?
	- What do you call circular junction in which road traffic must travel in one direction
	around a central island?
	- What do you call item of clothing worn on the lower part of the body from the
	waist to the ankles, covering both legs separately?
	- What do you call short undergarments worn on the lower body?
	- What do you call the creepy crawly thing that often rolls into a ball when touched?
	- What do you call the person who collects and removes rubbish from residential
	areas for further processing and disposal?
	- What do you call the popular sport played between two teams of eleven players
	with a spherical ball?
	- What do you say to call for a temporary respite or truce during a game or activity?
	- What is your general term for sweetened carbonated beverages?
	- What is your general term for the type of rubber-soled shoes that one typically
	wears for athletic activities or casual situations?
	allow_multiple: no
	- required: no
	class: integer
	name: k
	label: Number of neighbours to check
	standalone: yes
	value: 3.0
	length:
	min: 1.0
	max: 1.0
	description: Number of neighbours to check in the k-nearest neighbourgh cluster
	limit:
	min: 1.0
	max: 10.0
	- required: no
	class: character
	name: colp
	label: Color palette
	standalone: yes
	value: Set1
	length:
	min: 1.0
	max: 1.0
	description: Color paletter from colorbrewer.com
	matchable: yes
	options:
	- BrBG
	- PiYG
	- PRGn
	- PuOr
	- RdBu
	- RdGy
	- RdYlBu
	- RdYlGn
	- Spectral
	- Accent
	- Dark2
	- Paired
	- Pastel1
	- Pastel2
	- Set1
	- Set2
	- Set3
	- Blues
	- BuGn
	- BuPu
	- GnBu
	- Greens
	- Greys
	- Oranges
	- OrRd
	- PuBu
	- PuBuGn
	- PuRd
	- Purples
	- RdPu
	- Reds
	- YlGn
	- YlGnBu
	- YlOrBr
	- YlOrRd
	allow_multiple: no
	head-->

	## Data source

	Bert Vaux and Marius L. Jøhndal (University of Cambridge, United Kingdom) have just recently published some exciting results of the [The Cambridge Online Survey of World Englishes](http://www.tekstlab.uio.no/cambridge_survey/) that we try to analyse a bit further below.

	## Apologetics

	Please note that the below report is generated automatically based on a [statistical report template](http://support.rapporter.net/entries/22471338-What-is-a-template-) and the results, map, tables and all these text is generated real-time or served from cache. This means that you are now reading a non-proofread quick report written by computers.

	## Map

	First, let us plot the raw results about _<%=q%>_ gathered in the United Kingdom on a terrain map borrowed from [Google](https://developers.google.com/maps/):

	<%=
	df <- UK_language_data$df
	polies <- UK_language_data$polies
	poliesM <- UK_language_data$poliesM
	bgmap <- UK_language_data$bgmap
	smallpolies <- UK_language_data$smallpolies

	## load data
	#df <- readRDS(system.file('custom-data/UK-survey.RData', package = 'rapport.server'))

	## fix levels
	levels(df$Q)[2] <- 'What do you call the long cold sandwich that contains cold cuts, lettuce, and so on?'
	levels(df$Q)[3] <- 'What is your generic casual or informal term for a sweetened carbonated beverage?'
	levels(df$Q) <- gsub('[>\|<]', '\'', levels(df$Q))

	## filter data
	df <- df[which(df$Q == q), ]
	df$A <- factor(df$A)
	levels(df$A) <- gsub('[>\|<]', '\'', levels(df$A))

	## order
	lt <- as.numeric(table(df$A))
	lno <- length(lt)
	ln <- min(length(lt), ifelse('other' %in% levels(df$A), 6, 5))
	lo <- order(lt, decreasing = TRUE)[1:ln]
	lt <- lt[lo]
	llo <- names(table(df$A))
	ll <- llo[lo]

	## drop 5+ cats
	if (!'other' %in% ll) {
	ll <- c(ll, 'other')
	ln <- ln + 1
	}
	df$A <- as.character(df$A)
	ids <- which(!df$A %in% ll)
	if (length(ids)>0) {
	df$A[ids] <- 'other'
	}
	df$A <- factor(df$A, levels = ll)
	if (table(df$A)[['other']] == 0) {
	ll <- setdiff(ll, 'other')
	df$A <- factor(df$A, levels = ll)
	ln <- ln - 1
	}

	## strwrap
	ll <- sapply(ll, function(l) paste(strwrap(l, 30), collapse = '\n'))

	## colors
	if (nrow(df) > 0) {
	cs <- brewer.pal(ln, colp)
	ct <- alpha(cs, 0.4)
	df$cs <- df$ct <- df$A
	levels(df$cs) <- cs
	levels(df$ct) <- ct
	}

	## map data
	#polies <- readRDS(system.file('custom-data/UK-polies.RData', package = 'rapport.server'))
	#poliesM <- readRDS(system.file('custom-data/UK-polies-mercator.RData', package = 'rapport.server'))
	#bgmap <- readRDS(system.file('custom-data/UK-raster.RData', package = 'rapport.server'))
	bgmap@file@name <- "/usr/local/lib/R/site-library/rapport.server/custom-data/UK-raster-raw.gif"
	#bgmap@data@names <- system.file('custom-data/UK.raster.raw', package = 'rapport.server')

	## cluster
	centroids <- coordinates(polies)
	require(class)
	if (nrow(df) > 0) {
	cols <- knn(df[, c('LNG', 'LAT')], centroids, df$ct, k)
	}

	## update plot settins
	evalsOptions('width', 700)
	evalsOptions('height', 700)
	#evalsOptions('res', 150)
	evalsOptions('graph.unify', FALSE)
	%>

	<%=
	## plot
	set.caption(q)
	%>

	<% if (nrow(df) > 0) { %>

	<%=
	plot(bgmap, maxpixels = 10e7, xaxs = 'i', yaxs = 'i')
	+plot(poliesM, add = TRUE, col = as.character(cols))
	+points(Mercator(df[, c(2,1)]) , col = as.character(df$cs), pch = '*', cex = 2)
	+if (legend("topright", legend = paste0(ll, ' [', lt, ']'), col = cs, pch = '*', box.col = '#B2B2B2', bg = '#B2B2B2', cex = 1, plot = FALSE)$rect$h > 700000) {
	legend("topright", legend = paste0(ll, ' [', lt, ']'), col = cs, pch = '*', box.col = '#B2B2B2', bg = '#B2B2B2', cex = 1)
	} else {
	legend("topright", legend = paste0(ll, ' [', lt, ']'), col = cs, pch = '*', box.col = '#B2B2B2', bg = '#B2B2B2', cex = 1.5)
	}
	%>
	<% } else { %>
	<%= plot(bgmap, maxpixels = 10e7, xaxs = 'i', yaxs = 'i') %>
	<% } %>

	### Responses

	You can see the raw results geocoded by the Zip code of the respondents in the above map marked by coloured stars for the <%=lno%> categories offered in the survey. See the legend on the top right corner for details where the number of cases for each category is shown after the labels in square brackets.
	<% if (lno > ln) { %>

	### Merged categories

	Please note that <%=lno-ln%> categories were merged to the "other" category (n=<%=length(which(df$A == 'other'))%>) in the map for convenience:

	<%= paste(pandoc.list.return(llo[-lo]), collapse = '\n') %>
	<% } %>

	### K-nearest neighbours

	Beside the <%=nrow(df)%> answers, 192 subdivisions of the United Kingdom is also shown in similar (a bit dimmer and transparent) colours defined by [k-nearest neighbour algorithm](https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm) where _k_ being <%=k%>.This classification method builds and uses the survey data to determine the most likely category for the given subdivision based on the _k_ number of nearest neighbour(s).

	This means that setting _k_ to _1_ would find the nearest point to each subdivisions centre and colour the polygons accordingly, and using a higher number for _k_ would return a more smoothed map of colours.

	## Language usage across the UK

	Although the characteristics of the four countries addressed in this report may be seen in the above map, some more detailed descriptive statistics are also worth noting.

	### Observations

	<% if (nrow(df) > 0) { %>

	<%=
	## find country
	ps <- attributes(smallpolies)$polygons
	df$country <- factor(rowSums(sapply(1:4, function(i) {
	pss <- ps[[i]]@Polygons
	ifelse(rowSums(sapply(1:length(pss), function(j) {
	coordsPolygon <- pss[j][[1]]@coords
	point.in.polygon(df$LNG, df$LAT, coordsPolygon[, 1], coordsPolygon[, 2])
	})), i, 0)
	})))
	levels(df$country) <- smallpolies@data$NAME_1

	## crosstable
	ct <- table(df$A, df$country)
	ct
	%>

	The above table shows the number of geocoded cases for each category in each country, that is just not too informative. A row-percentage table with the marginal and emphasized based on the computed Pearson-residuals might be a lot better to check out.

	### Percentages

	<%=
	ctr <- apply(round(prop.table(addmargins(ct, 2), 2)*100, 2), c(1,2), function(s) paste0(s, '%'))
	emphasize.cols(5)
	ctres <- suppressWarnings(CrossTable(ct))$CST$stdres
	ctre <- which(ctres < -2 \| ctres > 2, arr.ind = TRUE)
	emphasize.strong.cells(ctre)
	set.caption('Residuals being higher than 2 or smaller than -2 are highlighted with bold font')
	ctr
	%>

	The last column of the above table shows the summarized distribution of the answers about _<%=q%>_ that is worth comparing to the country-specific values. The most interesting <%=nrow(ctre)%> values are highlighted based on their residuals.

	### Statistical tests

	<%=
	t <- suppressWarnings(chisq.test(ct))
	lambda <- lambda.test(ct)
	cramer <- sqrt(as.numeric(t$statistic)/(sum(ct)*min(dim(ct))))
	%>

	<%if (t$p.value < 0.05) { %>

	It seems that a real association can be pointed out between the question and the country ($\chi$=<%=as.numeric(t$statistic)%> at the degree of freedom being <%=as.numeric(t$parameter)%>) at the significance level of <%=t$p.value%>. This means that there is a significance difference in what people think about _<%=q%>_ in the analysed four countries. This association seems to be <%=ifelse(cramer < 0.5, "weak", "strong")%> based on Cramer\'s V (<%=cramer%>).

	<% } else { %>

	It seems that no real association can be pointed out between the question and the country ($\chi$=<%=as.numeric(t$statistic)%> at the degree of freedom being <%=as.numeric(t$parameter)%>) at the significance level of <%=t$p.value%>. This means that there is no significance difference in what people think about _<%=q%>_ in the analysed four countries. For this end, no further statistical tests were performed.

	<% } %>

	## Summary

	<%=
	fraction.to.string <- function(x) {
	s <- attr(fractions(x, max.denominator = 10), 'frac')
	s <- strsplit(s, '/')[[1]]
	s <- as.numeric(s)
	if (length(s) == 1 && s == 0)
	return('less then one tenth')
	if (length(s) == 1 && s == 1)
	return('more then nine tenth')
	if (s[2] > 10)
	s <- c(round(x*10, 0), 10)
	s1 <- factor(s[1], levels = 1:9)
	levels(s1) <- c('one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine')
	s2 <- factor(s[2], levels = 2:10)
	levels(s2) <- c('half', 'third', 'fourth', 'fith', 'sixth', 'seventh', 'eighth', 'ninth', 'tenth')
	paste(s1, s2)
	}
	%>

	The most popular category in the United Kingdom was <<_<%=names(which.max(prop.table(addmargins(ct, 2), 2)[, 5]))%>_>> for <<_<%=q%>_>> chosen by _<%=fraction.to.string(max(prop.table(addmargins(ct, 2), 2)[, 5]))%>_ of the respondents.

	<%if (t$p.value < 0.05) { %>

	And the most important differences between the countries can be summarised as:

	<%=
	df$citizen <- df$country
	levels(df$citizen) <- c('Brittish', 'Northern Irish', 'Scottish', 'Welsh')
	#apply(ctre[!duplicated(ctre[, 1]), ], 1, function(x) {
	res <- apply(ctre, 1, function(x) {
	paste(sample(c('it seems, that', 'one may say, that', 'in short,', 'eventually,'), 1),
	paste(paste0('_', fraction.to.string(prop.table(addmargins(ct, 2), 2)[x[1], x[2]]), '_'), 'of'),
	sample(c(paste('people living in', levels(df$country)[x[2]]),
	paste(levels(df$citizen)[x[2]], 'people')), 1),
	paste(ifelse(ctres[x[1], x[2]] < 0,
	sample(c('dislike the answer', 'do not really like the asnwer', 'tends to dislike the answer', 'disagree with', 'do not agree with'), 1),
	sample(c('like the answer', 'love the answer', 'tends to like the answer', 'agree with', 'sympathies with'), 1)),
	paste0('<<_', row.names(ctres)[x[1]], '_>>'),
	'that is', ifelse(ctres[x[1], x[2]] < 0, 'low', 'high'),
	sample(c('compared to the average', 'compared to the other countries', 'in a grand avarage', paste('compared to e.g.', sample(setdiff(levels(df$citizen), levels(df$citizen)[x[2]]), 1), sample(c('people', 'citizens'), 1)), paste('compared to lets say', sample(setdiff(levels(df$citizen), levels(df$citizen)[x[2]]), 1), sample(c('people', 'citizens'), 1)), paste('comparing to e.g.', sample(setdiff(levels(df$citizen), levels(df$citizen)[x[2]]), 1), sample(c('people', 'citizens'), 1))), 1)
	))
	})
	paste(pandoc.list.return(res), collapse = '\n')
	%>


	<% } else { %>

	And people tend to think in the same way about _<%=q%>_ all in England, Scotland, Wales and Northern Ireland. Why not give a try to [analyse another question](http://rapporter.net/api/form/b6591e70fa19b53786dc9e1f7e734e5ca26bd4c6e13acffc07fbdc77092d8c55)?

	<% } %>

	<% } else { %>
	There were no responses gathered about _<%=q%>_ in the United Kingdom. Why not give a try to [analyse another question](http://rapporter.net/api/form/b6591e70fa19b53786dc9e1f7e734e5ca26bd4c6e13acffc07fbdc77092d8c55)?
	<% } %>