Skip to content

Instantly share code, notes, and snippets.

@ArephB
Created October 7, 2013 12:11
Show Gist options
  • Select an option

  • Save ArephB/6866827 to your computer and use it in GitHub Desktop.

Select an option

Save ArephB/6866827 to your computer and use it in GitHub Desktop.
Homework 5 Visualize anything with ggplot2
========================================================
In this Assignment we have used a data set from [CANSIM tables](http://www5.statcan.gc.ca/cansim/home-accueil?lang=eng&p2=50&HPA). We are going to work on the data in Table 202-0101 : Distribution of earnings, by sex, in 2011 constant dollars. This table contains 2100 series, with data for years 1976 - 2011 (not all combinations necessarily have data for all years), and was last released on 2013-06-27.
This table contains data described by the following dimensions (Not all combinations are available):
* Geography (35 items: Canada; Atlantic provinces; Newfoundland and Labrador; Prince Edward Island; ...)
* Sex (3 items: Both sexes; Males; Females)
* Earnings group (20 items: Average earnings; Median earnings; Average total income; Median total income; ...)
We have used only a part of this data set, containing only 240 observations through 7 variables.
# Loading the Data and Initializations
```{r}
ErnDat <- read.csv("EarningDistribution.csv")
str(ErnDat)
```
We have selected the past 24-years data (1988-2011) and restricted ourselves to the following parameters:
* Income: average total income (dollars) for the corresponding sex group and province in a particular year
* EarnersCount: Number of all earners (x1000)
* FYFTEarning: Average earnings of full-year full-time workers (dollars)
* FYFTCount: Number of full-year full-time workers (x1000)
```{r}
library(ggplot2)
```
# Income Distribution on Sex
In this part we want to depict the empirical distribution of income for both sexes separately and compare these distributions.
```{r fig.width=10, fig.height=7}
ggplot(ErnDat, aes(x = Income, color = SEX)) + geom_density() + facet_wrap(~Province) + xlab("average total income (dollars)")
```
As we see, in all the provinces, men generally earn higher incomes compared to women.
# Earnings of Full-Year Full-Time workers in different Provinces
To compare the average of earnings of an FYFT worker among Provinces, we can use the following simple diagram.
```{r fig.width=10, fig.height=7}
ggplot(ErnDat, aes(reorder(Province, FYFTEarning),FYFTEarning)) + geom_point()+geom_jitter(position = position_jitter(width = .1)) + facet_wrap(~SEX)+ ylab("Earnings of a Full-Year Full-Time worker (dollars)")
```
We see that FYFT workers in Ontario have the highest average earnings among all the provinces (no matter what the sex type is). We see that the highest level of earnings of female FYFT workers ,across different Provinces in Canada, is not as much as the lowest level of earning of male workers.
# Full-Year Full-Time workers over time
```{r fig.width=10, fig.height=7}
ggplot(ErnDat, aes(Year, FYFTCount, color=SEX)) + geom_point()+geom_line()+ facet_wrap(~Province) + ylab("Number of Full-Year Full-Time workers")+ xlab("Provinces")
```
As we see there is more or less a gap between the number of Full-Year Full-Time men workers and that of women. As an example, this gap has been vanishing in Atlantic Provinces.
# Full time Average Earning vs. Average Income
We want to show the relation between the earning of a full-year full-time person and her/his average income.We have depicted different sexes with different colors.
```{r fig.width=10, fig.height=7}
ggplot(ErnDat, aes(x = Income, y = FYFTEarning,col=SEX)) + geom_point() + geom_smooth(method="lm") + xlab("Average total income (dollars)") + ylab("Average earnings of full-year full-time workers (dollars)")
```
We see that there is a close relationship between the total income and the average earnings of a full-time worker, across countries in different years.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment