Created
April 12, 2018 20:49
-
-
Save jthomasmock/2b86d49070e67914e10d798e19d67707 to your computer and use it in GitHub Desktop.
code to find group data
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: 'Find the #Group' | |
output: | |
html_document: | |
df_print: paged | |
--- | |
```{r} | |
library(tidyverse) | |
``` | |
## Create the dataframe | |
Here is the dataframe with weird group inserts! | |
```{r} | |
# create a dataframe for reprex | |
df <- data.frame(col1 = c("#GROUP 1", 1, 4, "#GROUP 2", 7, 10, "#GROUP 3", 13, 16), | |
col2 = c(NA, 2, 5, NA, 8, 11, NA, 14, 17), | |
col3 = c(NA, 3, 6, NA, 9, 12, NA, 15, 18)) | |
df | |
``` | |
## find where the groups are | |
We `grep` to find where the rows are and then combine into a dataframe that tells us what group corresponds to top and bottom row for each of the groups. In this case, group1 is rows 2 and 3 (because #GROUP1 is at row 1, and #GROUP4 is at row 4). | |
```{r} | |
# get a string that identifies row location of the `#GROUP` | |
group_range <- grep("#GROUP", df$col1) | |
# what rows have group in them? | |
group_range | |
``` | |
## Combine into a dataframe | |
This lets us see where the matches of group_number are with the range of rows. | |
```{r} | |
# string with group numbers | |
group_number <- 1:length(group_range) | |
# combine the group_number with group_range | |
group_df <- data.frame(group_number, group_range) | |
group_df | |
``` | |
## Function for finding top and bottow rows for each of the groups | |
This spits out what the rows of actual data are for each of the groups. | |
```{r} | |
find_range_fun <- function(top_group, bottom_group, df_name) { | |
print(paste0("The top number is ", df_name[top_group, 2] + 1)) | |
print(paste0("The bottom number is ", df_name[bottom_group, 2] - 1)) | |
} | |
``` | |
### Test the function | |
test the function for group 1 | |
```{r} | |
# test function for group 1 (between just below #GROUP 1 and just above #GROUP 2) | |
find_range_fun(top_group = 1, | |
bottom_group = 2, | |
group_df) | |
``` | |
and for group 2. | |
```{r} | |
# test function for group 2 (between just below #GROUP 2 and just above #GROUP 3) | |
find_range_fun(2, 3, group_df) | |
``` | |
## Function for subsetting original dataframe | |
This dataframe grabs each of the subsets. | |
```{r} | |
# create a function that combines between subsets of rows | |
find_fun <- function(row_top, row_bottom, grp_num, df_name, rep_length){ | |
cbind(df_name[row_top:row_bottom, ], group_var = rep(grp_num, rep_length)) | |
} | |
``` | |
### Testing the function | |
This works for group 1 and group 2. | |
```{r} | |
#testing the function | |
group_1_subset <- find_fun(row_top = 2, | |
row_bottom = 3, | |
grp_num = 1, | |
df_name = df, | |
rep_length = 2) | |
group_1_subset | |
#testing the function | |
group_2_subset <- find_fun(row_top = 5, | |
row_bottom = 6, | |
grp_num = 2, | |
df_name = df, | |
rep_length = 2) | |
group_2_subset | |
``` | |
## Look at raw DF | |
```{r} | |
# compare those results to raw df | |
df | |
``` | |
## Compared to the swapped data. | |
```{r} | |
rbind(group_1_subset, group_2_subset) | |
``` | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment