Skip to content

Instantly share code, notes, and snippets.

@benmarwick
Last active February 9, 2021 15:06
Show Gist options
  • Star 9 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save benmarwick/9266072 to your computer and use it in GitHub Desktop.
Save benmarwick/9266072 to your computer and use it in GitHub Desktop.
Convert a single CSV file (one text per row) into separate text files. A function in R.
#' Making several text files from a single CSV file
#'
#' Convert a single CSV file (one text per row) into
#' separate text files. A function in R.
#'
#' To use this function for the first time run:
#' install.packages("devtools")
#' then thereafter you just need to load the function
#' fom github like so:
#' library(devtools) # windows users need Rtools installed, mac users need XCode installed
#' source_url("https://gist.github.com/benmarwick/9266072/raw/csv2txts.R")
#'
#' Here's how to set the argument to the function
#'
#' mydir is the full path of the folder that contains your csv file
#' for example "C:/Downloads/mycsvfile" Note that it must have
#' quote marks around it and forward slashes, which are not default
#' in windows. This should be a folder with a *single* CSV file in it.
#'
#' labels refers to the column number with the document labels. The
#' default is the first column, but in can your want to use a different
#' column you can set it like so (for example if col 2 has the labels):
#' labels = 2
#'
#'
#' A full example, assuming you've sourced the
#' function from github already:
#' csv2txt("C:/Downloads/mycsvfile", labels = 2)
#' and after a moment you'll get a message in the R console
#' saying 'Your texts files can be found in C:/Downloads/mycsvfile'
csv2txt <- function(mydir, labels = 1){
# Get the names of all the CSV file
mycsvfile <- list.files(mydir, full.names = TRUE, pattern = "*.CSV|.csv")
# Read the actual contexts of the text files into R and rearrange a little.
# create a list of dataframes containing the text
mycsvdata <- read.csv(mycsvfile)
# combine all except the first column together into
# one long character string for each row
mytxtsconcat <- apply(mycsvdata[-(1:labels)], 1, paste, collapse=" ")
# make a dataframe with the file names and texts
mytxtsdf <- data.frame(filename = mycsvdata[,labels], # get the first col for the text file names
fulltext = mytxtsconcat)
# Now write one text file for each row of the csv
# use 'invisible' so we don't see anything in the console
setwd(mydir)
invisible(lapply(1:nrow(mytxtsdf), function(i) write.table(mytxtsdf[i,2],
file = paste0(mytxtsdf[i,1], ".txt"),
row.names = FALSE, col.names = FALSE,
quote = FALSE)))
# now check your folder to see the txt files
message(paste0("Your text files can be found in ", getwd()))
}
@keilabcs
Copy link

keilabcs commented Nov 2, 2018

I receive it as an answer can I help?

Error in file(file, ifelse(append, "a", "w")) :
não é possível abrir a conexão
Além disso: Warning message:
In file(file, ifelse(append, "a", "w")) :

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment