Skip to content

Instantly share code, notes, and snippets.

@tbobin
Last active November 28, 2018 07:29
Show Gist options
  • Save tbobin/7834e764255610cc36eb8ede075dce76 to your computer and use it in GitHub Desktop.
Save tbobin/7834e764255610cc36eb8ede075dce76 to your computer and use it in GitHub Desktop.
How to extract attachments with msgxtractr from outlook .msg files
library(tidyverse)
library(msgxtractr)
library(lubridate)
myReadSave_attachments <- function(msgFile){
# read the msg file
msg <- read_msg(msgFile)
# extract date from mail header
date <- str_extract(msg$headers$Date,
"(?<=,\\s)[0-9]+\\s[[:graph:]]{2,3}\\s[0-9]{4}(?=\\s[0-9]+:)")
# format date
date <- lubridate::dmy(date)
# directory of attachment to be stored
att_dir <- str_glue("./Data/attachments/{date}")
# create directory where the attachments will be stored
if (!isTRUE(file.info(att_dir)$isdir)) dir.create(att_dir)
# save attachment to directory
save_attachments(msg, path = att_dir, use_short = F)
### moving processed msg files
# extract filename from path
filename <- str_extract(msgFile, "(?<=/)([^/]+\\.msg$)")
# destiny to move the file
moveTo <- str_glue("./Data/Mails_processed/{filename}")
# move the file
file.rename(from = msgFile, to = moveTo)
}
# read outlook file
msg_files <- list.files("./Data/Mails", pattern = ".msg", full.names = T)
# walk throug all files in dir
walk(msg_files, myReadSave_attachments)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment