Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Google Docs/Sheets/Slides automated backups script

Update: I now use a slightly different version of this script, which creates a single zip file instead of one per document, and puts a timestamp in the filename rather than overwriting the previous backup file. That version can be found at https://github.com/brokensandals/export-google-docs.

Google Apps Script that exports all your Google Docs/Sheets/Slides into docx/xlsx/pptx files and PDFs into a folder in your Google Drive. For more info and step-by-step setup instructions, see here: http://brokensandals.net/google-docs-backup

Replace INSERT_FOLDER_ID_HERE with the ID of the folder you want backups to be placed in.

Create a trigger to run the backupAll function if you want to do this on a schedule (e.g. nightly).

Notes:

  • By default, only files that you own (as opposed to files others have shared with you) will be backed up. Remove the file.getOwner() check from the backupAll method if you want to change that.
  • For each file, both an Office file (docx/xlsx/pptx) and a PDF are generated, and combined into a zip file that's placed in the backup folder. Zipping the backup files ensures that they don't clutter up the recent activity list for Docs/Sheets/Slides.
  • The script depends on the lastUpdated dates being correct on both the input files and the files in the backup directory. If that seems problematic, you could change the createOrUpdateFileForBlob method to delete existing backup files rather than updating them.

As always, this code may have defects that prevent it from working properly. Use at your own risk and remember to periodically verify that your backups are actually working as expected.

var BACKUP_FOLDER_ID = 'INSERT_FOLDER_ID_HERE';
var NATIVE_MIME_TYPES = {};
NATIVE_MIME_TYPES[MimeType.GOOGLE_DOCS] = MimeType.MICROSOFT_WORD;
NATIVE_MIME_TYPES[MimeType.GOOGLE_SHEETS] = MimeType.MICROSOFT_EXCEL;
NATIVE_MIME_TYPES[MimeType.GOOGLE_SLIDES] = MimeType.MICROSOFT_POWERPOINT;
var NATIVE_EXTENSIONS = {};
NATIVE_EXTENSIONS[MimeType.GOOGLE_DOCS] = '.docx';
NATIVE_EXTENSIONS[MimeType.GOOGLE_SHEETS] = '.xlsx';
NATIVE_EXTENSIONS[MimeType.GOOGLE_SLIDES] = '.pptx';
var BACKUP_MIME_TYPES = Object.keys(NATIVE_MIME_TYPES);
function backupAll() {
const backupFolder = DriveApp.getFolderById(BACKUP_FOLDER_ID);
BACKUP_MIME_TYPES.forEach(function(mimeType) {
var files = DriveApp.getFilesByType(mimeType);
while (files.hasNext()) {
var file = files.next();
if (file.getOwner() && file.getOwner().getEmail() == Session.getActiveUser().getEmail()) {
backup(file, backupFolder);
}
}
});
}
function backup(file, folder) {
var targetName = file.getName() + ' ' + file.getId();
var lastUpdated = file.getLastUpdated();
var pdf = getPdfBlob(file);
var native = getNativeBlob(file);
var zip = Utilities.zip([pdf, native], targetName + '.zip');
createOrUpdateFileForBlob(zip, folder, lastUpdated);
}
function createOrUpdateFileForBlob(blob, folder, ifOlderThan) {
var existingFiles = folder.getFilesByName(blob.getName());
if (existingFiles.hasNext()) {
var file = existingFiles.next();
if (file.getLastUpdated() < ifOlderThan) {
updateFile(file, blob);
}
} else {
folder.createFile(blob);
}
}
function updateFile(file, blob) {
const url = 'https://www.googleapis.com/upload/drive/v2/files/' + file.getId() + '?uploadType=media';
const params = {
method: 'put',
headers: { Authorization: 'Bearer ' + ScriptApp.getOAuthToken() },
payload: blob
};
var response = UrlFetchApp.fetch(url, params);
if (response.getResponseCode() < 200 || response.getResponseCode() > 299) {
throw 'Failed to update file named ' + file.getName();
}
}
function getPdfBlob(file) {
var blob = file.getAs('application/pdf');
return blob;
}
function getNativeBlob(file) {
const nativeMimeType = NATIVE_MIME_TYPES[file.getMimeType()];
const extension = NATIVE_EXTENSIONS[file.getMimeType()];
const url = 'https://www.googleapis.com/drive/v2/files/' + file.getId() + '/export?mimeType=' + nativeMimeType;
const params = {
method: 'get',
headers: { Authorization: 'Bearer ' + ScriptApp.getOAuthToken() }
};
const blob = UrlFetchApp.fetch(url, params).getBlob();
blob.setName(file.getName() + extension);
return blob;
}
@felix11h

This comment has been minimized.

Copy link

felix11h commented Nov 20, 2019

Exactly what I've been looking for and works precisely as described. I don't have any further comments, just wanted to say thanks for sharing this!

@brokensandals

This comment has been minimized.

Copy link
Owner Author

brokensandals commented Nov 20, 2019

@felix11h awesome, glad you found it useful!

I did run into one limitation of this approach: the API has a file size limit for exports (10MB if I remember right). So for example a document with lots of photos in it may fail to export. I haven’t put much time into looking for a workaround since I rarely keep large files in Google Docs.

@felix11h

This comment has been minimized.

Copy link

felix11h commented Nov 21, 2019

@brokensandals thanks for pointing that out. I'll keep it in mind as it might become important later on, but certainly not a problem for me right now either!

@FutureBrock

This comment has been minimized.

Copy link

FutureBrock commented Mar 4, 2020

@brokensandals Bingo! This works perfectly.
There is only 1 tweak I want to make and I am having difficulty figuring it out.
Do you know how I can change this so that it does not ZIP the files? I want all the files in the backup folder to just be .XLSX etc. without being zipped.
I have toyed around with the "backup" function but have not been successful.

@brokensandals

This comment has been minimized.

Copy link
Owner Author

brokensandals commented Mar 4, 2020

@FutureBrock glad you found it helpful! To skip zipping, replace the backup function with the following:

function backup(file, folder) {
  var lastUpdated = file.getLastUpdated();
  
  var pdf = getPdfBlob(file);
  var native = getNativeBlob(file);
  
  createOrUpdateFileForBlob(pdf, folder, lastUpdated);
  createOrUpdateFileForBlob(native, folder, lastUpdated);
}

This will dump both the .pdf and the .xlsx/etc files directly into the backup folder. If you don't want the pdf files, remove the two lines that mention pdf.

The big downside of not zipping the files is that all the backups will now show up in your recent files list if you go to sheets.google.com, etc, every time the backup runs. Zipping was a hack to avoid that.

@ghost

This comment has been minimized.

Copy link

ghost commented Mar 20, 2020

This is amazing. Is there a way to automatically name the exported zips as "file-YYYY-MM-DD"?

@brokensandals

This comment has been minimized.

Copy link
Owner Author

brokensandals commented Mar 20, 2020

@cnlbrnsllvn Thanks for comment! Try changing the first two lines of the backup function to this:

var lastUpdated = file.getLastUpdated();
var targetName = file.getName() + ' ' + file.getId() + '-' + lastUpdated.toISOString().slice(0, 10);

I haven't tested it, but I think that should work. The date in each zip file name will be the date that the original file had last been modified, at the time the backup was made.

@ghost

This comment has been minimized.

Copy link

ghost commented Mar 20, 2020

@cnlbrnsllvn Thanks for comment! Try changing the first two lines of the backup function to this:

var lastUpdated = file.getLastUpdated();
var targetName = file.getName() + ' ' + file.getId() + '-' + lastUpdated.toISOString().slice(0, 10);

I haven't tested it, but I think that should work. The date in each zip file name will be the date that the original file had last been modified, at the time the backup was made.

Thanks!

@eCigDistributors

This comment has been minimized.

Copy link

eCigDistributors commented Mar 28, 2020

I want to remove formulas and only copy values, I also want to keep formatting and colors though.

Will you please help me with that?

Thank you for sharing this

@brokensandals

This comment has been minimized.

Copy link
Owner Author

brokensandals commented Mar 28, 2020

@eCigDistributors try replacing the backup function with the following:

function backup(file, folder) {
  var targetName = file.getName() + ' ' + file.getId();
  var lastUpdated = file.getLastUpdated();
  
  var pdf = getPdfBlob(file);
  
  var native;
  if (file.getMimeType() === MimeType.GOOGLE_SHEETS) {
    // Create a copy of the file with all formulas replaced with their literal values
    // This code was adapted from https://support.google.com/docs/thread/17355100?hl=en&msgid=17382472
    var copy = file.makeCopy('VALUES ' + file.getName());
    var spreadsheet = SpreadsheetApp.open(copy);
    var sheets = spreadsheet.getSheets();
    for (var si = 0; si < sheets.length; si++) {
      var sheet = sheets[si];
      if (sheet.getLastRow() > 0 && sheet.getLastColumn() > 0) {
        var range = sheet.getRange(1, 1, sheet.getLastRow(), sheet.getLastColumn());
        range.setValues(range.getValues());
      }
    }
    native = getNativeBlob(copy);
    copy.setTrashed(true);
  } else {
    native = getNativeBlob(file);
  }
  
  var zip = Utilities.zip([pdf, native], targetName + '.zip');
  
  createOrUpdateFileForBlob(zip, folder, lastUpdated);
}

This doesn't work perfectly. In some of my spreadsheets, it does replace most or all formulas with values in the backups; but in some spreadsheets, I still see some formulas. So there is probably a bug in one of the three lines beginning at "if (sheet.getLastRow()...". I don't think I'll have time to investigate further, though.

@K-Langoso

This comment has been minimized.

Copy link

K-Langoso commented Apr 9, 2020

Hi!

This script is just what I needed. I had a little trouble running it but I think it is just when run manually. Posting here if anyone runs into the same thing.

When manually run, it gives an error at row 30, can't get property getName on “undefined”. However, when run automatically by the scheduler, it works fine. (Never mind, I was running a sub-function. 🤦‍♂️ See below.)

So, success, I guess! 😁

Thanks!

@brokensandals

This comment has been minimized.

Copy link
Owner Author

brokensandals commented Apr 9, 2020

@K-Langoso thanks for the comment! Make sure you run the backupAll function, not the backup function. Trying to run the backup function could give the error you described (because it depends on parameters that have to be supplied by the backupAll function).

Screen Shot 2020-04-09 at 12 15 47 PM

@K-Langoso

This comment has been minimized.

Copy link

K-Langoso commented Apr 9, 2020

Yep. Looks like it. 🙈

I'll update my comment accordingly, thanks!

@jmcastellano

This comment has been minimized.

Copy link

jmcastellano commented Apr 14, 2020

Hi,
I also need to make a copy of all the .pdf files.
I have tried adding it as NATIVE_MIME_TYPES [MimeType.pdf] but it doesn't work,
could you help me?

@brokensandals

This comment has been minimized.

Copy link
Owner Author

brokensandals commented Apr 14, 2020

@jmcastellano The script should already include PDF copies of your docs/sheets/slides in the .zip files it produces. If you're wanting to make backups of other PDF files you have stored in Drive, do this:

  1. Change the line var BACKUP_MIME_TYPES... to var BACKUP_MIME_TYPES = Object.keys(NATIVE_MIME_TYPES).concat([MimeType.PDF]);
  2. Rewrite the backup function as follows:
function backup(file, folder) {
  var targetName = file.getName() + ' ' + file.getId();
  var lastUpdated = file.getLastUpdated();
  
  var toZip;
  if (NATIVE_MIME_TYPES[file.getMimeType()]) {
    var pdf = getPdfBlob(file);
    var native = getNativeBlob(file);
    toZip = [pdf, native];
  } else {
    toZip = [file];
  }
  
  var zip = Utilities.zip(toZip, targetName + '.zip');
  
  createOrUpdateFileForBlob(zip, folder, lastUpdated);
}
@jmcastellano

This comment has been minimized.

Copy link

jmcastellano commented Apr 15, 2020

Thanks!! it works perfectly!!

@freetimecoder

This comment has been minimized.

Copy link

freetimecoder commented May 22, 2020

Hello!

This script is amazing, thank you very much for making it!

I have a problem exporting files bigger than 10mb though, I searched and it looks like the answer is to use a method called exportLinks. However I do not understand enough of the api and script to implement it. Any pointers would be appreciated.

Also is it possible to retain the folder structure in the backup? At the moment the script exports all documents into the same folder.

Kind Regards!

@brokensandals

This comment has been minimized.

Copy link
Owner Author

brokensandals commented May 22, 2020

Hi @freetimecoder! Good find regarding exportLinks, I didn't realize that existed. Using that, I was able to get the script to download exports of files larger than 10MB, but it fails when it goes to upload them. I think this is just a limitation of UrlFetchApp, or maybe just the particular way I'm using UrlFetchApp, not the Google Drive API itself. I can't devote any more time to it right now, but here's the code in case you want to play with it: https://github.com/brokensandals/export-google-docs/tree/zip-per-document-large-files (the updateFile method is what would have to be fixed).

(If you get that working, though, you may still run into another issue if you have a lot of large files: there's a limit to how long a script can run, and streaming a bunch of large files to/from the drive API might take too long.)

Regarding retaining the folder structure, that would take some work. One option would be to use the https://developers.google.com/apps-script/reference/drive/file#getParents() method on the file objects to find the parent folder, and walk that recursively until you get to the root, building a list of all the folders along the way. (This is complicated by the fact that a file can have multiple parents, but if you never use that feature you could just use the first parent. I think Google is removing the multiple-parents feature later this year anyway.) Then you could go back through the list and use https://developers.google.com/apps-script/reference/drive/folder#createFolder(String) to create corresponding folders inside the backup directory.

Alternatively, you could change the backupAll method so that instead of just requesting all files for a given mime type, it starts at https://developers.google.com/apps-script/reference/drive/drive-app#getRootFolder() and walks the directory tree recursively, so that you'd always know where you are in the tree. Then you can create corresponding folders in the backup directory as you go along.

@freetimecoder

This comment has been minimized.

Copy link

freetimecoder commented May 22, 2020

Thanks for your answer @brokensandals!

I've been reading into the whole thing today and made some adjustments, especially for speed as the script terminates after a few minutes. Since I don't want to zip or create pdfs I removed them from my version. A probably volatile solution to get more speed is to hard code the export links, skipping one query (I also removed the . from the extension list):

var NATIVE_EXPORT_URLS = {};
NATIVE_EXPORT_URLS[MimeType.GOOGLE_DOCS] = "https://docs.google.com/feeds/download/documents/export/Export?id=";
NATIVE_EXPORT_URLS[MimeType.GOOGLE_SHEETS] = "https://docs.google.com/spreadsheets/export?id=";
NATIVE_EXPORT_URLS[MimeType.GOOGLE_SLIDES] = "https://docs.google.com/feeds/download/presentations/Export?id=";

[...]

function getNativeBlob(file) {
  const nativeMimeType = NATIVE_MIME_TYPES[file.getMimeType()];
  const extension = NATIVE_EXTENSIONS[file.getMimeType()];
  const exportURL = NATIVE_EXPORT_URLS[file.getMimeType()];
  
  //Construct query url
  const url = exportURL + file.getId() + '&exportFormat='+extension;

  const params = {
    method: 'get',
    headers: { Authorization: 'Bearer ' + ScriptApp.getOAuthToken() }
  };
  
  //Download converted file
  const blob = UrlFetchApp.fetch(url, params).getBlob();
  
  var fileName = file.getName() +"."+ extension;
  
  //Set filename
  blob.setName(fileName);
  return blob;
}

I got another big speed gain from checking if the existing backup file is newer before downloading the blobs and removed the check in the update function:

BACKUP_MIME_TYPES.forEach(function(mimeType) {
	//Iterate through all files on the drive
	var files = DriveApp.getFilesByType(mimeType);
	while (files.hasNext()) {
	  var file = files.next();
			   
	  //Check if older backups exist of this file
	  var existingFiles = backupFolder.getFilesByName(file.getName()+"."+NATIVE_EXTENSIONS[mimeType])
	  if(existingFiles.hasNext()) {
		var existingFile = existingFiles.next();

		//Don't backup if existing backup is up to date
		if(file.getLastUpdated() < existingFile.getLastUpdated())
		{
		  continue;
		}
	  }
	  
	  //Backup file 
	  backup(file, backupFolder);
	}
});

This way the existing files are basically skipped over fast, which means the script can run multiple times and has a better chance to reach the end if the files at the front are done by a previous iteration. It's not ideal as it still suffers from the same problem as before if there are too many files. I read about being able to cache values between script runs so I'll look into that next. (Or not, depending on time I can spend on it)

Thanks for the links on folder structure, the multi parents threw me off, I thought it was supposed to list every parent kind of like this [Root, Subfolder, SubSubfolder], which it doesn't. I think walking the tree is probably faster than reconstructing the hierarchy from each file.

Thanks again, the script was a fantastic starting point to reverse engineer! I will report back if I get the folders to work.

Kind Regards

@pantha007

This comment has been minimized.

Copy link

pantha007 commented Jun 3, 2020

@freetimecoder, can you please help me I am trying to implement your changes but can not seem to get it to run properly, is there any chance I can see you full code ??? ( without the drive link of course :-) )

@freetimecoder

This comment has been minimized.

Copy link

freetimecoder commented Jun 3, 2020

@pantha007 Sure, I put my forked changes up here:
https://gist.github.com/freetimecoder/dc916ace62b0e67f89ed80e617b6f630
Beware though, it's probably not as stable, because I optimized for speed and skipped some checks along the way.

@pantha007

This comment has been minimized.

Copy link

pantha007 commented Jun 3, 2020

@freetimecoder, thanks so much I will be keeping an eye on that :-) :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.