Skip to content

Instantly share code, notes, and snippets.

@jmodjeska
Forked from roblg/gist:2400902
Last active July 25, 2023 10:03
Show Gist options
  • Save jmodjeska/b0af2372c75c903700aeca4afb1fd56f to your computer and use it in GitHub Desktop.
Save jmodjeska/b0af2372c75c903700aeca4afb1fd56f to your computer and use it in GitHub Desktop.
retrieving a basic auth-protected CSV with Google Spreadsheets and Google App Scripting
/** Import CSV with basic auth
* https://modjeska.us/csv-google-sheets-basic-auth/
* Synopsis:
populateSheetWithCSV(foo, csvUrl, base64pw, ignoreHeaders, bustCache)
* Usage:
=populateSheetWithCSV("Any Value", "https://csv-url", "base64pw", TRUE, TRUE)
* Variables:
* foo: Any value. Google Sheets doesn't always recalculate your formula.
To force it, you can manually or formulaically change this value,
which will trigger a fresh fetch of the CSV data from the source.
* csvUrl: Location of the CSV.
* base64pw: Pre-encoded base-64 password. You can use something like
https://www.base64encode.net/ to get the encoded version of
username:password.
* ignoreHeaders: If TRUE, doesn't return the first row of the CSV.
* bustCache: If TRUE, appends a random token to the CSV URL to traverse
a cache on the host serving your CSV.
**/
// Helper function: generate a random number for a cache busting token
function cacheBust() {
return Math.floor((Math.random() * 100000) + 1);
}
// Helper function: parse the CSV response
function parseCsvResponse(csvString, ignoreHeaders) {
var retArray = [];
var strLines = csvString.split(/\n/g);
startLine = ignoreHeaders ? 1 : 0;
for (var i = startLine; i < strLines.length; i++) {
var line = strLines[i];
if (line != '') {
retArray.push(line.split(/,(?=(?:(?:[^"]*"){2})*[^"]*$)/));
}
}
return retArray;
}
// Main function: retrieve the CSV and poppulate the data in-place
function populateSheetWithCSV(foo, csvUrl, base64pw, ignoreHeaders, bustCache) {
var url = cacheBust ? csvUrl .concat(cacheBust()) : csvUrl;
var resp = UrlFetchApp.fetch(url, {
headers: {
'Authorization': 'Basic '.concat(base64pw)
}
});
var csvContent = parseCsvResponse(resp.getContentText(), ignoreHeaders);
return csvContent;
}
@jmodjeska
Copy link
Author

jmodjeska commented Aug 16, 2018

I really appreciate the original Gist; it got me started on the right path to solving an important problem. I suspect the various things that weren't working in the original are due to changes over time in Google Sheets functionality rather than any oversight by the author. But, since we live in the future now, here's the modified solution I ended up with.

This version solves a few problems

  1. CSVs often enclose commas inside sets of double quotes. We should be able to deal with that.
  2. Google restricts custom scripts' access to some of the functions used in the original version, including clearContents and setValue.
  3. This kind of function is most intuitive if it pulls the data in-place — the kind of behavior you get with IMPORTDATA() — rather than trying to clear and re-populate a remote sheet.
  4. Avoids storing plaintext username and password right there in the spreadsheet (not really more secure though; don't be fooled).

Gold plating

  1. Optionally, ignore CSV header rows.
  2. Optionally, use a cache-busting token in case your CSV source is caching and serving you old data.
  3. Allow for an arbitrary variable so you can force Google to recalculate the formula on-demand.

@swvajanyatek
Copy link

@jmodjeska - this is great. Any chance you worked on a similar script to mimic IMPORTHTML with auth? thanks for this script!

@jmodjeska
Copy link
Author

@swvajanyatek — I have not done this with HTML. There is a JSON solution here that I have used successfully, so if your HTML page has a JSON endpoint behind it that could be an option for you. Otherwise I don't know what to suggest; the crux of the problem would be how to hand off a variable full of raw HTML to Google's importHTML function in order to take advantage of its parsing capabilities.

@TimetravelerDD
Copy link

love it!!! I have only one minor suggestion:

If a field contains commas and is therefore enclosed in quotes, the script is correctly not splitting up the field, however it should also omit the quotes in this case which it doesn't do yet. How could we achieve that?

image

@TimetravelerDD
Copy link

TimetravelerDD commented Oct 25, 2019

after some trail and error and with the help of my dear friend Lars I found a solution to correctly handle the double quotes in the csv. Feel free to include it in your version if you like it.

On top of that I implemented context help for the function and made an optional change that enables you to supply the unencrypted string


/**
Import CSV with basic auth
  * https://modjeska.us/csv-google-sheets-basic-auth/
  * Synopsis:
    populateSheetWithCSV(foo, csvUrl, base64pw, ignoreHeaders, bustCache)
  * Usage:
    =populateSheetWithCSV("Any Value", "https://csv-url", "base64pw", TRUE, TRUE)
  * Variables:
  * foo:            Any value. Google Sheets doesn't always recalculate your formula. 
                    To force it, you can manually or formulaically change this value,
                    which will trigger a fresh fetch of the CSV data from the source.
  * csvUrl:         Location of the CSV.
  * base64pw:       Pre-encoded base-64 password. You can use something like 
                    https://www.base64encode.net/ to get the encoded version of
                    username:password.
  * ignoreHeaders:  If TRUE, doesn't return the first row of the CSV.
  * bustCache:      If TRUE, appends a random token to the CSV URL to traverse
                    a cache on the host serving your CSV. 

*/

// Helper function: generate a random number for a cache busting token 
function cacheBust() {
  return Math.floor((Math.random() * 100000) + 1);
}

// Helper function: parse the CSV response
function parseCsvResponse(csvString, ignoreHeaders) {
  var retArray = [];
  var row = [];
  var strLines = csvString.split(/\n/g);
  startLine = ignoreHeaders ? 1 : 0;
  for (var i = startLine; i < strLines.length; i++) {
    var line = strLines[i];
    if (line != '') {
      row = line.split(/,(?=(?:(?:[^"]*"){2})*[^"]*$)/)
      for (var j = 0; j < row.length; j++) {
        row[j] = row[j].replace(/^"(.*(?="$))"$/, '$1'); //remove double quotes at the start and the end
        row[j] = row[j].replace(/""/g, /"/); //unescape escaped double quotes
      }
      retArray.push(row);                                                                          
    }
  }
  return retArray;
}

// Main function: retrieve the CSV and poppulate the data in-place

/**
Import CSV with basic auth
@param {1} foo Any value. Change it to force an update of the CSV
@param {"toggl.com/reports/api/v2/details.csv?rounding=Off&status=active&calculate=time"} csvUrl Location of the CSV
@param {"dXNlcm5hbWU6cGFzc3dvcmQ="} base64pw Pre-encoded base-64 username:password. Get it from https://www.base64encode.net/ or a similar website
@param {FALSE} ignoreHeaders If TRUE, doesn't return the first row of the CSV.
@param {FALSE} bustCache If TRUE, appends a random token to the CSV URL to traverse a cache on the host serving your CSV.
@customfunction*/
function populateSheetWithCSV(foo, csvUrl, base64pw, ignoreHeaders, bustCache) {
  //var base64pw = Utilities.base64Encode(base64pw); //uncomment this line if you want to supply the unencrypted 'user:pass' string 
  var url = cacheBust ? csvUrl .concat(cacheBust()) : csvUrl;
  var resp = UrlFetchApp.fetch(url, {
    headers: {
      'Authorization': 'Basic '.concat(base64pw)
    }
  });
  var csvContent = parseCsvResponse(resp.getContentText(), ignoreHeaders);
  return csvContent;  
}

@TMac1112
Copy link

Hi.
The works works well for me, but is it possible to:

  1. Add a sort by asc or DESC option?
  2. Remove/Replace any linebreaks when importing the csv file?

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment