Skip to content

Instantly share code, notes, and snippets.

@allenyllee
Forked from paulgambill/import_json_appsscript.js
Last active August 2, 2023 09:28
Show Gist options
  • Save allenyllee/c764c86ed722417948fc256b7a5077c4 to your computer and use it in GitHub Desktop.
Save allenyllee/c764c86ed722417948fc256b7a5077c4 to your computer and use it in GitHub Desktop.
// import_json_appsscript.js
// https://gist.github.com/allenyllee/c764c86ed722417948fc256b7a5077c4
//
// Changelog:
// (Oct. 16 2019) tag: allenyllee-20191016
// 1. Fixed google script error: urlfetchapp - service invoked too many times https://stackoverflow.com/questions/10598179/google-apps-script-urlfetchapp-service-invoked-too-many-times
// (Jul. 16 2018) tag: allenyllee-20180716
// 1. Fixed the issue "If you try to query /arrayA[k]/arrayB[n]/arrayC[m]/.../member, you will always get /arrayA[k]/arrayB[k]/arrayC[k]/.../member."
// (Nov. 30 2017) tag: allenyllee-20171130
// 1. Add the ability to query array elements by using xpath like "/array[n]/member" where "n" is array index
// 2. Fixed Issue: when the path and the rule partially matched but not under the same xpath, it will still print that value under that xpath.
/**
* Retrieves all the rows in the active spreadsheet that contain data and logs the
* values for each row.
* For more information on using the Spreadsheet API, see
* https://developers.google.com/apps-script/service_spreadsheet
*/
function readRows() {
var sheet = SpreadsheetApp.getActiveSheet();
var rows = sheet.getDataRange();
var numRows = rows.getNumRows();
var values = rows.getValues();
for (var i = 0; i <= numRows - 1; i++) {
var row = values[i];
Logger.log(row);
}
};
/**
* Adds a custom menu to the active spreadsheet, containing a single menu item
* for invoking the readRows() function specified above.
* The onOpen() function, when defined, is automatically invoked whenever the
* spreadsheet is opened.
* For more information on using the Spreadsheet API, see
* https://developers.google.com/apps-script/service_spreadsheet
*/
function onOpen() {
var sheet = SpreadsheetApp.getActiveSpreadsheet();
var entries = [{
name : "Read Data",
functionName : "readRows"
}];
sheet.addMenu("Script Center Menu", entries);
};
/*====================================================================================================================================*
ImportJSON by Trevor Lohrbeer (@FastFedora)
====================================================================================================================================
Version: 1.1
Project Page: http://blog.fastfedora.com/projects/import-json
Copyright: (c) 2012 by Trevor Lohrbeer
License: GNU General Public License, version 3 (GPL-3.0)
http://www.opensource.org/licenses/gpl-3.0.html
------------------------------------------------------------------------------------------------------------------------------------
A library for importing JSON feeds into Google spreadsheets. Functions include:
ImportJSON For use by end users to import a JSON feed from a URL
ImportJSONAdvanced For use by script developers to easily extend the functionality of this library
Future enhancements may include:
- Support for a real XPath like syntax similar to ImportXML for the query parameter
- Support for OAuth authenticated APIs
Or feel free to write these and add on to the library yourself!
------------------------------------------------------------------------------------------------------------------------------------
Changelog:
1.1 Added support for the noHeaders option
1.0 Initial release
*====================================================================================================================================*/
/**
* Imports a JSON feed and returns the results to be inserted into a Google Spreadsheet. The JSON feed is flattened to create
* a two-dimensional array. The first row contains the headers, with each column header indicating the path to that data in
* the JSON feed. The remaining rows contain the data.
*
* By default, data gets transformed so it looks more like a normal data import. Specifically:
*
* - Data from parent JSON elements gets inherited to their child elements, so rows representing child elements contain the values
* of the rows representing their parent elements.
* - Values longer than 256 characters get truncated.
* - Headers have slashes converted to spaces, common prefixes removed and the resulting text converted to title case.
*
* To change this behavior, pass in one of these values in the options parameter:
*
* noInherit: Don't inherit values from parent elements
* noTruncate: Don't truncate values
* rawHeaders: Don't prettify headers
* noHeaders: Don't include headers, only the data
* debugLocation: Prepend each value with the row & column it belongs in
*
* For example:
*
* =ImportJSON("http://gdata.youtube.com/feeds/api/standardfeeds/most_popular?v=2&alt=json", "/feed/entry/title,/feed/entry/content",
* "noInherit,noTruncate,rawHeaders")
*
* @param {url} the URL to a public JSON feed
* @param {query} a comma-separated lists of paths to import. Any path starting with one of these paths gets imported.
* @param {options} a comma-separated list of options that alter processing of the data
*
* @return a two-dimensional array containing the data, with the first row containing headers
* @customfunction
**/
function ImportJSON(url, query, options) {
return ImportJSONAdvanced(url, query, options, includeXPath_, defaultTransform_);
}
/**
* An advanced version of ImportJSON designed to be easily extended by a script. This version cannot be called from within a
* spreadsheet.
*
* Imports a JSON feed and returns the results to be inserted into a Google Spreadsheet. The JSON feed is flattened to create
* a two-dimensional array. The first row contains the headers, with each column header indicating the path to that data in
* the JSON feed. The remaining rows contain the data.
*
* Use the include and transformation functions to determine what to include in the import and how to transform the data after it is
* imported.
*
* For example:
*
* =ImportJSON("http://gdata.youtube.com/feeds/api/standardfeeds/most_popular?v=2&alt=json",
* "/feed/entry",
* function (query, path) { return path.indexOf(query) == 0; },
* function (data, row, column) { data[row][column] = data[row][column].toString().substr(0, 100); } )
*
* In this example, the import function checks to see if the path to the data being imported starts with the query. The transform
* function takes the data and truncates it. For more robust versions of these functions, see the internal code of this library.
*
* @param {url} the URL to a public JSON feed
* @param {query} the query passed to the include function
* @param {options} a comma-separated list of options that may alter processing of the data
* @param {includeFunc} a function with the signature func(query, path, options) that returns true if the data element at the given path
* should be included or false otherwise.
* @param {transformFunc} a function with the signature func(data, row, column, options) where data is a 2-dimensional array of the data
* and row & column are the current row and column being processed. Any return value is ignored. Note that row 0
* contains the headers for the data, so test for row==0 to process headers only.
*
* @return a two-dimensional array containing the data, with the first row containing headers
**/
function ImportJSONAdvanced(url, query, options, includeFunc, transformFunc) {
// allenyllee-20191016-var jsondata = UrlFetchApp.fetch(url);
var jsondata = null; // allenyllee-20191016+
// allenyllee-20191016 >>>
// retry when fetch failed
const NB_RETRY = 10;
var nbSecPause = 1.5;
var nbErr = 0;
while (nbErr < NB_RETRY) {
try {
jsondata = UrlFetchApp.fetch(url);
break;
}
catch (error) {
nbErr++;
Utilities.sleep(nbSecPause * 1000)
nbSecPause += 0.5;
}
}
// allenyllee-20191016 <<<
var object = JSON.parse(jsondata.getContentText());
return parseJSONObject_(object, query, options, includeFunc, transformFunc);
}
/**
* Encodes the given value to use within a URL.
*
* @param {value} the value to be encoded
*
* @return the value encoded using URL percent-encoding
*/
function URLEncode(value) {
return encodeURIComponent(value.toString());
}
/**
* Parses a JSON object and returns a two-dimensional array containing the data of that object.
*/
function parseJSONObject_(object, query, options, includeFunc, transformFunc) {
var headers = new Array();
var data = new Array();
if (query && !Array.isArray(query) && query.toString().indexOf(",") != -1) {
query = query.toString().split(",");
}
if (options) {
options = options.toString().split(",");
}
parseData_(headers, data, "", 1, object, query, options, includeFunc);
parseHeaders_(headers, data);
transformData_(data, options, transformFunc);
return hasOption_(options, "noHeaders") ? (data.length > 1 ? data.slice(1) : new Array()) : data;
}
/**
* Parses the data contained within the given value and inserts it into the data two-dimensional array starting at the rowIndex.
* If the data is to be inserted into a new column, a new header is added to the headers array. The value can be an object,
* array or scalar value.
*
* If the value is an object, it's properties are iterated through and passed back into this function with the name of each
* property extending the path. For instance, if the object contains the property "entry" and the path passed in was "/feed",
* this function is called with the value of the entry property and the path "/feed/entry".
*
* If the value is an array containing other arrays or objects, each element in the array is passed into this function with
* the rowIndex incremeneted for each element.
*
* If the value is an array containing only scalar values, those values are joined together and inserted into the data array as
* a single value.
*
* If the value is a scalar, the value is inserted directly into the data array.
*/
function parseData_(headers, data, path, rowIndex, value, query, options, includeFunc) {
var dataInserted = false;
if (isObject_(value)) {
for (key in value) {
if (parseData_(headers, data, "/" + key, rowIndex, value[key], query, options, includeFunc)) { //allenyllee-20180716+
//allenyllee-20180716- if (parseData_(headers, data, path + "/" + key, rowIndex, value[key], query, options, includeFunc)) {
dataInserted = true;
}
}
} else if (Array.isArray(value) && isObjectArray_(value)) {
// allenyllee-20171130 >>>
var string1 = query.toString();
var string2 = query.toString(); //allenyllee-20180716+
var index = -1;
if(string1.match(/\[[0-9]+\]/g)) {
string1 = string1.match(/\[[0-9]+\]/g)[0];
string1 = string1.slice(1,string1.length-1);
index = parseInt(string1);
//allenyllee-20180716 >>>
// sympotom:
// If you try to query /arrayA[k]/arrayB[n]/arrayC[m]/.../member,
// you will always get /arrayA[k]/arrayB[k]/arrayC[k]/.../member.
// rootcause:
// The code always extract index from the first ecountered array in the query xpath,
// and doesn't remove that array from the query xpath.
// solution:
// Remove the first encountered array everytime from the current query xpath as a new query xpath for next parseing use.
//
string2 = string2.match(/\/\w+\[[0-9]+\]/g)[0];
string2 = query.slice(string2.length)
//allenyllee-20180716 <<<
}
// allenyllee-20171130 <<<
for (var i = 0; i < value.length; i++) {
if (index>=0 && i!=index) continue; // allenyllee-20171130+
if (parseData_(headers, data, path, rowIndex, value[i], string2, options, includeFunc)) { //allenyllee-20180716+
//allenyllee-20180716- if (parseData_(headers, data, path, rowIndex, value[i], query, options, includeFunc)) {
dataInserted = true;
rowIndex++;
}
}
} else if (!includeFunc || includeFunc(query, path, options)) {
// Handle arrays containing only scalar values
if (Array.isArray(value)) {
value = value.join();
}
// Insert new row if one doesn't already exist
if (!data[rowIndex]) {
data[rowIndex] = new Array();
}
// Add a new header if one doesn't exist
if (!headers[path] && headers[path] != 0) {
headers[path] = Object.keys(headers).length;
}
// Insert the data
data[rowIndex][headers[path]] = value;
dataInserted = true;
}
return dataInserted;
}
/**
* Parses the headers array and inserts it into the first row of the data array.
*/
function parseHeaders_(headers, data) {
data[0] = new Array();
for (key in headers) {
data[0][headers[key]] = key;
}
}
/**
* Applies the transform function for each element in the data array, going through each column of each row.
*/
function transformData_(data, options, transformFunc) {
for (var i = 0; i < data.length; i++) {
for (var j = 0; j < data[i].length; j++) {
transformFunc(data, i, j, options);
}
}
}
/**
* Returns true if the given test value is an object; false otherwise.
*/
function isObject_(test) {
return Object.prototype.toString.call(test) === '[object Object]';
}
/**
* Returns true if the given test value is an array containing at least one object; false otherwise.
*/
function isObjectArray_(test) {
for (var i = 0; i < test.length; i++) {
if (isObject_(test[i])) {
return true;
}
}
return false;
}
/**
* Returns true if the given query applies to the given path.
*/
function includeXPath_(query, path, options) {
if (!query) {
return true;
} else if (Array.isArray(query)) {
for (var i = 0; i < query.length; i++) {
if (applyXPathRule_(query[i], path, options)) {
return true;
}
}
} else {
// allenyllee-20171130 >>>
var string1 = query.toString();
if(string1.match(/\[[0-9]+\]/g)) {
string1 = string1.replace(/\[[0-9]+\]/g,"");
}
// allenyllee-20171130 <<<
return applyXPathRule_(string1, path, options); //allenyllee-20171130+
// allenyllee-20171130- return applyXPathRule_(query, path, options);
}
return false;
};
/**
* Returns true if the rule applies to the given path.
*/
function applyXPathRule_(rule, path, options) {
//allenyllee-20171130- return path.indexOf(rule) == 0;
// allenyllee-20171130 >>>
// sympotom:
// when the path and the rule partially matched but not under the same xpath
// it will still print that value under that xpath
// rootcause:
// while the path matched the rule at the begining, the path can contain other characters
// for example, path = /data/asset_longname and rule = /data/asset
// solution:
// check the first remaining charater, if it's empty "" or slash "/" the function should return true.
// the slash "/" means that there are members under the same xpath, so do not remove it.
if ( path.indexOf(rule) == 0 ){
return (path.charAt(rule.length) == "" || path.charAt(rule.length) == "/")
}
// allenyllee-20171130 <<<
}
/**
* By default, this function transforms the value at the given row & column so it looks more like a normal data import. Specifically:
*
* - Data from parent JSON elements gets inherited to their child elements, so rows representing child elements contain the values
* of the rows representing their parent elements.
* - Values longer than 256 characters get truncated.
* - Values in row 0 (headers) have slashes converted to spaces, common prefixes removed and the resulting text converted to title
* case.
*
* To change this behavior, pass in one of these values in the options parameter:
*
* noInherit: Don't inherit values from parent elements
* noTruncate: Don't truncate values
* rawHeaders: Don't prettify headers
* debugLocation: Prepend each value with the row & column it belongs in
*/
function defaultTransform_(data, row, column, options) {
if (!data[row][column]) {
if (row < 2 || hasOption_(options, "noInherit")) {
data[row][column] = "";
} else {
data[row][column] = data[row-1][column];
}
}
if (!hasOption_(options, "rawHeaders") && row == 0) {
if (column == 0 && data[row].length > 1) {
removeCommonPrefixes_(data, row);
}
data[row][column] = toTitleCase_(data[row][column].toString().replace(/[\/\_]/g, " "));
}
if (!hasOption_(options, "noTruncate") && data[row][column]) {
data[row][column] = data[row][column].toString().substr(0, 256);
}
if (hasOption_(options, "debugLocation")) {
data[row][column] = "[" + row + "," + column + "]" + data[row][column];
}
}
/**
* If all the values in the given row share the same prefix, remove that prefix.
*/
function removeCommonPrefixes_(data, row) {
var matchIndex = data[row][0].length;
for (var i = 1; i < data[row].length; i++) {
matchIndex = findEqualityEndpoint_(data[row][i-1], data[row][i], matchIndex);
if (matchIndex == 0) {
return;
}
}
for (var i = 0; i < data[row].length; i++) {
data[row][i] = data[row][i].substring(matchIndex, data[row][i].length);
}
}
/**
* Locates the index where the two strings values stop being equal, stopping automatically at the stopAt index.
*/
function findEqualityEndpoint_(string1, string2, stopAt) {
if (!string1 || !string2) {
return -1;
}
var maxEndpoint = Math.min(stopAt, string1.length, string2.length);
for (var i = 0; i < maxEndpoint; i++) {
if (string1.charAt(i) != string2.charAt(i)) {
return i;
}
}
return maxEndpoint;
}
/**
* Converts the text to title case.
*/
function toTitleCase_(text) {
if (text == null) {
return null;
}
return text.replace(/\w\S*/g, function(word) { return word.charAt(0).toUpperCase() + word.substr(1).toLowerCase(); });
}
/**
* Returns true if the given set of options contains the given option.
*/
function hasOption_(options, option) {
return options && options.indexOf(option) >= 0;
}
@ofmarconi
Copy link

Hello,
I’m experimenting with the Facebook API and I’m watching a strange behaviour. I have following API-String to get informations for two facebook pages at once):

=ImportJSON(“http://graph.facebook.com/?ids=15765808054,116335138405853″; “”; “noHeaders”))

The result of this call is put in only one lines of my spreadshet although I get results for two pages. The results of the second page is put in the same line after the results of the first line. I expected these data of the second page under the first page in a second line.

The JSON output looks good for me I cannot find any error.

The problem is not with the script, but with the way Facebook is returning multiple objects in the API. Rather than returning the objects as a top-level array, which ImportJSON would correctly parse as multiple rows, Facebook is returning the objects as a key-value map where the key is the id of each object.

I found this post, but no one answers.
bradjasper/ImportJSON#2

Cheers,

@raulperdomo
Copy link

Thank you for updating this script. I was wondering if there was a way to query for relative paths?

Instead of searching top down /top/second/third[1]
I would like to be able to do something like //third to return all the data that starts under the 'third' keys, like XPATH allows in python.

@dabros
Copy link

dabros commented Dec 23, 2018

Same as @rawl28 - something with XPath wildcards like "//name" or "/items/*/name" would solve my purpose, as there is numerical index after items.

bradjasper/ImportJSON#71 (comment)

bradjasper/ImportJSON#91 (comment)

@Gusristic
Copy link

@dd36
Copy link

dd36 commented Oct 5, 2022

I'm not a computer scientist. =importjson("https://query2.finance.yahoo.com/v10/finance/quoteSummary/AAPL?modules=assetProfile";"/city") Works well. =importjson("https://query2.finance.yahoo.com/v10/finance/quoteSummary/AAPL?modules=assetProfile";"/zip") Works well. but =importjson("https://query2.finance.yahoo.com/v10/finance/quoteSummary/AAPL?modules=assetProfile";"/zip,/city") The reference does not exist. Why multiple queries do not work???

I am having the same problem. Did the syntax change?

@ONEKENO
Copy link

ONEKENO commented Feb 3, 2023

I’m having issue as following:


JSON

[

    { "id": 1, "items": [ { "itemId": 11 }, { "itemId": 12 } , { "itemId": 13 }  ] },

    { "id": 2, "items": [ { "itemId": 21 }, { "itemId": 22 }, { "itemId": 23 }  ] }

]


It is imported as

/id /items/itemId
1 11
2 21
2 22
2 23

with itemId 12 and itemId 13 missing.


Could you please help?

Thank you very much

@ONEKENO
Copy link

ONEKENO commented Feb 5, 2023

I’m having issue as following:

JSON

[

    { "id": 1, "items": [ { "itemId": 11 }, { "itemId": 12 } , { "itemId": 13 }  ] },

    { "id": 2, "items": [ { "itemId": 21 }, { "itemId": 22 }, { "itemId": 23 }  ] }

]

It is imported as

/id /items/itemId
1 11
2 21
2 22
2 23
with itemId 12 and itemId 13 missing.

Could you please help?

Thank you very much

The issue is fixed with this one, https://github.com/qeet/IMPORTJSONAPI
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment