Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save Cauen/f424c42d95cca79d5aae51d41741a7fc to your computer and use it in GitHub Desktop.
Save Cauen/f424c42d95cca79d5aae51d41741a7fc to your computer and use it in GitHub Desktop.
Parse Google Sheet or Spreadsheet Data Copied to Clipboard to JSON Object (Array of Objects)

Why?

It parse copied data from Google Sheets (who is a space separated values) in JSON to work more easily.

Input

image

Output

image

How? (steps)

  1. Split string in rows
  2. Remove emtpy rows
  3. Remove empty columns
  4. Parse arrays to JSON

Play

Typescript playground

const fake = 'Name\tAge\t\t\nEmanuel\t18\t\t\nCauê\t13\t\t\n\t\t\t\n\t\t\t'
const data = prompt("Digite o conteúdo") || fake
const splitInRows = (data: string): string[] => data.split("\n")
const isEmptyRow = (data: string): boolean => data.replaceAll('\t', '').length === 0
const splitRowInColumns = (row: string): string[] => row.split("\t")
const splitRowsInColumns = (rows: string[]) => rows.map(row => splitRowInColumns(row))
const getFilledColumnsIndexes = (headerRow: string[]): number[] => headerRow.reduce((prev, curr, currentIndex) => (curr.trim()) ? [...prev, currentIndex] : prev, [] as number[])
const keepOnlySpecificIndexesInArray = (row: string[], indexes: number[]): string[] => indexes.map(index => row[index])
const cleanifyRows = (rows: string[][], indexesToKeep: number[]) => rows.map(row => keepOnlySpecificIndexesInArray(row, indexesToKeep))
/**
* It turns 'Name\tAge\t\t\nEmanuel\t18\t\t\nCauê\t13\t\t\n\t\t\t\n\t\t\t'
*
* into
*
* [
[
"name",
"age"
],
[
"Emanuel",
"18"
],
[
"Cauê",
"13"
]
]
*/
function parseSheetStringInArraySheetData(data: string) {
const rows = splitInRows(data)
const contentRows = rows.filter(el => !isEmptyRow(el))
// console.log({ contentRows })
const columns = splitRowsInColumns(contentRows)
// console.log({ columns })
const headerRow = columns[0]
const filledColumnsIndexes = getFilledColumnsIndexes(headerRow)
// console.log({ filledColumnsIndexes })
const withOnlySpecificIndexes = cleanifyRows(columns, filledColumnsIndexes)
// console.log({ withOnlySpecificIndexes })
return withOnlySpecificIndexes
}
/**
* @param items [['name', 'age'], ['Emanuel', '18'], ['Cauê', '13']]
* @returns [{"name":"Emanuel", "age":"18"}, {"name":"Cauê", "age": "13"}]
*/
function parseSheetDataToArrayOfObjects(
items: string[][],
addId?: boolean
): Record<string, string>[] {
const headers = items[0]
return items.reduce((prev, current, index) => {
const isHeader = index === 0
const item = current.reduce((prevItem, currentItem, currentIndex) => {
prevItem[headers[currentIndex]] = currentItem
return prevItem
}, {} as Record<string, string>)
if (addId) item.id = `${index}`
if (!isHeader) prev.push(item)
return prev
}, [] as Record<string, string>[])
}
function parseSheetStringInJSON({ data, addId }: { data: string, addId?: boolean }) {
const arrays = parseSheetStringInArraySheetData(data)
const json = parseSheetDataToArrayOfObjects(arrays, addId)
return json
}
console.log(parseSheetStringInJSON({ data, addId: true }))
@Cauen
Copy link
Author

Cauen commented Mar 29, 2023

v2

import parser from "csvtojson";

/**
 * - Takes into account \t and \n inside cells (🎉NEW)
 * - Remove empty columns
 * - Remove empty rows
 * - Trim cells
 */
export async function parseSheetStringInJSON(
  tsvString: string
): Promise<Record<string, string>[]> {
  function removeEmpty(obj: Object) {
    return Object.fromEntries(
      Object.entries(obj).filter(([_, v]) => Boolean(v))
    );
  }

  const parsed: Record<string, string>[] = await new Promise(
    (resolve, reject) =>
      parser({
        flatKeys: true,
        delimiter: "\t",
        trim: true,
      })
        .fromString(tsvString.trim() ?? "")
        .then(function (result) {
          const clean = result.map((el) => removeEmpty(el));
          resolve(clean);
        })
  );

  return parsed;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment