Skip to content

Instantly share code, notes, and snippets.

@tanaikech
Last active March 28, 2024 15:29
Show Gist options
  • Star 17 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save tanaikech/e6251657d425d2827fee6dd3daf47976 to your computer and use it in GitHub Desktop.
Save tanaikech/e6251657d425d2827fee6dd3daf47976 to your computer and use it in GitHub Desktop.
Workaround for Retrieving Direct Links of All Sheets from URL of 2PACX-### of Web Published Google Spreadsheet

Workaround for Retrieving Direct Links of All Sheets from URL of 2PACX-### of Web Published Google Spreadsheet

This is a sample script for retrieving the direct links of all sheets from the URL like https://docs.google.com/spreadsheets/d/e/2PACX-###/pubhtml of the web published Google Spreadsheet. This sample script can be used for the following situation.

  1. The Spreadsheet is published to Web and the URL like https://docs.google.com/spreadsheets/d/e/2PACX-###/pubhtml is known
  2. You are not the owner of Google Spreadsheet.
  3. You don't know the Spreadsheet ID and Sheet IDs.

Under above situation, unfortunately, the direct links of each sheet cannot be directly retrieved. I think that this is the specification of Google side. So in this post, I would like to introduce a workaround for retrieving the direct links of each sheet under above situation.

Flow

The flow of this workaround is as follows.

  1. Download the Google Spreadsheet as a XLSX data from the URL of https://docs.google.com/spreadsheets/d/e/2PACX-###/pubhtml.
  2. Convert the XLSX data to Google Spreadsheet.
  3. Publish the converted Google Spreadsheet to Web.
  4. Retrieve the URLs of each sheet.

Sample script 1

Please copy and paste the following script (Google Apps Script) to the script editor. And please enable Google Drive API at Advanced Google services. This script is for the Web Apps.

function myFunction() {
  const inputUrl = "https://docs.google.com/spreadsheets/d/e/2PACX-###/pubhtml";

  const prop = PropertiesService.getScriptProperties();
  const ssId = prop.getProperty("ssId");
  if (ssId) {
    DriveApp.getFileById(ssId).setTrashed(true);
    prop.deleteProperty("ssId");
  }
  const re = new RegExp(
    "(https?:\\/\\/docs\\.google\\.com\\/spreadsheets\\/d\\/e\\/2PACX-.+?\\/)"
  );
  if (!re.test(inputUrl)) throw new Error("Wrong URL.");
  const url = `${inputUrl.match(re)[1]}pub?output=xlsx`;
  const blob = UrlFetchApp.fetch(url).getBlob();
  const id = Drive.Files.insert(
    { mimeType: MimeType.GOOGLE_SHEETS, title: "temp" },
    blob
  ).id;
  prop.setProperty("ssId", id);
  Drive.Revisions.update(
    { published: true, publishedOutsideDomain: true, publishAuto: true },
    id,
    1
  );
  const sheets = SpreadsheetApp.openById(id).getSheets();
  const pubUrls = sheets.map((s) => ({
    [s.getSheetName()]: `https://docs.google.com/spreadsheets/d/${id}/pubhtml?gid=${s.getSheetId()}`,
  }));
  console.log(pubUrls);
}
  • When this script is run, the Google Spreadsheet is created to the root folder and the direct links of each sheet of the web published Spreadsheet are returned.

  • In this script, when the script is run, the Google Spreadsheet is downloaded as a XLSX data, and the XLSX data is converted to Google Spreadsheet. Then, the converted Spreadsheet is published to the web. By this, the direct links of each sheet can be retrieved.

    • Also, in this script, it supposes that the original Spreadsheet is changed. So if you run the script again, the existing Spreadsheet is deleted and new Spreadsheet is created by downloading from the original Spreadsheet. In this case, the URLs are updated.

    • So if the Spreadsheet is not changed, you can continue to use the retrieved URLs. Of course, you can also directly use the downloaded and converted Spreadsheet.

Sample script 2

Please copy and paste the following script (Google Apps Script) to the script editor. And please enable Google Drive API at Advanced Google services. This script is for the Web Apps.

As another workaround, when the original Spreadsheet is often changed, and the number of sheet is constant in the original Spreadsheet, and then, you want to retrieve only values, you can also use the following script. In this script, the URL is not changed even when the script is run again. So you can continue to use the URL. And, when this function is run using the time-driven trigger, you can also synchronize with the original Spreadsheet.

function myFunction() {
  const inputUrl = "https://docs.google.com/spreadsheets/d/e/2PACX-###/pubhtml";
  
  const re = new RegExp("(https?:\\/\\/docs\\.google\\.com\\/spreadsheets\\/d\\/e\\/2PACX-.+?\\/)");
  if (!re.test(inputUrl)) throw new Error("Wrong URL.");
  const url = `${inputUrl.match(re)[1]}pub?output=xlsx`;
  const blob = UrlFetchApp.fetch(url).getBlob();
  const prop = PropertiesService.getScriptProperties();
  let sheets;
  let ssId = prop.getProperty("ssId");
  if (ssId) {
    const temp = Drive.Files.insert({mimeType: MimeType.GOOGLE_SHEETS, title: "tempSpreadsheet"}, blob).id;
    const tempSheets = SpreadsheetApp.openById(temp).getSheets();
    sheets = SpreadsheetApp.openById(ssId).getSheets();
    tempSheets.forEach((e, i) => {
      const values = e.getDataRange().getValues();
      sheets[i].getRange(1, 1, values.length, values[0].length).setValues(values);
    });
    DriveApp.getFileById(temp).setTrashed(true);
  } else {
    ssId = Drive.Files.insert({mimeType: MimeType.GOOGLE_SHEETS, title: "copiedSpreadsheet"}, blob).id;
    Drive.Revisions.update({published: true, publishedOutsideDomain: true, publishAuto: true}, ssId, 1);
    prop.setProperty("ssId", ssId);
    sheets = SpreadsheetApp.openById(ssId).getSheets();
  }
  const pubUrls = sheets.map(s => ({[s.getSheetName()]: `https://docs.google.com/spreadsheets/d/${ssId}/pubhtml?gid=${s.getSheetId()}`}));
  console.log(pubUrls);  // You can see the URLs for each sheet at the log.
}

References

@Nevidebla
Copy link

how to download xlsx in "pubhtml" link?

@NoLimitCoder
Copy link

how to download xlsx in "pubhtml" link?

To download it, as any allowed format (xslx, pdf, csv...), just replace "/pubhtml" with "/export?format=[Yourformat]".
Ex:
Original:
https://docs.google.com/spreadsheets/d/[SpreadSheetID]/pubhtml
Downloadable version:
https://docs.google.com/spreadsheets/d/[SpreadSheetID]/export?format=xlsx

@Nevidebla
Copy link

how to download xlsx in "pubhtml" link?

To download it, as any allowed format (xslx, pdf, csv...), just replace "/pubhtml" with "/export?format=[Yourformat]". Ex: Original: https://docs.google.com/spreadsheets/d/[SpreadSheetID]/pubhtml Downloadable version: https://docs.google.com/spreadsheets/d/[SpreadSheetID]/export?format=xlsx

Thx, but it is no use to this:
https://docs.google.com/spreadsheets/d/e/2PACX-1vSTWLlj1luPBQBGNpzs_npdN7oM-0OFwmfdduufbXSOjxQDD3bkPdeo23xE0r6rHFwX1SWmYM0j9xJW/pubhtml#

@TuVieja007
Copy link

BRO THANK YOU

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment