Skip to content

Instantly share code, notes, and snippets.

@oldwestaction
Last active January 3, 2023 01:37
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 4 You must be signed in to fork a gist
  • Save oldwestaction/68b5c3bedf6d8acb84ff5a846cd3d8f8 to your computer and use it in GitHub Desktop.
Save oldwestaction/68b5c3bedf6d8acb84ff5a846cd3d8f8 to your computer and use it in GitHub Desktop.
google apps script to populate a google sheet w/ fragrance notes from fragrantica

how to use

  1. open sample-spreadsheet.csv (or your own fragrance spreadsheet) in google sheets
  2. in the toolbar, go to Tools > Script Editor
  3. copy and paste Code.gs into 'Code.gs'
    • by default google apps script should be using the v8 runtime -- this script will not work if it is not!
    • you can edit the variables at top if your columns are arranged differently than in my example, or if you want to name the menu item something other than 'Custom scripts'
  4. hit 'save' (you will have to enter a project name)
  5. go back to your spreadsheet and refresh the page
  6. in the toolbar, click Custom scripts > Look up fragrance notes (you will need to authorize the script)
  7. it should populate all empty cells in the 'Fragrance notes' column with the fragrance notes listed on fragrantica.com
    • if it can't find the fragrance/notes it'll fail silently
  8. run whenever you add new rows to your sheet!
// full instructions on using this script are available here: https://gist.github.com/oldwestaction/68b5c3bedf6d8acb84ff5a846cd3d8f8
// edit these variables depending on how your sheet is set up
// 1 = column A, 2 = column B, and so on
var googleSheetMenuButton = 'Custom scripts';
var perfumerColumn = 1;
var titleColumn = 2;
var fragranceNotesColumn = 3;
// you shouldn't need to edit anything below this point!
function onOpen() {
var ui = SpreadsheetApp.getUi();
ui.createMenu(googleSheetMenuButton)
.addItem('Look up fragrance notes', 'getFragranceNotes')
.addToUi();
}
function getFragranceNotes() {
var ss = SpreadsheetApp.getActiveSpreadsheet();
var allSheets = ss.getSheets();
// using google returns a 429 because the "i'm feeling lucky" flag doesn't reliably resolve, so we're going to use duck duck go
// var baseUrl = 'https://www.google.com/search?btnI=1&q=+';
var baseUrl = 'https://duckduckgo.com/?q=!ducky+';
var siteParam = '+site%3Afragrantica.com/perfume'
for (var sheet in allSheets){
var currentSheet = allSheets[sheet]
var selection = currentSheet.getDataRange();
var rows = selection.getNumRows() + 1;
for (var row = 2; row < rows; row++) { // assumes you're using row 1 for column headers
var targetCell = selection.getCell(row, fragranceNotesColumn);
if (targetCell.isBlank()) { // don't do anything unless the fragrance notes column is empty
var perfumer = selection.getCell(row, perfumerColumn).getValue();
var title = selection.getCell(row, titleColumn).getValue();
var searchQuery = (perfumer + ' ' + title).replace(/ /g, '+');
var searchUrl = baseUrl + searchQuery + siteParam;
var fragranceNotes = [];
// https://stackoverflow.com/questions/19455158/what-is-the-best-way-to-parse-html-in-google-apps-script
// https://sites.google.com/site/scriptsexamples/learn-by-example/parsing-html
// https://stackoverflow.com/questions/39036270/how-can-i-see-the-full-server-response-for-this-api-error-message-in-google-scri
var response = UrlFetchApp.fetch(searchUrl, { muteHttpExceptions: true });
var page = response.getContentText();
// in a better world, we could just use `XmlService.parse()` on `page` directly, but it can't deal with fragrantica's malformed HTML
// so instead, we are going to grab a chunk of well-formed markup using regex (barf) THEN parse it using XmlService
// see https://regex101.com/r/zwoXnB/1 for an interactive example of wtf this regex is doing
var regex = new RegExp(/(?<=<h6>main accords<\/h6>).*(?<= )/);
var markupChunk = page.match(regex);
if (markupChunk !== null) {
var parsedXml = XmlService.parse(markupChunk);
var rootElement = parsedXml.getRootElement();
var noteNodes = rootElement.getChildren();
noteNodes.forEach(function (noteNode) {
fragranceNotes.push(noteNode.getChildText('div'));
});
}
targetCell.setValue(fragranceNotes.join(', '));
}
}
}
}
Brand Title Notes
Aroma M Geisha Vanilla Hinoki woody, aromatic, fresh spicy, citrus, floral
CB I Hate Perfume To See A Flower earthy, green, yellow floral, floral, warm spicy, woody
DS and Durga White Peacock Lily white floral, aquatic, floral, sweet, musky
Heeley Iris de Nuit powdery, musky, floral, woody, amber
Monsillage Pays Dogon woody, fresh spicy, earthy, green, aromatic
William Eadon No. 12 warm spicy, citrus, aromatic, fresh spicy, musky, floral
Brand Title Fragrance notes
Aroma M Geisha Vanilla Hinoki
CB I Hate Perfume To See A Flower
DS and Durga White Peacock Lily
Heeley Iris de Nuit
Monsillage Pays Dogon
William Eadon No. 12
@Curlbread
Copy link

Curlbread commented Dec 1, 2018

Thanks for this! Really cool. I've tried to add the rating but don't really know what I'm doing. Seems a bit tricky with itemprop. Any advice are welcome!
At least managed to insert a link with:

targetCell.setValue('=HYPERLINK("' + searchUrl + '";"' + fragranceNotes +'")');

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment