Skip to content

Instantly share code, notes, and snippets.

@troyharvey
Created May 29, 2020 03:17
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save troyharvey/1ed5c6647797ca0e18ef636d037e0511 to your computer and use it in GitHub Desktop.
Save troyharvey/1ed5c6647797ca0e18ef636d037e0511 to your computer and use it in GitHub Desktop.
Transform the opawg Podcast User Agent list into a jsonl file for BigQuery
  1. Generate a jsonl file with all the Podcast User Agents.

     node agents.js
    
  2. Use the BigQuery console to create a table and load the podcast-user-agents.jsonl file into the new table.

'use strict';
const fs = require('fs');
const https = require('https');
let results = [];
const url =
'https://raw.githubusercontent.com/opawg/user-agents/master/src/user-agents.json';
https.get(url, (res) => {
res.setEncoding('utf8');
let body = '';
res.on('data', (data) => {
body += data;
});
res.on('end', () => {
for (let ua of JSON.parse(body)) {
for (let regex of ua.user_agents) {
results.push({
regex: regex,
os: ua.os || null,
device: ua.device || null,
app: ua.app || null,
bot: ua.bot || false,
});
}
}
fs.writeFileSync(
'podcast-user-agents.jsonl',
results.reduce((previous, current) => {
return previous + JSON.stringify(current) + '\n';
}, '')
);
});
});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment