Skip to content

Instantly share code, notes, and snippets.

@benozol
Created March 13, 2017 11:36
Show Gist options
  • Save benozol/ee411685e0386a2491f1a8d5f4b786b3 to your computer and use it in GitHub Desktop.
Save benozol/ee411685e0386a2491f1a8d5f4b786b3 to your computer and use it in GitHub Desktop.
function download(text, name, type) {
var a = document.createElement("a");
document.body.appendChild(a)
var file = new Blob([text], {type: type});
a.href = URL.createObjectURL(file);
a.download = name;
a.click();
}
function getMessage(elt) {
return {
'username': elt.find('.username b').text(),
'user_id': elt.find('.account-group').attr('data-user-id'),
'timestamp': Number(elt.find('small.time a .js-short-timestamp').attr('data-time')),
'href': elt.find('small.time a').attr('href'),
'content': elt.find('p.tweet-text').text(),
'content_html': elt.find('p.tweet-text').html(),
};
}
function toCSV(msgs) {
var processRow = function (row) {
var finalVal = '';
for (var j = 0; j < row.length; j++) {
var innerValue = row[j] === null ? '' : row[j].toString();
if (row[j] instanceof Date) {
innerValue = row[j].toLocaleString();
};
var result = innerValue.replace(/"/g, '""');
if (result.search(/("|,|\n)/g) >= 0)
result = '"' + result + '"';
if (j > 0)
finalVal += ',';
finalVal += result;
}
return finalVal + '\n';
};
var csvFile = processRow(['username', 'user_id', 'timestamp', 'link', 'content', 'content_html']);
for (var i = 0; i < msgs.length; i++) {
var msg = msgs[i];
csvFile += processRow([msg.username, msg.user_id, new Date(1000*msg.timestamp), msg.href, msg.content, msg.content_html]);
}
return csvFile;
}
var msgs = $('.tweet .content').map(function() { return getMessage($(this)); })
download(toCSV(msgs.toArray()), 'messages.csv', 'text/csv');
@benozol
Copy link
Author

benozol commented Mar 13, 2017

The entry point for obtaining Tweets is its search https://twitter.com/search-advanced.

  1. Enter the query terms
  2. Click search
  3. Select "Latest" to order tweets by time
  4. Keep scrolling to the very bottom (press shift+page down) until no further tweets are found
  5. Open a console (Firefox: Right-click anyway, "Inspect element (Q)", Chrome: Right-click, "Inspect", select Console in the opened window)
  6. Copy and paste the script above after the prompt (») in the console on the very bottom of the browser window and press enter.
  7. A download with the results is started
  8. Save the resulting file (messages.csv).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment