Skip to content

Instantly share code, notes, and snippets.

@jensgro
Last active March 1, 2023 20:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jensgro/9dd29b9beba4eb2434bf3a0e90694436 to your computer and use it in GitHub Desktop.
Save jensgro/9dd29b9beba4eb2434bf3a0e90694436 to your computer and use it in GitHub Desktop.
save a json-file with all public or private pens from Codepen

Scrape pens

Select and copy the script into clipboard. Then navigate to codepen.io, and sign in. Next, navigate to your dashboard select private pens. Next, it's important you select the List view so the pens show up as a list instead of grid. You can use the URL example from below if you don't know where it is.

https://codepen.io/${your_username}/pens/private?grid_type=list

You have to do some manual work now. Open Google Developers Console (ALT + CMD + i). Paste the script you copied from the clipboard and hit "ENTER".

Now you are ready to run scrape() in the console. You should see an output in the console as an "array" with some items in it. That means all went well. Now, let's do next pagination page if it exists.

Run next() in the console. Now, make sure you wait until the next list of pens loads and appears on the screen (this might take a second). Once they are there, repeat the process of running scrape() and next() until you go through all your private pens.

As the last step, we need to copy all the pens saved in your localStorage and put then in a local JSON file.

Run JSON.stringify(JSON.parse(localStorage.getItem("cp")), null, 2) in the chrome console. It should output a big formated string with all the data. Now just copy it and paste it to a local JSON file.

function scrape() {
const lsData = localStorage.getItem("cp");
const data = lsData ? JSON.parse(lsData) : [];
const items = document.querySelectorAll(".item-in-list-view");
// the structure is like this
// td -> title
// - a -> href link
// td -> preview (skpi this)
// td -> date created
// td -> date updated
// td -> list stats
// - hearts
// - comments
// - views
const nextData = [...items].map((item, index) => {
return [...item.childNodes].reduce((acc, child, i) => {
// 1st td is title
if (i === 0) {
acc.title = child.innerText;
acc.link = [...child.children].filter(a => a.nodeName === "A")[0].href;
// 3nd td date created
} else if (i === 2) {
acc.createdAt = child.innerText;
// 4th td is date updated
} else if (i === 3) {
acc.updatedAt = child.innerText;
// 5th td is stats
} else if (i === 4) {
const [hearts, comments, views] = child.innerText.split("\n");
acc.hearts = hearts;
acc.comments = comments;
acc.views = views;
}
return acc;
}, {});
});
console.log([...data, ...nextData]);
localStorage.setItem("cp", JSON.stringify([...data, ...nextData]));
}
function next() {
const nextBtn = [
...document.querySelectorAll("[data-direction='next']")
].pop();
nextBtn.click();
}
// scrape();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment