Skip to content

Instantly share code, notes, and snippets.

@nikitalpopov
Created June 26, 2019 19:05
Show Gist options
  • Save nikitalpopov/6106eb2b2d6b5077df87a6290f0cea33 to your computer and use it in GitHub Desktop.
Save nikitalpopov/6106eb2b2d6b5077df87a6290f0cea33 to your computer and use it in GitHub Desktop.
Extract {track name} - {artist name} from VK audios.html to separated *.txt files
from bs4 import BeautifulSoup
vk_audios_file = 'audios.html' # SET YOUR FILE NAME
soup = BeautifulSoup(open(vk_audios_file, encoding='ISO-8859-1'), 'lxml')
songs = [song.get_text() for song in soup.find_all("div", {"class": "audio__title"})]
def revert(songname):
song = None
if (len(songname.split(' –')) > 1):
song = songname.split(' –')[1] + ' - ' + songname.split('– ')[0]
return song
songs = [revert(song) for song in songs if revert(song)]
outer = 0
counter = 0
file = open("songlist" + str(outer) + ".txt", "w") # SET YOUR FILE NAME
for song in songs:
file.write(song+'\n')
counter += 1
if counter == 200:
outer += 1
print(outer)
file.close()
file = open("songlist" + str(outer) + ".txt", "w") # SET YOUR FILE NAME
print(file)
counter = 0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment