Skip to content

Instantly share code, notes, and snippets.

@ShayanRiyaz
Created April 24, 2020 10:28
Show Gist options
  • Save ShayanRiyaz/19c3019ca4c879ea1d0e5e6a1fb7aec5 to your computer and use it in GitHub Desktop.
Save ShayanRiyaz/19c3019ca4c879ea1d0e5e6a1fb7aec5 to your computer and use it in GitHub Desktop.
url = requests.get('https://en.wikipedia.org/wiki/List_of_districts_and_neighbourhoods_of_Los_Angeles').text
soup = BeautifulSoup(url,"html.parser")
lis = []
for li in soup.findAll('li'):
if li.find(href="/wiki/Portal:Los_Angeles"):
break
if li.find(href=re.compile("^/wiki/")):
lis.append(li)
if li.text=='Pico Robertson[34]': #Pico Robertson is the only item on the list that does not have a hyperlink reference
lis.append(li)
neigh = []
for i in range(0,len(lis)):
neigh.append(lis[i].text.strip())
df = pd.DataFrame(neigh)
df.columns = ['Neighbourhood']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment