Skip to content

Instantly share code, notes, and snippets.

@cab938
Last active August 2, 2018 13:28
Show Gist options
  • Save cab938/f35cec48fb4fb008de44147fad6bc375 to your computer and use it in GitHub Desktop.
Save cab938/f35cec48fb4fb008de44147fad6bc375 to your computer and use it in GitHub Desktop.
#!pip install html5lib #install html5lib, only needs to be run once
import pandas as pd
import numpy as np
earthquake_data='https://proxy.mentoracademy.org/getContentFromUrl/?userid=brooks&url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FList_of_earthquakes_in_the_United_States'
df = pd.read_html(earthquake_data, header=0)[0]
df=df[df['Magnitude']!='Unknown'] #get rid of all the data where there is no known magnitude
df['Magnitude']=df['Magnitude'].apply(lambda x: x.split(", ")[0]) #for data where there are two values report with a comma, just take the first value
df['Magnitude']=df['Magnitude'].apply(lambda x: np.mean(np.array(x.split('–')).astype(float))) #average all ranges of values
print(len(df[df['Magnitude']>7])) #print out how many earthquakes on this list had values >7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment