misho-kr/Analyzing Police Activity with pandas.md

## Analyzing Police Activity with pandas.md

      
    Raw
  

              Analyzing Police Activity with pandas.md
            
          
    Analyzing Police Activity with pandas

You will explore the Stanford Open Policing Project dataset and analyze the impact of gender on police behavior. Practice cleaning messy data, creating visualizations, combining and reshaping datasets, and manipulating time series data. Analyzing Police Activity with pandas will give you valuable experience analyzing a dataset.
Lead by Kevin Markham Founder of Data School
Preparing the data for analysis

Examine and clean the dataset, to make working with it a more efficient process. Fix data types, handling missing values, and dropping columns and rows while learning about the Stanford Open Policing Project dataset.

Traffic stops by police officers download
Preparing the data -- examine and clean

Locating missing values
Dropping a column
Dropping rows


Examining the data types

Fixing a data type


Creating a DatetimeIndex

Using datetime format
Setting the index


import pandas as pd

ri = pd.read_csv('police.csv')
ri.isnull().sum()

ri.drop('county_name', axis='columns', inplace=True)
ri.dropna(subset=['stop_date', 'stop_time'], inplace=True)

ri.dtypes
apple['price'] = apple.price.astype('float')

apple.date.str.replace('/', '-')
combined = apple.date.str.cat(apple.time, sep=' ')
apple['date_and_time'] = pd.to_datetime(combined)
apple.set_index('date_and_time', inplace=True)
Exploring the relationship between gender and policing

Does the gender of a driver have an impact on police behavior during a traffic stop? Explore that question while practicing filtering, grouping, method chaining, Boolean math, string methods, and more!

Counting unique values

Expressing counts as proportions


Filtering by multiple conditions
Correlation, not causation

Analyze the relationship between gender and stop outcome
Not going to draw any conclusions about causation

Would need additional data and expertise


Math with Boolean values
Comparing groups using groupby
Examining the search types

Searching for a string


ri.stop_outcome.value_counts()

white = ri[ri.driver_race == 'White']
white.stop_outcome.value_counts(normalize=True)

np.mean([False, True, False, False])
ri.is_arrested.value_counts(normalize=True)

ri.groupby('district').is_arrested.mean()
ri.groupby(['district', 'driver_gender']).is_arrested.mean()

ri['inventory'] = ri.search_type.str.contains('Inventory', na=False)
ri.inventory.dtype
ri.inventory.sum()
ri.inventory.mean()
Visual exploratory data analysis

Are you more likely to get arrested at a certain time of day? Are drug-related stops on the rise? Answer these and other questions by analyzing the dataset visually, since plots can help you to understand trends in a way that examining the raw data cannot.

Analyzing datetime data

Accessing datetime attributes
Calculating the monthly mean price


Resampling the price

Plotting price and volume


Computing a frequency table

Tally of how many times each combination of values occurs


Analyzing an object column

Mapping one set of values to another


Creating a bar plot

Ordering the bars
Rotating the bars


apple.date_and_time.dt.month

apple.set_index('date_and_time', inplace=True)
apple.index

apple.groupby(apple.index.month).price.mean()

monthly_price = apple.price.resample('M').mean()
monthly_volume = apple.volume.resample('M').mean()
monthly = pd.concat([monthly_price, monthly_volume], axis='columns')

monthly.plot(subplots=True)
plt.show()
table = pd.crosstab(ri.driver_race, ri.driver_gender)
table.loc['Asian':'Hispanic']

table.plot()
table.plot(kind='bar')
table.plot(kind='bar', stacked=True)
plt.show()

mapping = {'up':True, 'down':False}
apple['is_up'] = apple.change.map(mapping)

search_rate = ri.groupby('violation').search_conducted.mean()
search_rate.sort_values().plot(kind='bar')
search_rate.sort_values().plot(kind='barh')
plt.show()
Analyzing the effect of weather on policing

Use a second dataset to explore the impact of weather conditions on police behavior during traffic stops. Practice merging and reshaping datasets, assessing whether a data source is trustworthy, working with categorical data, and other advanced skills.