Skip to content

Instantly share code, notes, and snippets.

@ivopbernardo
Created August 20, 2019 19:44
Show Gist options
  • Save ivopbernardo/84f6c8db11f2a08094cece008e62d2f2 to your computer and use it in GitHub Desktop.
Save ivopbernardo/84f6c8db11f2a08094cece008e62d2f2 to your computer and use it in GitHub Desktop.
Examples of extracting data using Pandas
import pandas as pd
import pyodbc
#Ask for user and password input
user = input('Provide user: \n')
pwd = input('Provide password: \n')
#Make connection to My SQL local host
mydb = pyodbc.connect("DRIVER={MySQL ODBC 8.0 ANSI Driver}; SERVER=localhost; PORT=3306;DATABASE=sakila; UID=%s; PASSWORD=%s;" % (user, pwd))
#Create cursor and extract via pyodbc
cursor = mydb.cursor()
cursor.execute('select title, release_year from film')
movies_release_year = pd.DataFrame([tuple(t) for t in cursor.fetchall()])
#Extract SQL query via Pandas
movies_release_year = pd.read_sql_query('select title, release_year from film', mydb, index_col = 'title')
#Pandas select with "in" clause
movies_release_year.loc[['BLUES INSTINCT','WONKA SEA']]
#Read sql query File
with open('average_actors_per_category.sql', 'r') as f:
sql_query = f.read().split(';')
f.close()
#Run query sequentially
for query in sql_query:
print('Now executing query.. /n {} /n'.format(query))
cursor.execute(query)
#Select temporary table query
average_actors_per_category = pd.read_sql_query('select * from sakila.AVERAGE_ACTORS_CATEGORY', mydb)
'''
The SQL Query used:
CREATE TEMPORARY TABLE sakila.FILM_ACTOR_COUNT
select film_id, count(*) as number_actors
from sakila.film_actor
group by film_id;
CREATE TEMPORARY TABLE sakila.AVERAGE_ACTORS_CATEGORY
select categories.name, avg(film_act.number_actors)
from sakila.FILM_ACTOR_COUNT as film_act
inner join
sakila.film_category as films
on film_act.film_id = films.film_id
inner join
sakila.category as categories
on films.category_id = categories.category_id
group by name
'''
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment