-
-
Save jamescalam/9a5008f064516d5fb4177362a03e2594 to your computer and use it in GitHub Desktop.
import sys | |
sys.path.insert(0, r'C:\\User\medium\pysqlplus\lib') | |
import os | |
from data import Sql | |
sql = Sql('database123') # initialise the Sql object | |
directory = r'C:\\User\medium\data\\' # this is where our generic data is stored | |
file_list = os.listdir(directory) # get a list of all files | |
for file in file_list: # loop to import files to sql | |
df = pd.read_csv(directory+file) # read file to dataframe | |
sql.push_dataframe(df, file[:-4]) | |
# now we convert our file_list names into the table names we have imported to SQL | |
table_names = [x[:-4] for x in file_list] | |
sql.union(table_names, 'generic_jan') # union our files into one new table called 'generic_jan' | |
sql.drop(table_names) # drop our original tables as we now have full table | |
# get list of categories in colX, eg ['hr', 'finance', 'tech', 'c_suite'] | |
sets = list(sql.manual("SELECT colX AS 'category' FROM generic_jan GROUP BY colX", response=True)['category']) | |
for category in sets: | |
sql.manual("SELECT * INTO generic_jan_"+category+" FROM generic_jan WHERE colX = '"+category+"'") | |
Hi toconner9, I've adjusted your code to use pandas dataframe, which is the object type that my code exports the data from your query as, maybe this works for you?
I had a difficult time getting more than one field back from the 'sql.manual' method. I was wondering if anyone could look at my code and give me some feedback. It works, but is there a better way?
from mssql_connection_short import Sql from datetime import date, datetime import pandas as pd # here we import the pandas modules so we can work with dataframes sql = Sql('AdventureWorks2017') SQLcommand = """SELECT TOP 20 P.ProductID, P.Name, P.Weight, P.ModifiedDate FROM AdventureWorks2017.Production.Product p (NOLOCK) WHERE P.weight > 0""" product_data = sql.manual(SQLcommand, response=True).astype( dtype={ "ProductID": "int64", "Name": "object", "Weight": "float64", "ModifiedDate": "datetime64[ns]" }) # here we are converting datatypes using pandas dataframe # to print all columns and rows to the console, must adjust pandas display options pd.set_option('display.max_rows', None) pd.set_option('display.max_columns', None) print(product_data) # then simply print
Printing the dataframe isn't always the easiest, so maybe (depending on what you are using it for) it would be easier to save to Excel or csv?
product_data.to_excel('output.xlsx', index=False) # index = False just prevents an index column from being included in your output
or
product_data.to_csv('output.csv', sep='|') # I usually set the delimiter (sep) to |, but can use any character
Also, there is documentation for using it all here if that helps!
I had a difficult time getting more than one field back from the 'sql.manual' method. I was wondering if anyone could look at my code and give me some feedback. It works, but is there a better way?