Last active
November 26, 2015 07:13
-
-
Save ashutoshnanda/2f79d6d310dd5c1cc08f to your computer and use it in GitHub Desktop.
This code (in both Python and R) will read from a SQLite database of Hillary Clinton's emails and display basic information about the database.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
import sqlite3 | |
#Set up connection to the SQLite database | |
connection = sqlite3.connect('clinton.sqlite') | |
#Print all tables | |
print("Tables") | |
all_tables = pd.read_sql("SELECT name FROM sqlite_master WHERE type = 'table'", connection) | |
print(all_tables) | |
#Print information about 'docs' table | |
docs = pd.read_sql("SELECT * FROM %s" % ("docs"), connection) | |
print("Column Names") | |
print(list(docs.columns.values)) | |
print("Number of Rows: %d" % len(docs)) | |
#Clean up connection to the database | |
connection.close() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(dplyr) | |
library(RSQLite) | |
#Set up connection to the SQLite database | |
connection <- dbConnect(RSQLite::SQLite(), dbname = "clinton.sqlite") | |
#Print all tables | |
print("Tables") | |
all_tables <- dbListTables(connection) | |
print(all_tables) | |
#Print information about 'docs' table | |
docs <- dbGetQuery(connection, sprintf("SELECT * FROM %s", "docs")) | |
print("Column Name") | |
print(colnames(docs)) | |
print(sprintf("Number of Rows: %d", nrow(docs))) | |
#Clean up connection to the database | |
dbDisconnect(connection) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Tables | |
name | |
0 classreason_doc | |
1 classreasons | |
2 docs | |
3 recipient_doc_cc | |
4 recipient_doc_from | |
5 recipient_doc_to | |
6 recipients | |
Column Names | |
['date', 'posted_date', 'pdf_link', 'doc_class', 'message_num', 'case_num', 'declass_date', 'classification', 'reason', 'author', 'id', 'subject', 'body', 'full_path', 'is_document', 'doc_date', 'pdf_url'] | |
Number of Rows: 40737 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[1] "Tables" | |
[1] "classreason_doc" "classreasons" "docs" "recipient_doc_cc" | |
[5] "recipient_doc_from" "recipient_doc_to" "recipients" | |
[1] "Column Name" | |
[1] "date" "posted_date" "pdf_link" "doc_class" "message_num" | |
[6] "case_num" "declass_date" "classification" "reason" "author" | |
[11] "id" "subject" "body" "full_path" "is_document" | |
[16] "doc_date" "pdf_url" | |
[1] "Number of Rows: 40737" |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment