Created
January 6, 2018 01:00
-
-
Save bourque/c785c4b2f53b4142b3e5ca32de581a3f to your computer and use it in GitHub Desktop.
An astronomy-themed SQL tutorial for the AAS231 Software Carpentry workshop
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "code", | |
"execution_count": 113, | |
"metadata": { | |
"collapsed": true, | |
"slideshow": { | |
"slide_type": "skip" | |
} | |
}, | |
"outputs": [], | |
"source": [ | |
"# To open this as a presentation\n", | |
"# jupyter nbconvert Interacting\\ with\\ SQL\\ databases.ipynb --to slides --post serve" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "skip" | |
} | |
}, | |
"source": [ | |
"# Why use databases?" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Types of Databases\n", | |
"\n", | |
"<img src=\"https://bytesforlunch.files.wordpress.com/2011/01/relational-vs-non-relational.jpg?w=700\">" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"### Relational\n", | |
"- Represent and store data in tables and rows.\n", | |
"- Use Structured Query Language (`SQL`), which is like a programming language for relational databases.\n", | |
"- Examples include `SQLite`, `MySQL`, `PostreSQL`, `T-SQL`.\n", | |
"\n", | |
"### Non-relational\n", | |
"- Represent and store data as collections of documents, objects, key-value stores, or heirarchal data formats.\n", | |
"- Also known as \"NoSQL\" databases.\n", | |
"- Examples include `JSON`, `MongoDB`." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# SQL: Structured Query Language\n", | |
"\n", | |
"- Programming language for storing, manipulating, and retreiving data stored in a relational database.\n", | |
"- All relational database management systems (e.g. `MySQL`, `SQLite`, `PostgreSQL`) use `SQL` as their standard database landuage.\n", | |
"- `SQL` can have different *dialects* that contain small, subtle differences (just like there are different *accents* in the english language). For example, a string-like data type in `MySQL` is defined as **`STRING`**, while in `SQLite` it is defined as **`TEXT`**. \n", | |
"- However, the vast majority of `SQL` is the same in all relational database management systems." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# What will we learn?\n", | |
"\n", | |
"In this lesson, we will learn how to retreive astronomical data from a `SQLite` database.\n", | |
"\n", | |
"- How to retrieve data from a database\n", | |
"- How to select a subset of the data\n", | |
"- How to sort a query's results\n", | |
"- How to combine data from multiple tables\n", | |
"\n", | |
"We will first learn how to do this through the `SQLite` command line interface, then we will learn how to accomplish the same tasks using Python." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# The `hipparcos.db` Database\n", | |
"\n", | |
"In this lesson, we will use with the `hipparcos.db` database, which stores data from the Extended Hipparcos Compilation (XHIP). \n", | |
"\n", | |
"It contains:\n", | |
"- Spectral classificaitons\n", | |
"- Radial velocities\n", | |
"- Iron abundances\n", | |
"- Stellar classifications\n", | |
"- Stellar ages\n", | |
"- 2MASS photometry\n", | |
"\n", | |
"The database can be downloaded from this URL:\n", | |
"\n", | |
"https://www.dropbox.com/s/o80up2k4h3z7uud/hipparcos.db?dl=0\n", | |
"\n", | |
"More information about the Extended Hipparcos Compilation:\n", | |
"\n", | |
"- https://arxiv.org/abs/1108.4971\n", | |
"- http://cdsarc.u-strasbg.fr/viz-bin/Cat?cat=V%2F137D&target=readme&#sRM-F.42\n", | |
"\n", | |
"This database contains two tables, named `data` and `photometry`. The `data` table stores Astrometry, spectrography, space motions, and exoplanet indications. The `photometry` tables stores photometry information." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Opening and Navigating the Database\n", | |
"\n", | |
"To connect to the database, we can use the `sqlite3` command line interface:\n", | |
"\n", | |
"```\n", | |
">>> sqlite3 hipparcos.db\n", | |
"```\n", | |
"\n", | |
"```\n", | |
"SQLite version 3.13.0 2016-05-18 10:57:30\n", | |
"Enter \".help\" for usage hints.\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"**Excercise 1:** Try connecting to the `hipparcos.db` database. Use the `.help` command to determine which command to use to list the names of tables of the database and run it." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"*Solution:*\n", | |
"\n", | |
"```\n", | |
"sqlite> .tables\n", | |
"data photometry\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "skip" | |
} | |
}, | |
"source": [ | |
"Instructor's notes:\n", | |
"- Mention how `SQLite` databases are simply files" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"To make the output a bit more human-readable, let's run the following commands:\n", | |
"\n", | |
"```\n", | |
".mode column\n", | |
".header on\n", | |
"```\n", | |
"\n", | |
"In `sqlite`, we can list the columns of a table using the `PRAGMA table_info()` command:\n", | |
"\n", | |
"```\n", | |
"sqlite> PRAGMA table_info(data);\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"**Excercise 2:** List all of the columns in the `photometry` table. How many columns are there?" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"*Solution:*\n", | |
"\n", | |
"```SQL\n", | |
"PRAGMA table_info(photometry);\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "skip" | |
} | |
}, | |
"source": [ | |
"Instructor's notes:\n", | |
"- `PRAGMA` is short for \"Pragmatics\", meaning dealing with language in context in which it is used.\n", | |
"- Specific to `SQLlite`\n", | |
"- Used to query the SQLite library for internal non-table data." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Selecting everything from a table" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"```SQL\n", | |
"SELECT *\n", | |
"FROM <table_name>;\n", | |
"```\n", | |
"\n", | |
"for example:\n", | |
"\n", | |
"```SQL\n", | |
"SELECT *\n", | |
"FROM data;\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"Note that we can also limit the number of results returned:\n", | |
"\n", | |
"```SQL\n", | |
"SELECT *\n", | |
"FROM data\n", | |
"LIMIT 10;\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"**Excercise 3:** Write a query that selects all columns from the `photometry` table and limits the results to 50 rows." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"*Solution*:\n", | |
"\n", | |
"```SQL\n", | |
"SELECT *\n", | |
"FROM photometry\n", | |
"LIMIT 50;\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "skip" | |
} | |
}, | |
"source": [ | |
"Instructor's notes:\n", | |
"- Note that `SQL` is *case insenstive*, but data is case-sensitive\n", | |
"- Note that it is common practice to have SQL commands be uppercase" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Filtering specific columns" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"```SQL\n", | |
"SELECT <column1_name>, <column2_name>, ..., <columnN_name>\n", | |
"FROM <table_name>;\n", | |
"```\n", | |
"\n", | |
"for example:\n", | |
"\n", | |
"```SQL\n", | |
"SELECT HIP, RV, e_RV\n", | |
"FROM data;\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"**Exercise 4**: Write a query that selects the Hipparcos Identifier, the Hipparcos magnitude, and the variability period from the `photometry` table. You may need to list the table's columns or refer to the documentation to check their column name." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"*Solution*:\n", | |
"\n", | |
"```SQL\n", | |
"SELECT HIP, Hpmag, Per\n", | |
"FROM photometry;\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Filtering specific rows" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"```SQL\n", | |
"SELECT <column1_name>, <column2_name>, ..., <columnN_name>\n", | |
"FROM <table_name>\n", | |
"WHERE <column_name> <condition>;\n", | |
"```\n", | |
"\n", | |
"for example:\n", | |
"\n", | |
"```SQL\n", | |
"SELECT HIP, RV, e_RV\n", | |
"FROM data\n", | |
"WHERE RV != 0;\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"More conditions can be added to the query using the `AND` command. There is no limit to the number of conditions you can put on a query.\n", | |
"\n", | |
"\n", | |
"```SQL\n", | |
"SELECT <column1_name>, <column2_name>, ..., <columnN_name>\n", | |
"FROM <table_name>\n", | |
"WHERE <column_name> <condition>\n", | |
"AND <column_name> <condition>\n", | |
"...\n", | |
"AND <column_name> <condition>;\n", | |
"```\n", | |
"\n", | |
"for example:\n", | |
"```SQL\n", | |
"SELECT HIP, RV, e_RV\n", | |
"FROM data\n", | |
"WHERE RV != 0\n", | |
"AND e_RV < 1;\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"**Exercise 5:** Write a query that selects the Hipparcos indentifier, the Johnson B apparent magnitute, the Johnson V apparent magnitude, and the variability period. Only select sources whose variability period and B-magnitute are non-zero, and have a B-magnitude less than 10. " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"*Solution:*\n", | |
"\n", | |
"```SQL\n", | |
"SELECT HIP, Bmag, Vmag, Per\n", | |
"FROM photometry\n", | |
"WHERE Per != 0\n", | |
"AND Bmag != 0\n", | |
"AND Bmag < 10;\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Sorting results" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"```SQL\n", | |
"SELECT <column1_name>, <column2_name>, ..., <columnN_name>\n", | |
"FROM <table_name>\n", | |
"WHERE <column_name> <condition>\n", | |
"AND <column_name> <condition>\n", | |
"...\n", | |
"AND <column_name> <condition>\n", | |
"ORDER BY <column_name>;\n", | |
"```\n", | |
"\n", | |
"for example:\n", | |
"\n", | |
"```SQL\n", | |
"SELECT HIP, Bmag, Vmag, Per\n", | |
"FROM photometry\n", | |
"WHERE Per != 0\n", | |
"AND Bmag != 0\n", | |
"AND Bmag < 10\n", | |
"ORDER BY Vmag;\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"**Exercise 6:** Write a query that selects the Hipparcos identifer, the Johnson B and V apparent magnitudes, and the variability period. Only select sources whose V-magnitude is greater than 11, and whose variability type is 'P'. Order the results by the variability period. " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"*Solution:*\n", | |
"\n", | |
"```SQL\n", | |
"SELECT HIP, Bmag, Vmag, Per\n", | |
"FROM photometry\n", | |
"WHERE Hvar = 'P'\n", | |
"AND Bmag > 11\n", | |
"ORDER BY Per;\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Joining Tables" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"<img src=\"https://image.slidesharecdn.com/2-background-140403152126-phpapp02/95/database-2-background-30-638.jpg?cb=1396538564\">\n", | |
"\n", | |
"```SQL\n", | |
"SELECT <column1_name>, <column2_name>, ..., <columnN_name>\n", | |
"FROM <table1_name>\n", | |
"JOIN <table2_name> ON <condition>;\n", | |
"```\n", | |
"\n", | |
"for example:\n", | |
"\n", | |
"```SQL\n", | |
"SELECT data.HIP, pmRA, pmDE, Bmag, Vmag\n", | |
"FROM data\n", | |
"JOIN photometry ON data.HIP = photometry.HIP;\n", | |
"```\n", | |
"\n", | |
"Note that `WHERE`, `AND`, and `ORDER BY` conditions can be applied to the query after the `JOIN`, for example:\n", | |
"\n", | |
"```SQL\n", | |
"SELECT data.HIP, pmRA, pmDE, Bmag, Vmag\n", | |
"FROM data\n", | |
"JOIN photometry ON data.HIP = photometry.HIP\n", | |
"WHERE Vmag > 11\n", | |
"AND Bmag > 11\n", | |
"ORDER BY Vmag;\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"**Exercise 7:** Using `JOIN`, write a query that selects the Hipparcos identifier and the radial velocity from the `data` table, as well as the Johnson B and V apparent magnitudes from the `photometry` table. Only select sources whose radial velocity are greater than 0, and order the results by the Johnson B magnitude." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"*Solution:*\n", | |
"\n", | |
"```SQL\n", | |
"SELECT data.HIP, RV, Bmag, Vmag\n", | |
"FROM data\n", | |
"JOIN photometry ON data.HIP = photometry.HIP\n", | |
"WHERE RV > 0\n", | |
"ORDER BY Bmag;\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"# Programming with Databases using Python" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"We have now learned how to retreive, filter, and sort data from a `SQL` database using the `SQLite` command line interface. The same tasks can be performed in Python, using the `sqlite3` library.\n", | |
"\n", | |
"```python\n", | |
"import sqlite3\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"To connect to the database, we can use the `sqlite3` `connect()` method:\n", | |
"\n", | |
"```python\n", | |
"connection = sqlite3.connect(\"/Users/bourque/Desktop/swc/hipparcos.db\")\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"Next, we set up a `cursor` object, which will later allow us to execute `SQL` commands:\n", | |
"\n", | |
"```python\n", | |
"cursor = connection.cursor()\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"Executing a database query is not as simple as passing the hard-coded query to the `cursor.execute()` command:\n", | |
"\n", | |
"```python\n", | |
"cursor.execute(\"<SQL COMMAND>;\")\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"For example:\n", | |
"\n", | |
"```python\n", | |
"sql_command = \"\"\"\n", | |
" SELECT data.HIP, RV, Bmag, Vmag\n", | |
" FROM data\n", | |
" JOIN photometry ON data.HIP = photometry.HIP\n", | |
" WHERE RV > 0\n", | |
" ORDER BY Bmag\n", | |
" LIMIT 20;\n", | |
" \"\"\"\n", | |
"cursor.execute(sql_command)\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"To retreive the results of the query, we can use the `cursor.fetchall()` method. This returns a list of tuples, with each element in the list being a row of results:\n", | |
"\n", | |
"```python\n", | |
"results = cursor.fetchall()\n", | |
"for result in results:\n", | |
" print(result)\n", | |
"```\n", | |
"\n", | |
"```\n", | |
"(30438, 20.299999237060547, -0.4560000002384186, -0.6200000047683716)\n", | |
"(30438, 20.299999237060547, -0.4560000002384186, -0.6200000047683716)\n", | |
"(30438, 20.299999237060547, -0.4560000002384186, -0.6200000047683716)\n", | |
"(30438, 20.299999237060547, -0.4560000002384186, -0.6200000047683716)\n", | |
"(53, 11.420000076293945, 0.0, 10.960000038146973)\n", | |
"(53, 11.420000076293945, 0.0, 10.960000038146973)\n", | |
"(3937, 6.0, 0.0, 12.010000228881836)\n", | |
"(3937, 6.0, 0.0, 12.010000228881836)\n", | |
"(4725, 57.0, 0.0, 11.800000190734863)\n", | |
"(4725, 57.0, 0.0, 11.800000190734863)\n", | |
"(6115, 130.0, 0.0, 11.630000114440918)\n", | |
"(6115, 130.0, 0.0, 11.630000114440918)\n", | |
"(8939, 24.0, 0.0, 12.369999885559082)\n", | |
"(8939, 24.0, 0.0, 12.369999885559082)\n", | |
"(11517, 9.0, 0.0, 11.850000381469727)\n", | |
"(11517, 9.0, 0.0, 11.850000381469727)\n", | |
"(12886, 28.0, 0.0, 10.800000190734863)\n", | |
"(12886, 28.0, 0.0, 10.800000190734863)\n", | |
"(14574, 11.800000190734863, 0.0, 11.720000267028809)\n", | |
"(14574, 11.800000190734863, 0.0, 11.720000267028809)\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"With some unpacking, we can get lists of the results:\n", | |
"\n", | |
"```python\n", | |
"HIPs = [item[0] for item in results]\n", | |
"RVs = [item[1] for item in results]\n", | |
"Bmags = [item[2] for item in results]\n", | |
"Vmags = [item[3] for item in results]\n", | |
"\n", | |
"print(Vmags)\n", | |
"```\n", | |
"\n", | |
"```\n", | |
"[-0.6200000047683716, -0.6200000047683716, -0.6200000047683716, -0.6200000047683716, 10.960000038146973, 10.960000038146973, 12.010000228881836, 12.010000228881836, 11.800000190734863, 11.800000190734863, 11.630000114440918, 11.630000114440918, 12.369999885559082, 12.369999885559082, 11.850000381469727, 11.850000381469727, 10.800000190734863, 10.800000190734863, 11.720000267028809, 11.720000267028809]\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"```python\n", | |
"import sqlite3\n", | |
"\n", | |
"connection = sqlite3.connect(\"/Users/bourque/Desktop/swc/hipparcos.db\")\n", | |
"cursor = connection.cursor()\n", | |
"\n", | |
"sql_command = \"\"\"\n", | |
" SELECT data.HIP, RV, Bmag, Vmag\n", | |
" FROM data\n", | |
" JOIN photometry ON data.HIP = photometry.HIP\n", | |
" WHERE RV > 0\n", | |
" ORDER BY Bmag\n", | |
" LIMIT 20;\n", | |
" \"\"\"\n", | |
"cursor.execute(sql_command)\n", | |
"results = cursor.fetchall()\n", | |
"\n", | |
"HIPs = [item[0] for item in results]\n", | |
"RVs = [item[1] for item in results]\n", | |
"Bmags = [item[2] for item in results]\n", | |
"Vmags = [item[3] for item in results]\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"**Exercise 8:**\n", | |
"\n", | |
"1. Using `sqlite3` and python, query the database for `B-V` and Absolute V mag (i.e. `VAbsMag`) measurements.\n", | |
" - Select only sources that are have a Distance error (`e_Dist`) that is non-zero and less than 10. \n", | |
" - Also filter out non-zero `B-V` values\n", | |
" - *Hint: You will need to join the `data` and `photometry` tables.*\n", | |
" - *Hint: You will need to wrap 'B-V' in double-quotes in the `SELECT` statement to avoid errors caused by the minus-sign.*\n", | |
"2. Parse the results by creating a list that holds the `B-V` values and a list that holds the `VabsMag` values.\n", | |
"3. Plot the results using `pyplot` *(hint: It may be useful to invery the y-axis using `plt.ylim(plt.ylim()[::-1])`*" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"*solution:*\n", | |
"\n", | |
"\n", | |
"```python\n", | |
"import matplotlib.pyplot as plt\n", | |
"\n", | |
"connection = sqlite3.connect(\"/Users/bourque/Desktop/swc/hipparcos.db\")\n", | |
"cursor = connection.cursor()\n", | |
"\n", | |
"sql_command = \"\"\"\n", | |
" SELECT `B-V`, VAbsMag\n", | |
" FROM data\n", | |
" JOIN photometry ON data.HIP = photometry.HIP\n", | |
" WHERE e_Dist != 0\n", | |
" AND e_Dist < 10\n", | |
" AND `B-V` !=0;\n", | |
" \"\"\"\n", | |
"cursor.execute(sql_command)\n", | |
"\n", | |
"results_all = cursor.fetchall()\n", | |
"\n", | |
"BVs = [item[0] for item in results]\n", | |
"VAbsMags = [item[1] for item in results]\n", | |
"\n", | |
"plt.plot(BVs, VAbsMags, '.', ms=4)\n", | |
"plt.xlabel('B-V [mag]')\n", | |
"plt.ylabel('Absolute V [mag]')\n", | |
"plt.ylim(plt.ylim()[::-1])\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"<img src=\"cmd.png\">" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "slide" | |
} | |
}, | |
"source": [ | |
"**(Bonus) Exercise 9:**\n", | |
"\n", | |
"1. Execute the previous query, only this time add conditions to select only white dwarf stars (`\"B-V\" < 0.7`, `VAbsMag > 10`).\n", | |
"2. Overplot the white dwarf stars on the plot from Excercise 8 using a different marker color." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"*Solution:*\n", | |
"\n", | |
"```python\n", | |
"connection = sqlite3.connect(\"/Users/bourque/Desktop/swc/hipparcos.db\")\n", | |
"cursor = connection.cursor()\n", | |
"\n", | |
"sql_command = \"\"\"\n", | |
" SELECT `B-V`, VAbsMag, pmRA, pmDE\n", | |
" FROM data\n", | |
" JOIN photometry ON data.HIP = photometry.HIP\n", | |
" WHERE e_Dist != 0\n", | |
" AND e_Dist < 10\n", | |
" AND `B-V` !=0\n", | |
" AND `B-V` < 0.7\n", | |
" AND VAbsMag > 10;\n", | |
" \"\"\"\n", | |
"cursor.execute(sql_command)\n", | |
"results_wd = cursor.fetchall()\n", | |
"\n", | |
"BVs_wd = [item[0] for item in results_wd]\n", | |
"VAbsMags_wd = [item[1] for item in results_wd]\n", | |
"\n", | |
"plt.plot(BVs, VAbsMags, '.', ms=4)\n", | |
"plt.plot(BVs_wd, VAbsMags_wd, '.', ms=4, color='orange')\n", | |
"plt.xlabel('B-V [mag]')\n", | |
"plt.ylabel('Absolute V [mag]')\n", | |
"plt.ylim(plt.ylim()[::-1])\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"slideshow": { | |
"slide_type": "fragment" | |
} | |
}, | |
"source": [ | |
"<img src=\"cmd_wd.png\">" | |
] | |
} | |
], | |
"metadata": { | |
"anaconda-cloud": {}, | |
"celltoolbar": "Slideshow", | |
"kernelspec": { | |
"display_name": "Python [conda env:astroconda3]", | |
"language": "python", | |
"name": "conda-env-astroconda3-py" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.5.4" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 1 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment