Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@jonalv
Created January 20, 2021 13:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jonalv/6305c13c491b65bf4d8e6cb709a41f18 to your computer and use it in GitHub Desktop.
Save jonalv/6305c13c491b65bf4d8e6cb709a41f18 to your computer and use it in GitHub Desktop.
\documentclass[12pt, oneside]{memoir}
\usepackage{tikz}
\usetikzlibrary{intersections}
\usetikzlibrary{calc}
\usetikzlibrary{er}
\usepackage{letltxmacro}
\usepackage[utf8]{inputenc}
\usepackage[cmintegrals,cmbraces]{newtxmath}
\usepackage[lining]{ebgaramond}
\usepackage{ebgaramond-maths}
\usepackage[T1]{fontenc}
\usepackage{nicefrac}
\usepackage{microtype}
\usepackage{icomma}
\usepackage{amsmath}
\usepackage{mathtools}
\usepackage{pgfplots}
\usepackage{keystroke}
\usepackage{alltt}
\usepackage{cmap}
\usepackage{textcomp}
\usepackage{wasysym}
\usepackage{fancyvrb}
\usepackage{datetime}
\usepackage[textsize=subscriptsize, linecolor=magenta, bordercolor=magenta,
backgroundcolor=magenta, textwidth=7cm]{todonotes}
\makeatletter
\renewcommand{\@todonotes@drawMarginNoteWithLine}{%
\begin{tikzpicture}[remember picture, overlay, baseline=-0.75ex]%
\node [coordinate] (inText) {};%
\end{tikzpicture}%
\marginpar[{% Draw note in left margin
\@todonotes@drawMarginNote{r}%
\@todonotes@drawLineToLeftMargin%
}]{% Draw note in right margin
\@todonotes@drawMarginNote{l}%
\@todonotes@drawLineToRightMargin%
}%
}
\renewcommand{\@todonotes@drawMarginNote}[1]{
\makebox[\marginparwidth][#1]{\begin{tikzpicture}[remember picture,baseline=(X.base)]%
\node(X){\vphantom{X}};%
\draw node[notestyle,font=\@todonotes@sizecommand,anchor=north, text width=2.5cm] (inNote) at (X.north)%
{\@todonotes@text};%
\if@todonotes@authorgiven%
\draw node[notestyle,font=\@todonotes@sizecommand,anchor=north, text width=2.5cm] (inNote) at (X.north)%
{\@todonotes@sizecommand\@todonotes@author};%
\node(Y)[below=of X]{};%
\draw node[notestyle,font=\@todonotes@sizecommand,anchor=north, text width=2.5cm] (inNote) at (X.south)%
{\@todonotes@text};%
\else%
\draw node[notestyle,font=\@todonotes@sizecommand,anchor=north, text width=2.5cm] (inNote) at (X.north)%
{\@todonotes@text};%
\fi%
\end{tikzpicture}%
}}
\makeatother
\LetLtxMacro{\oldtodo}{\todo}
\renewcommand{\todo}[1]{{\color{white}\oldtodo{\textsf{#1}}}}
\renewcommand*{\sideparfont}{\footnotesize\itshape}
\sideparmargin{outer}
\renewcommand*{\sideparform}{\raggedright}
\begin{document}
\sidepar{\miniscule \textsf{Version: \today, \currenttime}}
\vskip 1 \baselineskip
{\let\clearpage\relax\chapter*{Databases exercise 1}
In this exercise you will create a simple SQLite database, populate it with some
data and execute some select statements.\sidepar{
When writing on the command line you should always hit
\Enter to execute, so after this page I will not remind you.}
\section*{Starting the SQLite command line client}
Open Jupyter Lab and then start a terminal by: \texttt{File} $\rightarrow$
\texttt{New} $\rightarrow$ \texttt{Terminal}. You might want to create a
separate directory for this exercise and go into it. If so, write: (don't write
the \$ symbol, it is already there)
\begin{alltt}
\$ mkdir sqlite \Enter \vspace{2pt}
\$ cd sqlite \Enter
\end{alltt}
Next start SQLite using a database file names \texttt{work.db}:\sidepar{We will be using SQLite version 3}
\begin{alltt}
\$ sqlite3 work.db \Enter
\end{alltt}%
When doing SQL it is very helpful to have a text editor
open to write things in first and then copy from and paste into the terminal.
Do \texttt{File} $\rightarrow$ \texttt{New} $\rightarrow$ \texttt{Text File}
and drag that tab next to your terminal tab to get something looking a bit like this:
\sidepar{This is the \texttt{Settings}$ \rightarrow$ \\ \texttt{Jupyter Lab Theme}$ \rightarrow$ \\ \texttt{Jupyter Lab Dark} theme.}
\enlargethispage{3\baselineskip}
\noindent
%\includegraphics[width=1\textwidth]{images/texteditor.png}
Next we will create tables in our new database but first a few words about
SQLite. SQLite is not very strict with data types. Normally in SQL there are a
few standard data types. SQLite will accept them and transfer them into it's
own datatypes. For example most SQL languages have a multitude of
\texttt{Integer} versions of different size. This because in some cases using a
smaller datatype can mean that the database becomes significantly smaller.
SQLite maps all of them to it's own \texttt{Integer} datatype. In this exercise
I will use the more standard SQL datatypes meaning that if you find yourself
with another database system than SQLite it should still look pretty familiar.
\section*{Creating tables}\label{createTableSection}
We are going to create the tables from the lecture. It will be a bit to
write but take it slowly and step by step. We start by creating the
\texttt{Person} table. A \texttt{Person} has a name, an address and a birth
date. We also give each table a column \texttt{id} so that we can easily refer
to a specific row in the table.
\Needspace{8\baselineskip}
\sidepar{also the extra white spaces (indentations) are not important, I have
just added them to make it look good}
We need to write: (Don't write the \texttt{...>} symbols,
SQLite adds them for you when you make a newline,)
\begin{alltt}
sqlite> CREATE TABLE Person (
...> id INTEGER PRIMARY KEY AUTOINCREMENT,
...> name VARCHAR(128),
...> address VARCHAR(256),
...> birthdate DATE
...> );
\end{alltt}\label{CreateTableStatement}
\sidepar{this works differently on different database systems, in fact the
\texttt{AUTOINCREMENT} keyword is not strictly needed in SQL\kern1ptite but it is
supported}The \texttt{AUTOINCREMENT} command means that whenever a new person
is created it will get a new increasing number.
Next, a \texttt{Location} has a name and is situated in a town. Create the
\texttt{Location} table:
\begin{alltt}
sqlite> CREATE TABLE Location (
...> id INTEGER PRIMARY KEY AUTOINCREMENT,
...> name VARCHAR(128),
...> town VARCHAR(128)
...> );
\end{alltt}
Finally a \texttt{Job} ties it together. The \texttt{Job} table specifies which
\texttt{Person} has what title and works at which \texttt{Location}. Create the
\texttt{Job} table:
\Needspace{10\baselineskip}
\begin{alltt}
sqlite> CREATE TABLE Job (
...> id INTEGER PRIMARY KEY AUTOINCREMENT,
...> title VARCHAR(128),
...> Location_id int,
...> Person_id int,
...>
...> FOREIGN KEY (Location_id) REFERENCES Location(id),
...> FOREIGN KEY (Person_id) REFERENCES Person(id)
...> );
\end{alltt}
Now, the \texttt{Job} table has two \texttt{FOREIGN KEY} specifications. They
enforce that we can't create a \texttt{Job} that references a \texttt{Location}
or \texttt{Person} that doesn't exist yet. As we will see later on.
\section*{Populating tables}
\sidepar{Tired of writing? These lines are without `\texttt{...>}' so you should be
able to copy from the pdf.}Next we will add some data to our
tables. Remember that the command for adding rows to a table is called
\texttt{INSERT} and that we must specify which fields we want to add data to.
In fact, any field that we don't specify will be set to \texttt{NULL},
\textit{i.e.}, set to nothing, for that entry. First let's create some rows in
the \texttt{Person} table:
\begin{alltt}
INSERT INTO Person(name, address, birthdate)
VALUES (\texttt{\textquotesingle}Mary Smith\texttt{\textquotesingle}, \texttt{\textquotesingle}Main street 1\texttt{\textquotesingle}, \texttt{\textquotesingle}1970-04-17\texttt{\textquotesingle});
INSERT INTO Person(name, address, birthdate)
VALUES (\texttt{\textquotesingle}John Doe\texttt{\textquotesingle}, \texttt{\textquotesingle}Highway 2\texttt{\textquotesingle}, \texttt{\textquotesingle}1972-07-24\texttt{\textquotesingle});
INSERT INTO Person(name, address, birthdate)
VALUES (\texttt{\textquotesingle}Clara Doe\texttt{\textquotesingle}, \texttt{\textquotesingle}Highway 2\texttt{\textquotesingle}, \texttt{\textquotesingle}1995-11-11\texttt{\textquotesingle});
INSERT INTO Person(name, address, birthdate)
VALUES (\texttt{\textquotesingle}Anna Jones\texttt{\textquotesingle}, \texttt{\textquotesingle}Midway 34\texttt{\textquotesingle}, \texttt{\textquotesingle}1974-02-03\texttt{\textquotesingle});
\end{alltt}
While we are at it let's create the two example rows into the \texttt{Location}
table as well:
\begin{alltt}
INSERT INTO Location(name, town)
VALUES (\texttt{\textquotesingle}Site A\texttt{\textquotesingle}, \texttt{\textquotesingle}New York\texttt{\textquotesingle});
INSERT INTO Location(name, town)
VALUES (\texttt{\textquotesingle}Site B\texttt{\textquotesingle}, \texttt{\textquotesingle}London\texttt{\textquotesingle});
\end{alltt}
All that remains now is the \texttt{Job} table. Remember that it has references
to rows in the other tables. In a sense it connects the other two tables. In
order to create an entry in the \texttt{Job} table we need to references a
\texttt{Person} and a \texttt{Location}. We do this reference by their id
field. What id's for \texttt{Location} and \texttt{Person} are are in the
database then? An easy way to figure this out is to run:
\Needspace{8\baselineskip}
\begin{alltt}
sqlite> SELECT * FROM Person;
{ \fontfamily{pcr}\selectfont
1|Mary Smith|Main street 1|1970-04-17
2|John Doe|Highway 2|1972-07-24
3|Clara Doe|Highway 2|1995-11-11
4|Anna Jones|Midway 34|1974-02-03}
\end{alltt}
However, it would be nice to see the name of the headers, and we can configure
SQLite to also print the columns in a nicer way. Run:
\begin{alltt}
sqlite> .headers on
sqlite> .mode column
\end{alltt}
and the same command as before will look like this instead:
\begin{alltt}
sqlite> SELECT * FROM Person;
{ \fontfamily{pcr}\selectfont
id name address birthdate
---------- ---------- ------------- ----------
1 Mary Smith Main street 1 1970-04-17
2 John Doe Highway 2 1972-07-24
3 Clara Doe Highway 2 1995-11-11
4 Anna Jones Midway 34 1974-02-03}
\end{alltt}
and let's also list the locations:
\begin{alltt}
sqlite> SELECT * FROM Location;
{ \fontfamily{pcr}\selectfont
id name town
---------- ---------- ----------
1 Site A New York
2 Site B London}
\end{alltt}
As you can see from these tables every time a new entry has been inserted a new
id number has been given to it. Now, if we try to insert a row into the
\texttt{Job} table that references a \texttt{Person} by an id that doesn't yet
exist I have said that it shouldn't work. That is not entirely true because SQLite only honors foreign key relations if that feature is activated. Let's try it out by first activating it:
\begin{alltt}
sqlite> PRAGMA foreign_keys = ON;
\end{alltt}
and then try:
\begin{alltt}
sqlite> INSERT INTO Job(title, Location_id, Person_id)
...> VALUES ('CEO', 1, 5);
\end{alltt}
%\sidepar{You should be able to copy and paste these inserts from the pdf
%too.}
What did you get? At least now we can't create a job for a
non-existing \texttt{Person}. That is good to know.
\Needspace{8\baselineskip}
Now, let's add some rows to the \texttt{Jobs} table as well so that we have
something to work with:
\begin{alltt}
INSERT INTO Job(title, Location_id, Person_id)
VALUES (\texttt{\textquotesingle}Secretary\texttt{\textquotesingle}, 1, 1);
INSERT INTO Job(title, Location_id, Person_id)
VALUES (\texttt{\textquotesingle}Secretary\texttt{\textquotesingle}, 2, 2);
INSERT INTO Job(title, Location_id, Person_id)
VALUES (\texttt{\textquotesingle}Intern\texttt{\textquotesingle}, 1, 3);
INSERT INTO Job(title, Location_id, Person_id)
VALUES (\texttt{\textquotesingle}Administrator\texttt{\textquotesingle}, 2, 4);
INSERT INTO Job(title, Location_id, Person_id)
VALUES (\texttt{\textquotesingle}Fill-in\texttt{\textquotesingle}, 1, 4);
\end{alltt}
\needspace{12\baselineskip}
\section*{Working with our data}
\sidepar{``Do I really need to write all this code?'' --- Yes, and no\dots\ I
think that you actually learn by trying it out and doing it yourself though.
Also, if you think of something else that you want to try, Go Right
Ahead! \smiley{}}Finally, we now have a populated database. Maybe just a small example
with only a few rows in each table but at least enough for us to try some
things on. First let's try the \texttt{UPDATE} command. We can pretend that it
turns out that 'Anna Jones' has moved so we need to update her address:
\begin{alltt}
sqlite> UPDATE Person
...> SET address = 'Upper street 12'
...> WHERE name = 'Anna Jones';
\end{alltt}
We should also try the \texttt{DELETE} command. Let's say that it turns out
that this same 'Anna Jones' no longer fills in at the New York office. Then we
should delete that entry from the \texttt{Job} table:
\Needspace{8\baselineskip}
\begin{alltt}
sqlite> DELETE FROM Job
...> WHERE title='Fill-in';
\end{alltt}
Notice that this command removes all fill-ins. In this case only one exists though.
\Needspace{8\baselineskip}
These two commands didn't really show what the data looked like after they were
executed. If we want to do that we need a \texttt{SELECT} statement again. We
want to check the address and the jobs of\kern 4pt `Anna Jones' to see that our changes
took effect. Remembering what our database looks like: \sidepar{You might also
want to have a look at the \texttt{CREATE TABLE} statements on
page~\pageref{CreateTableStatement}}
\begin{center}
\begin{tikzpicture}
\footnotesize
\node[entity] (person) at (0,0) {Person}
child {node [key attribute] {id}}
child {node {address}}
child {node {name}}
child {node {birth date}};
\node[entity] (job) at (3,1) {Job}
child {node [key attribute] {id}}
child {node {title}};
\node[relationship] (has) at ($(person)!0.5!(job)$) {has}
edge (person)
edge (job);
\node[entity] (location) at (6,0) {Location}
child {node [key attribute] {id}}
child {node {name}}
child {node {town}};
\node[relationship] (has2) at ($(location)!0.5!(job)$) {has}
edge (location)
edge (job);
\end{tikzpicture}
\end{center}
\noindent
we know that we need to join the \texttt{Person} and \texttt{Job} table and
that we want the person's name to be \texttt{'Anna Jones'}. Try to write the
query yourself before looking at the solution on next page:
\newpage
\begin{alltt}
sqlite> SELECT p.name, p.address, j.title
...> FROM Person p, Job j
...> WHERE p.name='Anna Jones'
...> AND p.id = j.Person_id;
{ \fontfamily{pcr}\selectfont
name address title
---------- --------------- -------------
Anna Jones Upper street 12 Administrator}
\end{alltt}
We see that our \texttt{UPDATE} of the address had effect and that the
\texttt{Job} entry for 'Fill-in' has been removed.
Our database is quite small so an \texttt{INDEX} will not make a difference
that we can measure but if the database were to grow large it might be useful
with and \texttt{INDEX} on the name of \texttt{Person}. Let's add one:
\begin{alltt}
sqlite> CREATE INDEX name_index ON Person(name);
\end{alltt}
\newpage
\section*{Exercises}
Now it's time for you to try to write your own queries. Here are a few
exercises for you to work on. We will go through the solutions tomorrow.
Tomorrow we will also look at the ChEMBL database which is a database
containing chemical data.
\begin{itemize}
\item [Exercise 1:] Write a \texttt{SELECT} statement that retrieves title,
name and address for every person in the database. You will need to join the
Person table with the Job table to do that.
\item [Exercise 2:] Do you remember the flat file example from the lecture? It
looked a bit like this in the lecture:
\sidepar{The column headers used in the flat file example is not the same as in
the database. If you want to change them, try using Google to find out how to do it.}
\tiny
\begin{Verbatim}[frame=single]
ID NAME HOME ADDRESS BIRTHDATE TITLE OFFICE SITE OFFICE TOWN
1 Mary Smith Main street 1 1970-04-17 Secretary Site B London
2 John Doe Highway 2 1972-07-24 Secretary Site A New York
3 Clara Doe Highway 2 1995-11-11 Intern Site A New York
4 Anna Jones Midway 34 1974-02-03 Administrator Site B London
5 Anna Jones Midway 34 1974-02-03 Fill-in Site A New York
\end{Verbatim}
\normalsize
Write a \texttt{SELECT} statement that outputs those fields. The output won't
be identical, for example remember that we removed the `Fill-in' \texttt{Job}
and changed the address of `Anna Jones'. This time you will need to join three
tables and list quite a few fields to be included.
\item [Exercise 3:] Write a \texttt{SELECT} statement for name, address and
title of all the persons working at the New York office.
\end{itemize}
\end{document}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment