- Know what SQL is and what it does
- Explain different kinds of data allowed to be stored in our database
- Be able to understand a Schema and how it represents data in our application
- Understand Primary and Foreign Keys and table relationships
Before this lesson, students should already be able to:
- Install PostgreSQL
- Let's review: at it's simplest, a relational database is a mechanism to store and retrieve data in a tabular form.
- Spreadsheets are a good analogy
- But how do we interact with our database: inserting data, updating data, retrieving data, and deleting data?
- That's where SQL comes in!
- SQL stands for Structured Query Language, and it is a language universally used and adapted to interact with relational databases.
- SQL allows us to store and control data in our application
- With SQL, we can do the following with our data
- Inserting data
- Querying or retrieving data
- Updating or deleting data
- Creating new tables and entire databases
- Control permissions of who can have access to our data
- A database is just a repository to store the data and you need to use systems that dictate how the data will be stored and as a client to interact with the data.
- We call these systems "Database Management Systems", they come in many forms:
- MySQL
- SQLite
- PostgreSQL (what we'll be using!)
- The foundational idea underneath relational databases is a simple but powerful structure.
- Each table is a set of sets, and within a single table all of these sets have the same data structure, containing a list of named fields and their values.
- For convenience, each set within a table is called a row, and each field within that row is part of a larger named column
- It looks a lot like a spreadsheet with named columns and unnamed rows.
- Databases offer much less of that kind of flexibility, and in return can offer tremendous power because of their obsession with neatly ordered data.
- Every row within a table has to have the same structure for its data, and calculations generally take place outside of the tables, not within them.
- Tables just contain data.
- A schema is its structure described in a formal language supported by the database syntax (it varies between SQL and PostgreSQL)
- It represents the name of the Table (our object) and the attributes corresponding data type
- The term "schema" refers to the organization of data as a blueprint of how the database is constructed
Field Name | Data Type |
---|---|
id | :integer |
given_name | :string |
middle_name | :string |
family_name | :string |
date_of_birth | :date |
grade_point_average | :float |
drivers_license | :boolean |
- Depending on the database, we can only declare data to be stored in certain select formats
- Most databases respond to similar types of data, but they may have different names
- When declaring a field name on our Table, we need to give it a certain data type
- Data types can be grouped together in the following way:
- boolean
- character values (strings)
- integer values
- floating point values
- dates/time, datetime
-
strings are most commonly used. They have a max length of 225 characters and are best used for names, addresses, emails, etc
-
when using SQL, use single quote strings
' '
some databases do not recognize double quotes" "
-
but if you're string contains a string i.e.
'it's a nice day'
-
write it as:
"it's a nice day"
or'it''s a nice day'
-
In SQL, strings are referred to in the following names:
-
char: holds a single character. example: 'a'
-
char (#): holds # number of characters. Spaces will be inserted to fill any extra room.
-
varchar (#): holds a maximum of # number of character. Can contain less characters. We will be using this the most often
-
color varchar(24)
- examples: large text files, messaging apps, email
- the text data type is best used for large amounts of text
- text also does not have a max length, is is virtually unlimited
- text should be in single quotes
' '
as well description text
-
when using single vs double quotes remember this
[S]ingle quote for [S]trings, [D]ouble quote for things in the [D]atabase
-
not all databases support the use of double strings
" "
-
to escape single strings, remember they cancel each other out
'it''s been suzie''s turn how''d you forget?'
- examples:
3
56
789
19876
- integers are used for whole numbers
- the main thing we will use integers for are
id
values on objects - when making objects, it is convenient to keep track of them with an ordered unique identifier
- we can accomplish that by adding the
serial
keyword to oursql
command which auto populates the number
- examples:
3.45
0.79
1234.6543918
- use if exact precision isnt needed i.e. GPA's, "random" numbers
- ruby's
rand
method and javascript'sMath.random()
function both return floating point integer values
3.4520928172
0.628172531
- similar to float, but they are geared towards exact precision.
- use these if precision is necessary i.e. dealing with currency
-
examples:
'2004-10-19 10:23:54'
-
calling an object with a datetime value:
user.created_at # => 2016-09-16 10:04:32
-
the format of
datetime
is YYYY-MM-DD HH:MM:SS -
a
timestamp
is a "method" that uses a return value of datetime -
it is commonly used in capturing precise times of when users and data has taken action
null
is essentially nothing. It is a false value, like ruby'snil
- in SQL, to validate data you can append
NOT NULL
to ensure a value is not saved into the database without it - this lets us control what get's put into our database so things don't break
- we will get into creating tables and schemas next, but based on the data types above, if we wanted to make a
user
object for our application, we could model it by setting up a basic table like this
CREATE TABLE user (
id serial PRIMARY KEY,
name varchar (50) NOT NULL,
nickname varchar (25) NOT NULL,
description text,
created_at timestamp
);
-
when modeling data we use integer values called
id
to keep track of the object number we created. -
in the above example where we create our user table we write it like this:
id serial PRIMARY KEY
-
3 things are done here
-
we give a
field name
calledid
onto our user table -
we give a "method" called
serial
which allows it to auto-increment meaning we don't have to manually pass it a value -
and we define it as a
primary key
-
if we did not want to add the
serial
option, but still wanted to ensure anid
was made and is unique, we could write it like this
CREATE TABLE user (
ID INT NOT NULL,
NAME VARCHAR (35) NOT NULL,
PRIMARY KEY (ID)
);
-
If you were looking at a giant spreadsheet of data, and someone told you to look for data in a certain row/column, what would be the first thing you would ask for?
-
You can think of the primary key as an address. If the rows in a table were mailboxes, then the primary key would be the listing of street addresses.
- In order for a table to qualify as a relational table it must have a primary key
- The primary key consists of one or more columns whose data contained within is used to uniquely identify each row in the table.
- In order to be a primary key, several conditions must hold true.
- First, as mentioned, the columns must be unique.
- Second, no value in the columns can be blank or NULL.
- When defining a table you specify the primary key. A table has just one primary key, and its definition is mandatory.
- The primary key for each table is stored in an index. The index is used to enforce the uniqueness requirement. It also makes it easy for foreign key values to refer back to corresponding primary key values
-
A database index allows a query to efficiently retrieve data from a database.
-
Indexes are related to specific tables and consist of one or more keys. A table can have more than one index built from it. The keys are a fancy term for the values we want to look up in the index.
-
For this example consider the index in the back of a book:
-
The keys for this index are the subject words we reference. The index entries consist of the key and page number.
-
The keys are in alphabetical order, which makes really easy for us to scan the index, find an entry, note the pages, and then flip the book to the correct pages.
- A foreign key is a set of one or more columns in a table that refers to the primary key in another table.
- There isn’t any special code, configurations, or table definitions you need to place to officially “designate” a foreign key.
- Foreign keys allow us to find relationships between groups of objects
- A FOREIGN KEY in one table points to a PRIMARY KEY in another table.
- Note that the "P_Id" column in the "Orders" table points to the "P_Id" column in the "Persons" table.
- The "P_Id" column in the "Persons" table is the PRIMARY KEY in the "Persons" table.
- The "P_Id" column in the "Orders" table is a FOREIGN KEY in the "Orders" table.
- The FOREIGN KEY constraint is used to prevent actions that would destroy links between tables.
- The FOREIGN KEY constraint also prevents invalid data from being inserted into the foreign key column, because it has to be one of the values contained in the table it points to.
- foreign keys are stored as indexes, as they are commonly used with advanced operations such as grouping and joining objects
- the same rule and concepts for
primary key
indexes related to foreign keys