Skip to content

Instantly share code, notes, and snippets.

@Yogendra0Sharma
Created January 17, 2017 12:05
Show Gist options
  • Save Yogendra0Sharma/7cb585ff43b8cb542e9c6fbc65527328 to your computer and use it in GitHub Desktop.
Save Yogendra0Sharma/7cb585ff43b8cb542e9c6fbc65527328 to your computer and use it in GitHub Desktop.
PostgreSQL Basics

PostgreSQL (v9.6.1) Basics

Architectural Fundamentals

  • uses a client/server model
  • a PostgreSQL session consists of:
    • running the postgres database server program, which:
      • manages the database files
      • accepts connections to the database from client applications
      • performs database actions on behalf of the clients
    • connecting the user's client (frontend) application that wants to perform database operations
  • the client and the server can be on different hosts; they communicate over a TCP/IP network connection
  • PostgreSQL server can handle multiple concurrent connections from clients; it forks a new process for each connection for the client to communicate with without intervention from the main postgres process

Creating a Database

$ createdb mydb
$ dropdb mydb

Accessing a Database

-- activate 'psql' for the 'mydb' database
$ psql mydb

The psql program has a number of internal commands that are not SQL commands. They begin with \.

-- get help on the syntax of various PostgreSQL SQL commands
mydb=> \h
-- exit 'psql'
mydb=> \q

The SQL Language

Concepts

  • PostgreSQL is a relational database management system (RDBMS). It is a system for managing data stored in relations.
  • A relation is a mathematical term for table.
  • Each table is a named collections of rows.
  • Each row of a given table has the same set of named columns.
  • Each column is of a specific data type.
  • Columns have a fixed order in each row, rows do NOT have a fixed order within the table.
  • Tables are grouped into databases.
  • A collection of databases managed by a single PostgreSQL server instance constitutes a database cluster.

Creating a New Table

CREATE TABLE weather (
  city        varchar(80),
  temp_lo     int,          -- low temperature
  temp_hi     int,          -- high temperature
  prcp        real,         -- precipitation
  date        date
);
CREATE TABLE cities (
  name        varchar(80),
  location    point         -- a PostgreSQL-specific data type
);
DROP TABLE <tablename>;

Populating a Table with Rows

INSERT INTO weather VALUES ('San Francisco', 46, 50, 0.25, '1994-11-27');
INSERT INTO cities VALUES ('San Francisco', '(-194.0, 53.0)');

-- alternative syntax: list the columns explicitly
INSERT INTO weather (city, temp_lo, temp_hi, prcp, date)
  VALUES ('San Francisco', 43, 57, 0.0, '1994-11-29');
INSERT INTO weather (date, city, temp_hi, temp_lo)
  VALUES ('1994-11-29', 'Hayward', 54, 37);

Use COPY to load large amounts of data from flat-text files.

COPY weather FROM '/home/user/weather.txt';

Querying a Table

To retrieve data from a table, the table is queried. An SQL statement is used to do this. The statement is divided into:

  • a select list (lists the columns to be returned)
  • a table list (lists the tables from which to retrieve the data)
  • an optional qualification (specifies any restrictions)
-- retrieve all the rows of table 'weather'
SELECT * FROM weather;
-- write expressions and column references in the select list
SELECT city, (temp_hi+temp_lo)/2 AS temp_avg, date FROM weather;
-- retrieve the weather of San Francisco on rainy days
SELECT * FROM weather
  WHERE city = 'San Francisco' AND prcp > 0.0;
-- request the the results of a query be returned in a sorted order
SELECT * FROM weather
  ORDER BY city, temp_lo;
-- request that duplicate rows be removed
SELECT DISTINCT city FROM weather
  ORDER BY city;

Joins Between Tables

A query that accesses multiple rows of the same or different tables at one time is called a join query.

-- list all the weather records together with the location of the associated city
SELECT *                -- inefficient, for demo purposes only
  FROM weather, cities
  WHERE city = name;

-- list the output columns explicitly (best practice)
SELECT city, temp_lo, temp_hi, prcp, date, location
  FROM weather, cities
  WHERE city = name;

-- if duplicate column names exist, you need to use a qualifier (best practice)
SELECT weather.city, weather.temp_lo, weather.temp_hi,
       weather.prcp, weather.date, cities.location
  FROM weather, cities
  WHERE cities.name = weather.city;

-- alternative form (not common)
SELECT *
  FROM weather
  INNER JOIN cities
  ON (weather.city = cities.name);

-- if no matching row is found we want some "empty values" to be substituted (outer join)
SELECT *
  FROM weather
  LEFT OUTER JOIN cities -- 'LEFT OUTER' means the table on
  --- the left of the JOIN operator will have each of its
  --- rows in the output at least once; when there is no
  --- right-table match, empty (null) values are substituted
  --- for the right-table columns.
  ON (weather.city = cities.name);

-- join the table against itself (self join)
SELECT W1.city, W1.temp_lo AS low, W1.temp_hi AS high,
  W2.city, W2.temp_lo AS low, W2.temp_hi AS high
  FROM weather W1, weather W2
  WHERE W1.temp_lo < W2.temp_lo
  AND W1.temp_hi > W2.temp_hi;

-- alias short-cut
SELECT *
  FROM weather w, cities c
  WHERE w.city = c.name;

Aggregate Functions

An aggregate function computes a single result from multiple input rows.

-- find the highest low-temperature reading anywhere
SELECT max(temp_lo) FROM weather;

-- select which city that reading occurred in
-- use a subquery for this
SELECT city
  FROM weather
  WHERE temp_lo = (SELECT max(temp_lo) FROM weather);

-- get the max low temperature observed in each city with GROUP BY
SELECT city, max(temp_lo)
  FROM weather
  GROUP BY city;

-- filter grouped rows with HAVING
SELECT city, max(temp_lo)
  FROM weather
  GROUP BY city
  HAVING max(temp_lo) < 40;

-- cities whose names begin with "S"
SELECT city, max(temp_lo)
  FROM weather
  WHERE city LIKE 'S%' (1)
  GROUP BY city
  HAVING max(temp_lo) < 40;
  • WHERE selects input rows before groups and aggregates are computed (i.e. controls which rows go into the aggregate computation)
  • HAVING always contains aggregate functions

Updates

-- fix temperature readings after November 28, which are off by 2 degrees
UPDATE weather
  SET temp_hi = temp_hi - 2, temp_lo = temp_lo - 2
  WHERE date > '1994-11-28';

Deletions

-- delete weather data from 'Hayward'
DELETE FROM weather WHERE city = 'Hayward';

-- remove ALL rows from a given table
DELETE FROM <tablename>;

Advanced Features

Views

You can create a view over a commonly used query, which gave a name to the query that you can refer to like an ordinary table.

CREATE VIEW myview AS
  SELECT city, temp_lo, temp_hi, prcp, date, location
    FROM weather, cities
    WHERE city = name;

SELECT * FROM myview;
  • Making liberal use of views is a key aspect of good SQL database design.
  • Views allow you to encapsulate the details of the structure of your tables behind consistent interfaces.

Foreign Keys

You want to make sure that no one can insert rows in the weather table that do not have a matching entry in the cities table. This is called maintaining the referential integrity of your data. PostgreSQL can do this for you.

CREATE TABLE cities (
  city      varchar(80) primary key,
  location  point
);

CREATE TABLE weather (
  city      varchar(80) references cities(city),
  temp_lo   int,
  temp_hi   int,
  prcp      real,
  date      date
);

Transactions

A transaction bundles multiple steps into a single, all-or-nothing operation. A transaction is said to be atomic: from the point of view of other transactions, it either happens completely or not at all. The intermediate states between the steps are not visible to other concurrent transactions, and if some failure occurs that prevents the transaction from completing, then none of the steps affect the database at all. Furthermore, a transactional database guarantees that all updated made by a transaction are logged in permanent storage (i.e., on disk) before the transaction is reported complete.

PostgreSQL treats every SQL statement as being executed within a transaction. A group of statements surrounded by BEGIN and COMMIT is called a transaction block.

-- A transaction block
BEGIN;
UPDATE accounts SET balance = balance - 100.00
  WHERE name = 'Alice';
-- etc. etc.
COMMIT;

Savepoints allow you to selectively discard parts of the transaction while committing the rest. After defining a savepoint with SAVEPOINT, you can if need roll back to the savepoint with ROLLBACK TO.

BEGIN;
UPDATE accounts
  SET balance = balance - 100.00
  WHERE name = 'Alice';
SAVEPOINT my_savepoint;
UPDATE accounts
  SET balance = balance + 100.00
  WHERE name = 'Bob';
-- oops ... forget that and use Wally's account
ROLLBACK TO my_savepoint;
UPDATE accounts
  SET balance = balance + 100.00
  WHERE name = 'Wally';
COMMIT;

Window functions

A window function performs a calculation across a set of table rows that are somehow related to the current row. Unlike aggregate functions, window functions do not cause rows to become grouped into a single output row -- the rows retain their separate identities.

-- compare each employee's salary with the average salary in his/her department
SELECT depname, empno, salary, avg(salary)
  OVER (PARTITION BY depname) -- This clause determines
  --- exactly how the rows are split up for processing by the
  --- window function. PARTITION BY specifies dividing the
  --- rows into partitions of 'depname'. For each row, the
  --- window function is computed across the rows that fall
  --- into the same partition as the current row.
  FROM empsalary;

-- control the order in which the rows are processed by window functions (ORDER BY within OVER)
SELECT depname, empno, salary, rank() -- The rank() function
  --- produces a numerical rank within the current row's
  --- partition for each distinct ORDER BY value, in order`
  --- defined in the ORDER BY clause.
  OVER (PARTITION BY depname ORDER BY salary DESC)
  FROM empsalary;

-- when ORDER BY is omitted, the default window frame consists of all rows in the partition
SELECT salary, sum(salary)
  OVER ()
  FROM empsalary;

-- when ORDER BY is defined, the window frame consists of the first (lowest) salary up through the current once, including any duplicates of the current one
SELECT salary, sum(salary)
  OVER (ORDER BY salary)
  FROM empsalary;

-- filter or group rows after the window calculations are performed
SELECT depname, empno, salary, enroll_date
  FROM
    (SELECT depname, empno, salary, enroll_date,
      rank() OVER (PARITION BY depname ORDER BY salary DESC, empno) AS pos
      FROM empsalary
    ) AS ss
  WHERE pos < 3;

-- using multiple window functions
SELECT sum(salary) OVER w, avg(salary) OVER w
  FROM empsalary
  WINDOW w as (PARTITION BY depname ORDER BY salary DESC);
  • Window functions are permitted only in the SELECT list and the ORDER BY clause of the query.
  • Window functions execute after regular aggregate functions.

Inheritance

CREATE TABLE cities (
  name          text,
  population    real,
  altitude      int     -- (in ft)
);

CREATE TABLE capitals (
  state         char(2)
) INHERITS (cities);
-- all cities, including state capitals, that are located at an altitude over 500 feet
SELECT name, altitude
  FROM cities
  WHERE altitude > 500;

-- all cities that are not state capitals and are situated at an altitude over 500 feet
SELECT name, altitude
  FROM ONLY cities -- ONLY indicates that the query should be
  --- run only over the 'cities' table, and not tables below
  --- 'cities' in the inheritance hierarchy
  WHERE altitude > 500;

Although inheritance is frequently useful, it has not been integrated with unique constraints or foreign keys, which limits its usefulness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment