Skip to content

Instantly share code, notes, and snippets.

@iSarCasm
Last active February 26, 2018 15:57
Show Gist options
  • Save iSarCasm/842d93f19094d2d28e4fb1cbf3b68b2d to your computer and use it in GitHub Desktop.
Save iSarCasm/842d93f19094d2d28e4fb1cbf3b68b2d to your computer and use it in GitHub Desktop.

Introduction

  • SQL Structures Query Language. Declarative programming language
    • DDL Data Definition Language
      • Create, Alter, Drop
    • DML Data Manipulation Language
      • Insert, Update, Delete
    • TCL Transaction Control Language
      • begin, commit, rollback, start transaction, set transaction (savepoint, release savepoint, rollback to savepoint)
    • DCL Data Control Language
      • grant, revoke
    • PSQL mainatance commands
      • vacuum, analyze, cluster
  • RDBMS Relational Database Management System
  • Atomic, Consistent, Isolated, Durable ACID
  • SQL Injection
    • PostgreSQL implements a protocol level facility to send the static SQL query text separately from its dynamic arguments.
  • libpq C Driver
  • Why?
    • Bussiness Logic vs Data access
    • Correctness
      • Control of Connection and Transaction semantics
      • Transaction Isolation (https://www.postgresql.org/docs/current/static/transaction-iso.html)
        • PostgreSQL implements 3 out of 4 SQL standard isolation levels
        • Determines which side effects from other transactions your transaction is sensitive to.
          • Read uncommiteed (not in PG)
          • Read committed (default)
          • Repeatable read
          • Serializable
    • Efficiency
      • Reduces amount of round-trips between application and database

Indexing Strategy

Procedural Code and Stored Procedures

(PLpgSQL and SQL)

  • Allows to build a data access API.
  • Shrotens Query text
create or replace function get_all_albums
(
  in artistid bigint,
  out album
  text,
  out duration interval
)
returns setof record
language sql
as $$
  select album.title as album,
      sum(milliseconds) * interval '1 ms' as duration
  from album
      join artist using(artistid)
      left join track using(albumid)
  where artist.artistid = get_all_albums.artistid
group by album
order by album;
$$;

psql

  • Implements REPL (read-eval-print loop)
  • ~/.psqlrc
    • \set [ name [ value [ ... ] ] ]
      • \set ON_ERROR_STOP on
      • \set ON_ERROR_ROLLBACK interactive
      • \set PROMPT1 '%~%x%# '
    • \setenv name [ value ]
    • \pset [ option [ value ] ] (output format)
  • Psql commands (https://www.postgresql.org/docs/current/static/app-psql.html)

SQL

  • SELECT
    • Specifies the columns that result dataset has to include
    • Its a good practice to avoid * in your application
      • hides the intentions
      • provides less meaningful errors
      • makes code review easier
  • FROM
    • Specifies the sources
  • WHERE
    • combine filters
    • and, or, not, not in (NOT IN (1,2,3, NULL) returns no rows), not exists (SUBQUERY)
  • ORDER BY
    • SQL doesn't guarantee any ordering by default.
    • nulls last
    • kNN (k Nearest Neighbours) ordering
    • order by point(lng,lat) <-> point(2.349014, 48.864716) (GiST indexing)
  • LIMIT, OFFSET
    • offset
      • bad query performance
      • FETCH
        • fetch first 3 rows only
        • where row(lap, position) > (1, 3)
  • GROUP BY
    • Used with Aggregates (aka Map/Reduce)
      • sum, count, avg, bool_and
      • filter
      • ERROR: aggregate function calls cannot be nested
    • HAVING
      • filter for groups (like WHERE)
    • grouping sets
      • allows computing several groups in parallel
      • group by grouping sets((drivers.surname),(constructors.name))
    • rollup
      • permutations
    • cube
      • permustations with nulls
  • WITH (Common Table Expression)
  • DISTINCT ON
    • SELECT DISTINCT ON ( expression [, . . . ] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. The DISTINCT ON expressions are interpreted using the same rules as for ORDER BY (see above). Note that the “first row” of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first.
  • Set operations
    • UNION
      • UNION ALL (allows duplicates)
    • INTERSECT
    • EXCEPT
  • Comments
    • -- or /* */
  • cast(smth as type)
  • common table expression the with x ()
  • COALESCE returns the first of its arguments that is not null.
  • Set returning functions (https://www.postgresql.org/docs/current/static/functions-srf.html)
    • generate_series()

Nulls

  • Three-valued logic
a b a = b op result
true true true true = true is true
true false false true = false is false
true Null Null true = null is null
false true false false = true is false
false false true false = false is true
false Null Null false = null is null
Null true Null null = true is null
Null false Null null = false is null
Null Null Null null = null is null
  • We can think of null as meaning I don’t know what this is rather than no value here. Say you have in A (left hand) something (hidden) that you don’t know what it is and in B (right hand) something (hidden) that you don’t know what it is. You’re asked if A and B are the same thing. Well, you can’t know that, can you?
  • is distinct from and is not distinct from. Those two operators not only have a very long name, they also pretend that null is the same thing as null.
  • The default value for any column, unless you specify something else, is always null.
  • CONSTRAINT: Not null
  • Use is null instead of = null where you mean it.

Joins

  • noise words OURER / INNER
    • left, right, full = OUTER
    • join = INNER

Use Cases

select cast(calendar.entry as date) as date,
    coalesce(shares, 0) as shares,
    coalesce(trades, 0) as trades,
    to_char(
      coalesce(dollars, 0),
      'L99G999G999G999'
    ) as dollars
  from /*
  * Generate the target month's calendar then LEFT JOIN
  * each day against the factbook dataset, so as to have
  * every day in the result set, wether or not we have a
  * book entry for the day.
  */
  generate_series(date :'start',
    date :'start' + interval '1 month'
    - interval '1 day',2.2 A First Use Case | 17
    interval '1 day'
  )
  as calendar(entry)
    left join factbook
      on factbook.date = calendar.entry
order by date;
with computed_data as
(
  select cast(date as date) as date,
    to_char(date, 'Dy') as day,
    coalesce(dollars, 0) as dollars,
    lag(dollars, 1)
      over(
        partition by extract('isodow' from date)
        order by date
      )
    as last_week_dollars
  from /*
  * Generate the month calendar, plus a week before
  * so that we have values to compare dollars against
  * even for the first week of the month.
  */
    generate_series(date :'start' - interval '1 week',
      date :'start' + interval '1 month'
      - interval '1 day',
      interval '1 day'2.2 A First Use Case | 20
    )
    as calendar(date)
    left join factbook using(date)
)
select date, day,
  to_char(
  coalesce(dollars, 0),
  'L99G999G999G999'
  ) as dollars,
  case when dollars is not null
                    and dollars <> 0
      then round( 100.0
                  * (dollars - last_week_dollars)
                  / dollars
                  , 2)
  end
  as "WoW %"
from computed_data
where date >= date :'start'
order by date;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment