Skip to content

Instantly share code, notes, and snippets.

@hjwp
Created August 1, 2014 11:53
Show Gist options
  • Save hjwp/56972b110c969ee70628 to your computer and use it in GitHub Desktop.
Save hjwp/56972b110c969ee70628 to your computer and use it in GitHub Desktop.
Building a postgres docker image...
FROM ubuntu:trusty
MAINTAINER developers@pythonanywhere.com
# Add the PostgreSQL PGP key to verify their Debian packages.
# It should be the same key as https://www.postgresql.org/media/keys/ACCC4CF8.asc
RUN apt-key adv --keyserver keyserver.ubuntu.com --recv-keys B97B0AFCAA1A47F044F244A07FCC7D46ACCC4CF8
# Add PostgreSQL's repository. It contains the most recent stable release
RUN echo "deb http://apt.postgresql.org/pub/repos/apt/ trusty-pgdg main 9.4" > /etc/apt/sources.list.d/pgdg.list
# Update the Ubuntu and PostgreSQL repository indexes
RUN apt-get update
# Install PostgreSQL 9.4 from repository#
RUN apt-get -y -q install postgresql-9.4 postgresql-client-9.4 postgresql-contrib-9.4
# Install some useful postgres add-ons
RUN apt-get -y -q install postgresql-plpython-9.4 # postgresql-9.4-postgis-2.1 does not exist yet
# Install postgis by compiling from source - several steps
#
# Start with dependencies
RUN apt-get -y -q install postgresql-server-dev-9.4
RUN apt-get -y -q install autoconf build-essential wget
# surely some of these are unnecessary!
RUN apt-get -y -q install libgeos-c1 libgdal-dev libproj-dev libjson-c-dev libxml2-dev libxml2-utils xsltproc docbook-xsl docbook-mathml
# postgis 2.1.3 is broken against libjson-c-dev, so compile one instead
RUN wget https://s3.amazonaws.com/json-c_releases/releases/json-c-0.11.tar.gz -O json-c-0.11.tar.gz
RUN tar -xvf json-c-0.11.tar.gz
RUN cd json-c-0.11/ && ./configure
RUN cd json-c-0.11/ && make
RUN cd json-c-0.11/ && make install
# Now download and compile postgis itself
RUN wget http://download.osgeo.org/postgis/source/postgis-2.1.3.tar.gz -O postgis-2.1.3.tar.gz
RUN tar xvfz postgis-2.1.3.tar.gz
RUN cd postgis-2.1.3 && ./configure
RUN cd postgis-2.1.3 && make
RUN cd postgis-2.1.3 && make install
# Finally, create all locales, so users can have any language they like...
RUN cp /usr/share/i18n/SUPPORTED /var/lib/locales/supported.d/local
RUN locale-gen
RUN update-locale "en_US.UTF-8"
# Expose the PostgreSQL port
EXPOSE 5432
# two main config files
ADD postgres_config_files/postgresql.conf /etc/postgresql/9.4/main/postgresql.conf
ADD postgres_config_files/pg_hba.conf /etc/postgresql/9.4/main/pg_hba.conf
# And set the user to be postgres for future commands in this container
USER postgres
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all postgres peer
local all all md5
# IPv4 local connections:
host all all 127.0.0.1/32 md5
# IPv6 local connections:
host all all ::1/128 md5
# password IPv4 connections from any host:
host all all 0.0.0.0/0 md5
# -------------------------------------------
# Simple PostgreSQL Configuration File v. 9.3 (partially ported to 9.4)
# -------------------------------------------
# This file provides a simple configuration with the most common options
# which most users need to modify for running PostgreSQL in production,
# including extensive notes on how to set each of these. If your configuration
# needs are more specific, then use the standard postgresql.conf, or add
# additional configuration options to the bottom of this file.
#
# This file is re-read when you send a SIGHUP to the server, or on a full
# restart. Note that on a SIGHUP simply recommenting the settings is not
# enough to reset to default value; the last explicit value you set will
# still be in effect.
#
# AvRAM: Several of the formulas below ask for "AvRAM", which is short for
# "Available RAM". This refers to the amount of memory which is available for
# running PostgreSQL. On a dedicated PostgreSQL server, you can use the total
# system RAM, but on shared servers you need to estimate what portion of RAM
# is usually available for PostgreSQL.
#
# Each setting below lists one recommended starting setting, followed by
# several alternate settings which are commented out. If multiple settings
# are uncommented, the *last* one will take effect.
# listen_addresses
# ------------------------
# listen_addresses takes a list of network interfaces the Postmaster will
# listen on. The setting below, '*', listens on all interfaces, and is only
# appropriate for development servers and initial setup. Otherwise, it
# should be restrictively set to only specific addresses. Note that most
# PostgreSQL access control settings are in the pg_hba.conf file.
listen_addresses = '*' # all interfaces
# listen_addresses = 'localhost' # unix sockets and loopback only
# listen_addresses = 'localhost,192.168.1.1' # local and one external interface
# max_connections
# ------------------------
# An integer setting a limit on the number of new connection processes which
# PostgreSQL will create. Should be set to the maximum number of connections
# which you expect to need at peak load. Note that each connection uses
# shared_buffer memory, as well as additional non-shared memory, so be careful
# not to run the system out of memory. In general, if you need more than 200
# connections, you should probably be making more use of connection pooling.
#
# Note that by default 3 connections are reserved for autovacuum and
# administration, and more may be used by replication.
# max_connections = 100 # small server
# max_connections = 300 # web application database
# max_connections = 40 # data warehousing database
# demo server setting
max_connections = 20
# shared_buffers
# ------------------------
# A memory quantity defining PostgreSQL's "dedicated" RAM, which is used
# for connection control, active operations, and more. However, since
# PostgreSQL also needs free RAM for file system buffers, sorts and
# maintenance operations, it is not advisable to set shared_buffers to a
# majority of RAM.
#
# The setting below is a formula. Calculate the resulting value, then
# uncomment it. Values should be expressed in kB, MB or GB.
# shared_buffers = ( AvRAM / 4 ) limit 8GB
# shared_buffers = 512MB # basic 2GB web server
# shared_buffers = 8GB # 64-bit server with 32GB RAM
# eg. 384MB / 4 = 96MB
shared_buffers = 96MB
# work_mem
# ------------------------
# This memory quantity sets the limit for the amount of non-shared RAM
# available for each query operation, including sorts and hashes. This limit
# acts as a primitive resource control, preventing the server from going
# into swap due to overallocation. Note that this is non-shared RAM per
# *operation*, which means large complex queries can use multple times
# this amount. Also, work_mem is allocated by powers of two, so round
# to the nearest binary step.
# The setting below is a formula. Calculate the resulting value, then
# uncomment it. Values should be expressed in kB, MB or GB. Note that
# it's seldom effective to set work_mem over 2GB, and that you cannot do
# so at all on 32-bit servers. Set for the maximum connections you
# expect to actually use if max_connections is higher than that.
# Most web applications should use the formula below, because their
# queries often require no work_mem.
# work_mem = ( AvRAM / max_connections ) ROUND DOWN to 2^x
# work_mem = 2MB # for 2GB server with 700 connections
# Formula for most BI/DW applications, or others running many complex
# queries:
# work_mem = ( AvRAM / ( 2 * max_connections ) ) ROUND DOWN to 2^x
# work_mem = 128MB # DW server with 32GB RAM and 40 connections
# eg. 384MB / ( 2 * 20 ) = 8MB
work_mem = 8MB
# maintenance_work_mem
# -------------------------
# This memory value sets the limit for the amount that autovacuum,
# manual vacuum, bulk index build and other maintenance routines are
# permitted to use. Setting it to a moderately high value will increase
# the efficiency of vacuum and other operations.
# Now that we have multiple autovacuum workers, you need to also
# consider the effect of multiple allocations of this memory. If
# you have some reason to raise the number of workers, decrease
# this setting proportionally.
# The setting below is a formula. Calculate the resulting value, then # uncomment it. Values should be expressed in kB, MB or GB.
# If you plan to run autovacuum (see below):
# maintenance_work_mem = ( AvRAM / 16 ) ROUND DOWN to 2^x
# maintenance_work_mem = 128MB #webserver with 2GB RAM
# 384MB / 16 = 24MB, round down to 16MB
maintenance_work_mem = 16MB
# If not running autovacuum:
# maintenance_work_mem = ( AvRAM / 8 ) ROUND DOWN to 2^x
# maintenance_work_mem = 2GB #DW server with 32GB RAM
# synchronous_commit
# -------------------------
# This boolean setting controls whether or not all of your transactions
# are guaranteed to be written to disk when they commit. If you are
# willing to lose up to 0.4 seconds of data in the event of an unexpected
# shutdown (as many web applications are), then you can gain substantial
# performance benefits by turning off synchronous commit. For most
# applications, however, this setting is better used on a per-session
# basis. synchronous_commit = off is essential for Amazon EC2/EBS.
synchronous_commit = on #most applications
# synchronous_commit = off #if speed is more important than data
# wal_buffers
# -------------------------
# this memory setting defines how much buffer space is available for
# the Write Ahead Log. Set too low, it can become a bottleneck on
# inserts and updates; there is no benefit to setting it high, however.
# To date, the maximum effective size has been 32MB.
# As with some of the other settings above, may require increasing
# some kernel parameters.
# wal_buffers = 32MB
# eg set low
wal_buffers = 8MB
# checkpoint_segments
# -------------------------
# This integer defines the maximum number of 8MB transaction log segments
# PostgreSQL will create before forcing a checkpoint. For most
# high-volume OTLP databases and DW you will want to increase this
# setting significantly. Alternately, just wait for checkpoint
# warnings in the log before increasing this.
#
# Increasing this setting can make recovery in the event of unexpected
# shutdown take longer.
#
# Maximum disk space required is (checkpoint_segments * 2 + 1) * 16MB
# plus extra for wal_keep_segments if doing replication.
# Make sure you have that much available before setting it.
# checkpoint_segments = 16 #normal small-medium database
# checkpoint_segments = 96 #high-volume OLTP database
# checkpoint_segments = 256 #heavy-ETL large database
# small demo server
checkpoint_segments = 8
# checkpoint_completion_target
# ----------------------------
# This setting controls whether checkpoints are done gradually or
# all-at-once. Older filesystems require all-at-once. Newer ones
# work better with gradually.
# if using ext2, ext3, UFS, HFS+ or NTFS
checkpoint_completion_target = 0.0
# if using ext4, XFS, ZFS or FreeBSD UFS
# checkpoint_completion_target = 0.9
# autovacuum
# ---------------------------
# autovacuum turns on a maintenance daemon which runs in the background,
# periodically cleaning up your tables and indexes. The only reason to turn
# autovacuum off is for large batch loads (ETL).
autovacuum = on #most databases
# autovacuum = off #large DW
# effective_cache_size
# --------------------------
# This memory setting tells the PostgreSQL query planner how much RAM
# is estimated to be available for caching data, in both shared_buffers and
# in the filesystem cache. This setting just helps the planner make good
# cost estimates; it does not actually allocate the memory.
# The setting below is a formula. Calculate the resulting value, then # uncomment it.
# effective_cache_size = ( AvRAM * 0.75 )
# 384MB * 0.75 = 288MB
effective_cache_size = 288MB
# replication
# ------------
# With 9.X, one of the things most users will want to set is to turn on
# replication. Note that you will also need to change settings in
# recovery.conf. The settings below should be applied on both the
# master and the standby.
# Streaming replication only
# wal_level = hot_standby
# max_wal_senders = 5
# wal_keep_segments = 8 #requires disk space
# hot_standby = on
# Log shipping replication, add as well:
# archive_mode = on
# archive_command = '/setup/postgres/archive_logs.sh %p %f'
# log_destination & logging settings
# --------------------------
# This list value determines where PostgreSQL's logs are sent. What
# setting to use really depends on your server room setup and the
# production status and OS of your server.
#
# Note that there are several dozen settings on what and how often
# to log; these will not be covered in detail in this quick
# configuration file. Instead, several common combinations are
# given.
# Syslog setup for centralized monitoring
# log_destination = 'syslog'
# syslog_facility = 'LOCAL0' #local syslog
# Windows
# log_destination = 'eventlog'
# Private PostgreSQL Log, rotates weekly
log_destination = 'stderr'
logging_collector = on
log_directory = '/var/log/postgresql/'
log_filename = 'postgresql-%a'
log_truncate_on_rotation = on
# query activity logging example
# for demo
#log_destination = 'csvlog'
#logging_collector = on
#log_directory = '/var/log/postgresql/'
#log_filename = 'activitylog-%a'
#log_truncate_on_rotation = on
#log_min_duration_statement = 0
#log_lock_waits = on
#log_connections = on
#log_disconnections = on
# CSV logging for pgbadger log analyzer
# summarizes all queries slower than 100ms
# Warning: may generate large log files
# log_destination = 'csvlog'
# logging_collector = on
# log_directory = '/path/to/log/dir'
# log_min_duration_statement = 100
# log_rotation_size = 1GB
# CSV logging for collecting multiple performance statistics.
# Warning: this much logging will generate many log
# files and affect performance.
# log_destination = 'csvlog'
# logging_collector = on
# log_directory = '/path/to/log/dir'
# log_temp_files = 8kB
# log_lock_waits = on
# deadlock_timeout = 500ms
# log_connections = on
# log_disconnections = on
# log_min_duration_statement = 0
# log_rotation_size = 1GB
# File Locations
# --------------
# These are specific to Ubuntu and required
# Do not change them if you are using the standard Ubuntu packages.
# and data directory locations. Note that even if you change the
# data directory location, you won't want to change the other items
data_directory = '/var/lib/postgresql/9.4/main'
hba_file = '/etc/postgresql/9.4/main/pg_hba.conf'
ident_file = '/etc/postgresql/9.4/main/pg_ident.conf'
external_pid_file = '/var/run/postgresql/9.4-main.pid'
unix_socket_directories = '/var/run/postgresql'
port = 5432
timezone = 'UTC'
log_timezone = 'UTC'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment