unused indices
indices take space and also require time to update. find if any of them are unused
SELECT
schemaname || '.' || relname AS table,
indexrelname AS index,
pg_size_pretty(pg_relation_size(i.indexrelid)) AS index_size,
idx_scan as index_scans
FROM pg_stat_user_indexes ui
JOIN pg_index i ON ui.indexrelid = i.indexrelid
WHERE NOT indisunique AND idx_scan < 50 AND pg_relation_size(relid) > 5 * 8192
ORDER BY pg_relation_size(i.indexrelid) / nullif(idx_scan, 0) DESC NULLS FIRST,
pg_relation_size(i.indexrelid) DESC;
approximate row count
Sometimes, running a SELECT COUNT(1) from 'table_name';
is prohibitively expensive in big tables, but we can get an approximation w/ this query.
SELECT
left(query, 200) AS short_query,
round(total_time::numeric, 2) AS total_time,
calls, round(mean_time::numeric, 2) AS mean, round((100 * total_time / sum(total_time::numeric) OVER ())::numeric, 2) AS percentage_cpu
FROM
pg_stat_statements
ORDER BY
total_time DESC
LIMIT 20;

In general, the postgres community seems to overwhelmingly recommend pgbouncer
to address this issue.
Version 1.21 has added a long awaited feature: support for prepared statements inside of transaction mode. Prior to this, one had to choose between using prepared statements (a performance win), and using PgBouncer’s transaction mode (also a large performance win). Now, we can have our cake and eat it too 🎂

Monitoring
Heroku won’t print any pgbouncer
stats to the log drain, but it creates a new table named pgbouncer
that includes several stats that we can capture to NewRelic/Datadog/etc

In Heroku, we can enable pgbouncer
via Server Side connection pooling (managed by Heroku), automatically adds a new database that we can connect to, named pgbouncer
.
Other (less popular) options are available as well

One solution to this problem is to use a connection pooler which is program that sits in-between your application and your db server and helps reusing/multiplexing connections
Connection pooling with pgbouncer
Postgres isn’t great when dealing with many connections (performance issues start happening after +500 connections)
The issue is specially noticeable around deployments (depending on the deployment strategy), when new servers start coming up online while old servers are still up responding to requests.
If you’re using something like blue/green deployments, you can effectively double your connection count when deployments happen.

rollback plan
keep the previous database around in case we need to rollback. We can promote the old database at any moment.
heroku pg:promote <old db name (new alias)> -a app-name

after the upgrade provision a new follower to keep H/A
It’s possible for the new follower creation to fail because the new primary is “too new”.
In that case, try again in a few hours.
heroku addons:create heroku-postgresql:standard-8 --follow DATABASE_URL -a app-name
after promotion, run ANALYZE
on the db to recalculate statistics and ensure the query planner won’t pick bad query plans and queries remain fast
promoting a replica while there is activity on the primary
database, means that some of the data might not have a chance to replicate, so there could be data loss!
As part of the pg:upgrade
process, Heroku Postgres runs ANALYZE
on your database, this recalculates statistics for your database to make sure the Postgres query planner has up to date information even after a version upgrade.
Promoting an RDS read replica
Before promoting an RDS replica, you need to make sure that there are isn’t new data written to the primary
instance
A replica is ready to be promoted when there’s no active transactions on the primary instance and the replica lag is zero.
Active transactions and connected clients
look for any activity on primary
and ensure no activity
-- active transactions (to run on primary)
SELECT * FROM pg_stat_activity WHERE state = 'active' AND query LIKE '%INSERT%' OR query LIKE '%UPDATE%' OR query LIKE '%DELETE%';
Replication lag
ensure replication lag is zero, which means that the replica is caught up with primary
and can be promoted safely
-- replication lag (to run on replica)
SELECT EXTRACT(EPOCH FROM NOW() - pg_last_xact_replay_timestamp()) AS replication_lag;
primary
instance returns null
.
Once there’s no activity and replica lag is zero, the replica can be promoted.

Upgrading a postgres database in Heroku
Based on https://devcenter.heroku.com/articles/upgrading-heroku-postgres-databases#upgrading-with-pg-upgrade
Doing this process with a database size of ~1 TB, it takes about 15 minutes of downtime
Provision a new follower database
heroku addons:create heroku-postgresql:premium-8 --follow DATABASE_URL -a app-name
heroku pg:wait -a app-name
heroku ps:scale worker=0 -a app-name
heroku maintenance:on -a app-name
heroku pg:info -a app-name
#=> HEROKU_POSTGRESQL_PINK_URL Plan: Premium 8 Status: available ...
#=> Following: HEROKU_POSTGRESQL_PURPLE_URL (DATABASE_URL)
#=> Behind By: 0 commits
heroku pg:upgrade <NEW_FOLLOWER_DATABASE_URL> -a app-name
heroku pg:wait -a app-name
heroku pg:promote <NEW_FOLLOWER_DATABASE_URL> -a app-name
heroku maintenance:off -a app-name
heroku ps:scale worker=1 -a app-name
Version 1.21 of PgBouncer, the Postgres connection pooler, has added a long awaited feature: support for prepared statements inside of transaction mode. Prior to this, one had to choose between using prepared statements (a performance win), and using PgBouncer’s transaction mode (also a large performance win). Now, we can have our cake and eat it too 🎂





blocking queries
SELECT bl.pid AS blocked_pid,
ka.query AS blocking_statement,
now() - ka.query_start AS blocking_duration,
kl.pid AS blocking_pid,
a.query AS blocked_statement,
now() - a.query_start AS blocked_duration
FROM pg_catalog.pg_locks bl
JOIN pg_catalog.pg_stat_activity a
ON bl.pid = a.pid
JOIN pg_catalog.pg_locks kl
JOIN pg_catalog.pg_stat_activity ka
ON kl.pid = ka.pid
ON bl.transactionid = kl.transactionid AND bl.pid != kl.pid
WHERE NOT bl.granted;
terminate long running queries
/* FIND THE PID */
SELECT * FROM pg_stat_activity WHERE state = 'active';
/* KILL GRACEFULLY */
SELECT pg_cancel_backend(PID);
/* KILL FORCEFULLY */
SELECT pg_terminate_backend(PID);

long running queries
SELECT
pid,
now() - pg_stat_activity.query_start AS duration,
query AS query
FROM
pg_stat_activity
WHERE
pg_stat_activity.query <> ''::text
AND state <> 'idle'
AND now() - pg_stat_activity.query_start > interval '50 ms'
ORDER BY
now() - pg_stat_activity.query_start DESC;

which tables are currently autovacuuming
SELECT pid, query FROM pg_stat_activity WHERE query LIKE 'autovacuum: %';
pid │ query
───────┼──────────────────────────────────────────────────────────
29431 │ autovacuum: VACUUM public.scores (to prevent wraparound)
(1 row)
table bloat
WITH constants AS (
SELECT current_setting('block_size')::numeric AS bs, 23 AS hdr, 4 AS ma
), bloat_info AS (
SELECT
ma,bs,schemaname,tablename,
(datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
(maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
FROM (
SELECT
schemaname, tablename, hdr, ma, bs,
SUM((1-null_frac)*avg_width) AS datawidth,
MAX(null_frac) AS maxfracsum,
hdr+(
SELECT 1+count(*)/8
FROM pg_stats s2
WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
) AS nullhdr
FROM pg_stats s, constants
GROUP BY 1,2,3,4,5
) AS foo
), table_bloat AS (
SELECT
schemaname, tablename, cc.relpages, bs,
CEIL((cc.reltuples*((datahdr+ma-
(CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta
FROM bloat_info
JOIN pg_class cc ON cc.relname = bloat_info.tablename
JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = bloat_info.schemaname AND nn.nspname <> 'information_schema'
), index_bloat AS (
SELECT
schemaname, tablename, bs,
COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols
FROM bloat_info
JOIN pg_class cc ON cc.relname = bloat_info.tablename
JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = bloat_info.schemaname AND nn.nspname <> 'information_schema'
JOIN pg_index i ON indrelid = cc.oid
JOIN pg_class c2 ON c2.oid = i.indexrelid
)
SELECT
type, schemaname, object_name, bloat, pg_size_pretty(raw_waste) as waste
FROM
(SELECT
'table' as type,
schemaname,
tablename as object_name,
ROUND(CASE WHEN otta=0 THEN 0.0 ELSE table_bloat.relpages/otta::numeric END,1) AS bloat,
CASE WHEN relpages < otta THEN '0' ELSE (bs*(table_bloat.relpages-otta)::bigint)::bigint END AS raw_waste
FROM
table_bloat
UNION
SELECT
'index' as type,
schemaname,
tablename || '::' || iname as object_name,
ROUND(CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages/iotta::numeric END,1) AS bloat,
CASE WHEN ipages < iotta THEN '0' ELSE (bs*(ipages-iotta))::bigint END AS raw_waste
FROM
index_bloat) bloat_summary
ORDER BY raw_waste DESC, bloat DESC
vacuum stats
WITH table_opts AS (
SELECT
pg_class.oid, relname, nspname, array_to_string(reloptions, '') AS relopts
FROM
pg_class INNER JOIN pg_namespace ns ON relnamespace = ns.oid
), vacuum_settings AS (
SELECT
oid, relname, nspname,
CASE
WHEN relopts LIKE '%autovacuum_vacuum_threshold%'
THEN substring(relopts, '.*autovacuum_vacuum_threshold=([0-9.]+).*')::integer
ELSE current_setting('autovacuum_vacuum_threshold')::integer
END AS autovacuum_vacuum_threshold,
CASE
WHEN relopts LIKE '%autovacuum_vacuum_scale_factor%'
THEN substring(relopts, '.*autovacuum_vacuum_scale_factor=([0-9.]+).*')::real
ELSE current_setting('autovacuum_vacuum_scale_factor')::real
END AS autovacuum_vacuum_scale_factor
FROM
table_opts
)
SELECT
vacuum_settings.nspname AS schema,
vacuum_settings.relname AS table,
to_char(psut.last_vacuum, 'YYYY-MM-DD HH24:MI') AS last_vacuum,
to_char(psut.last_autovacuum, 'YYYY-MM-DD HH24:MI') AS last_autovacuum,
to_char(pg_class.reltuples, '9G999G999G999') AS rowcount,
to_char(psut.n_dead_tup, '9G999G999G999') AS dead_rowcount,
to_char(autovacuum_vacuum_threshold
+ (autovacuum_vacuum_scale_factor::numeric * pg_class.reltuples), '9G999G999G999') AS autovacuum_threshold,
CASE
WHEN autovacuum_vacuum_threshold + (autovacuum_vacuum_scale_factor::numeric * pg_class.reltuples) < psut.n_dead_tup
THEN 'yes'
END AS expect_autovacuum
FROM
pg_stat_user_tables psut INNER JOIN pg_class ON psut.relid = pg_class.oid
INNER JOIN vacuum_settings ON pg_class.oid = vacuum_settings.oid
ORDER BY 1
cache hit
SELECT
'index hit rate' AS name,
(sum(idx_blks_hit)) / nullif(sum(idx_blks_hit + idx_blks_read),0) AS ratio
FROM pg_statio_user_indexes
UNION ALL
SELECT
'table hit rate' AS name,
sum(heap_blks_hit) / nullif(sum(heap_blks_hit) + sum(heap_blks_read),0) AS ratio
FROM pg_statio_user_tables;
sequential scans
find which tables are doing sequential scans.
sequential scans are not inherently bad as long as the table is small enough. on big tables, this is a performance killer.
SELECT relname AS name,
seq_scan as count
FROM
pg_stat_user_tables
ORDER BY seq_scan DESC;
total index size
SELECT
pg_size_pretty(sum(c.relpages::bigint*8192)::bigint) AS size
FROM
pg_class c
LEFT JOIN
pg_namespace n ON (n.oid = c.relnamespace)
WHERE
n.nspname NOT IN ('pg_catalog', 'information_schema')
AND n.nspname !~ '^pg_toast'
AND c.relkind='i';
this requires the pg_stat_statements
extension installed
CREATE EXTENSION IF NOT EXISTS pg_stat_statements WITH SCHEMA <schema_name>;
index size per table
SELECT c.relname AS table,
pg_size_pretty(pg_indexes_size(c.oid)) AS index_size
FROM pg_class c
LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace)
WHERE n.nspname NOT IN ('pg_catalog', 'information_schema')
AND n.nspname !~ '^pg_toast'
AND c.relkind='r'
ORDER BY pg_indexes_size(c.oid) DESC;
total table size
SELECT c.relname AS name,
pg_size_pretty(pg_total_relation_size(c.oid)) AS size
FROM pg_class c
LEFT JOIN pg_namespace n ON (n.oid = c.relnamespace)
WHERE n.nspname NOT IN ('pg_catalog', 'information_schema')
AND n.nspname !~ '^pg_toast'
AND c.relkind='r'
ORDER BY pg_total_relation_size(c.oid) DESC;
index usage
SELECT relname,
CASE idx_scan
WHEN 0 THEN 'Insufficient data'
ELSE (100 * idx_scan / (seq_scan + idx_scan))::text
END percent_of_times_index_used,
n_live_tup rows_in_table
FROM
pg_stat_user_tables
ORDER BY
n_live_tup DESC;
index size
SELECT
c.relname AS name,
pg_size_pretty(sum(c.relpages::bigint*8192)::bigint) AS size
FROM
pg_class c
LEFT JOIN
pg_namespace n ON (n.oid = c.relnamespace)
WHERE
n.nspname NOT IN ('pg_catalog', 'information_schema')
AND n.nspname !~ '^pg_toast'
AND c.relkind='i'
GROUP BY
c.relname
ORDER BY
sum(c.relpages) DESC;
Adding JSON constraints to JSON objects
I find it easy to think of JSONB columns as being open to anything – I’ll often create a JSONB ‘metadata’ column and use it as a sort of generic bucket to throw data into that I may or may not need later. They perform this job well.
While you can use JSON in a very dynamic, ‘seat of the pants’ fashion, you can also bring it into the structured world with queries, functions, and constraints.
For example, you might have a books table and be storing data about books within it as JSON documents
create table books(k serial primary key, doc jsonb not null);
insert into books(doc) values
('
{ "ISBN" : 4582546494267,
"title" : "Macbeth",
"author" :
{"given_name": "William",
"family_name": "Shakespeare"},
"year" : 1623
}
');
alter table books
add constraint books_doc_is_object
check(
jsonb_typeof(doc) is not null and
jsonb_typeof(doc) = 'object'
);
alter table books
add constraint books_doc_isbn_ok
check(
doc->>'ISBN' is not null and
jsonb_typeof(doc->'ISBN') = 'number' and
(doc->>'ISBN')::bigint > 0 and
length(doc->>'ISBN') = 13
);
`RANK` and `DENSE_RANK` for ranking rows
Let’s say we’ve been taking votes of people’s favorite database systems
CREATE TABLE databases (name TEXT, votes INT);
INSERT INTO databases VALUES
('Postgres', 10000),
('MySQL', 4522),
('SQLite', 9500),
('MongoDB', 4522),
('Oracle', 4580),
('Redis', 9500);
SELECT DENSE_RANK() OVER (ORDER by votes DESC), * FROM databases;
SELECT RANK() OVER (ORDER by votes DESC), * FROM databases;
