A PostgreSQL extension that enforces optimal column alignment to minimize row padding waste.
- Warns on suboptimal
CREATE TABLEstatements during development - Enforces strict alignment in CI/CD pipelines
- Audits existing tables and generates optimized migration scripts
PostgreSQL stores each row as a sequence of bytes on disk. Column types have different sizes: a bigint takes 8 bytes, an integer takes 4, a boolean takes just 1. So far so simple.
The problem is that PostgreSQL can't just pack them back to back. The CPU reads memory most efficiently when values are naturally aligned; an 8-byte value should start at a position divisible by 8, a 4-byte value at a position divisible by 4, and so on. To guarantee this, PostgreSQL inserts invisible padding bytes between columns whenever needed.
Here's an example. Say you create a table like this:
CREATE TABLE bad_order (
active boolean, -- 1 byte
user_id bigint, -- 8 bytes
age integer -- 4 bytes
);In memory, each row looks like:
[active: 1B] [7B padding] [user_id: 8B] [age: 4B] → 20 bytes of column data
user_id needs to start at an 8-byte boundary, so PostgreSQL pads 7 bytes after active to get there. That's 7 wasted bytes per row.
Now reorder the columns largest-first:
CREATE TABLE good_order (
user_id bigint, -- 8 bytes
age integer, -- 4 bytes
active boolean -- 1 byte
);[user_id: 8B] [age: 4B] [active: 1B] → 13 bytes of column data
Zero padding. Same data, 35% smaller rows. Multiply that across millions of rows and dozens of columns and it will adds up fast. Optimal column order is free performance: zero runtime cost, just a smarter CREATE TABLE.
The extension sorts columns into these groups, largest alignment first:
- 8-byte aligned (
d):bigint,timestamptz,float8,interval - 4-byte aligned (
i):integer,float4,date,oid - 2-byte aligned (
s):smallint - 1-byte aligned (
c):boolean,char(1) - Variable-length (varlena):
text,varchar,numeric,jsonb,bytea- always last
Within each group, NOT NULL columns come first (minor CPU optimization for tuple deforming).
- PostgreSQL 14+
- Superuser or event trigger privileges (
rds_superuseron RDS,cloudsqlsuperuseron Cloud SQL)
Pure SQL/PL/pgSQL - no C, no compilation.
make installCREATE EXTENSION pg_column_tetris;Since there's no C code, the extension runs anywhere PostgreSQL does:
psql -d your_database -f pg_column_tetris--0.1.0.sqlThe extension has three modes (warn, strict, off) that cover different workflows.
The extension installs in warn mode. Any CREATE TABLE with suboptimal column order emits a NOTICE but still succeeds:
CREATE TABLE orders (
is_shipped boolean,
order_total numeric,
user_id bigint,
item_ct smallint,
order_dt timestamptz,
status smallint,
ship_dt timestamptz
);NOTICE: pg_column_tetris: suboptimal column alignment — 19 bytes of fixed-width padding wasted per row
Good for development — you see the problem without breaking anything.
In strict mode, CREATE TABLE with suboptimal column order is blocked and rolled back. The error message includes the optimal column order so you can fix it immediately:
SELECT column_tetris.set_mode('strict');
CREATE TABLE orders ( ... );
-- ERROR: suboptimal column alignment — 19 bytes of fixed-width padding wasted per row
-- HINT: Suggested order:
-- CREATE TABLE orders (
-- user_id bigint, -- 8-byte aligned
-- order_dt timestamptz, -- 8-byte aligned
-- ship_dt timestamptz, -- 8-byte aligned
-- item_ct smallint, -- 2-byte aligned
-- status smallint, -- 2-byte aligned
-- is_shipped boolean, -- 1-byte aligned
-- order_total numeric -- varlena (last)
-- );Use this in staging/production databases or CI pipelines to guarantee every new table has optimal alignment.
Use padding_wasted() to quickly check how many bytes a table wastes per row:
SELECT column_tetris.padding_wasted('orders');
-- Returns: 7 (bytes of avoidable padding per row)Pass 'total' to see the total waste across all rows in the table. Wrap with pg_size_pretty() for human-readable output:
SELECT pg_size_pretty(column_tetris.padding_wasted('orders', 'total'));
-- Returns: '458 MB'Find all tables with padding waste:
SELECT schemaname, tablename,
column_tetris.padding_wasted(schemaname || '.' || tablename) AS bytes_per_row
FROM pg_tables
WHERE schemaname = 'public'
AND column_tetris.padding_wasted(schemaname || '.' || tablename) > 0;Use check() for a detailed column-by-column layout report:
SELECT * FROM column_tetris.check('orders');Use suggest_rewrite() to generate a migration script that reorders the columns optimally. Caution - it renames the original table, creates a new one, and copies all data. This means exclusive locks, downtime for that table, and lost foreign keys/indexes/triggers/defaults that aren't part of the generated DDL. Always review the output and test on a copy first:
SELECT column_tetris.suggest_rewrite('orders');-- Generated output:
BEGIN;
ALTER TABLE public.orders RENAME TO orders_old;
CREATE TABLE public.orders ( ...optimal order... );
INSERT INTO public.orders SELECT ... FROM public.orders_old;
DROP TABLE public.orders_old;
COMMIT;You can use off mode if you want to disable the event trigger entirely and just use the analysis functions.
-- Check current mode
SELECT column_tetris.mode();
-- Exclude a table from validation (e.g., matching an external schema)
SELECT column_tetris.exclude('legacy_imports');
-- View excluded table-/s
SELECT * FROM column_tetris.exclusions;- CREATE TABLE statements are validated by the event trigger
- ALTER TABLE is deliberately skipped - you can't reorder existing columns, so warning would be noise
- Temp tables and system schemas (
pg_catalog,information_schema) are skipped - Tables in the
exclusionslist are skipped
MIT