0% found this document useful (0 votes)

128 views8 pages

Postgresql Tuning Guide: Postgresql Architecture: Key Takeaways

PostgreSQL is an open source database that requires tuning for optimal performance in production environments. Key aspects that should be configured include buffer settings, number of connections, logging, autovacuum settings, and checkpoint settings. The query planner relies on up-to-date statistics and memory information to generate efficient execution plans. Background processes write data to disk and manage vacuuming and checkpointing to control disk usage and crash recovery times.

Uploaded by

Tonny Kusdarwanto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

128 views8 pages

Postgresql Tuning Guide: Postgresql Architecture: Key Takeaways

Uploaded by

Tonny Kusdarwanto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

PostgreSQL Tuning Guide: PostgreSQL

Architecture
Key takeaways

• PostgreSQL is often shipped with a configuration that isn't suitable for production. Due
to this, you should at least configure the PostgreSQL buffer settings, the number of
connections, and logging.
• The query planner needs to know about the memory situation and have good table
statistics in order to make accurate plans.
• Checkpoints need to be tuned carefully in order to limit crash recovery time, while not
impacting the rest of the system's performance.
• The autovacuum process is also critical to make sure the query planner has the right
information to work with, as well as to keep tables maintained properly.

Introduction

PostgreSQL is the world’s most advanced Open Source database. It is designed for extensibility
and customization. It has ANSI/ISO compliant SQL support (strongly conforms to the ANSI-
SQL:2008 standard specification). It has been actively developed for more than 25 years.
Furthermore, it is well-known for its portability, reliability, scalability, and security.

PostgreSQL uses the client-server model, where the client and server programs are usually on
different hosts. The communication between the client and server is normally done via TCP/IP
protocols or via Linux sockets.

Client process or program (frontend)

The database frontend application can perform any database action. The frontend can be a web
server that wants to display a web page or a command-line tool that does maintenance tasks.
PostgreSQL provides frontend tools such as psql, createdb, dropdb, and createuser and so on.

Server process (backend)

The server process manages database files, accepts connections from client applications, and
performs actions on behalf of the client. The server process name is "postgres". The main server
process forks a new process for each new connection. Therefore, client and server processes
communicate with each other without the intervention of the main server process. In addition,
these new processes have a certain lifetime, which is determined by accepting and terminating a
client connection.

The PostgreSQL server could be divided roughly into four subsystems, as follows:
• Process manager
• Query processor
• Utilities
• Storage Manager

Process manager
Starting with the daemon process

The first process that is launched when we start PostgreSQL is /usr/local/pgsql/bin/postgres. This
process has quite a few responsibilities such as performing recovery, initializing shared data
structures/memory space, and kicking off the mandatory and optional processes. These processes
are also referred to as utility processes and include bgwriter, checkpointer, autovacuum
launcher, log writer, stats collector process, and so on. Furthermore, the daemon process also
listens for connection requests, receives requests for connections from clients, and spawns server
processes for the client. It's obvious that the daemon itself is a mandatory process that should be
running in order for a user to connect to the database.

Backend Processes

Let's focus on the user connecting-issuing-commands scenario. The following diagram walks
you through the process of how the daemon process receives a connection request and starts
(forks) a backend process. The backend process will, on successful authentication, start handling
requests from that client:

PostgreSQL query request overview

Similarly, the process is repeated for all connection requests (unless we hit
the max_connections settings, in which case we get an error).

As a result, an active server, after a period of time, will have the processes that were there when
the server started, plus quite a few processes to serve client connections, as shown in the
following diagram:

PostgreSQL - forking the daemon process

Once a user is connected to a database, the user typically wants to read (SELECT) data or write
(UPDATE/DELETE/INSERT) data, not to mention making changes to table structure, adding
indexes, and so on.

However, when we consider that there might be thousands of users who want to do this, it gets a
little bit more complex. Reading from the directories/files will result in a miserably non-scalable
system. The reads and writes will result in searching for many files, opening these files, using
fseek() for specific data records, locking, editing, and unlocking. To make this a lot more
scalable and faster, the concept of shared buffers (memory area) is introduced. Now, the backend
processes are no longer reading from the files and writing to the files, but dealing with buffers or
RAM, with significant improvement in performance.
The Path of a Query

Query processor - The Path of a Query

Now let’s have a short overview of the stages a query has to pass in order to obtain the results.

1. A connection from an application program to the PostgreSQL server has to be

established.
2. The parser stage checks the query transmitted by the application program for correct
syntax and then creates a query tree.
3. The rewrite system takes the query tree created at the parser stage and looks for
any rules (stored in the system catalogs) to apply them to the query tree. It performs the
transformations given in the rule bodies. One application of the rewrite system is in the
creation of views. Whenever a query is executed against a view (i.e., a virtual table), the
system rewrites the query in order to access base tables instead.
4. The planner/optimizer takes the (rewritten) query tree and creates a query plan that will
be the input to the executor. It does so by first creating all possible paths leading to the
same result. For example, if there is an index on a relation to be scanned, there are two
paths for the scan. One possibility is a simple sequential scan and the other possibility is
to use the index. Next, the cost for the execution of each path is estimated and the
cheapest path is chosen. The cheapest path is expanded into a complete plan that the
executor can use.
5. The executor recursively steps through the plan tree and retrieves rows in the way
represented by the plan. The executor makes use of the storage system while scanning
relations, performs sorts and joins, evaluates qualifications and finally hands back the
rows.
Utility Processes

The utilities subsystem provides a means to maintain the database, such as claiming storage,
updating statistics, and exporting and importing data with a certain format or logging.

With a default configuration, we will get the checkpointer process, the writer process, the WAL
writer process, the autovacuum launcher process, and the stats collector process. You will see
more processes running if you turn on archiving or streaming replication. You might also get a
process for writing the server log, depending on the configuration.

The background writer

The background writer is responsible for writing to disk specific dirty buffers based on an
algorithm, whereas checkpointer writes all dirty buffers. The process takes into consideration
shared memory usage data as well as information about which blocks have been used/accessed
recently (least recently used). The primary objective of this process is to ensure that free buffers
are available for use.

Cleaning the shared buffers

Checkpoint

Checkpoint is a mandatory process. Once a user makes changes to the data (which has been
made available in the buffer), that buffer is dirty. The fact that a user has committed a change
does not mean that the change has been written to the data file. It's the job of the checkpointer
process to write the change to the data file. The checkpointer writes all dirty (modified) pages to
the table and index files. The process also marks the pages as clean. It also marks the write-ahead
log as applied up to this point.

Let's consider a server with around 16 GB shared buffer. If a significant proportion of this load
consists of writes, then, most of this 16 GB buffer can become dirty in a few minutes. A low
setting for checkpoint_segments will result in the available segments getting filled quickly and
frequent checkpoints. Similarly, a low setting for checkpoint_timeout will also result in frequent
checkpoints. This results in excessive disk throughput. On the other hand, if we keep these
values very high, this will result in infrequent checkpoints. In a write-heavy system, this can
result in significant I/O spikes during checkpoints, which affects the performance of other
queries.

The Checkpoint Process

Wal writer

As mentioned before, when we make changes to the data, the changes are not written to the data
files immediately. Postgres makes these changes to the blocks in the buffer and also writes them
to the WAL buffer (as soon as changes to data are made). Changes are flushed to the WAL
segments when the statements are committed.
Now, let's also take a look at a few key parameters related to WAL. For example,
the fsync setting forces each transaction to be written to the hard disk after each commit. Turning
this off will increase performance, especially for bulk-upload operations. Disabling fsync can
lead to serious problems, such as data consistency, in the case of power failure and crashes.
A max_wal_size of a small value might lead to a performance penalty in write-heavy systems;
on the other hand, increasing the max_wal_size setting to a high value will increase recovery
time.

The autovacuum launcher process

This is an optional process that is ON as default value. The vacuum process marks space used by
previously deleted (or updated) records as being available for reuse within the table. There is also
a vacuum command to do this manually. Vacuum does not lock the table.

However, VACUUM FULL, in addition to marking the space as reusable, removes the deleted or
updated records and reorders the table data. This requires an exclusive lock on the table.

Autovacuum automates the vacuum process. In cases where the database is under heavy load for
the most part of the day, Postgres can schedule vacuum during off-peak hours. Although even
there are few or no deletes/updates in the cluster, it's useful to have routine vacuuming as
vacuum updates the data statistics used by the planner.

Stats collector

In short, the stats collector process keeps track of queries to tables and indexes. It does it for both
disk-blocks and individual rows in order to output statistics of database activity. Keeping track of
the statistics view will be useful in zooming in on tables that are frequently accessed, identifying
columns to be indexed and can also help in providing input for configuring the server parameters
(such as shared_buffers and other memory areas).

Archiever

The archiever process is responsible for copying files from the pg_xlog location to the archive
location.

Logging collector

This is an optional process and the default setting is off. It is responsible for writing database
messages into database log files. According to the level set in the configuration file, it writes into
log files.

log_min_duration_statement = 0

This setting causes the duration of each completed statement to be logged if the statement ran for
at least the specified number of milliseconds. The default value is -1, which disables the logging
feature. Setting the value to 0 results in the duration of all statements to be logged. This
approach, along with an analysis tool such as pgbadger, helps in tracking down slow queries
quickly. However, in a system with hundreds of queries executed every second, this setting can
result in log files growing pretty quickly. So, it is better to have a process in place to move the
log files to another location, where a tool such as pgbadger can be used to mine the log files and
generate user-friendly reports.

Storage manager

The storage handles the memory cache, disk buffers, and storage allocation.

Summary

In this section, we covered the important processes and memory structures in PostgreSQL. We
also saw how some configuration settings can effectively be used to capture data about the
activities in the server, and then use this data to further optimize the database performance.

In the next part, we will give you an idea of what memory-related configurations you could make
to improve that PostgreSQL performance.

Bibliography

• PostgreSQL 12 Official Documentation, https://www.postgresql.org/docs/

• PostgreSQL for Data Architects, Jayadevan Maymala, Packt Publishing, March 2015
• PostgreSQL 10 High Performance, Enrico Pirozzi, Gregory Smith, Ibrar Ahmed, April 2018
• Learning PostgreSQL 11, Andrey Volkov, Salahaldin Juba, January 2019

Postgrre
No ratings yet
Postgrre
14 pages
PostgreSQL When It's Not Your Job
100% (1)
PostgreSQL When It's Not Your Job
183 pages
Postgres Topic
No ratings yet
Postgres Topic
116 pages
Upgrade To PostgreSQL11
No ratings yet
Upgrade To PostgreSQL11
36 pages
PostgreSQL Activity
No ratings yet
PostgreSQL Activity
13 pages
How To Set Up PostgreSQL For High Availability and Replication With Hot Standby - Solutions Google Cloud Platform
No ratings yet
How To Set Up PostgreSQL For High Availability and Replication With Hot Standby - Solutions Google Cloud Platform
6 pages
PostgreSQL 15 Replication Setup Guide
No ratings yet
PostgreSQL 15 Replication Setup Guide
9 pages
Google Cloud Professional Database Engineer Exam Questions
No ratings yet
Google Cloud Professional Database Engineer Exam Questions
69 pages
Postgresql Dba: Learn Basic Rdbms Terms and Concepts
No ratings yet
Postgresql Dba: Learn Basic Rdbms Terms and Concepts
7 pages
PostgreSQL DBA Complete 300 Pages
No ratings yet
PostgreSQL DBA Complete 300 Pages
303 pages
PostgreSQL Hot Standby Setup Guide
No ratings yet
PostgreSQL Hot Standby Setup Guide
11 pages
Netezza Commands
No ratings yet
Netezza Commands
1 page
PostgreSQL Architecture 2
No ratings yet
PostgreSQL Architecture 2
5 pages
Regular-Lab-Book-Upto 3 RD Experiment
No ratings yet
Regular-Lab-Book-Upto 3 RD Experiment
32 pages
PostgreSQL to Citus Migration Guide
No ratings yet
PostgreSQL to Citus Migration Guide
3 pages
PostgreSQL SQL Queries Guide
No ratings yet
PostgreSQL SQL Queries Guide
1 page
Sstpath Dba
No ratings yet
Sstpath Dba
3 pages
PostgreSQL DBA Full Notes 2025
No ratings yet
PostgreSQL DBA Full Notes 2025
267 pages
SQL Server Health Checklist
No ratings yet
SQL Server Health Checklist
26 pages
Logical Replication in PostgreSQL
No ratings yet
Logical Replication in PostgreSQL
28 pages
PGSQL CheatSheet Mysql2psql
No ratings yet
PGSQL CheatSheet Mysql2psql
7 pages
Set Up PostgreSQL 9.5 Master-Slave Replication Using Repmgr Edward Samuel's Blog
No ratings yet
Set Up PostgreSQL 9.5 Master-Slave Replication Using Repmgr Edward Samuel's Blog
5 pages
MySQL Backup Strategies Guide
No ratings yet
MySQL Backup Strategies Guide
33 pages
PostgreSQL & MySQL DBA Expertise
No ratings yet
PostgreSQL & MySQL DBA Expertise
6 pages
Building A Scalable Time-Series Database Using Postgres: Mike Freedman
No ratings yet
Building A Scalable Time-Series Database Using Postgres: Mike Freedman
45 pages
Migrate From Oracle To Postgresql With Azure: Webinar Series
100% (1)
Migrate From Oracle To Postgresql With Azure: Webinar Series
21 pages
SQL Server Replication
No ratings yet
SQL Server Replication
8 pages
Sybase Commands PDF
No ratings yet
Sybase Commands PDF
22 pages
Oracle 10 Database Engine New Features For Techies: Tanel Põder Independent Consultant Integrid - Info
No ratings yet
Oracle 10 Database Engine New Features For Techies: Tanel Põder Independent Consultant Integrid - Info
32 pages
Labs AlloyDB
100% (1)
Labs AlloyDB
2 pages
EDB Postgres Failover Manager Guide v2.1
No ratings yet
EDB Postgres Failover Manager Guide v2.1
86 pages
Understanding Multi-Master Replication 2
No ratings yet
Understanding Multi-Master Replication 2
31 pages
Azure PostgreSQL
No ratings yet
Azure PostgreSQL
130 pages
Postgresql Query Tuning
No ratings yet
Postgresql Query Tuning
78 pages
PostgreSQL Replication
No ratings yet
PostgreSQL Replication
59 pages
Setting Up Replication in PostgreSQL
No ratings yet
Setting Up Replication in PostgreSQL
6 pages
Netezza Performance Best Practices
No ratings yet
Netezza Performance Best Practices
5 pages
The Secrets of Materialized Views
100% (2)
The Secrets of Materialized Views
8 pages
Postgresql CMD For General Tasks
No ratings yet
Postgresql CMD For General Tasks
5 pages
PostgreSQL Installation Guide
No ratings yet
PostgreSQL Installation Guide
55 pages
PGCES 02 S
No ratings yet
PGCES 02 S
15 pages
Devices and Databases: o o o o o o
No ratings yet
Devices and Databases: o o o o o o
4 pages
MSSQL DBA Notes
No ratings yet
MSSQL DBA Notes
147 pages
PostgreSQL Webinar: Users, Roles, and Features
No ratings yet
PostgreSQL Webinar: Users, Roles, and Features
35 pages
Oracle Interview Questions and Answers: SQL: 1. To See Current User Name
No ratings yet
Oracle Interview Questions and Answers: SQL: 1. To See Current User Name
18 pages
SybaseIQ15 Best Practices Wp20110718
100% (1)
SybaseIQ15 Best Practices Wp20110718
66 pages
10gen-MongoDB Operations Best Practices
No ratings yet
10gen-MongoDB Operations Best Practices
26 pages
SQL Scripts for Tablespace Alerts
No ratings yet
SQL Scripts for Tablespace Alerts
10 pages
HVR User Manual 5.7.cec91698
No ratings yet
HVR User Manual 5.7.cec91698
715 pages
Huge Page Caculator Tuning Red Hat Enterprise Linux For Oracle and Oracle RAC Performance - Red Hat Customer Portal
No ratings yet
Huge Page Caculator Tuning Red Hat Enterprise Linux For Oracle and Oracle RAC Performance - Red Hat Customer Portal
22 pages
PostgreSQL 9.3 Setup & Replication Guide
100% (1)
PostgreSQL 9.3 Setup & Replication Guide
33 pages
Exadata X5 Admin Exam Guide
0% (2)
Exadata X5 Admin Exam Guide
4 pages
Introbook v4 en
No ratings yet
Introbook v4 en
145 pages
Oracle To Microsoft SQL Server Migration - SQLines
No ratings yet
Oracle To Microsoft SQL Server Migration - SQLines
7 pages
Optimizing MySQL Server Settings Codingpedia
No ratings yet
Optimizing MySQL Server Settings Codingpedia
6 pages
PostgreSQL Database Administration
No ratings yet
PostgreSQL Database Administration
1 page
Oracle Database Internals FAQ
No ratings yet
Oracle Database Internals FAQ
9 pages
Introduction Postgre SQLAdministration V11
No ratings yet
Introduction Postgre SQLAdministration V11
274 pages
Postgresql Course Material
No ratings yet
Postgresql Course Material
205 pages
Postgre SQL
No ratings yet
Postgre SQL
35 pages
Power Builder Training
0% (1)
Power Builder Training
124 pages
Online Book Store: Chapter-1
No ratings yet
Online Book Store: Chapter-1
46 pages
Database Design & Development Guide
No ratings yet
Database Design & Development Guide
35 pages
Unit V-Hive
No ratings yet
Unit V-Hive
10 pages
Empowerment Technology: Quarter 1 - Module 3
86% (7)
Empowerment Technology: Quarter 1 - Module 3
10 pages
Secondary Dataaaaa
No ratings yet
Secondary Dataaaaa
46 pages
BTP Final Report 112101009 Chandrakanta
No ratings yet
BTP Final Report 112101009 Chandrakanta
46 pages
ICT COC DBA Level 4
100% (1)
ICT COC DBA Level 4
4 pages
DIT Project
50% (10)
DIT Project
16 pages
Installation Guide For Ibm'S Db2 Database Server Software
No ratings yet
Installation Guide For Ibm'S Db2 Database Server Software
10 pages
Basics of SQL Queries: Dbms Lab (Cs593)
No ratings yet
Basics of SQL Queries: Dbms Lab (Cs593)
20 pages
What Does SAP SD Look Like in SAP S4 Hana
100% (3)
What Does SAP SD Look Like in SAP S4 Hana
15 pages
Analysis-Cheat-Sheet-Part-1/#gref Analysis-Cheat-Sheet-Part-2/#gref
No ratings yet
Analysis-Cheat-Sheet-Part-1/#gref Analysis-Cheat-Sheet-Part-2/#gref
12 pages
Manual of Installation and Configuration For SMART-BUS Android Official Pro Version (Manual Version V1.7)
No ratings yet
Manual of Installation and Configuration For SMART-BUS Android Official Pro Version (Manual Version V1.7)
26 pages
Mayank Gupta Resume
No ratings yet
Mayank Gupta Resume
5 pages
8MySQL Practical Questions
No ratings yet
8MySQL Practical Questions
4 pages
Config
No ratings yet
Config
48 pages
Configuration Kit: Product Overview Benefits
No ratings yet
Configuration Kit: Product Overview Benefits
2 pages
Saphir 2 - Manual 1.4 - en
No ratings yet
Saphir 2 - Manual 1.4 - en
26 pages
CSEC Information Technology - Practical Paper - May 2007
No ratings yet
CSEC Information Technology - Practical Paper - May 2007
7 pages
Examiner's Report PDF
No ratings yet
Examiner's Report PDF
153 pages
Unified Contact Center Custom Application Software Services
No ratings yet
Unified Contact Center Custom Application Software Services
11 pages
Edgar F. Codd - IBM Archives
No ratings yet
Edgar F. Codd - IBM Archives
3 pages
Invitation To Computer Science 7th Edition Schneider Digital Access
100% (1)
Invitation To Computer Science 7th Edition Schneider Digital Access
409 pages
Engineering Process Selection Guide
No ratings yet
Engineering Process Selection Guide
11 pages
SCADA/DMS Tech Specs for Utilities
No ratings yet
SCADA/DMS Tech Specs for Utilities
1 page
Medical Store Management System
100% (3)
Medical Store Management System
28 pages
ABSTRACT
No ratings yet
ABSTRACT
40 pages
Persistence Best Practices For Java
No ratings yet
Persistence Best Practices For Java
202 pages
Couchbase Manual 1.8
No ratings yet
Couchbase Manual 1.8
157 pages

Postgresql Tuning Guide: Postgresql Architecture: Key Takeaways

Uploaded by

Postgresql Tuning Guide: Postgresql Architecture: Key Takeaways

Uploaded by

PostgreSQL Tuning Guide: PostgreSQL

Client process or program (frontend)

Server process (backend)

PostgreSQL query request overview

PostgreSQL - forking the daemon process

Query processor - The Path of a Query

1. A connection from an application program to the PostgreSQL server has to be

The background writer

Cleaning the shared buffers

The Checkpoint Process

The autovacuum launcher process

• PostgreSQL 12 Official Documentation, https://www.postgresql.org/docs/

You might also like