PostgreSQL Replication Options

So, with all of the other articles that I have written about Postgresql, I figured that it was inevitable that we finally get one written about replication. What makes an article about Postgresql replication difficult is that unlike many of Postgresql’s competitors, PostgreSQL waited until 2008 to decide to implement any replication strategy into core PostgreSQL itself.

PostgreSQL’s core team made it clear that they wanted to allow development of competing solutions to allow multiple options “recognizing that there is no ‘one size fits all’ replication solution” (See: Core team statement on replication in PostgreSQL, Tom Lane, May 29, 2008 http://www.postgresql.org/message-id/26529.1212070375@sss.pgh.pa.us)
This led to a host of replication options being available then and still today for PostgreSQL outside of what is now available in the core product itself. These competing options include, but are not limited to: PgCluster, pgpool, slony, Bucardo, Londiste, Mammoth, rubyrep. See: http://wiki.postgresql.org/wiki/Clustering for more details on these options.

Finally in 2008, the core team decided to begin supporting their own native simple asynchronous single-master-multiple-slave replication to be available in version 8.4 as:
“Users who might consider PostgreSQL are choosing other database systems because our existing replication options are too complex to install and use for simple cases.” (See: Core team statement on replication in PostgreSQL, Tom Lane, May 29, 2008 http://www.postgresql.org/message-id/26529.1212070375@sss.pgh.pa.us)

PostgreSQL Replication in Version 9.X

Now today with versions 9.X of PostgreSQL, we call it “streaming replication”, and allows you to setup multiple standby servers using both synchronous and asynchronous slaves. Any type of multi-master replication, however, still requires that you use one of the other independent replication options.

Configure Simple Master / Slave Streaming Replication in PostgreSQL in 9.X

Inasmuch as I recently had to setup a simple PostgreSQL master slave configuration recently, I thought that I jot down the notes and share them here to help others that needed to perform the same task someday. These steps assume that you installed your PostgreSQL similar to steps in the previous article: CentOS Install PostgreSQL. The slave server will be overwritten. Any data stored on it will be gone.

Step 1 – Configure SSH Keys for the PostgreSQL User

When the WAL logs get switched out, we typically will want to be able to copy them over to our slave so that we have less risk of losing data in the event of a disaster. Rather than manually copying these files, we can automate the task. However, to automate the copying of these files, we need to create an SSH key on the master server that we will copy over to the slave so that later the master can copy its WAL files over after each switch. Because we already wrote an article on creating ssh keys here at uptimemadeeasy.com, I will refer you to steps 2-5 of the article automate-file-copy-machines-using-rsync-over-ssh to accomplish this first task. Follow those steps, creating the ssh keys as the postgres user on the master server and then copy the key over to the postgres user’s $HOME/.ssh/authorized_keys file on the slave server. When all of this is done, come back here and we will begin the next steps.

Step 2 – Create a PostgreSQL Replication User

Now that we have our ssh keys figured out, our next step is to create a userid that the slave servers can use to contact PostgreSQL on this server. This is pretty simple using the create user syntax:

create user <username> replication encrypted password '<password-here>';

Now, we need to edit the pg_hba.conf file to allow this new replication user access to the master PostgreSQL server. I did this by adding this line to my file:

host    replication     rep             my.ip.add.ress/32           md5

where my.ip.add.ress is the actual ip address of the slave server. After editing this pg_hba.conf file, I had to restart PostgreSQL to make it take.

Step 3 – Changes to the Master Server’s PostgreSQL.conf File

Now it is time to progress to the postgresql.conf file configuration. We have a list of configuration options that we need to change. I will list all of my changes in one spot and then discuss them a bit below:

listen_addresses = '*'   
wal_level = hot_standby   
wal_keep_segments = 10
archive_mode = on         
archive_command = '/usr/bin/rsync -qa %p postgres@prizm2.1on1.com:/data/pgsql/9.3/archive/%f' 
max_wal_senders = 1             
hot_standby = on

Let’s go through each of these:

listen_addresses lists all of the addresses that you want to have your PostgreSQL listen on.listen_addresses lists all of the addresses that you want to have your PostgreSQL listen on.listen_addresses lists all of the addresses that you want to have your PostgreSQL listen on.listen_addresses lists all of the addresses that you want to have your PostgreSQL listen on.

Name	Description
listen_addresses	Lists all of the addresses that you want to have your PostgreSQL listen on. You can set this to a comma separated list or use wildcards. Be careful using wildcards as changing this value will allow different hosts to connect to it. Using this option as well as a firewall such as iptables is a good idea.
wal_keep_segments	This is the minimum number of 16MB WAL files to retain in the pg_xlog directory. More files may exist to handle a large checkpoint. If your slave server frequently loses connection and has to catch up on lots of data, setting this to a higher number can help. Remember that the you can also set your WAL files to be rsynced to the slave server using the archive_command option.
wal_level	Can be set to one of these options to determine the level of wal logging done: minimal, archive, or hot_standby. Must be set to archive or preferably hot_standby for streaming replication as minimal does not log enough information for replication.
archive_mode	Setting archive_mode to on enables the WAL logs.
archive_command	if archive_mode is set to “on” then this shell command is executed whenever a WAL file segment (default is 16MB) is completed. In the case above, we have it copying with rsync the file to the slave server. There are two variables that can be used in your shell command: %f – filename or %p – path.
max_wal_senders	Default is Zero. Max_wal_senders is the number of concurrent connections allowed from the slave servers. wal sender connections count against the total number of connections configured in max_connections. Be sure to have at least one connection configured in max_wal_senders per configured slave.
hot_standby	default is off. Determines if you can connect and run queries during times that your server is in archive recovery or standby mode. This is helpful to replication slaves.

Once you have your postgresql.conf file set on the master, restart the PostgreSQL server.

Step 4 – Changes to the Slave Server (PostgreSQL.conf and recovery.conf Files)

Next, we go to the slave server and configure its postgresql.conf file and the recovery.conf file.

Edit the postgresql.conf file and set:

hot_standby = on

Now, we create the recovery file in the postgresql data directory. On my repo installed postgresql server, I found a sample recovery file at: /usr/pgsql-9.3/share/recovery.conf.sample. This file tells PostgreSQL how to perform either an archive recovery of a database or how to connect to the master server in the event of replication like we are doing.

standby_mode = on
primary_conninfo = ‘host=pg1.uptimemadeeasy.com port=5432 user=rep password=<my rep user’s password>’
trigger_file = ‘/tmp/postgresql.trigger.5432’

standby_modedescription

Parameter	Description
standby_mode	When enabled, standby_mode causes PostgreSQL to work as a standby in a replication configuration
primary_conninfo	Specifies the connection information used to connect to the master server. Example: primary_conninfo = ‘host=host.domain.com port=5432 user=userid password = password’
trigger_file	While running as a standby, the slave will continue to restore the XLOG records from the primary server. The trigger file is what is used to “trigger” a server to leave standby mode and to begin being a primary server. You can configure this option by specifying the trigger file to look for.

Step 5 – Materialize the Slave

Before we can finalize the replication we need to get an initial copy of the database onto the slave machine. We can do this pretty easily by running the typical binary backup process (pg_start_backup and pg_stop_backup), but instead of backing up the PostgreSQL files, we are copying them to the slave server using rsync and the ssh key that we generated in step 1 above.

$ psql -c “select pg_start_backup(‘mybackup’);”
$ rsync -cva --inplace --exclude=*pg_xlog* /data/pgsql/9.3/data/ postgres@pg2.1on1.com:/data/pgsql/9.3/data/
$ psql -c "select pg_stop_backup();"

Restart everything and hope for the best!

This is where we are hoping that we have everything working and, of course, everything really should be up and working once we startup the slave server.

Start the Slave Server

# service postgresql-9.3 start

Watch the log file as it starts up to make sure that there are no errors. So, I will tail the log being written today:

# tail -f /data/pgsql/9.3/data/pg_log/postgresql-Tue.log
< 2014-03-24 23:15:47.670 MDT >LOG: started streaming WAL from primary at 5D9/87000000 on timeline 1
< 2014-03-24 23:17:20.233 MDT >LOG: consistent recovery state reached at 5D9/921BBF70
< 2014-03-24 23:17:20.234 MDT >LOG: database system is ready to accept read only connections

Great! Looks like everything is working. We can verify it is working by creating a test table and inserting data to see if they make it into the slave.

On the Master:

postgres=# create database joke;

postgres=# \c joke;

joke=# create table joketable ( jokeint int, jokechar varchar(40) );
CREATE TABLE
joke=# insert into joketable values ( 1, '2 Guys walk into a bar...' );
INSERT 0 1

On the Slave:

joke=# select * from joketable;
jokeint | jokechar
---------+---------------------------
1 | 2 Guys walk into a bar...
(1 row)

So, it appears to be working as our database, table, and row data made it to the slave.

Repairing Broken PostgreSQL Streaming Replication

So, I have already broken my PostgreSQL replication. Don’t stress! Take a deep breath. The best thing is that your slave is just a copy of the master’s data. We can just rebuild the slave. Here are the steps that I take to repair my slave when it is broken:

First – Shutdown the Slave and Copy the recovery.conf, postgresql.conf, and Any Other File You Wish to Keep, Then Rename the Data Directory

Shutdown the slave:

# service postgresql-9.3 stop

Make a copy of these important files:

$ cd /data/pgsql/9.3/data
$ cp recovery.conf $HOME
$ cp postgresql.conf $HOME
$ mv $ cd /data/pgsql/9.3/data $ cd /data/pgsql/9.3/data-old

Second, Rematerialize Your Slave From the Master’s Copy

On the Master perform these steps again:

$ psql -c “select pg_start_backup(‘mybackup’);”
$ rsync -cva --inplace --exclude=*pg_xlog* /data/pgsql/9.3/data/ postgres@pg2.1on1.com:/data/pgsql/9.3/data/
$ psql -c "select pg_stop_backup();"

Third, Copy Back the Recovery.conf and the Postgresql.conf Files and Restart the Slave

Remember when we backed up these files? We will put them back in place.

$ cp $HOME/recovery.conf   /data/pgsql/9.3/data
$ cp $HOME/postgresql.conf /data/pgsql/9.3/data

Startup the Slave PostgreSQL

# /sbin/service postgresql-9.3 start

Verify the logs to make sure that it is working.

Monitoring PostgreSQL Streaming Replication with Nagios

Now that you have your PostgreSQL Streaming Replication Setup you may want to monitor it with Nagios. I found a great Nagios plugin for monitoring streaming replication on Johnny Morano’s blog. His plugin / script with a couple of customizations for my environment work great. You can find his blog / plugin at: http://jmorano.moretrix.com/2013/08/postgresql-9-2-master-slave-monitoring/ Props to you Johnny Morano!

A quick note on running it, be sure to modify the retry_check_interval and max_check_attempts so that a brief moment of being behind doesn’t wake you up at night.

Bio
Latest Posts

Jeff Staten

Jeff has 20 years of professional IT experience, having done nearly everything in his roles of IT consultant, Systems Integrator, Systems Engineer, CNOC Engineer, Systems Administrator, Network Systems Administrator, and IT Director. If there is one thing he knows for sure, it is that there is always a simple answer to every IT problem and that downtime begins with complexity. Seasoned IT professional by day, Jeff hopes to help other IT professionals by blogging about his experiences at night on his blog: http://uptimemadeeasy.com. You can find Jeff on Google+ or LinkedIn at: LinkedIn or Twitter at: Twitter

Latest posts by Jeff Staten (see all)

Configure Your HP Procurve Switch with SNTP - May 5, 2015
Configuring HP Procurve 2920 Switches - May 1, 2015
Troubleshooting Sendmail - November 28, 2014

pg_current_xlog_insert_location()# select pg_current_xlog_insert_location(); pg_current_xlog_insert_location ——————————— 5DB/93301F20 (1 row)	master	Get the current transaction log insert location
pg_current_xlog_location()# select pg_current_xlog_location(); pg_current_xlog_location ————————– 5DB/93B442A8 (1 row)	master	Get the current transaction log write location.
pg_switch_xlog()# select pg_switch_xlog(); pg_switch_xlog —————- 5DB/93B6AC60 (1 row)	master	Force PostgreSQL to switch to a new transaction log file.
pg_is_in_recovery()# select pg_is_in_recovery(); pg_is_in_recovery ——————- t (1 row)	master(returns f)slave (returns t)	This is a boolean return type. t for true, f for false. This can help you identify if you are on the slave or master as the slave will return t.
pg_last_xlog_receive_location# select pg_last_xlog_receive_location(); pg_last_xlog_receive_location ——————————- 5DB/B9FC4880 (1 row)	slave	Find the last transaction log location received that was synced to disk through streaming replication. Will return a null if not on slave or if replication is not enabled.
pg_last_xlog_replay_location# select pg_last_xlog_replay_location(); pg_last_xlog_replay_location —————————— 5DB/B9FD61	slave	Find the last transaction log location replayed during a recovery. Returns NULL if replication is not enabled.
pg_last_xact_replay_timestamp()# select pg_last_xact_replay_timestamp(); pg_last_xact_replay_timestamp ——————————- 2014-03-25 15:34:29.867547-06 (1 row)	slave	Find the timestamp of the last transaction replayed during the recovery. Reflects the time from the WAL record commit or abort for the transaction.
pg_is_xlog_replay_paused()# select pg_is_xlog_replay_paused(); pg_is_xlog_replay_paused ————————– f (1 row)	slave	This is a boolean that returns true if the replay has been paused. Let’s you know if your slave has been paused or not.
pg_xlog_replay_pause()# select pg_xlog_replay_pause(); pg_xlog_replay_pause ———————-(1 row)	slave	Will pause a slave if run. Shows a NULL, but still works –Be careful.
pg_xlog_replay_resume()	slave	Will restart the slave when run. Shows a NULL, but still works to start.

PostgreSQL Replication

PostgreSQL Replication Options

PostgreSQL Replication in Version 9.X

Configure Simple Master / Slave Streaming Replication in PostgreSQL in 9.X

Step 1 – Configure SSH Keys for the PostgreSQL User

Step 2 – Create a PostgreSQL Replication User

Step 3 – Changes to the Master Server’s PostgreSQL.conf File

Step 4 – Changes to the Slave Server (PostgreSQL.conf and recovery.conf Files)

Step 5 – Materialize the Slave

Restart everything and hope for the best!

Start the Slave Server

On the Master:

On the Slave:

Repairing Broken PostgreSQL Streaming Replication

First – Shutdown the Slave and Copy the recovery.conf, postgresql.conf, and Any Other File You Wish to Keep, Then Rename the Data Directory

Second, Rematerialize Your Slave From the Master’s Copy

Third, Copy Back the Recovery.conf and the Postgresql.conf Files and Restart the Slave

Other Helpful PostgreSQL Streaming Replication Materials

Monitoring PostgreSQL Streaming Replication with Nagios

Jeff Staten

Latest posts by Jeff Staten (see all)

Leave a Reply Cancel reply

PostgreSQL Replication Options

PostgreSQL Replication in Version 9.X

Configure Simple Master / Slave Streaming Replication in PostgreSQL in 9.X

Step 1 – Configure SSH Keys for the PostgreSQL User

Step 2 – Create a PostgreSQL Replication User

Step 3 – Changes to the Master Server’s PostgreSQL.conf File

Step 4 – Changes to the Slave Server (PostgreSQL.conf and recovery.conf Files)

Step 5 – Materialize the Slave

Restart everything and hope for the best!

Start the Slave Server

On the Master:

On the Slave:

Repairing Broken PostgreSQL Streaming Replication

First – Shutdown the Slave and Copy the recovery.conf, postgresql.conf, and Any Other File You Wish to Keep, Then Rename the Data Directory

Second, Rematerialize Your Slave From the Master’s Copy

Third, Copy Back the Recovery.conf and the Postgresql.conf Files and Restart the Slave

Other Helpful PostgreSQL Streaming Replication Materials

Monitoring PostgreSQL Streaming Replication with Nagios

Jeff Staten

Latest posts by Jeff Staten (see all)

Related posts:

Post navigation

Leave a Reply Cancel reply