Today I learned how to fix a PostgreSQL error like this:
ERROR: canceling statement due to conflict with recovery DETAIL: User query might have needed to see row versions that must be removed.
It's about database replication. I have been getting that on read-only database which uses PostgreSQL Hot Standby. This error didn't happen always but happened occasionally.
I fixed that by setting both
max_standby_streaming_delay to a longer time (
300s) on the standby servers. The reason that this error occurs is query conflicts. This kind of error is not inevitable because of the nature of database replication, and in some cases, queries running on standby servers have to be canceled. Then the error shows up.
If my understanding is correct, it seems when this error occurs, a vacuum process on the primary (writable) server cleaned up some old row versions that are still needed by a running query on standby servers. Increasing
max_standby_streaming_delay gives queries on standby servers some more grace period. It will decrease the frequency of query cancellations.
Note that increasing these values means replication could delay a longer time when a query conflict happens. The default values of both
max_standby_streaming_delay are 30 seconds. In my case, 300 seconds delay was acceptable.
Another solution is setting
hot_standby_feedback, which prevents
VACUUM from cleaning up recently-dead rows. The official document says it could causes undesirable table bloat. I don't have enough time to try this solution so far. So maybe it's a better one.