转自:https://www.percona.com/blog/2018/10/19/postgresql-building-enterprise-grade-setup-with-open-source/
hello everyone, and thank you to those that attended our webinar on building an enterprise-grade postgresql setup using open source tools last wednesday. you’ll find the recordings of such as well as the slides we have used during our presentation here.
we had over forty questions during the webinar but were only able to tackle a handful during the time available, so most remained unanswered. we address the remaining ones below, and they have been grouped in categories for better organization. thank you for sending them over! we have merged related questions and kept some of our answers concise, but please leave us a comment if you would like to see a particular point addressed further.
backups
q: in our experience, pg_basebackup with compression is slow due to single-thread gzip compression. how to speed up online compressed full backup?
single-thread operation is indeed a limitation of pg_basebackup, and this is not limited to compression only. pgbackrest is an interesting alternative tool in this regard as it does have support for parallel processing.
q: usually one setup database backup on primary db in a ha setup. is it possible to automatically activate backup on new primary db after patroni failover? (or other ha solutions)
yes. this can be done transparently by pointing your backup system to the “master-role” port in the haproxy instead – or to the “replica-role” port; in fact, it’s more common to use standby replicas as the backup source.
q: do backups and wal backups work with third party backup managers like netbackup for example?
yes, as usual it depends on how good the vendor support is. netbackup supports postgresql, and so does zmanda to mention another one.
security and auditing
q: do you know a tde solution for postgresql? can you talk a little bit about the encryption at rest solution for postgres pci/pii applications from percona standpoint.
at this point postgresql does not provide a native transparent data encryption (tde) functionality, relying instead in the underlying file system for data-at-rest encryption. encryption at the column level can be achieved through the pgcrypto module.
moreover, other postgresql security features related to pci compliance are:
row-level security
host based authentication (pg_hba.conf)
encryption of data over the wire using ssl
q: how to prevent superuser account to access raw data in postgres? (…) we encounter companies usually ask that even managed accounts can not access the real data in any mean.
it is fundamental to maintain a superuser account that is able to access any object in the database for maintenance activities. having said that, currently it is not possible to deny a superuser direct access to the raw data found in tables. what you can do to protect sensitive data from superuser access is to have it stored encrypted. as mentioned above, pgcrypto offers the necessary functionality for achieving this.
furthermore, avoiding connecting to the database as a superuser is a best practice. the extension set_user allows for unprivileged users to escalate themselves as superuser for maintenance tasks on demand while providing an additional layer of logging and control for better auditing. also, as discussed in the webinar, it’s possible to implement segregation of users using roles and privileges. remember it’s best practice to only grant the essential privileges a role to fulfill its duties, including application users. additionally, password authentication should be enforced to superusers.
q: how can you make audit logging in postgres record dmls while masking data content in these recorded sqls?
to the best of our knowledge, currently there is not a solution to apply query obfuscation to logs. bind parameters are always included in both the audit and logging of dmls, and that is by design. if you would rather avoid logging bind parameters and want to keep track of the statements executed only, you can use the pg_stat_statements extension instead. note that while pg_stat_statements provides overall statistics of the executed statements, it does not keep track of when each dml has been executed.
q: how to setup database audit logging effectively when utilizing pgbouncer or pgpool?
a key part of auditing is having separate user accounts in the database instead of a single, shared account. the connection to the database should be made by the appropriate user/application account. in pgbouncer we can have multiple pools for each of the user accounts. every action by a connection from that pool will be audited against the corresponding user.
high availability and replication
q: is there anything like galera for postgresql ?
galera replication library provides support for multi-master, active-active mysql clusters based on synchronous replication, such as percona xtradb cluster. postgresql does have support for synchronous replication but limited to a single active master context only.
there are, however, clustering solutions for postgresql that address similar business requirements or problem domains such as scalability and high availability (ha). we have presented one of them, patroni, in our webinar; it focuses on ha and read scaling. for write scaling, there have long been sharding based solutions, including citus, and postgresql 10 (and now 11!) bring substantial new features in the partitioning area. finally, postgresql based solutions like greenplum and amazon redshift addresses scalability for analytical processing, while timescaledb has been conceived to handle large volumes of time series data.
q: pgpool can load balance – what is the benefit of haproxy over pgpool?
no doubt pgpool is feature rich, which includes load balancing besides connection pooling, among other functionalities. it could be used in place of haproxy and pgbouncer, yes. but features is just one of the criteria for selecting a solution. in our evaluation we gave more weight to lightweight and faster, scalable solutions. haproxy is well known for its lightweight connection routing capability without consuming much of the server resources.
q: how to combine pgbouncer and pgpool together so that one can achieve transaction pooling load balancing? can you let me know between the two scaling solutions which one is better, pgbouncer or pgpool-ii?
it depends, and must be analyzed on a case-by-case basis. if what we really need is just a connection pooler, pgbouncer will be our first choice because it is more lightweight compared to pgpool. pgbouncer is thread-based while pgpool is process-based—like postgresql, forking the main process for each inbound connection is a somewhat expensive operation. pgbouncer is more effective in this front.
however, the relative heavyweight of pgpool comes with a lot of features, including the capability to manage postgresql replication, and the ability to parse statements fired against postgresql and redirect them to certain cluster nodes for load balancing. also, when your application cannot differentiate between read and write requests, pgpool can parse the individual sql statements and redirect them to the master, if it is a write, or to a standby replica, if it is a read, as configured in your pgpool setup. the demo application we used in our webinar setup was able to distinguish reads from writes and use multiple connection strings accordingly, so we employed haproxy on top of patroni.
we have seen environments where pgpool was used for its load balancing capabilities while connection pooling duties were left for pgbouncer, but this is not a great combination. as described above, haproxy is more efficient than pgpool as a load balancer.
finally, as discussed in the webinar, any external connection pooler like pgbouncer is required only if there is no proper application layer connection pooler, or if the application layer connection pooler is not doing a great job in maintaining a proper connection pool, resulting in frequent connections and disconnections.
q: is it possible for postgres to have a built-in connection pool worker? maybe merge pgbouncer into postgres core? that would make it much easier to use advanced authentication mechanisms (e.g. ldap).
a great thought. that would indeed be a better approach in many aspects than employing an external connection pooler like pgbouncer. recently there were discussions among postgresql contributors on the related topic, as seen here. a few sample patches have been submitted by hackers but nothing has been accepted yet. the postgresql community is very keen to keep the server code lightweight and stable.
q: is rebooting the standby the only way to change master in postgresql?
a standby-to-master promotion does not involve any restart.
from the perspective of the user, a standby is promoted by pg_ctl promote command or by creating a trigger file. during this operation, the replica stops the recovery related processing and becomes a read-write database.
once we have a new master, all the other standby servers need to start replicating from it. this involves changes to the recovery.conf parameters and, yes, a restart: the restart happens only on the standby side when the current master has to be changed. postgresql currently does not allow us to change this parameter using a sighup.
q: are external connection pooling solutions (pgbouncer, pgpool) compatible with java hibernate orm ?
external connection poolers like pgbouncer and pgpool are compatible with regular postgresql connections. so connections from hibernate orm can treat pgbouncer as regular postgresql but running on a different port (or the same, depending on how you configure it). an important point to remember is that they are complementary to connection pools that integrate well with orm components. for example c3p0 is a well known connection pooler for hibernate. if an orm connection pooler can be well tuned to avoid frequent connections and disconnections, then, external pooling solutions like pgbouncer or pgpool will become redundant and can/should be avoided.
q: question regarding connection pool: i want to understand if the connections are never closed or if there are any settings to force the closing of the connection after some time.
there is no need to close a connection if it can be reused (recycled) again and again instead of having a new one created. that is the very purpose of the connection pooler. when an application “closes” a connection, the connection pooler will virtually release the connection from the application and recover it back to the pool of connections. on the next connection request, instead of establishing a new connection to the database the connection pooler will pick a connection from the pool of connections and “lend” it to the application. furthermore, most connection poolers include a parameter to control the release of connections after a specified idle time.
q: question regarding patroni: can we select in the settings to not failover automatically and only used patroni for manual failover/failback?
yes, patroni allow users to pause its automation process, leaving them to manually trigger operations such as failover. the actual procedure for achieving this will make an interesting blog post (we put it in our to-do list).
q: where should we install pgbouncer, patroni and haproxy to fulfill the 3-lawyers format: web frontends, app backends and db servers ? what about etcd ?
patroni and etcd must be installed in the database servers. in fact, etcd can be running in other servers as well, because the set of etcd instances just form the distributed consensus store. haproxy and pgbouncer can be installed on the application servers for simplicity, or optionally they can run on dedicated servers, especially when you ran a large amount of those. having said that, haproxy is very lightweight and can be maintained in each application server without added impact. if you want to install pgbouncer on dedicated servers, just make sure to avoid spof (single point of failure) by employing active-passive servers.
q: how does haproxy in your demo setup know how to route dml appropriately to the master and slaves (e.g. writes always go to the master and reads are load balanced between the replicas) ?
haproxy does not parse sql statements in the intermediate layer in order to redirect them to the master or to one of the replicas accordingly—this must be done at the application level. in order to benefit from this traffic distribution, your application needs to send write requests to the appropriate haproxy port; the same with read requests. in our demo setup, the application connected to two different ports, one for reads and another for writes (dml).
q: how often does the cluster poll each node/slave? is it tunable for poor performing networks?
patroni uses an underlying distributed consensus mechanism for all heartbeat checks. for example, etcd, which can be used for this, has default heartbeat interval of 100ms, but it is adjustable. apart from this, in every layer of the stack, there are tunable tcp-like timeouts. for connection routing haproxy polls by making use of the patroni api, which also allows further control on how the checks can be done. having said that, please keep in mind that poor performing networks are often a bad choice for distributed services, with problems spanning beyond timeout checks.
miscellaneous
q: hi avinash/nando/jobin, maybe i wasn’t able to catch up with ddl’s but what’s the best way to handle ddls ? in mysql, we can use pt-online-schema-change and avoid large replica lag, is there a way to achieve the same in postgresql without blocking/downtime or does percona has an equivalent tool for postgresql? looking forward to this!
currently, postgresql locks tables for ddls. some ddls, such as creating triggers and indexes, may not lock every activity on the table. there isn’t a tool like pt-online-schema-change for postgresql yet. there is, however, an extension called pg_repack, which assists in rebuilding a table online. additionally, adding the keyword “concurrently” to create index statement makes it gentle on the system and allows concurrent dmls and queries to happen while the index is being built. let’s suppose you want to rebuild the index behind the primary key or unique key: an index can be created independently and the index behind the key can be replaced with a momentarily lock that may be seamless.
a lot of new features are added in this space with each new release. one of the extreme cases of extended locking is adding a not null column on a table with default values. in most of the database systems this operation can hold a write lock on the table until it completes. just released, postgresql 11 makes it a brief operation irrespective of the size of the table. it is now achieved with a simple metadata change rather than through a complete table rebuild. as postgresql continues to get better on handling ddls, the scope for external tools is reducing. moreover, it is not resulting in table rewrite, so excessive i/o and other side effects like replication lag can be avoided.
q: what are the actions that can be performed by the parallelization option in postgresql ?
this is the area where postgresql has improved significantly in the last few versions. the answer, then, depends on which version you are using. parallelization has been introduced in postgresql 9.6, with more capabilities added in version 10. as of version 11 pretty much everything can make use of parallelization, including index building. the more cpu cores your server has at its disposal, the more you would benefit from the latest versions of postgresql, given that it is properly turned for parallel execution.
q: is there any flashback query or flashback database option in postgresql ?
if flashback queries are an application requirement please consider using temporal tables to better visualize data from a specific time or period. if the application is handling time series data (like iot devices), then, timescaledb may be an interesting option for you.
flashback of the database can be achieved in multiple ways, either with the help of backup tools (and point-in-time recovery) or using a delayed standby replica.
q: question regarding pg_repack: we have attempted running pg_repack and for some reason it kept running forever; can we simply cancel/abort its execution ?
yes, the execution of pg_repack can be aborted without prejudice. this is safe to do because the tool creates an auxiliary table and uses it to rearrange the data, swapping it with the original table at the end of the process. if its execution is interrupted before it completes, the swapping of tables just doesn’t take place. however, since it works online and doesn’t hold an exclusive lock on the target table, depending on its size and the changes made on the target table during the process, it might take considerable time to complete. please explore the parallel feature available with pg_repack.
q: will the monitoring tool from percona be open source ?
percona monitoring and management (pmm) has been released already as an open source project with its source code being available at github.
q: it’s unfortunate that the master/slave terminology is still used on slide. why not use instead leader/follower or orchestrator node/node?
we agree with you, particularly regarding the reference on “slave” – “replica” is a more generally accepted term (for good reason), with “standby” [server|replica] being more commonly used with postgresql.
patroni usually employs the terms “leader” and “followers”.
the use of “cluster” (and thus “node”) in postgresql, however, contrasts with what is usually the norm (when we think about traditional beowulf clusters, or even galera and patroni) as it denotes the set of databases running on a single postgresql instance/server.