Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HAProxy + pgbouncer configuration and pgpool-II as an alternative #710

Open
rcknr opened this issue Jul 30, 2024 · 8 comments
Open

HAProxy + pgbouncer configuration and pgpool-II as an alternative #710

rcknr opened this issue Jul 30, 2024 · 8 comments

Comments

@rcknr
Copy link
Contributor

rcknr commented Jul 30, 2024

I have followed the most common scenario for deploying a cluster recently with HAProxy and pgbouncer.
After maintaining this setup for a while I came up with understanding of the following downsides:

  • pgbouncer is deployed to each cluster node which increases complexity of the setup and maintenance;
  • HAProxy has no awareness of database specifics and therefore configured to handle write/read operations through different ports. This means that load-balancing have to be handled by the client application;
  • Current confd template lists all database nodes for each frontend, relying on patroni health check. This means that some of the backends are always be down (e.g. primary node for read-only frontend).

The last point can be mitigated with a more involved template which also check for node role before including it in a particular frontend. @vitabaks please let me know if you'd be open to a contribution regarding this.

While researching I mistakingly assumed that pgpool-II was a functional equivalent of pgbouncer and the latter was chosen due to bigger popularity. Since then I have found that pgpool-II is actually so much more capable and can act as a load-balancer among other things. For now I'm testing it as a replacement for both HAProxy and pgbouncer and keep learning its other features like query caching.

@vitabaks
Copy link
Owner

vitabaks commented Jul 30, 2024

@rcknr Thanks for the comment.

I think we could implement the option to install pgbouncer on separate nodes, but by default, we install it locally and use a Unix socket to connect the pool to the database, as it is a more efficient version of the scheme.

P.S. We are not considering pgpool-II and it will most likely not be added to the main project, but you can experiment with your installation because postgresql_cluster provides flexibility in choosing a scheme.

@rcknr
Copy link
Contributor Author

rcknr commented Jul 30, 2024

For confd template I have the following proposal:

{{- range $members := gets "/members/*" -}}
  {{- $data := json $members.Value -}}
  {{- $hostname := index (split $members.Key "/") 2 -}}
  {{- $host := base (replace (index (split (index (split $data.conn_url ":") 1) "/") 2) "@" "/" -1) -}}
  {{- $port := index (split (index (split $data.api_url ":") 2) "/") 0 -}}
  {{- if eq $data.role "master" }}
  	server {{ $hostname }} {{ $host }}:6432 check port {{ $port }}
  {{ end -}}	
{{- end }}

This section refers to write frontend. For read frontend "master" is replaced with "replica".
This avoids listing replicas in write frontend and masters in read frontend respectively.

@vitabaks
Copy link
Owner

this is not necessary because HAProxy sends checks to the Patroni REST API and directs traffic only to servers with the required role.

image

@rcknr
Copy link
Contributor Author

rcknr commented Jul 30, 2024

Yes, I understand that. However, if you use any monitoring on HAProxy it will complain that some backends are down and you are going to need to get around that. One could argue that confd is not really necessary as you could just list all nodes in HAProxy with Ansible and fully rely on patroni healthchecks.

@vitabaks
Copy link
Owner

I monitor the availability of port 5000 and 5001, and not separately each host that is defined in the HAProxy configuration.

According to the same principle (using the Patroni REST API) I am implementing cloud load balancers (will be available in version 2.0).

@rcknr
Copy link
Contributor Author

rcknr commented Jul 30, 2024

HAProxy though checks each backend listed and appropriately marks some as being down under its metrics.
Again, in this scenario confd isn't doing much and could be taken from the equation without compromising reliability.
However, if you account for the role in the template it suddenly adds another layer of accountability with DCS.

@vitabaks
Copy link
Owner

vitabaks commented Dec 9, 2024

If you're interested in this feature, please consider becoming a sponsor.

The development of this feature requires sponsorship to fund developer efforts. If you're already a sponsor and are interested in this feature, please leave a comment here so we can prioritize this issue accordingly.

We also welcome contributions from those willing to dedicate their time to implement this feature on a voluntary basis. Please see the Contributing Guide for more information.

@joe-at-startupmedia
Copy link

  • HAProxy has no awareness of database specifics and therefore configured to handle write/read operations through different ports. This means that load-balancing have to be handled by the client application;

You can easily make HAProxy aware of database specifics (which replica to route master requests)
https://github.com/joe-at-startupmedia/happac/blob/743946594976bafc5cca752a71c8b6d4ef07da80/README.md?plain=1#L90

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@rcknr @joe-at-startupmedia @vitabaks and others