Safely pairing HA-Proxy with virtual network interface providers like Keepalived or Heartbeat
This is sort of a follow-up to the Deploying HA-Proxy + Keepalived with Mercurial for distributed config post.
During testing we were coming across an issue where the HA-Proxy instance running on the slave member of our cluster would fail to bind some of its frontend proxies:
Starting haproxy: [ALERT] : Starting proxy Public-HTTPS: cannot bind socket
After some head scratching I noticed that the problem was only arising on those proxies that explicitly defined the IP address of a virtual interface that was being managed by Keepalived (or maybe Heartbeat for you).
This is because both of these High-Availability clustering systems use a rather simplistic design whereby the “shared” virtual IP is only installed on the active node in the cluster. While the nodes that are in a dormant state (i.e. the slaves) do not actually have those virtual IPs assigned to them during that state. It’s a sort of “IP address hot-swapping” design. I learnt this by executing a simple a command, first from the master server:
$ ip a <snipped stuff for brevity> 2: seth0: mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:15:5d:28:7d:19 brd ff:ff:ff:ff:ff:ff inet 172.16.61.151/24 brd 172.16.61.255 scope global seth0 inet 172.16.61.150/24 brd 172.16.61.255 scope global secondary seth0:0 inet 172.16.61.159/24 brd 172.16.61.255 scope global secondary seth0:1 inet6 fe80::215:5dff:fe28:7d19/64 scope link valid_lft forever preferred_lft forever <snipped trailing stuff for brevity>
Then again, from the slave server:
$ ip a <snipped stuff for brevity> 2: seth0: mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:15:5d:2d:9c:11 brd ff:ff:ff:ff:ff:ff inet 172.16.61.152/24 brd 172.16.61.255 scope global seth0 inet6 fe80::215:5dff:fe2d:9c11/64 scope link valid_lft forever preferred_lft forever <snipped trailing stuff for brevity>
Unfortunately this behaviour can cause problems for programs like HA-Proxy which have been configured to expect the existence of specific network interfaces on the server. I was considering working around it by writing some scripts that hook events within the HA cluster to handle stopping and starting the HA-Proxy when needed. But this approach seemed clunky and unintuitive. So I dug a little deeper and came across a bit of a gem hidden away in the depths of the Linux networking stack. It is a simple boolean setting called “
net.ipv4.ip_nonlocal_bind” and it allows a program like HA-Proxy to create listening sockets on network interfaces that do not actually exist on the server. It was created specially for this situation.
So in the end the fix was as simple as adding/updating the
/etc/sysctl.conf file to include the following key/value pair:
My previous experience of setting up these low-level High-Availability clusters was with Windows Server’s feature called Network Load Balancing (NLB). This works quite different from Keepalived and Heartbeat. It relies upon some low level ARP hacking/trickery and some sort of distributed time splicing algorithm. But it does ensure that each node in the cluster (whether in a master or slave position) will remain allocated with the virtual IP address(es) at all times. I suppose there is always more than one way to crack an egg…