This is a quick post on how to use keepalived to setup high-availability on two kamailio machines. In this setup there will be a “primary” and “secondary” node. Each node will be running kamailio and keepalived with a “shared” or sometimes referred to as a “floating” IP address.
The idea is that each node will be running health checks on itself and the other node. If a node detects kamailio as being down, it will move the “floating” IP address to itself and become the MASTER node.
After the IP address is moved, all SIP traffic will be automatically directed to the new MASTER node. Both kamailio nodes should be running almost identical routing configs in order to ensure traffic is routed properly after fail-over. This setup was tested on Debian9 with kamailio version 5.10.
We will use the following example IP address setup:
node 1 IP Address: 1.2.3.4
node 2 IP Address: 5.6.7.8
“floating” IP Address: 5.5.5.5
In order for this example to work, you need to enable binding to a non-local IP address. On each node run to enable this setting:
$ echo 1 > /proc/sys/net/ipv4/ip_nonlocal_bind
To permanently save this setting, add the following line to a file like /etc/sysctl.d/99-non-local-bind.conf
net.ipv4.ip_nonlocal_bind = 1
We will want kamailio to bind to the floating IP address and also to allow SIP OPTION pings from each of the other nodes. So you will want something like this in your kamailio.cfg:
#!substdef "!NODE01!1.2.3.4!g" #!substdef "!NODE02!5.6.7.8!g" # bind to local IP and "floating" IP listen=udp:1.2.3.4:5060 # on node 2, use 5.6.7.8 here listen=udp:5.5.5.5:5060 # main request routing logic route { # allow pings from other node if (is_method("OPTIONS") && ($si == "NODE01" || $si == "NODE02")) { options_reply(); } ... }
Start kamailio and make sure it’s running.
Next we will setup keepalived. To install everything we need run:
$ apt-get install libipset3 sipsak keepalived
We will be using sipsak with a custom bash script to report the “health” of each node. Add the following bash script to node 1 at /etc/keepalived/node01.sh
#!/bin/bash node01=1.2.3.4 node02=5.6.7.8 return_code=0 # success # check local instance timeout 2 sipsak -s sip:$node01:5060 exit_status=$? if [[ $exit_status -eq 0 ]]; then echo "sip ping successful to node01 [$node01]" exit $return_code fi # local instance failed, check remote timeout 2 sipsak -s sip:$node02:5060 exit_status=$? if [[ $exit_status -eq 0 ]]; then echo "sip ping successful to node02 [$node02]" return_code=1 fi echo "return code [$return_code]" exit $return_code
Make sure the file is executable:
$ chmod +x /etc/keepalived/node01.sh
The script on node 1 checks the health of kamailio locally first (by sending a SIP OPTION ping), if it fails, it checks the health of node 2. If node 1 is healthy, it reports success to keepalived by returning 0. If node 1 is not healthy and node 2 is healthy, it reports a failure to keepalived by returning 1.
And add the following bash script to node 2 at /etc/keepalived/node02.sh
#!/bin/bash node01=1.2.3.4 node02=5.6.7.8 return_code=1 # fail # check remote instance fist timeout 2 sipsak -s sip:$node01:5060 exit_status=$? if [[ $exit_status -eq 0 ]]; then echo "sip ping successful to node01 [$node01]" exit $return_code fi # remote instance failed, check local timeout 2 sipsak -s sip:$node02:5060 exit_status=$? if [[ $exit_status -eq 0 ]]; then echo "sip ping successful to node02 [$node02]" return_code=0 fi echo "return code [$return_code]" exit $return_code
Make sure the file is executable:
$ chmod +x /etc/keepalived/node02.sh
The script on node 2 checks the health of kamailio on node 1 (remotely) first, if it fails, it checks the health of node 2 (locally). If node 1 is healthy, it reports a “local” failure to keepalived by returning 1. If node 1 is not healthy and node 2 is healthy, it reports a “success” to keepalived by returning 0.
keepalived will promote the node to MASTER based on the return code reported by the bash scripts. You can get more creative and add more logic if you’d like, but for the purposes of this post, I kept it simple.
Last we will configure keepalived by editing the following file on node 1: /etc/keepalived/keepalived.conf
vrrp_script check_sip { script "/etc/keepalived/node01.sh" interval 6 # check every 6 seconds } vrrp_instance VI_SBC { state MASTER interface eth0 virtual_router_id 51 advert_int 1 authentication { auth_type PASS auth_pass 11111 } virtual_ipaddress { 5.5.5.5 dev eth0 } track_script { check_sip } }
And on node 2: /etc/keepalived/keepalived.conf
vrrp_script check_sip { script "/etc/keepalived/node02.sh" interval 6 # check every 6 seconds } vrrp_instance VI_SBC { state BACKUP interface eth0 virtual_router_id 51 advert_int 1 authentication { auth_type PASS auth_pass 11111 } virtual_ipaddress { 5.5.5.5 dev eth0 } track_script { check_sip } }
The settings above basically say to check health of each node every 6 seconds. If the node should become master, enable the floating IP (5.5.5.5).
Start keepalived service on each node and watch the syslog. As long as kamailio is running and responsive to the SIP OPTION pings on node 1, it will be MASTER. You can test by turning off kamailio on node 1 and watching the IP move to node 2. Use the following command to check for the IP address:
$ ip addr show
4 replies on “kamailio high availability using keepalived”
will the setup work for tcp stateful transactions failover to standby box ?
no, this is not designed to handle tcp stateful connections.
HI Emmanuel,
What will happen to existing calls when a failover occurs? How do we ensure that the call states are replicated b/w the two HA nodes and transactions are handled appropriately?
i believe you need to implement something like anycast for that level of resilience.