kamailio high availability using keepalived

This is a quick post on how to use keepalived to setup high-availability on two kamailio machines. In this setup there will be a “primary” and “secondary” node. Each node will be running kamailio and keepalived with a “shared” or sometimes referred to as a “floating” IP address.
The idea is that each node will be running health checks on itself and the other node. If a node detects kamailio as being down, it will move the “floating” IP address to itself and become the MASTER node.
After the IP address is moved, all SIP traffic will be automatically directed to the new MASTER node. Both kamailio nodes should be running almost identical routing configs in order to ensure traffic is routed properly after fail-over. This setup was tested on Debian9 with kamailio version 5.10.

We will use the following example IP address setup:

node 1 IP Address: 1.2.3.4
node 2 IP Address: 5.6.7.8
“floating” IP Address: 5.5.5.5

In order for this example to work, you need to enable binding to a non-local IP address. On each node run to enable this setting:
$ echo 1 > /proc/sys/net/ipv4/ip_nonlocal_bind

To permanently save this setting, add the following line to a file like /etc/sysctl.d/99-non-local-bind.conf
net.ipv4.ip_nonlocal_bind = 1

We will want kamailio to bind to the floating IP address and also to allow SIP OPTION pings from each of the other nodes. So you will want something like this in your kamailio.cfg:

#!substdef "!NODE01!1.2.3.4!g"
#!substdef "!NODE02!5.6.7.8!g"

# bind to local IP and "floating" IP
listen=udp:1.2.3.4:5060 # on node 2, use 5.6.7.8 here
listen=udp:5.5.5.5:5060

# main request routing logic
route {

  # allow pings from other node
  if (is_method("OPTIONS") && ($si == "NODE01" || $si == "NODE02")) {
    options_reply();
  }

  ...

}

Start kamailio and make sure it’s running.

Next we will setup keepalived. To install everything we need run:
$ apt-get install libipset3 sipsak keepalived

We will be using sipsak with a custom bash script to report the “health” of each node. Add the following bash script to node 1 at /etc/keepalived/node01.sh

#!/bin/bash

node01=1.2.3.4
node02=5.6.7.8
return_code=0 # success

# check local instance
timeout 2 sipsak -s sip:$node01:5060
exit_status=$?
if [[ $exit_status -eq 0 ]]; then
  echo "sip ping successful to node01 [$node01]"
  exit $return_code
fi

# local instance failed, check remote
timeout 2 sipsak -s sip:$node02:5060
exit_status=$?
if [[ $exit_status -eq 0 ]]; then
  echo "sip ping successful to node02 [$node02]"
  return_code=1
fi

echo "return code [$return_code]"

exit $return_code

Make sure the file is executable:
$ chmod +x /etc/keepalived/node01.sh
The script on node 1 checks the health of kamailio locally first (by sending a SIP OPTION ping), if it fails, it checks the health of node 2. If node 1 is healthy, it reports success to keepalived by returning 0. If node 1 is not healthy and node 2 is healthy, it reports a failure to keepalived by returning 1.

And add the following bash script to node 2 at /etc/keepalived/node02.sh

#!/bin/bash

node01=1.2.3.4
node02=5.6.7.8
return_code=1 # fail

# check remote instance fist
timeout 2 sipsak -s sip:$node01:5060
exit_status=$?
if [[ $exit_status -eq 0 ]]; then
  echo "sip ping successful to node01 [$node01]"
  exit $return_code
fi

# remote instance failed, check local
timeout 2 sipsak -s sip:$node02:5060
exit_status=$?
if [[ $exit_status -eq 0 ]]; then
  echo "sip ping successful to node02 [$node02]"
  return_code=0
fi

echo "return code [$return_code]"

exit $return_code

Make sure the file is executable:
$ chmod +x /etc/keepalived/node02.sh
The script on node 2 checks the health of kamailio on node 1 (remotely) first, if it fails, it checks the health of node 2 (locally). If node 1 is healthy, it reports a “local” failure to keepalived by returning 1. If node 1 is not healthy and node 2 is healthy, it reports a “success” to keepalived by returning 0.

keepalived will promote the node to MASTER based on the return code reported by the bash scripts. You can get more creative and add more logic if you’d like, but for the purposes of this post, I kept it simple.

Last we will configure keepalived by editing the following file on node 1: /etc/keepalived/keepalived.conf

vrrp_script check_sip {
  script "/etc/keepalived/node01.sh"
  interval 6 # check every 6 seconds
}

vrrp_instance VI_SBC {
  state MASTER
  interface eth0
  virtual_router_id 51
  advert_int 1
  authentication {
    auth_type PASS
    auth_pass 11111
  }
  virtual_ipaddress {
    5.5.5.5 dev eth0
  }
  track_script {
    check_sip
  }
}

And on node 2: /etc/keepalived/keepalived.conf

vrrp_script check_sip {
  script "/etc/keepalived/node02.sh"
  interval 6 # check every 6 seconds
}

vrrp_instance VI_SBC {
  state BACKUP
  interface eth0
  virtual_router_id 51
  advert_int 1
  authentication {
    auth_type PASS
    auth_pass 11111
  }
  virtual_ipaddress {
    5.5.5.5 dev eth0
  }
  track_script {
    check_sip
  }
}

The settings above basically say to check health of each node every 6 seconds. If the node should become master, enable the floating IP (5.5.5.5).

Start keepalived service on each node and watch the syslog. As long as kamailio is running and responsive to the SIP OPTION pings on node 1, it will be MASTER. You can test by turning off kamailio on node 1 and watching the IP move to node 2. Use the following command to check for the IP address:
$ ip addr show

4 replies on “kamailio high availability using keepalived”

will the setup work for tcp stateful transactions failover to standby box ?

no, this is not designed to handle tcp stateful connections.

HI Emmanuel,

What will happen to existing calls when a failover occurs? How do we ensure that the call states are replicated b/w the two HA nodes and transactions are handled appropriately?

i believe you need to implement something like anycast for that level of resilience.

4 replies on “kamailio high availability using keepalived”

Leave a Reply Cancel reply