Thursday, February 23, 2012

SIP Load Sharing/Balancing and Failover using DNS SRV records

Note: It is assumed that you know about SIP protocol and you know how to configure SIP clients and SIP servers. [You can also check out my blogs on SIP at  for more info on SIP clients and SIP server configuration]
Brief Introduction on DNS:
DNS(Domain Name System) is a hierarchical distributed naming system for computers, services or any resource connected to the internet or a private network. It translates queries for domain names into IP addresses for the purpose of locating computer services and devices worldwide. (Src:Wikipedia)

In layman term, it maps human-friendly name into IP addresses. For example, it is alot easier to remember instead of remebering IP address of the www host of
DNS server stores the DNS records for a domain name, such as Address(A) records, Name Server(NS) records, Mail Exchanger(MX) records, Service (SRV) records and more. DNS server responds to the DNS queries made by it's clients against its databse.

For more info:

DNS Service(DNS SRV) record - RFC 2782: It is a type of DNS record/entry that specify the location of service available in a domain. It is typically used by clients to locate a service within a domain.

For example:
In Active Directory environment, PCs on domain rely on SRV records to locate domain controllers to authenticate to within their domain.

In SIP environment, SIP clients use SRV records to determine where to send an outgoing call.

Most importantly, DNS SRV records allow you to use domain name rather than full hostname of the server in the SIP address field of the client configuration.

SRV record is written in ZONE file as: TTL class SRV priority weight port target

_sip._udp.rapidtech.phones. 300 IN SRV 0 40 5060 sipserver1.rapidtech.phones.
_sip._udp.rapidtech.phones. 300 IN SRV 0 60 5060 sipserver2.rapidtech.phones.
_sip._udp.rapidtech.phones. 300 IN SRV 1 5 5060 sipserverbackup.rapidtech.phones.

service: name of the service e.g sip

proto: transport protocol of the service e.g tcp or udp

name: domain name that this record belongs to e.g rapidtech.phones

TTL: Time to Live value for this DNS record (Expiry time for the DNS record). Adjust this based on your environment. E.g If you set it to 300 seconds, client will make DNS query to the server every 5 minutes.

class: DNS class. This is always 'IN' here

Priority: Priority for the multiple hosts offering same service. This helps you to define failover serivce hosts. Lower value means more preferred. e.g sipserver1 and sipserver2 have priority 0 thus acts as primary servers and sipserverbackup has priority 1(greater than 0) thus acts as failover server.

Weight: A relative weight for records with the same priority. It is used for load sharing among the servers e.g. sipserver1 and sipserver2  sipserver1 will be used 40% of the time and sipserver2 will be used for 60% of the time.  If all the servers with priority 0 are unavailable, the record with the next highest priority value will be chosen, which is sipserverbackup.rapidtech.phones and it will be used 100% of time as it is not sharing load with another server.

port: TCP or UDP port on which the service is available e.g 5060 for SIP

target: the canonical hostname of the server providing the service e.g sipserver1.rapidtech.phones., sipserver2.rapidtech.phones., sipserverbackup.rapidtech.phones.

Ok great, now how do I implement it? Right!

DNS server:
Here I am going to use BIND/named (Linux based DNS server)
To run your own DNS: Install the bind, bind-devel, bind-utils, and caching-nameserver packages

Master Zone file:  rapidtech.phones

 #vi /var/lib/named/master/rapidtech.phones

$TTL 1800
@               IN SOA          dns1.dnsserver.phones.     root.dns1.dnsserver.phones. (
                                2012022302      ; serial
                                1800                    ; refresh time in seconds
                                600                      ; retry time in seconds
                                1w                        ; expiry time
                                1800 )                  ; minimum TTL

rapidtech.phones.        IN NS           dns1.dnsserver.phones.
sipserver1  IN A  
sipserver2  IN A  
NTP            IN A  
FTP             IN A  

_sip._udp.rapidtech.phones.   IN SRV 0 40 5060 sipserver1.rapidtech.phones.
_sip._udp.rapidtech.phones.   IN SRV 0 60 5060 sipserver2.rapidtech.phones.
 _sip._udp.rapidtech.phones.  IN SRV 1 5 5060 sipserverbackup.rapidtech.phones.

Define a zone in named.conf
#vi /etc/named.conf

logging {
        channel log_file {
                file "/var/log/bind.log" versions 3 size 100M; 
                severity dynamic;
                print-time yes;
                print-severity yes;
                print-category yes;

        category statistics { log_file; };
        category queries { log_file; };
        category xfer-in { log_file; };
        category xfer-out { log_file; };
        category default { log_file; };
#all DNS category activities will be logged in /var/lib/named/var/log/bind.log [ Warning!! it will not log in /var/log/bind.log ]

zone "rapidtech.phones" in {
        allow-transfer { any; };
        file "master/rapidtech.phones";
        type master;

Now restart the DNS service

#service named restart 

SIP clients:
SIP Proxy server setting: rapidtech.phones
This setting will allow SIP clients to make call thru' the available SIP servers.

[ Did you notice that just using domain name, you are directed to the available server according to the rules of SRV record? In this case, simply using rapidtech.phones, SIP client will use sipserver1 or sipserver2 or sipserverbackup according to the SRV records rule ]

Good Luck!

DNS SOA Header:
 DNS SRV Record: