Monday, October 10, 2011

Configured server with multiple NICs on different subnet. Can't PING IP add on second NIC? Here's the solution

Let's say we have a network problem as shown below (User can't access 10.1.1.10 from the workstation. User fails to ping 10.1.1.10 from workstation):


Note: In Linux, usually NIC1 is presented as eth0 and NIC2 is presented as eth1

How to fix the problem associated with accessing 10.1.1.10 from workstation?

You must configure multiple default routes in the server.
You can possibly achieve this in different ways, however I prefer the use of IP ROUTE and IP RULES. It's easy to implement and understand.

Step 1: Create a new policy routing table
# echo "1 TenNetwork" >> /etc/iproute2/rt_tables

Routing tables are declared in rt_tables. Here we declared TenNetwork table as we are going to write a set of rules associated with 10 network. You can give it any name you want.

Step2: Define routes in the table
#ip route add 10.1.0.0/16 dev eth1 src 10.1.1.10 table TenNetwork

#ip route add default via 10.1.1.1 dev eth1 table TenNetwork

Here we simply declared that NIC2(eth1) is associated with 10.1.0.0 subnet and it's IP address is 10.1.1.10. We also defined the default route via 10.1.1.1 on eth1 interface. (This is second default route. The first one is defined in 'main' routing table and the default route is via 192.168.2.1 on eth0 interface. OS automatically picks the first default route from eth0. You can check that by executing #ip rule show or #netstat -anr command)

#ip rule show

Since we haven't defined any rule associated with TenNetwork table yet, we can't see TenNetwork table in the rules.

Step3: Define the rules associated with TenNetwork table

#ip rule add from 10.1.1.10/32 table TenNetwork
#ip rule add to 10.1.1.10/32 table TenNetwork

Here we are defining a rule that says, if any packet is FROM/TO to 10.1.1.10, lookup the TenNetwork table.

#ip rule show
#netstat -anr

Now you can see the active routing rules associated with TenNetwork table as well.

You should be able to ping 10.1.1.10 from workstation now. (However I can't guarantee). If you are lucky, it will work like charm. If you are more lucky, you will discover more on routing and fix the problem by yourself. Good Luck!!!

Run WireShark on the server before and after applying the rule. You can visualize the problem and see how the problem is resolved. I love WireShark. I think people can find and fix more then 90% network problems using WireShark.


Warning!!!! :
1. Restarting the server will loose the configuration
2. Restarting the network will loose the configuration

Let's solve the problem associated with restarting the server. We will write startup script.

#vi /etc/init.d/TenNetwork
#!/bin/bash
#Copyright (c) 2011 DShah
# All rights reserved
#
#Author: DShah, 2011
# /etc/init.d/TenNetwork
#PLEASE READ /etc/init.d/skeleton to understand various parameters in startup scripts
#
### BEGIN INIT INFO
# Provides: TenNetwork
# Required-Start: $network
# Required-Stop:
# Default-Start: 3 5
# Default-Stop: 0 1 2 6
# Short-Description: Fixes 10 Network routing issue
### END INIT INFO

$logFile=/var/log/ten-network-log
ip route add 10.1.0.0/16 dev eth1 src 10.1.1.10 table TenNetwork
ip route add default via 10.1.1.1 dev eth1 table TenNetwork
ip route show 2>&1 >> $logFile
ip rule add from 10.1.1.10/32 table TenNetwork
ip rule add to 10.1.1.10/32 table TenNetwork
ip rule show 2>&1 >> $logFile
ip route show 2>&1 >> $logFile

Save and close the file

#chmod 700 /etc/init.d/TenNetwork

'insserv' command can be used to insert the script in desired runlevel as specified in script file
# insserv TenNetwork

You can go to /etc/init.d/rc3.d and /etc/init.d/rc5.d and look the startup order of TenNetwork.

Restart your server and see if it is working as you expected.


Updated info on 03/28/13 [Easy fix ]:

Multiple NICs routing issue can be resolved by making some modification in systctl.conf

/etc/sysctl.conf
# Disable response to broadcasts.
# You don't want yourself becoming a Smurf amplifier.
net.ipv4.icmp_echo_ignore_broadcasts = 1
# Disable route verification on all interfaces
net.ipv4.conf.all.rp_filter = 0
# enable ipV6 forwarding
#net.ipv6.conf.all.forwarding = 1
# increase the number of possible inotify(7) watches
fs.inotify.max_user_watches = 65536
# avoid deleting secondary IPs on deleting the primary IP
net.ipv4.conf.default.promote_secondaries = 1
net.ipv4.conf.all.promote_secondaries = 1



#sysctl -p   (to reload the changes done on the sysctl config)



Reference:
http://www.policyrouting.org/PolicyRoutingBook/ONLINE/TOC.html

2 comments:

Michael Mittelman said...

I've done this on two Ubuntu VMs on an esxi but every once in a while the NICs both seem to seize up. I can access some other vms on the box, but those two are locked up. Thoughts?

Devendra said...

I haven't had that problem with my SLES VMs and CentOS VMs on an esxi. I would definitely look at the system log file (/var/log/messages) to find out the root cause of the problem. Good Luck!