[This page originally lived at http://www.natecarlson.com/linux/advanced-routing-in-out.php. I am working on migrating all content over to WordPress, which is why this post exists. This document is mostly up-to-date; please leave a comment with any changes!]
One of my tasks at work has been to set up Nagios to monitor all of our critical services. In the process of setting this up, I’ve ran into a very interesting issue related to the way Linux does ARP with a “strange” routing table. This article details what the problem I ran into was, and what I did to resolve it with Advanced Routing.
Last modified: 11/21/2005 Nate Carlson
As an aside, this article could also be very useful for people who have two separate ISP’s, with a separate IP range from each ISP. The gist of what I end up doing is setting up source routes to guarantee that traffic will go back out the proper interface, which can be necessary to get the expected behavior out of your network.
First of all, I need to explain a bit about our network layout. For each of our public-facing boxes, we have two network interfaces – “front” and “back”. Let’s call the front interface eth0, and the back interface eth1. Front is used to serve actual data to the world, and back is supposed to be used for management purposes. Assume that 10.100.0.0/16 is our front network, and 10.101.0.0/16 is our back network. Our routing table looks something like this:
Destination Gateway Genmask Flags Metric Ref Use Iface 10.0.0.0 10.101.0.254 255.255.255.0 UG 0 0 0 eth1 10.101.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth1 10.100.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0 0.0.0.0 10.100.0.1 0.0.0.0 UG 0 0 0 eth0
10.100.0.254 and 10.101.0.254 are the uplink “internal” routers; 10.100.0.1 is the load balancer that these boxes are behind. 10.0.0.0/24 is a management network at our main office, which is where the Nagios server is located that monitors this box. Let’s say that the local IP’s on this box are 10.100.0.100 and 10.101.0.100.
On the Nagios server, I am only monitoring 10.100.0.100 (front) network at this point. I should probably be monitoring both, but hadn’t set that up yet; this is rather fortunate, as if I was monitoring both interfaces, I wouldn’t see the strange behavior. What is this behavior, you ask? In times of low load (IE, no traffic going to/from the box besides the Nagios monitoring), the box would occasionally become unreachable. I could verify this by trying to ping it’s address on the 10.100.0.0 network – I wasn’t able to reach it. However, the second I ping the 10.101.0.0 interface, the 10.100.0.0 interface becomes reachable again. I worked with the network guy on and off for a few weeks to try to figure out what was causing this behavior, and finally we figured out that it’s the way that the Linux kernel sends ARP requests. What happens is that the ARP entry for 10.101.0.254 times out on the Linux box (because of the lack of traffic), and it tries to re-resolve it. However, since the address we’re trying to connect to from the Nagios is in the 10.100.0.0 network, the Linux box sends an arp entry out the eth1 interface that looks like:
“Who has 10.101.0.254? Tell 10.100.0.100”
The Cisco router we’re using denies this request, as the IP asking for the ARP entry is not part of the network it’s asking for. In the ARP debug logs on the Cisco, we got an error like:
“IP ARP req filtered src 10.100.0.100
So, what can we do to get around this problem? I can see three solutions, any of which would work:
1) Add a static ARP entry for the router on the Linux box
2) Set up advanced routing on the Linux box, so traffic will go back out the same interface it came in
3) Figure out a way to get the router to answer the filtered ARP requests, and/or mangle the ARP request with iptables to “appear” to come from the right IP.
I really didn’t like either #1 or #3, so I went with #2. Here’s what the rules I added end up looking like:
## Table 100 – Traffic in/out of eth0, front
$ ip route add table 100 10.0.0.0/24 via 10.100.0.254 dev eth0
$ ip route add table 100 default via 10.100.0.1 dev eth0
## Table 101 – Traffic in/out of eth1, back
$ ip route add table 101 10.0.0.0/24 via 10.101.0.254 dev eth1
## Main table; default routes. Default to using the “back” interface for comms to HQ.
$ ip route add table main 10.0.0.0/24 via 10.101.0.254 dev eth1
$ ip route add table main 172.16.4.0/24 via 10.19.0.254 dev eth1
## Make our traffic follow these rules
$ ip rule add from 10.100.0.0/16 lookup 100
$ ip rule add from 10.101.0.0/16 lookup 101
With these rules in place, everything’s working great – traffic’s flowing in and out of the interfaces, as expected. Now, when the box tries to reply to traffic that hit it at 10.100.0.100, it will go back out the eth0 interface, and ARP for 10.100.0.254, which works just fine. All by the wonders of source routing.
If you have any comments on this document, please feel free to drop me an e-mail at: email@example.com