Thursday, December 19, 2013

In Pursuit of Port Forwarding

On one of my latest projects I found the need for stable port forwarding using an AWS NAT instance as the front end.  At first I used the quickest tool in my arsenal, the SSH port forward:
ssh -fgL 80:192.168.1.100:8884 localhost sleep 3600

This creates a simple port forwarding rule which will close on it's own in an hour's time.  Depending on what you're doing this is also nice as it a secure tunnel to do the forwarding, which can be nice depending on the network you're connecting across. You can also not use the -f option, and follow the SSH connection, which would then keep the port forward rule working until you "exit" the ssh connection.  This is great as a quick tool, and for temporary access, but I just don't trust the connection to be stable enough to consider it permanent.

Enter iptables.  I've used it before, and it definitely has it's place in the Linux networking toolbox, but it can be a bit cumbersome to use, especially if you're not using it regularly.  Furthermore, when you go looking for help on the Internet, everyone has a different example of rules that worked for them.  In the end I ended up with this configuration working for me:
sudo /sbin/iptables -t nat -A PREROUTING -i eth0 -p tcp --dport 80 -j DNAT --to 192.168.1.100:8884
sudo /sbin/iptables -t nat -A POSTROUTING -p tcp -j MASQUERADE

At first I had fooled myself by testing from the same instance I was doing the forwarding on, but once I realized that mistake, the above rules worked great for LAN and WAN traffic originating outside of the NAT instance. While I was willing to accept that as fair compromise, since what I really needed to work was working, I just didn't like the feeling of it all. Not to mention this still left me with the iptables-save / iptables-restore mechanism to setup to make sure the rules I put in place would survive a reboot. It was after a long time of researching and testing that I got the iptables configuration to work, in the end to decide I didn't really like it.

This is when xinetd occurred to me. A service made to accept connections and manage services, in this case I didn't need the service management, just some port forwarding, and there's a configuration option just for such a thing! You can place a config like this in your /etc/xinetd.d dir:
service port-forward-80
{
  type = UNLISTED
  socket_type = stream
  protocol = tcp
  wait = no
  user = root
  bind = 0.0.0.0
  port = 80
  only_from = 0.0.0.0
  redirect = 192.168.1.100 8884
}

This way once you
sudo /etc/init.d/xinetd restart (sudo service xinetd restart)
it will listen on the "port" and forward to the "redirect" IP address and port. The "type = UNLISTED" tells xinetd that there is no service that needs to be controlled by this config block. This is simple, easy to read, is managed by a service (which is likely running if you've got it installed), so it will easily survive reboots, and the configuration is simpler to store in version control and be automated with a tool like puppet. Additionally, this instantly worked from access via LAN, WAN, and traffic originating on instance, yet one more reason why this approach was better than iptables. For those interested in monitoring please note: when the xinetd process is listening a port check will return true even when the remote port is not available.

There's another project called redir, and while I've not actually tried it, the syntax looks quite simple:
redir --laddr=192.168.1.1 --lport=80 --caddr 192.168.1.100 --cport=8884

But again, after realizing the robust nature of xinetd and it's ability to do the job as a managed service, I decided to stick with it rather than one of these more temporary (or less then perfect) solutions.

References:

Wednesday, January 23, 2013

Adventures in Ethernet

It's always rough when you follow directions and something doesn't turn out, even more so when you are familiar with what you are trying to do. I believe I've discovered a bug somewhere in Debian's Ethernet (or the ifenslave-2.6 package) configuration during a new server setup on December 14, 2012. Once I am finished writing this post I am off to find where to submit an issue for them to see if I can save someone else from similar madness.

I was setting up a new install (Debian 6.0.6 i686) at work and was struggling with setting up Ethernet Bonding. I've done it in the past, and newer versions of Debian have made it easier than ever to configure, so I was really stumped as to why this was not working.

Entries like this from dmesg tell me it's working, but ping illustrates that clearly something is wrong:

Dec 14 14:20:09 ferrari kernel: [    4.595988] bonding: bond0: setting mode to active-backup (1).
Dec 14 14:20:09 ferrari kernel: [    4.596045] bonding: bond0: Setting MII monitoring interval to 100.
Dec 14 14:20:09 ferrari kernel: [    4.596087] bonding: bond0: Setting up delay to 200.
Dec 14 14:20:09 ferrari kernel: [    4.596121] bonding: bond0: Setting down delay to 200.
Dec 14 14:20:09 ferrari kernel: [    4.658073] bonding: bond0: doing slave updates when interface is down.
Dec 14 14:20:09 ferrari kernel: [    4.658079] bonding: bond0: Adding slave eth0.
Dec 14 14:20:09 ferrari kernel: [    4.658082] bonding bond0: master_dev is not up in bond_enslave
Dec 14 14:20:09 ferrari kernel: [    4.676526] tg3 0000:03:06.0: firmware: requesting tigon/tg3_tso.bin
Dec 14 14:20:09 ferrari kernel: [    4.923645] bonding: bond0: enslaving eth0 as a backup interface with a down link.
Dec 14 14:20:09 ferrari kernel: [    4.934060] bonding: bond0: doing slave updates when interface is down.
Dec 14 14:20:09 ferrari kernel: [    4.934066] bonding: bond0: Adding slave eth1.
Dec 14 14:20:09 ferrari kernel: [    4.934069] bonding bond0: master_dev is not up in bond_enslave
Dec 14 14:20:09 ferrari kernel: [    4.956523] tg3 0000:03:08.0: firmware: requesting tigon/tg3_tso.bin
Dec 14 14:20:09 ferrari kernel: [    5.208291] bonding: bond0: enslaving eth1 as a backup interface with a down link.
Dec 14 14:20:09 ferrari kernel: [    5.212315] ADDRCONF(NETDEV_UP): bond0: link is not ready
Dec 14 14:20:11 ferrari kernel: [    7.813163] tg3 0000:03:08.0: eth1: Link is up at 1000 Mbps, full duplex
Dec 14 14:20:11 ferrari kernel: [    7.813167] tg3 0000:03:08.0: eth1: Flow control is on for TX and on for RX
Dec 14 14:20:11 ferrari kernel: [    7.912012] bonding: bond0: link status up for interface eth1, enabling it in 0 ms.
Dec 14 14:20:11 ferrari kernel: [    7.912016] bonding: bond0: link status definitely up for interface eth1.
Dec 14 14:20:11 ferrari kernel: [    7.912020] bonding: bond0: making interface eth1 the new active one.
Dec 14 14:20:11 ferrari kernel: [    7.912044] bonding: bond0: first active interface up!
Dec 14 14:20:11 ferrari kernel: [    7.912172] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
Dec 14 14:20:11 ferrari kernel: [    8.148079] tg3 0000:03:06.0: eth0: Link is up at 1000 Mbps, full duplex
Dec 14 14:20:11 ferrari kernel: [    8.148084] tg3 0000:03:06.0: eth0: Flow control is on for TX and on for RX
Dec 14 14:20:11 ferrari kernel: [    8.212012] bonding: bond0: link status up for interface eth0, enabling it in 200 ms.
Dec 14 14:20:11 ferrari kernel: [    8.412010] bonding: bond0: link status definitely up for interface eth0.

While trying to figure this out I noticed some strange entries in both the routing table, and the output of /sbin/ifconfig.

/sbin/route:
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.0.0     *               255.255.255.0   U     0      0        0 eth0
192.168.0.0     *               255.255.255.0   U     0      0        0 bond0
default         192.168.0.1     0.0.0.0         UG    0      0        0 bond0

As you can see for some reason eth0 still has an entry in the routing table. Seeing that as a problem I tried to delete it with no success. Below you'll see for some reason eth0, while "RUNNING SLAVE" still has the old IP address it had before it was reassigned to bond0.

/sbin/ifconfig:
bond0     Link encap:Ethernet  HWaddr 00:0b:db:e2:ce:db
          inet addr:192.168.0.215  Bcast:192.168.0.255  Mask:255.255.255.0
          inet6 addr: fe80::20b:dbff:fee2:cedb/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:220 errors:0 dropped:0 overruns:0 frame:0
          TX packets:30 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:23522 (22.9 KiB)  TX bytes:2028 (1.9 KiB)

eth0      Link encap:Ethernet  HWaddr 00:0b:db:e2:ce:db
          inet addr:192.168.0.215  Bcast:192.168.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:142 errors:0 dropped:0 overruns:0 frame:0
          TX packets:21 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:15158 (14.8 KiB)  TX bytes:1344 (1.3 KiB)
          Interrupt:28

eth1      Link encap:Ethernet  HWaddr 00:0b:db:e2:ce:db
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:78 errors:0 dropped:0 overruns:0 frame:0
          TX packets:9 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:8364 (8.1 KiB)  TX bytes:684 (684.0 B)
          Interrupt:29

At first I thought it was maybe strange notation for the bonded interfaces, but the more I thought about it the more I felt it was wrong. After some searching I came to reading this: http://www.kernel.org/doc/Documentation/networking/bonding.txt and found "Section 8.1 Adventures in Routing" was explaining exactly the issue I was having. For reasons unknown to me I was not able to delete the route I wanted to delete. In the end what worked was getting my bonded connection setup and then rebooting. Only then did I lose the eth0 in routing, and the IP address on the eth0 as reported by /sbin/ifconfig.

I went through some trials using ifup and ifdown to get rid of the eth0 entry in routing, and I even put a short line in the interfaces files:

iface eth0 inet manual

Bringing eth0 up and down removed the errant entries, but restarting networking brought them back, even with eth0 removed from interfaces aside from the slave command.

So far my only success has been a reboot, upon which the bonding is working fine.

References:

Update:

Bug reported: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=698797