Rackspace Cloud Integration Consultant Evan Callicoat has been busy deploying Rackspace Cloud: Private Edition (RCPE) for OpenStack customers, and in this two-part blog series he shares insight into some of the issues he encountered with bridged networks in Linux and how to make routing and bridging play nice in a Linux environment.
Lately I’ve been busy working on OpenStack technologies and deploying RCPE for our customers. Along the way, I’ve run into a variety of fascinating issues with bridged networking in Linux, and I wanted to share how I’ve solved them because the behavior I encountered was very strange, and documentation was mostly best effort if at all. I’m going to give a bit of an introduction to the networking situation for those who aren’t intimately familiar with how routing and bridging work in a cloudy Linux environment, so bear with me.
What part of “virtual” do you not understand?
In a cloud infrastructure based on a hypervisor like Xen or KVM, when you spin up an instance a virtual networking interface (vif) is created, which is how the virtual machine’s (VM) network stack communicates with the host’s network stack, as though it were a directly-connected physical device. However, since it’s not a physical device, if you want the VM to communicate with anything else besides the host, you have basically two options: routing and/or bridging.
Routing is pretty straightforward most of the time. If we’re not bridging the vif with another interface, we have to make use of the routing engine on the host to send traffic where we want it to go. This means we need to have an IP on the vif to use as a gateway address that our VM can see, which means it’s in the same network range. This is necessary because interfaces are not bridged together by default, despite being on the same machine. Instead, their MAC addresses are picked up through ARPing on a link, and frames are sent to those MACs with destination IPs that may not be on the host, in which case the host can route them appropriately out some other interface.
Another approach is to bridge vifs with the interfaces you want them to be able to communicate out of, which in Linux creates another kind of virtual interface — the bridge — and you add other interfaces into it as though they were “ports” of a physical bridge.
Bridging is a Layer 2 activity, and does not actually require an IP address to be on the bridge, or on the (host-side) physical/virtual interface(s), unless the traffic needs the host to make a routing decision (Layer 3) in which case you need an IP on something host-side to point traffic at. Barring that, just bridging the interfaces is sufficient to allow frames to pass through the bridge without the host’s upper networking stack seeing or caring about them, which is often quite handy in virtual environments.
All this is well and good, and works fine when you have a physical interface for every logical network you want to communicate with, but not only is that wasteful of port density, it’s very inflexible. That’s where Virtual LANs (VLANs) come into play.
A VLAN is a logically isolated Layer 2 network, which is implemented by inserting numeric VLAN tags into each frame entering/exiting the logical network. In Linux, this is done with the 8021q module (referencing the IEEE standard 802.1q which specifies VLAN tagging), and works in user-space by creating a sub-interface (subif) of a main interface (mainif) for each VLAN number that you want the networking stack to tag/untag frames for. In most distros, this works practically something like this:
# modprobe 8021q # vconfig1 add eth0 100 Added VLAN with VID == 100 to IF -:eth0:- # ip li s dev eth0.100 up2
This creates a tagged subif named eth0.100. Both eth0 and eth0.100 are treated as independent interfaces in almost every way, except when you mix tagging with bridging in a particular way, and that’s what we’re going to dive into now.
Tune back in tomorrow for the second part of Evan’s post, which examines bridging VLANs, brouting and more.
1 In Debian-based distros the vconfig tool is provided by the VLAN package, and the 8021q module is almost always built into the stock Linux kernel regardless of distro.
2 Subifs and bridges are created being “down,” so it kind of helps to bring them up before using them, or so I hear.