Which technology is required when switched networks are designed to include redundant links?

That depends on their configurations. For example:

Inhaltsverzeichnis Show

What is network redundancy?
What are the different types of network redundancy?
Designing for redundancy
DDoS attacks and network redundancy
Network redundancy and infrastructure
Tips for achieving minimal complexity in network redundancy
Identical systems with identical connections
Simple redundancy protocols
Keep everything parallel
Never do more than you need to
Cookie cutters
Maximum availability with minimum complexity

While it makes very good sense to include redundant physical links in a network, connecting switches in loops, without taking the appropriate measures, will cause havoc on a network. Without the correct measures, a switch floods broadcast frames out all of its ports, causing serious problems for the network devices. The main problem is a broadcast storm where broadcast frames are flooded through every switch until all available bandwidth is used and all network devices have more inbound frames than they can process.

Originally this challenge was with bridges. Though switches have replaced bridges in most organizational networks, the solution is the same. Radia Perlman’s STA (Spanning Tree Algorithm) fix for bridge loops also works for switch-based networks:

STA allows redundant physical links while logically disabling the paths that would cause loops. It also lets a network planner design and install redundancy in a network without creating loops. The basic steps in setting up STA are:

Plan the network design and installation: Carefully document how the network is going to be designed and installed. Specifically note each link between switches.

Enable STA on the switches: Many vendor switches have STA turned on by default.

Select the root switch: This root switch is in the center of the network. All other switches recognize the root switch and each selects one path back to the root.

Confirm convergence and operation: Each of the switches will now identify forwarding and blocking ports appropriately. The root switch sends out a BPDU (Bridge Protocol Data Unit) message every two seconds to inform all of the connected switches that everything is still okay. When a topology change takes place, such as a link failure, the affected switches recalculate their new best path back to the root switch.

When you are designing a network, the root switch is the central connecting point for all of the connected switches. The administrator may designate which switch is to be the root or may leave the decision up to the switches.

To designate the root switch, assign it the lowest bridge ID number in the network. The root switch communicates with the other switches by sending its bridge ID number via a multicast IEEE 802.1D BPDU. By comparing its own ID number with the arriving BPDU’s bridge ID number, each switch can identify the root switch and the port to reach it.

The Bridge ID is a combination of Bridge Priority and the MAC address. At times, such as when installing new switches, more than one switch may have the same Bridge Priority value which is the lowest in the network. In that case, the lowest numbered MAC address, on those switches with the lowest Bridge Priority, breaks the tie. That switch, with the lowest Bridge ID, will become the root switch. The process of determining the root switch may take up to 30 seconds.

Leaving the root selection to the switches is less desirable. It takes the chance that a switch in the corner of the network may become the root switch. If that happens, network performance slows:

Related Courses
Understanding Networking Fundamentals
Internetworking Multilayer Switches
Network+ Prep Course (N10-005)

What is network redundancy?

Network redundancy is process of providing multiple paths for traffic, so that data can keep flowing even in the event of a failure. Put simply: more redundancy equals more reliability. It also helps with distributed site management. The idea is that if one device fails, another can automatically take over. By adding a little bit of complexity, we reduce the probability that a failure will take the network down.

But complexity is also an enemy to reliability. The more complex something is, the harder it is to understand, the greater the chance of human error, and the greater the chance of a software bug causing a new failure mode. So, when designing a network, it’s important to balance redundancy against complexity.

What are the different types of network redundancy?

There are two main forms that network redundancy can take. The first is fault tolerance, which uses full hardware redundancy—there’s at least one complete duplicate of the system hardware running side-by-side with the primary system. Should one system fail, the other will take over simultaneously, with no loss of service.

The second type of network redundancy is high availability. In this structure, rather than duplicate all of the physical hardware, a cluster of servers is run together. The servers monitor each other and have failover capacities, so if there is a problem on one server, a backup can take action.

If you’re curious about the benefits and drawbacks of each, consider this: fault tolerance systems deliver next to zero downtime but are highly expensive to implement, while high availability infrastructure is less expensive to implement but may come with a risk of minor impacts to service during outages.

Designing for redundancy

There are useful network redundancy protocols at many different OSI layers. The first thing to think about is what happens at each layer if you lose any individual link or piece of equipment.

If you’re new to this, I suggest creating detailed Layer 1, Layer 2 and Layer 3 network diagrams showing every box and every link. Put your pencil or your mouse on each line or box in succession and ask these questions for each element:

What happens at Layer 1 if this box or link goes down? Do you still have connectivity?
What happens at Layer 2? Do you still have continuity of all VLANs throughout the network?
What happens at Layer 3? Do you still have a default gateway on each segment?

There are a lot of different redundancy protocols around, not all of which are equally robust. You’ll need to choose appropriate protocols for your equipment and network, but here are the ones I generally use.

At Layer 1 and 2, I like to use Link Aggregation Control Protocol (LACP) for link redundancy. This includes multi-chassis LACP variations like Cisco’s Virtual Port Channel (VPC) technology, available on all Nexus switches. Note, however, that most multi-chassis link aggregation protocols have serious limitations. HP’s Distributed Trunking, for example, is best used for providing redundant connectivity for servers, and can have strange behavior when interconnecting pairs of switches.

The other important Layer 2 protocol to use is Spanning Tree Protocol (STP). I prefer the modern fast converging STP variants, MSTP and RSTP. (I’ve written about spanning tree protocol before).

At Layer 3, your redundancy mechanisms need to make the routing functions available when a device fails. The choice of protocol here depends on many factors. If the devices on this network segment are mostly end devices, such as servers or workstations, then I prefer to use a protocol that will allow the default gateway function to jump to a backup device in case the primary device fails. The best choices for this are Cisco’s proprietary Hot Standby Routing Protocol (HSRP) or the open standard Virtual Router Redundancy Protocol (VRRP).

If the segment is being used primarily to interconnect network devices, then it might make more sense to use a dynamic routing protocol such as OSPF, EIGRP or BGP. I don’t advise using the older RIP protocol because it has serious limitations in both convergence time and network size. However, I strongly advise against using both types of protocols, like deploying HSRP with OSPF. Doing this can lead to network instability, particularly when dealing with multicast traffic.

For physical box redundancy, the exact technology will dictate the best choice. For firewalls, which need to maintain massive tables of state information for every connection, there are no viable open standards. In these cases, you really need to use the vendor’s proprietary hardware redundancy mechanisms.

Similarly, stackable switches are always very simple to deploy, usually requiring almost no special configuration to achieve box network switch redundancy. The only thing to bear in mind is that you need to be careful about how you distribute connections between the stack members.

For switch redundancy (and routers), it makes sense to combine a Layer 1, 2 and a Layer 3 protocol from the ones discussed above. Be careful, though. Make sure the same device is the “master” at all layers. For example, at any moment your Layer 3 default gateway should be the same physical device as the spanning tree root bridge.

In all cases, make sure you thoroughly understand the implementation guidelines for each technology you’ll be using and follow them carefully. If you don’t understand it, trying it out in a production network can be career limiting. And for goodness sakes, keep network backups in place for emergencies!

DDoS attacks and network redundancy

Distributed Denial of Service or DDoS attacks are cyberattacks no network admin wants to deal with. The goal of an attack like this is to render a network or service inoperable. Luckily, network redundancy can help to mitigate the impact of DDoS attacks, because it improves network security.

By using multiple ISPs, data centers can reroute network services in the event of an attempted DDoS attack. That’s why it is crucial to have redundant networks with flexible internet access. Businesses can’t operate if the network is down—continuous internet connections and functioning technology are essential these days. If your network lacks redundancy, especially a redundant internet connection, the failure of a single device could result in hours of downtime for the entire network.

Network redundancy and infrastructure

The terms ‘fail’ and ‘failure’ have come up a lot so far, because when planning network redundancy it’s important to think about all of the ways a network can fail. Beyond the software issues, think about the physical and environmental factors that can impact the performance of a device.

The enemies of device uptime like heat, water, and power are all things to think about when planning network redundancy. Ensure that you’re using redundant electrical supplies including UPSs and possibly even backup generators, have redundant cooling systems in place, and have redundant environmental sensors to warn you if the physical environment is becoming less-than-ideal for network devices.

After all, logical network redundancy requires the backup devices and backup data paths to be physically operational!

Tips for achieving minimal complexity in network redundancy

Here’s a list of things to keep in mind for implementing network redundancy while minimizing complexity. Consider these some network design best practices.

Identical systems with identical connections

I like to provide redundancy by implementing exact duplicate systems in key spots in the network. For example, a core switch will be two identical switches. When I say identical, I mean they should be the same model, running the same software, and they should have the same connections, as much as possible. The easiest way to do this with switches is to use stackable switches. Then there’s really nothing to do—connect up the stacking cable and you have redundancy out of the box.

Simple redundancy protocols

There are a lot of ways to implement network redundancy. The most reliable ones involve the simplest configuration on the fewest devices. For example, if I need a highly available firewall, I’ll implement a pair of devices. And I’ll always use the vendor’s fail-over mechanisms. Then I don’t need to worry about making the firewall take part in any routing protocols. Unless there’s a compelling reason for the firewall to run a routing protocol, it only introduces unnecessary complexity. Always use the simplest configuration that meets the requirements!

Keep everything parallel

One thing that often trips people up is how to connect successive layers of redundant devices. The trick is to keep it all parallel. Create an A path and a B path with a cross-over connection at each layer. The idea is that any one device can fail completely without disrupting the end-to-end path.

For example, suppose I have a pair of access switches, a pair of core switches, and a pair of firewalls. I’d connect access switch A to core switch A, which also supports firewall A. Similarly, access switch B connects to core switch B, which connects to firewall B. I’d also connect the two access switches to one another and the two core switches to one another.

In this example, you may be tempted to further connect access switch A to core switch B and access switch B to core switch A. It’s certainly a common configuration, but as soon as you do this, you need to know what you’re doing in terms of link aggregation and spanning tree. That could add considerable complexity if you’re new to network design.

Never do more than you need to

As the previous example suggests, it’s easy to go further in implementing redundancy than is absolutely required. In many cases the extra redundancy is warranted and could provide additional functionality. But carefully consider every piece of equipment, every link, and every protocol. For each one, ask whether it’s providing enough additional functionality to warrant the additional complexity.

Finally, it’s extremely useful to follow a standard model when implementing your networks. If you have multiple data centers, make them as nearly identical as possible in terms of topology.

Similarly, make your access switches as nearly identical as possible. Use common VLAN assignments everywhere, have a common IP addressing scheme that works everywhere. Make the default gateway on every segment follow a common rule such as the first or the last IP address. If you use redundancy protocols like HSRP, use them everywhere, and configure them the same way everywhere.

All this similarity helps limit the possibility of human error. Maybe the new engineer has never looked at this particular device before. But if it’s exactly the same as every other device performing a similar function, then it’s much less likely that he or she will miss some obscure bit of protocol magic that was implemented on this device and only this device.

Maximum availability with minimum complexity

The goal is maximum availability with minimum complexity. So it’s vitally important to keep the configuration simple. Don’t implement multiple redundancy mechanisms that are trying to accomplish the same logical function, or network navigation will become very difficult.

When it comes to routing protocols in particular, think about whether you can get away with a static route pointing to an HSRP default gateway. Routing protocols have to distribute a lot of information among a lot of devices, and that always takes time. HSRP and VRRP are both faster and simpler so you should use them if you can.

If you have stacked switches, think about what happens to upstream and downstream connections if one stack member fails. Where possible, you should distribute these links among the various stack members.

Above all, remember that building a real-world network is not a test where you have to demonstrate your deep understanding of network redundancy examples. Points won’t be deducted for using static routes and trivial default configurations. Keep it simple.