CASE STUDY: HSRP INSTABILITY


TOPICS

CASE STUDY

SCENARIO DETAILS

REASON 1 PHYSICAL LAYER ISSUE

REASON 2 SPANNING TREE ISSUE

REASON 3 HIGH CPU UTILIZATION ISSUE

REASON 4 EXCESSIVE TRAFFIC FROM A PARTICULAR VLAN

REASON 5 HSRP MULTICAST ADDRESS IS BLOCKED IN ACCESS-LIST

REFERENCE LINK


CASE STUDY

HSRP INSTABILITY IN A SWITHING ENVIRONMENT


 

SCENARIO DETAILS

Member routers in a HSRP group are flapping from Active to Standby to Active

LOGS

Jan 9 08:00:42.623: %STANDBY-6-STATECHANGE: Standby: 49:

Vlan149 state Standby -> Active

Jan 9 08:00:56.011: %STANDBY-6-STATECHANGE: Standby: 49:

Vlan149 state Active -> Speak

Jan 9 08:01:03.011: %STANDBY-6-STATECHANGE: Standby: 49:

Vlan149 state Speak -> Standby

Jan 9 08:01:29.427: %STANDBY-6-STATECHANGE: Standby: 49:

Vlan149 state Standby -> Active

Jan 9 08:01:36.808: %STANDBY-6-STATECHANGE: Standby: 49:

Vlan149 state Active -> Speak

Jan 9 08:01:43.808: %STANDBY-6-STATECHANGE: Standby: 49:

Vlan149 state Speak -> Standby

Note

From the output above, it is clear that the HSRP state of Router is continuously changing from Active to Speaking to Standby to Active, and so on.


 

REASON 1 PHYSICAL LAYER ISSUE

In this case, Layer 2 connectivity is down between HSRP member routers, which leads to loss of Hello packets between them.

Due to this, HSRP members will not communicate & will not be able to do comparison of configuration.

That’s why, they consider themselves alone in the HSRP group & elect themselves as Active / Active.

LOGS

Router_1#show standby

Vlan10 – Group 10

Local state is Active, priority 110, may preempt

Hellotime 3 holdtime 10

Hot standby IP address is 192.168.10.100 configured

Active router is local

Standby router is unknown expired

Standby virtual mac address is 0000.0c07.ac0a

12 state changes, last state change 00:00:48

 

Solution 1

Check Physical port status. To confirm, Use command: show interface status

 

Solution 2

Check Cable connectivity between HSRP member routers.

Solution 3

Check if traffic is passing from connected Physical ports. To confirm, Use command: show interface summary

 

Solution 4

UDLD issue, A unidirectional link occurs whenever traffic that a local device transmits over a link is received by the neighbor, but traffic that the neighbor transmits is not received by the local device

In case of CATos,

Switch_1> (enable) set udld enable

UDLD enabled globally

Console> (enable) set udld aggressive-mode enable 1/1-2

Aggressive UDLD enabled on ports 1/1-2.

Console> (enable) show udld

UDLD    : enabled  

Message Interval : 15 seconds

Console> (enable) show udld port 1

UDLD : enabled

Or,

In case of router,

If only one side of a link can see its neighbor device, replace the cable between the devices and check for faulty interfaces

 

Solution 5

Check for mismatched VTP modes of HSRP group members


REASON 2 SPANNING TREE ISSUE

The HSRP process uses multicast address 224.0.0.2 to communicate hello packets with the other HSRP routers. If connectivity is lost, or an HSRP router with higher priority is added to a network, the HSRP states can start flapping as shown above. When running HSRP on certain router platforms (see Note below) and a higher priority router is added to the network, the HSRP state of the lower priority router changes from Active to Speaking, and a link-state change occurs. The port of the switch detects this link-state change and a spanning tree protocol transition takes place. The port takes approximately 30 seconds to go through the listening, learning, and forwarding stages. This time period exceeds the default timeouts of the HSRP hello processes, so that the lower priority router, after reaching the Standby state, becomes Active because no hello packets were received from the Active router.

Since the routers do not see each other’s HSRP hello packets, they both become active. When the switch ports transition to the Learning state it is possible that the switch sees the same virtual MAC address out of two different ports.

 

Solution 1

Configure the switch with the set spantree portfast enable, which allows the switch to bypass the spantree states and go straight into the Forwarding state.

If the router is configured to bridge packets on this interface/port, then this workaround cannot be used, because the immediate forwarding on such a link could make the network prone to a forwarding loop outage.

Note: This restriction is also true for switch ports that are connected to other switches or bridges

Configure the switch with the set spantree portfast enable, which allows the switch to bypass the spantree states and go straight into the Forwarding state.

If the router is configured to bridge packets on this interface/port, then this workaround cannot be used, because the immediate forwarding on such a link could make the network prone to a forwarding loop outage.

Note: This restriction is also true for switch ports that are connected to other switches or bridges.

 

Solution 2

Change the HSRP timers so that the spanning tree forward delay (default of 15 seconds) is less than half the HSRP holdtime (default of 10 seconds).

We suggest an HSRP holdtime of 40 seconds.

Note: Increasing the HSRP holdtime makes HSRP slower in detecting that the Active router is down and making the Standby router active.

Solution 3

Configure the standby use-bia command, which forces the HSRP active router to use the burned-in address.

This accomplishes two things. Since HSRP no longer needs to change (or add) a unicast MAC address to the MAC address filter list, the Ethernet interface does not get reset. It also keeps the switch from learning the same address on two different ports. Refer to What is the standby use-bia Command and How Does It Work? for more information

Unless HSRP is configured on a Token Ring interface, only use the standby use-bia command in special circumstances. This command tells the router to use its BIA instead of the virtual HSRP MAC address for the HSRP group. On a Token Ring network, if source-route bridging (SRB) is in use, the standby use-bia command allows the new active router to update the host Routing Information Field (RIF) cache with a gratuitous ARP. But, not all of the host implementations handle the gratuitous ARP correctly. Another caveat for the standby use-bia command involves proxy ARP. A standby router cannot cover for the lost proxy ARP database of the failed active router.


 

REASON 3 HIGH CPU UTILIZATION ISSUE

If the error message is due to high CPU utilization, put a sniffer on the network and the trace the system that causes the high CPU utilization.


 

REASON 4 EXCESSIVE TRAFFIC FROM A PARTICULAR VLAN

you can tune or increase the SPD (Selective Packet Discard) and hold the queue size to overcome the input queue drop problem.

In order to increase the Selective Packet Discard (SPD) size, go to the configuration mode and execute these commands on the Cat6500 switches:

(config)# ip spd queue max-threshold 600

!— Hidden Command

(config)# ip spd queue min-threshold 500

!— Hidden Command


 

REASON 5 HSRP MULTICAST ADDRESS IS BLOCKED IN ACCESS-LIST

Specifically, verify the multicast address that is used in order to send traffic to all of the routers on a subnet (224.0.0.2). Also, verify that the UDP traffic that is destined for the HSRP port 1985 is not filtered. HSRP uses this address and port to send hello packets between peers. Issue the show access-lists command as a quick reference to note the access lists that are configured on the router. Here is an example:

Router_1#show access-lists

Standard IP access list 77

deny   167.19.0.0, wildcard bits 0.0.255.255

permit anyExtended IP access list 144

deny pim 238.0.10.0 0.0.0.255 any

permit ip any any (58 matches)


 

REFERENCE LINK

http://www.cisco.com/c/en/us/support/docs/ip/hot-standby-router-protocol-hsrp/13782-8.html#diag

http://www.cisco.com/c/en/us/support/docs/ip/hot-standby-router-protocol-hsrp/10583-62.html#t1


 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s