Cisco APIC – Fabric Provisioning

This article describes how to provision Cisco ACI Fabric. It will start from script install and goes through the automatic fabric discovery as the final step for the fabric provisioning. We are using below topology during the fabric setup:

topology_1

APIC Controller Setup

After we connect all the ACI fabric devices its time to setup the system to make it operational. At the beginning, we will setup APIC controller. Cisco also call this activity as a script install. Power on your APIC server. access the CIMC interface for console access. After few minutes you will see initial setup dialog on the console like below.

apic_setup_01

apic_setup_02

Do notice, I left several parameters as is. One important parameter above is infra vlan. Make sure you are using vlan id that will not be used on the future operation purpose.

Fabric Discovery

After the APIC is ready, its time to register all fabric switches (spine and leaf) to the APIC controller. Make sure all the switches in the fabric are physically connected. On the menu bar, Navigate to Fabric → Inventory. In navigate pane click Fabric Membership. In the Work pane, in the Fabric Membership table, a single leaf switch is displayed with an ID of 0. It is the leaf switch that is connected to apic1. APIC use LLDP to discover its neighbor devices.

fabric_discovery_01

Configure the ID by double-clicking the leaf switch row, and performing the following actions:

  • In the Node ID field, add the appropriate ID (leaf1 is ID 101, and leaf 2 is ID 102). The ID must be a number that is greater than 100 because the first 100 IDs are for APIC appliance nodes. I am using 201 as leaf 1, 202 as leaf 2 and 101 as spine 1.
  • In the Node Name field, add the name of the switch and click UPDATE.

fabric_discovery_02

After the information has been updated. Now your switch is assigned with an IP address. When it done, another switch will appears. On my case, since I only have one connection from APIC to the leaf switch it will discover spine switch and the second leaf orderly one after another

fabric_discovery_03

Repeate above procedure for other switch that shows up on the working pane.

fabric_discovery_04

To verify all the fabric switches and APIC controller on the system connectect to each other, navigate to inventory pane an click Topology.

fabric_topology

Now your apic fabric system is ready. Happy Labbing!!!.

Sources:

Contributor:

Wahyu Herdyanto
F5 BIG-IP Specialist
wahyu.herdyanto@gmail.com

Wendra Pesliko
Network Datacenter Specialist
pesliko@gmail.com

Ananto Yudi Hendrawan
Network Engineer - CCIE Service Provider #38962, RHCE, VCP6-DCV
nantoyudi@gmail.com

 

Advertisements

Cisco APIC – Fabric OS Upgrade

ACI OSes Fabric Upgrade Overview

This article describes how to upgrade OS devices on Cisco ACI fabric. It will demonstrate step-by-step the upgrade process on “spine switch”, “leaf switches” and the “APIC controller”. We only have one spine switch, two leaf switches and one APIC controller server on our environment.

According to Cisco, at a high level, steps to upgrade the ACI fabric are as follow:

  • The procedure/steps for upgrade and downgrade are the same unless stated otherwise in the release notes of a specific release.
  • Download the ACI Controller image (APIC image) into the repository.
  • Download the ACI switch image into the repository.
  • Upgrade the ACI controller cluster (APICs).
  • Verify the fabric is operational.
  • Divide the switches into multiple groups. For example, divide into two groups – red and blue
  • Upgrade the red group of switches.
  • Verify the fabric is operational.
  • Upgrade the blue group of switches.
  • Verify the fabric is operational.

Pre Upgrade Verifications

Before we jump to the step-by-step ACI fabric upgrade, we need to verify our current operating system on our fabric. On APIC web UI, Navigate to APIC user mode on the top right side. Click on the username, pop up window will shows. Select about and it will displays your current APIC operating system.

APIC_Software_Info_Before.png

If you need more information regarding fabric switches (spine and leaf) operationg system info, from APIC UI you may navigate to Admin -> Firmware -> Fabric Node Firmware. The easiest way gathering up the all the informations is using the CLI mode from the APIC controller like below.

APIC-01# show  version 
 Role        Id          Name                      Version              
 ----------  ----------  ------------------------  -------------------- 
 controller  1           APIC-01                   2.1(2g)              
 spine       101         SPINE-101                 n9000-12.1(2g)       
 leaf        201         LEAF-201                  n9000-12.1(2g)       
 leaf        202         LEAF-202                  n9000-12.1(2g)

Step-by-Step Upgrade

This sub-article describes step-by-step upgrade process on Cisco ACI fabric (Spine switch, Leaf switches and APIC controller).

Upload images

After you obtain the OSes from Cisco, it is time to upload it to the APIC server. I was used HTTP download to transfer the OSes to the controller. I have set my PC as a HTTP server with list option like below. So it serves a directory list files rather than a web page.

http_server.png

Now from APIC web UI, navigate to Admin → Firmware → Download Tasks → ACTION → Create Firmware Download Task. Fill the dialog box with the information provided. Submit when you are done.

download_task_apic.png

Repeat above procedure for the switch OS. Navigate to Operational tab to view the progress.

download_task_progress.png

Once it done, verify using Firmware Repository tab on the left panel. Or if you want to verify it from the cli you may use below command.

APIC-01# show firmware repository 
 Name                                      Type        Version        Size(MB)   
 ----------------------------------------  ----------  -------------  ---------- 
 aci-apic-dk9.2.2.2j.bin                   controller  2.2(2j)        2966.320   
 aci-catalog-dk9.2.2.2j.bin                catalog     2.2(2j)        0.040      
 aci-n9000-dk9.12.2.2j.bin                 switch      12.2(2j)       1132.285

Controller Upgrade

Next step is upgrading the APIC controller. Hover your mouse to Controller Firmware, right click and click Controller Upgrade. Provide all the information requested on the dialog box.

Do notice on above picture we have 1 major fault that recommends us to resolve before we continue to upgrade the controller. We were navigated to System → Controllers → Faults, double click on the fault information, it will display more faults list on that domain. Double click one fault information you need to know. Below is ours.

According to the infromation, it seems we have one port down on the controller. In that case we good to go to the upgrade activity since I have another port working. From CLI you may see below output.

APIC-01# show faults controller
Code            : F0103
Severity        : major
Last Transition : 2017-08-06T23:47:33.420+07:00
Lifecycle       : raised
DN              : topology/pod-1/node-1/sys/cphys-[eth1/2]/fault-F0103
Description     : Physical Interface eth1/2 on Node 1 is now down

If you consider this information is not important, you may check the fault information on the list and click the gear icon to apply Hide Acked Fault.

Click submit on Controller Upgrade dialog box and your APIC server is starts to begin the upgrade process. During the upgrade, you might loose connectivity to the APIC server until it finish.

Fabric Switches Upgrade

On switches part, Cisco recommends us to configure the upgrade process using group upgrade. For example if we have two spines and two leafs, we can group spine one and leaf one as group one and spine two and leaf two as group two. Defining the group upgrade determine which group of switches will start to upgrade the OS. This group upgrade expects minimum downtime from the End Point Group devices, it is because at least there are one spine and one leaf switch that accomodates the data plane traffic.

Navigate to Admin → Firmware → Fabric Node Firmware → Firmware Groups, right click and select Create Firmware Group.

Two important informations you may put are Target Firmware Version and Group Node Ids. This group node ids define which devices will use the firmware OS as the target OS. click submit when you done.

Below is the CLI output from the APIC.

APIC-01# show running-config firmware switch-group 
# Command: show running-config firmware switch-group
# Time: Sat Sep  2 22:24:08 2017
  firmware
    switch-group OS-12.2.2j
      switch 101
      switch 201
      switch 202
      firmware-version aci-n9000-dk9.12.2.2j.bin
      exit
    exit

Now navigate to Maintenance Groups right click and select Create POD Maintenance Group. Fill the information provided. In this group I want to put spine 101 and leaf 201 as Group-1.

I left Run Mode and Scheduler as is. You can schedule an upgrade using Scheduler drop down menu.

Click submit when your are done. Repeate above procedure for the other leaf switch. You will see the summary of your group upgrade as below.

As we have finished the pre-upgrade activity, now we can start to upgrade the fabric switches. From the Maintenance Group tree on the left panel, right click on Group-1 and select Upgrade Now, it will initiate the upgrade process.

Once the upgrade activity on Group-1 is done repeate the same procedure for Group-2. You may see the upgrade status by clicking each group name on the left panel under Maintenance Groups.

From the CLI you may use below command to track down upgrade status for each fabric switch.

APIC-01# show firmware upgrade status 
  Node-Id     Current-Firmware      Target-Firmware       Status                     Upgrade-Progress(%)  
 ----------  --------------------  --------------------  -------------------------  -------------------- 
 1           apic-2.2(2j)          apic-2.2(2j)          success                    100                  
 101         n9000-12.2(2j)        n9000-12.2(2j)        success                    100                  
 201         n9000-12.2(2j)        n9000-12.2(2j)        success                    100                  
 202         n9000-12.2(2j)        n9000-12.2(2j)        success                    100

That’s all, happy labbing!!!

Source:

Operating Cisco Application Centric Infrastructure

Contributor:
Wahyu Herdyanto
F5 BIG-IP Specialist
wahyu.herdyanto@gmail.com

Wendra Pesliko
Network Datacenter Specialist
pesliko@gmail.com

Ananto Yudi Hendrawan
Network Engineer - CCIE Service Provider #38962, RHCE, VCP6-DCV
nantoyudi@gmail.com

Cisco Nexus 7000 VDC Administration

This article decribes how to manage Virtual Device Context (VDC) on Cisco Nexus 7010 series. Cisco’s VDC feature helps enable virtualization of a single physical device on one or more logical devices. In this lab testing, we are going to cover several scenario like below:

  • Configuring Admin VDC
  • Configuring VDC Resource and Templates
  • Managing VDCs

Configuring Admin VDC

Admin VDC without migrate option

You can create an Admin VDC in one of the following ways:

  • After a fresh switch bootup.
  • Enter the system admin-vdc command after bootup. All the nonglobal configuration in the default VDC is lost after you enter this command.
  • Use system admin-vdc migrate new vdc name option commmand to migrate a non global configuration on non default vdc to new vdc.

In this section we are going to focus on creating vdcs after the switch bootup. On this lab environment I am using Nexus 7010 Chassis with SUP2E and NX-OS 6.2(16).

Now let’s verify our default vdc configuration.

N7K-ADMIN# show run vdc
...
version 6.2(16)
no system admin-vdc
vdc N7K-ADMIN id 1
  limit-resource module-type m1 m1xl m2xl f2e 
  cpu-share 5
  allocate interface ethernet1/1-12
  limit-resource vlan minimum 16 maximum 4094
  limit-resource monitor-session minimum 0 maximum 2
  limit-resource monitor-session-erspan-dst minimum 0 maximum 23
  limit-resource vrf minimum 2 maximum 4096
  limit-resource port-channel minimum 0 maximum 768
  limit-resource u4route-mem minimum 96 maximum 96
  limit-resource u6route-mem minimum 24 maximum 24
  limit-resource m4route-mem minimum 58 maximum 58
  limit-resource m6route-mem minimum 8 maximum 8
  limit-resource monitor-session-inband-src minimum 0 maximum 1
  limit-resource anycast_bundleid minimum 0 maximum 16
  limit-resource monitor-session-mx-exception-src minimum 0 maximum 1
  limit-resource monitor-session-extended minimum 0 maximum 12
N7K-ADMIN# show vdc 

Switchwide mode is m1 f1 m1xl f2 m2xl f2e f3 

vdc_id  vdc_name                       state               mac                 type        lc      
------  --------                       -----               ----------          ---------   ------  
1       N7K-ADMIN                      active              40:55:39:0e:43:41   Ethernet    m1 f1 m1xl m2xl
N7K-ADMIN# show interface description 

-------------------------------------------------------------------------------
Interface                Description                                            
-------------------------------------------------------------------------------
mgmt0                    ***Management_Interface***

-------------------------------------------------------------------------------
Port          Type   Speed   Description
-------------------------------------------------------------------------------
Eth1/1        eth    1000    --
Eth1/2        eth    1000    --
Eth1/3        eth    1000    --
Eth1/4        eth    1000    --
Eth1/5        eth    1000    --
Eth1/6        eth    1000    --
Eth1/7        eth    1000    --
Eth1/8        eth    1000    --
Eth1/9        eth    1000    --
Eth1/10       eth    1000    --
Eth1/11       eth    1000    --
Eth1/12       eth    1000    --

Now configure your system to use admin vdc.

N7K-ADMIN(config)# system admin-vdc 
All non-global configuration from the default vdc will be removed, Are you sure you want to continue? (yes/no) [no] yes
N7K-ADMIN# show vdc 

Switchwide mode is m1 f1 m1xl f2 m2xl f2e f3 

vdc_id  vdc_name                          state               mac                 type        lc      
------  --------                          -----               ----------          ---------   ------  
1       N7K-ADMIN                         active              40:55:39:0e:43:41   Admin       None
N7K-ADMIN# show interface description 

-------------------------------------------------------------------------------
Interface                Description                                            
-------------------------------------------------------------------------------
mgmt0                    ***Management_Interface***

Do notice that right now we don’t have any interfaces from physical module interface. Only management interface is allowed on admin vdc. Now let’s try to allocate some interfaces to it and see how admin vdc respons to it.

N7K-ADMIN(config)# vdc N7K-ADMIN  
N7K-ADMIN(config-vdc)# allocate interface ethernet1/1-12M
Moving ports will cause all config associated to them in source vdc to be removed. Are you sure you want to move the ports (y/n)?  [yes] 
ERROR: 1 or more interfaces are from a module of type not supported by this vdc

According to the error message, it was clear that admin vdc only allow management interface on it. As it purpose as an admin vdc it necessary to have only one interface.

Admin VDC With Migrate Option

Another method to create admin vdc is by adding migrate option. This method allow you to keep you default vdc configuration. This option is recommended for existing deployments where the default VDC is used for production traffic whose downtime must be minimized.

Let’s verify our default vdc configuration before we do some changes.

N7K-ADMIN# sh run vdc
...
version 6.2(16)
no system admin-vdc
vdc N7K-ADMIN id 1
  limit-resource module-type m1 m1xl m2xl f2e 
  cpu-share 5
  limit-resource vlan minimum 16 maximum 4094
  limit-resource monitor-session minimum 0 maximum 2
  limit-resource monitor-session-erspan-dst minimum 0 maximum 23
  limit-resource vrf minimum 2 maximum 4096
  limit-resource port-channel minimum 0 maximum 768
  limit-resource u4route-mem minimum 96 maximum 96
  limit-resource u6route-mem minimum 24 maximum 24
  limit-resource m4route-mem minimum 58 maximum 58
  limit-resource m6route-mem minimum 8 maximum 8
  limit-resource monitor-session-inband-src minimum 0 maximum 1
  limit-resource anycast_bundleid minimum 0 maximum 16
  limit-resource monitor-session-mx-exception-src minimum 0 maximum 1
  limit-resource monitor-session-extended minimum 0 maximum 12
N7K-ADMIN# show vdc 

Switchwide mode is m1 f1 m1xl f2 m2xl f2e f3 

vdc_id  vdc_name                          state               mac                 type        lc      
------  --------                          -----               ----------          ---------   ------  
1       N7K-ADMIN                         active              40:55:39:0e:43:41   Ethernet    m1 m1xl m2xl f2e
N7K-ADMIN# show interface description 

-------------------------------------------------------------------------------
Interface                Description                                            
-------------------------------------------------------------------------------
mgmt0                    --

-------------------------------------------------------------------------------
Port          Type   Speed   Description
-------------------------------------------------------------------------------
Eth1/1        eth    1000    --
Eth1/2        eth    1000    --
Eth1/3        eth    1000    --
Eth1/4        eth    1000    --
Eth1/5        eth    1000    --
Eth1/6        eth    1000    --
Eth1/7        eth    1000    --
Eth1/8        eth    1000    --
Eth1/9        eth    1000    --
Eth1/10       eth    1000    --
Eth1/11       eth    1000    --
Eth1/12       eth    1000    --

Now we will configure admin vdc on our system and add new vdc (N7K-DEV) that will have default vdc configuration migrated to it.

N7K-ADMIN(config)# system admin-vdc migrate N7K-DEV
All non-global configuration from the default vdc will be removed, Are you sure you want to continue? (yes/no) [no] yes
Note: Interface mgmt0 will not have its ip address migrated to the new vdc
Note: During migration some configuration may not be migrated. Example: VTP will need to be reconfigured in the new vdc if it was enabled. Please refer to configuration guide for details
Please wait, this may take a while
Note: Ctrl-C has been temporarily disabled for the duration of this command
N7K-ADMIN# show vdc 

Switchwide mode is m1 f1 m1xl f2 m2xl f2e f3 

vdc_id  vdc_name                          state               mac                 type        lc      
------  --------                          -----               ----------          ---------   ------  
1       N7K-ADMIN                         active              40:55:39:0e:43:41   Admin       None    
2       N7K-DEV                           active              40:55:39:0e:43:42   Ethernet    m1 m1xl m2xl f2e
N7K-ADMIN# show run vdc
...
version 6.2(16)
system admin-vdc
vdc N7K-ADMIN id 1
  cpu-share 5
  limit-resource vlan minimum 16 maximum 4094
  limit-resource monitor-session minimum 0 maximum 2
  limit-resource monitor-session-erspan-dst minimum 0 maximum 23
  limit-resource vrf minimum 2 maximum 4096
  limit-resource port-channel minimum 0 maximum 768
  limit-resource u4route-mem minimum 96 maximum 96
  limit-resource u6route-mem minimum 24 maximum 24
  limit-resource m4route-mem minimum 58 maximum 58
  limit-resource m6route-mem minimum 8 maximum 8
  limit-resource monitor-session-inband-src minimum 0 maximum 1
  limit-resource anycast_bundleid minimum 0 maximum 16
  limit-resource monitor-session-mx-exception-src minimum 0 maximum 1
  limit-resource monitor-session-extended minimum 0 maximum 12
vdc N7K-DEV id 2
  limit-resource module-type m1 m1xl m2xl f2e 
  cpu-share 5
  allocate interface Ethernet1/1-12
  boot-order 1
  limit-resource vlan minimum 16 maximum 4094
  limit-resource monitor-session minimum 0 maximum 2
  limit-resource monitor-session-erspan-dst minimum 0 maximum 23
  limit-resource vrf minimum 2 maximum 4096
  limit-resource port-channel minimum 0 maximum 768
  limit-resource u4route-mem minimum 96 maximum 96
  limit-resource u6route-mem minimum 24 maximum 24
  limit-resource m4route-mem minimum 58 maximum 58
  limit-resource m6route-mem minimum 8 maximum 8
  limit-resource monitor-session-inband-src minimum 0 maximum 1
  limit-resource anycast_bundleid minimum 0 maximum 16
  limit-resource monitor-session-mx-exception-src minimum 0 maximum 1
  limit-resource monitor-session-extended minimum 0 maximum 12

vdc resource template admin-vdc-migrate-template
  limit-resource vlan minimum 16 maximum 4094
  limit-resource monitor-session minimum 0 maximum 2
  limit-resource monitor-session-erspan-dst minimum 0 maximum 23
  limit-resource vrf minimum 2 maximum 4096
  limit-resource port-channel minimum 0 maximum 768
  limit-resource u4route-mem minimum 96 maximum 96
  limit-resource u6route-mem minimum 24 maximum 24
  limit-resource m4route-mem minimum 58 maximum 58
  limit-resource m6route-mem minimum 8 maximum 8
  limit-resource monitor-session-inband-src minimum 0 maximum 1
  limit-resource anycast_bundleid minimum 0 maximum 16
  limit-resource monitor-session-mx-exception-src minimum 0 maximum 1
  limit-resource monitor-session-extended minimum 0 maximum 12

After admin vdc created, you will also have vdc resource template create based on admin vdc. We will cover vdc resource template on next section. Now try to login to new vdc using switchto vdc vdc_name and verify that it has the same membership of the interfaces on the default vdc.

N7K-ADMIN# switchto vdc N7K-DEV 
Cisco Nexus Operating System (NX-OS) Software
TAC support: http://www.cisco.com/tac
Copyright (c) 2002-2016, Cisco Systems, Inc. All rights reserved.
The copyrights to certain works contained in this software are
owned by other third parties and used and distributed under
license. Certain components of this software are licensed under
the GNU General Public License (GPL) version 2.0 or the GNU
Lesser General Public License (LGPL) Version 2.1. A copy of each
such license is available at
http://www.opensource.org/licenses/gpl-2.0.php and
http://www.opensource.org/licenses/lgpl-2.1.php
N7K-ADMIN-N7K-DEV#
N7K-ADMIN-N7K-DEV# sh interface description 

-------------------------------------------------------------------------------
Port          Type   Speed   Description
-------------------------------------------------------------------------------
Eth1/1        eth    1000    --
Eth1/2        eth    1000    --
Eth1/3        eth    1000    --
Eth1/4        eth    1000    --
Eth1/5        eth    1000    --
Eth1/6        eth    1000    --
Eth1/7        eth    1000    --
Eth1/8        eth    1000    --
Eth1/9        eth    1000    --
Eth1/10       eth    1000    --
Eth1/11       eth    1000    --
Eth1/12       eth    1000    --

Configuring VDC Resource and Templates

VDC resource templates set the minimum and maximum limits for shared physical device resources when you create the VDC. The Cisco NX-OS software reserves the minimum limit for the resource to the VDC. Any resources allocated to the VDC beyond the minimum are based on the maximum limit and availability on the device.

Below is one of the example of the vdc resource template we are using.

N7K-ADMIN(config)# vdc resource template new_vdc_template
N7K-ADMIN(config-vdc-template)# limit-resource vlan minimum 20 maximum 4094
N7K-ADMIN(config-vdc-template)#   limit-resource monitor-session minimum 0 maximum 2
N7K-ADMIN(config-vdc-template)#   limit-resource monitor-session-erspan-dst minimum 0 maximum 23
N7K-ADMIN(config-vdc-template)#   limit-resource vrf minimum 2 maximum 4096
N7K-ADMIN(config-vdc-template)#   limit-resource port-channel minimum 0 maximum 768
N7K-ADMIN(config-vdc-template)#   limit-resource u4route-mem minimum 96 maximum 96
N7K-ADMIN(config-vdc-template)#   limit-resource u6route-mem minimum 24 maximum 24
N7K-ADMIN(config-vdc-template)#   limit-resource m4route-mem minimum 58 maximum 58
N7K-ADMIN(config-vdc-template)#   limit-resource m6route-mem minimum 8 maximum 8
N7K-ADMIN(config-vdc-template)#   limit-resource monitor-session-inband-src minimum 0 maximum 1
N7K-ADMIN(config-vdc-template)#   limit-resource anycast_bundleid minimum 0 maximum 16
N7K-ADMIN(config-vdc-template)#   limit-resource monitor-session-mx-exception-src minimum 0 maximum 1
N7K-ADMIN(config-vdc-template)#   limit-resource monitor-session-extended minimum 0 maximum 12
N7K-ADMIN# show run vdc
...
vdc resource template new_vdc_template
  limit-resource vlan minimum 20 maximum 4094
  limit-resource monitor-session minimum 0 maximum 2
  limit-resource monitor-session-erspan-dst minimum 0 maximum 23
  limit-resource vrf minimum 2 maximum 4096
  limit-resource port-channel minimum 0 maximum 768
  limit-resource u4route-mem minimum 96 maximum 96
  limit-resource u6route-mem minimum 24 maximum 24
  limit-resource m4route-mem minimum 58 maximum 58
  limit-resource m6route-mem minimum 8 maximum 8
  limit-resource monitor-session-inband-src minimum 0 maximum 1
  limit-resource anycast_bundleid minimum 0 maximum 16
  limit-resource monitor-session-mx-exception-src minimum 0 maximum 1
  limit-resource monitor-session-extended minimum 0 maximum 12

Now create new vdc based on our vdc resource template.

N7K-ADMIN(config)# vdc N7K-PROD template new_vdc_template 
Note:  Creating VDC, one moment please ...
N7K-ADMIN# show vdc 

Switchwide mode is m1 f1 m1xl f2 m2xl f2e f3 

vdc_id  vdc_name                          state               mac                 type        lc      
------  --------                          -----               ----------          ---------   ------  
1       N7K-ADMIN                         active              40:55:39:0e:43:41   Admin       None    
2       N7K-DEV                           active              40:55:39:0e:43:42   Ethernet    m1 m1xl m2xl f2e 
3       N7K-PROD                          active              40:55:39:0e:43:43   Ethernet    m1 m1xl m2xl f2e
N7K-ADMIN# show vdc N7K-PROD detail 

Switchwide mode is m1 f1 m1xl f2 m2xl f2e f3 

vdc id: 3
vdc name: N7K-PROD
vdc state: active
vdc mac address: 40:55:39:0e:43:43
vdc ha policy: RESTART
vdc dual-sup ha policy: SWITCHOVER
vdc boot Order: 1
CPU Share: 5
CPU Share Percentage: 33%
vdc create time: Fri Apr  7 09:25:56 2017
vdc reload count: 0
vdc uptime: 0 day(s), 0 hour(s), 1 minute(s), 32 second(s)
vdc restart count: 0
vdc type: Ethernet
vdc supported linecards: m1 m1xl m2xl f2e
N7K-ADMIN# show vdc N7K-PROD resource

     Resource                   Min       Max       Used      Unused    Avail    
     --------                   ---       ---       ----      ------    -----    
     vlan                       20        4094      5         15        4089     
     monitor-session            0         2         0         0         2        
     monitor-session-erspan-dst 0         23        0         0         23       
     vrf                        2         4096      2         0         4090     
     port-channel               0         768       0         0         768      
     u4route-mem                96        96        1         95        95       
     u6route-mem                24        24        1         23        23       
     m4route-mem                58        58        1         57        57       
     m6route-mem                8         8         1         7         7        
     monitor-session-inband-src 0         1         0         0         1        
     anycast_bundleid           0         16        0         0         16       
     monitor-session-mx-excepti 0         1         0         0         1        
     monitor-session-extended   0         12        0         0         12

You may see on above resource output. Our vdc was assigned correctly by our configuration template. Let’s allocate some interfaces from the line card and verify it.

N7K-ADMIN(config-vdc)# allocate interface ethernet1/13-18
Moving ports will cause all config associated to them in source vdc to be removed. Are you sure you want to move the ports (y/n)?  [yes]
N7K-ADMIN# show vdc N7K-PROD membership 
Flags : b - breakout port
---------------------------------

vdc_id: 3 vdc_name: N7K-PROD interfaces:
        Ethernet1/13          Ethernet1/14          Ethernet1/15          
        Ethernet1/16          Ethernet1/17          Ethernet1/18

Login to vdc N7K-PROD and verify if it already have its interfaces. Since this is a new vdc, it is behave like a new switch. It will ask you to set up password and another administration task just like when you are entering the switch after fresh bootup.

N7K-ADMIN# switchto vdc N7K-PROD 


         ---- System Admin Account Setup ----


Do you want to enforce secure password standard (yes/no) [y]: no

  Enter the password for "admin": 
  Confirm the password for "admin": 

         ---- Basic System Configuration Dialog VDC: 3 ----

This setup utility will guide you through the basic configuration of
the system. Setup configures only enough connectivity for management
of the system.

Please register Cisco Nexus7000 Family devices promptly with your
supplier. Failure to register may affect response times for initial
service calls. Nexus7000 devices must be registered to receive 
entitled support services.

Press Enter at anytime to skip a dialog. Use ctrl-c at anytime
to skip the remaining dialogs.

Would you like to enter the basic configuration dialog (yes/no): no
Cisco Nexus Operating System (NX-OS) Software
TAC support: http://www.cisco.com/tac
Copyright (c) 2002-2016, Cisco Systems, Inc. All rights reserved.
The copyrights to certain works contained in this software are
owned by other third parties and used and distributed under
license. Certain components of this software are licensed under
the GNU General Public License (GPL) version 2.0 or the GNU
Lesser General Public License (LGPL) Version 2.1. A copy of each
such license is available at
http://www.opensource.org/licenses/gpl-2.0.php and
http://www.opensource.org/licenses/lgpl-2.1.php
N7K-ADMIN-N7K-PROD#
N7K-ADMIN-N7K-PROD# show interface description 

-------------------------------------------------------------------------------
Port          Type   Speed   Description
-------------------------------------------------------------------------------
Eth1/13       eth    1000    --
Eth1/14       eth    1000    --
Eth1/15       eth    1000    --
Eth1/16       eth    1000    --
Eth1/17       eth    1000    --
Eth1/18       eth    1000    --

Managing VDCs

Reloading VDCs

After we create VDCs, we can modify its parameter according to your network environment needs. In this subsection we are focusing on how to do some administrative task on your VDCs. Before we execute the command, let’s verify all VDCs we have on our Nexus 7000.

N7K-ADMIN# show vdc 

Switchwide mode is m1 f1 m1xl f2 m2xl f2e f3 

vdc_id  vdc_name                          state               mac                 type        lc      
------  --------                          -----               ----------          ---------   ------  
1       N7K-ADMIN                         active              40:55:39:0e:43:41   Admin       None    
2       N7K-DEV                           active              40:55:39:0e:43:42   Ethernet    m1 m1xl m2xl f2e 
3       N7K-PROD                          active              40:55:39:0e:43:43   Ethernet    f2

Now we will pick vdc N7K-PROD as a target for this test.

N7K-ADMIN# reload vdc N7K-PROD 
Are you sure you want to reload this vdc (y/n)?  [no] yes
N7K-ADMIN# show vdc 

Switchwide mode is m1 f1 m1xl f2 m2xl f2e f3 

vdc_id  vdc_name                          state               mac                 type        lc      
------  --------                          -----               ----------          ---------   ------  
1       N7K-ADMIN                         active              40:55:39:0e:43:41   Admin       None    
2       N7K-DEV                           active              40:55:39:0e:43:42   Ethernet    m1 m1xl m2xl f2e 
3       N7K-PROD                          resume in progress  40:55:39:0e:43:43   Ethernet    f2

In order to measure how long the reload process takes, I did a continues ping to the management interface resides on N7K-PROD vdc. After 30 seconds, N7K-PROD vdc back online.

N7K-ADMIN# show vdc 

Switchwide mode is m1 f1 m1xl f2 m2xl f2e f3 

vdc_id  vdc_name                          state               mac                 type        lc      
------  --------                          -----               ----------          ---------   ------  
1       N7K-ADMIN                         active              40:55:39:0e:43:41   Admin       None    
2       N7K-DEV                           active              40:55:39:0e:43:42   Ethernet    m1 m1xl m2xl f2e 
3       N7K-PROD                          active              40:55:39:0e:43:43   Ethernet    f2

Suspending VDCs

After assign a reload command to a vdc, now we are going to put a suspend action on it.

N7K-ADMIN(config)# vdc N7K-PROD suspend 
This command will suspend the VDC. (y/n)? [no] yes
Note: Suspending vdc N7K-PROD
N7K-ADMIN# show vdc 

Switchwide mode is m1 f1 m1xl f2 m2xl f2e f3 

vdc_id  vdc_name                          state               mac                 type        lc      
------  --------                          -----               ----------          ---------   ------  
1       N7K-ADMIN                         active              40:55:39:0e:43:41   Admin       None    
2       N7K-DEV                           active              40:55:39:0e:43:42   Ethernet    m1 m1xl m2xl f2e 
3       N7K-PROD                          suspended           40:55:39:0e:43:43   Ethernet    f2

Using the same procedure above to measure how long it back online, now let’s resume the vdc.

N7K-ADMIN(config)# no vdc N7K-PROD suspend 
Note: Resuming vdc N7K-PROD

After 30 seconds we can see vdc N7K-PROD is active

N7K-ADMIN# show vdc

Switchwide mode is m1 f1 m1xl f2 m2xl f2e f3 

vdc_id  vdc_name                          state               mac                 type        lc      
------  --------                          -----               ----------          ---------   ------  
1       N7K-ADMIN                         active              40:55:39:0e:43:41   Admin       None    
2       N7K-DEV                           active              40:55:39:0e:43:42   Ethernet    m1 m1xl m2xl f2e 
3       N7K-PROD                          active              40:55:39:0e:43:43   Ethernet    f2

Managing VDC Interfaces

When you create a VDC, you can allocate I/O interfaces to the VDC. One important thing regarding port allocation is, it is recommended to allocate all ports on the same port group to the same VDC. Beginning with Cisco NX-OS Release 5.2(1) for Nexus 7000 series devices, all members of a port group are automatically allocated to the VDC when you allocate an interface.

In this lab I am using two Line Cards, N7K-M148GT-11 on slot 1 and N7K-F248XP-25 on slot 2. Let’s see how we can verify port group on each line card.

Module N7K-M148GT-11

M1_Port_Group

Module N7K-F248XP-25
F2_Port_Group

We were omitted the rest of the output because those output is enough for us to understand how port group allocated on the line card. The interface number is listed in the FP port column, and the port ASIC number is listed in the MAC_0 column, which means that in slot 1 on the the above example, interfaces 1 through 12 share the same port ASIC (0) and on the slot 2, interfaces 1 through 4 share the same port ASIC (0).

When interfaces in different VDCs share the same port ASIC, reloading the VDC (with the reload vdc command) or provisioning interfaces to the VDC (with the allocate interface command) might cause short traffic disruptions (of 1 to 2 seconds) for these interfaces. If such behavior is undesirable, make sure to allocate all interfaces on the same port ASIC to the same VDC.

VDC Boot Order

Imagine you have a VDC connect to web servers, another VDC connect to app servers and Another VDC connec to database. In case your switch reload due to power outage or any force major incident, you expect specific VDC to go up first so the apps tier can comunicate properly. Another feature we can adjust on the VDC is boot order. Using boot order value you can manage which VDC should goes up first.

Use below command to adjust boot order value. By default it will have value of 1 on the boot order.

N7K-ADMIN(config)# vdc N7K-DEV
N7K-ADMIN(config-vdc)# boot-order 2
N7K-ADMIN# show vdc detail 
....
vdc id: 2
vdc name: N7K-DEV
vdc state: active
vdc mac address: 40:55:39:0e:43:42
vdc ha policy: BRINGDOWN
vdc dual-sup ha policy: SWITCHOVER
vdc boot Order: 2
CPU Share: 5
CPU Share Percentage: 33%
vdc create time: Tue Apr 11 11:53:32 2017
vdc reload count: 0
vdc uptime: 0 day(s), 0 hour(s), 52 minute(s), 25 second(s)
vdc restart count: 0
vdc type: Ethernet
vdc supported linecards: m1 m1xl m2xl f2e 
...

you cannot modify boot order on admin/default VDC. An Error will occurs when you try to modify it.

N7K-ADMIN(config)# vdc N7K-ADMIN 
N7K-ADMIN(config-vdc)# boot-order 1
ERROR: Default vdc boot order cannot be changed

Now let’s do some test by reloading the box and see the progress from each VDC.

N7K-ADMIN# show vdc 

Switchwide mode is m1 f1 m1xl f2 m2xl f2e f3 

vdc_id  vdc_name                          state               mac                type        lc      
------  --------                          -----               ----------         ---------   ------  
1       N7K-ADMIN                         active              40:55:39:0e:43:41  Admin       None    
2       N7K-DEV                           create pending      40:55:39:0e:43:42  Ethernet    m1 m1xl m2xl f2e 
3       N7K-PROD                          create pending      40:55:39:0e:43:43  Ethernet    f2
N7K-ADMIN# show vdc 

Switchwide mode is m1 f1 m1xl f2 m2xl f2e f3 

vdc_id  vdc_name                          state               mac                 type        lc      
------  --------                          -----               ----------          ---------   ------  
1       N7K-ADMIN                         active              40:55:39:0e:43:41   Admin       None    
2       N7K-DEV                           create in progress  40:55:39:0e:43:42   Ethernet    m1 m1xl m2xl f2e 
3       N7K-PROD                          create pending      40:55:39:0e:43:43   Ethernet    f2
N7K-ADMIN# show vdc 

Switchwide mode is m1 f1 m1xl f2 m2xl f2e f3 

vdc_id  vdc_name                          state               mac                 type        lc      
------  --------                          -----               ----------          ---------   ------  
1       N7K-ADMIN                         active              40:55:39:0e:43:41   Admin       None    
2       N7K-DEV                           active              40:55:39:0e:43:42   Ethernet    m1 m1xl m2xl f2e 
3       N7K-PROD                          create in progress  40:55:39:0e:43:43   Ethernet    f2
N7K-ADMIN# show vdc 

Switchwide mode is m1 f1 m1xl f2 m2xl f2e f3 

vdc_id  vdc_name                          state               mac                 type        lc      
------  --------                          -----               ----------          ---------   ------  
1       N7K-ADMIN                         active              40:55:39:0e:43:41   Admin       None    
2       N7K-DEV                           active              40:55:39:0e:43:42   Ethernet    m1 m1xl m2xl f2e 
3       N7K-PROD                          active              40:55:39:0e:43:43   Ethernet    f2

As you can see from above output, each VDC will start to active after another. Without boot order configured, each VDC will start to active at the same time.

VDC Hostname

You can change the format of the CLI prompt for nondefault VDCs. By default, the prompt format is a combination of the default VDC name and the nondefault VDC name. You can change the prompt to only contain the nondefault VDC name using no vdc combined-hostname. You can use this command only on spesific non default VDC or for the entire VDCs. Let’s verify non default VDC hostname before we change it.

N7K-ADMIN# switchto vdc N7K-PROD 
Cisco Nexus Operating System (NX-OS) Software
TAC support: http://www.cisco.com/tac
Copyright (c) 2002-2016, Cisco Systems, Inc. All rights reserved.
The copyrights to certain works contained in this software are
owned by other third parties and used and distributed under
license. Certain components of this software are licensed under
the GNU General Public License (GPL) version 2.0 or the GNU
Lesser General Public License (LGPL) Version 2.1. A copy of each
such license is available at
http://www.opensource.org/licenses/gpl-2.0.php and
http://www.opensource.org/licenses/lgpl-2.1.php
N7K-ADMIN-N7K-PROD#

Apply the config and see the change.

N7K-ADMIN(config)# no vdc combined-hostname
N7K-ADMIN# switchto vdc N7K-PROD 
Cisco Nexus Operating System (NX-OS) Software
TAC support: http://www.cisco.com/tac
Copyright (c) 2002-2016, Cisco Systems, Inc. All rights reserved.
The copyrights to certain works contained in this software are
owned by other third parties and used and distributed under
license. Certain components of this software are licensed under
the GNU General Public License (GPL) version 2.0 or the GNU
Lesser General Public License (LGPL) Version 2.1. A copy of each
such license is available at
http://www.opensource.org/licenses/gpl-2.0.php and
http://www.opensource.org/licenses/lgpl-2.1.php
N7K-PROD#

Now we have non default VDC hostname without additional name from the admin/default VDC.

VDC Management Interface

Nexus SUP2E module has one physical port for management. As Infromed earlier, this management interface is belong to admin/default VDC.

N7K-ADMIN# show interface description 

-------------------------------------------------------------------------------
Interface                Description                                            
-------------------------------------------------------------------------------
mgmt0                    --

-------------------------------------------------------------------------------
Port          Type   Speed   Description
-------------------------------------------------------------------------------
Eth1/1        eth    1000    --
Eth1/2        eth    1000    --
Eth1/3        eth    1000    --
Eth1/4        eth    1000    --
Eth1/5        eth    1000    --
Eth1/6        eth    1000    --
Eth1/7        eth    1000    --
Eth1/8        eth    1000    --
Eth1/9        eth    1000    --
Eth1/10       eth    1000    --
Eth1/11       eth    1000    --
Eth1/12       eth    1000    --

When we create more VDCs other than default VDC, Management interface is distributed through the VDCs. You can configure each VDC with an IP address with same segment with other IP addresses on the other VDCs. For example I was configured N7K-ADMIN VDC with 10.10.10.1/24, N7K-DEV 10.10.10.2/24, N7K-PROD 10.10.10.3/24 and management PC using 10.10.10.100/24. Do notice that on non default VDC, management interface is not shown when you execute show interface description.

N7K-PROD# show interface description 


-------------------------------------------------------------------------------
Port          Type   Speed   Description
-------------------------------------------------------------------------------
Eth2/1        eth    10G     --
Eth2/2        eth    10G     --
Eth2/3        eth    10G     --
Eth2/4        eth    10G     --

You can configure management interface just like you configure it on the admin/default VDC.

N7K-PROD(config)# interface mgmt 0
N7K-PROD(config-if)# vrf member management 
N7K-PROD(config-if)# ip address 10.10.10.3/24
N7K-PROD(config-if)# Description ***Management_Link***
N7K-PROD(config-if)# no shut
N7K-PROD# show interface description 

-------------------------------------------------------------------------------
Interface                Description                                            
-------------------------------------------------------------------------------
mgmt0             ***Management_Link***

-------------------------------------------------------------------------------
Port          Type   Speed   Description
-------------------------------------------------------------------------------
Eth2/1        eth    10G     --
Eth2/2        eth    10G     --
Eth2/3        eth    10G     --
Eth2/4        eth    10G     --

sources:

Contributor:

Ananto Yudi Hendrawan
Network Engineer - CCIE Service Provider #38962, RHCE, VCP6-DCV
nantoyudi@gmail.com

Nexus 7000 SUP2E Compact Flash Failure Recovery

This article describes one of the procedure to recover flash failure on Cisco Nexus 7000 using SUP2E. Cisco has published a bug id CSCus22805 (CCO account required) on their bug documentation. Before we show the procedure and CLI output during the recovery process, We are going to show how Cisco documentation explain regarding this issue.

Background

According to the documentation, Each N7K supervisor 2/2E is equipped with 2 eUSB flash devices in RAID1 configuration, one primary and one mirror. Together they provide non-volatile repositories for boot images, startup configuration and persistent application data. What can happen is over a period of months or years in service, one of these devices may be disconnected from the USB bus, causing the RAID software to drop the device from the configuration. The device can still function normally with 1/2 devices. However, when the second device drops out of the array, the bootflash is remounted as read-only, meaning you cannot save configuration or files to the bootflash, or allow the standby to sync to the active in the event it is reloaded.

Symptoms

  • Compact flash diagnostic failure
  • N7K-SUP2E# show diagnostic result module 1
    
     Current bootup diagnostic level: complete
     Module 5: Supervisor module-2  (Standby)
    
             Test results: (. = Pass, F = Fail, I = Incomplete,
             U = Untested, A = Abort, E = Error disabled)
    
              1) ASICRegisterCheck-------------> .
              2) USB---------------------------> .
              3) NVRAM-------------------------> .
              4) RealTimeClock-----------------> .
              5) PrimaryBootROM----------------> .
              6) SecondaryBootROM--------------> .
              7) CompactFlash------------------> F  <=====
              8) ExternalCompactFlash----------> U
              9) PwrMgmtBus--------------------> U
             10) SpineControlBus---------------> .
             11) SystemMgmtBus-----------------> U
             12) StatusBus---------------------> U
             13) StandbyFabricLoopback---------> .
             14) ManagementPortLoopback--------> .
             15) EOBCPortLoopback--------------> .
             16) OBFL--------------------------> .
  • Unable to perform ‘copy run start’
  • N7K-SUP2E# copy running-config startup-config
     [########################################] 100%
     Configuration update aborted: request was aborted
  • eUSB becomes read-only or is non-responsive
  • ISSU failures, usually when trying to failover to the standby supervisor

Problem Analysis

To diagnose the current state of the compact flash cards you need to use some internal commands Cisco provides on the documentation, those are show system internal raid | grep -A 1 “Current RAID status info” and show system internal file /proc/mdstat. If you have more than one supervisor, you may check it by adding slot x before the internal command, where x is the SUP2/2E slot position. Do notice, since these commands are internal, you might need to enter it completely. Don’t use tab keyboard to syntax completion it won’t working. Below are the output from those internal command related to my case.

N7K-SUP2E# show system internal raid | grep -A 1 "Current RAID status info"
 Current RAID status info:
 RAID data from CMOS = 0xa5 0xc3

From this output you want to look at the number beside of 0xa5 which is 0xc3. You can then use these keys to determine if the primary or secondary compact flash has failed, or both. The above output shows 0xc3 which tells us that both the primary and the secondary compact flashes have failed. Below is the reference table to pull up the information.

Raid Status Info Description
0xf0 No failures reported
0xe1 Primary flash failed
0xd2 Alternate (or mirror) flash failed
0xc3 Both primary and alternate failed
N7K-SUP2E# show system internal file /proc/mdstat
Personalities : [raid1]
md6 : active raid1 sdb6[2](F) sdc6[1]
      77888 blocks [2/1] [_U]
      
md5 : active raid1 sdb5[2](F) sdc5[1]
      78400 blocks [2/1] [_U]
      
md4 : active raid1 sdb4[2](F) sdc4[1]
      39424 blocks [2/1] [_U]
      
md3 : active raid1 sdb3[2](F) sdc3[1]
      1802240 blocks [2/1] [_U]

In this scenario you see that the primary compact flash is not up [_U]. A healthy output will show all blocks as [UU]. Below is the sample of the healty compact flash on my secondary SUP2E.

N7K-SUP2E# slot 2 show system internal file /proc/mdstat
Personalities : [raid1] 
md6 : active raid1 sdc6[0] sdb6[1]
      77888 blocks [2/2] [UU]
      
md5 : active raid1 sdc5[0] sdb5[1]
      78400 blocks [2/2] [UU]
      
md4 : active raid1 sdc4[0] sdb4[1]
      39424 blocks [2/2] [UU]
      
md3 : active raid1 sdc3[0] sdb3[1]
      1802240 blocks [2/2] [UU]

Scenarios

To determine which scenario you are facing, Cisco comes up with several scenarios letter. You will need to use the above commands in the “Problem Analysis” section to correlate with a scenario letter below.

Single supervisor:

Scenario Letter Active Supervisor Active Supervisor Code
A 1 Fail 0xe1 or 0xd2
B 2 Fail 0xc3

Dual supervisor:

Scenario Letter Active Supervisor Standby Supervisor Active Supervisor Code Standby Supervisor Code
C 0 Fail 1 Fail 0xf0 0xe1 or 0xd2
D 1 Fail 0 Fail 0xe1 or 0xd2 0xf0
E 1 Fail 1 Fail 0xe1 or 0xd2 0xe1 or 0xd2
F 2 Fail 0 Fail 0xc3 0xf0
G 0 Fail 2 Fail 0xf0 0xc3
H 2 Fail 1 Fail 0xc3 0xe1 or 0xd2
I 1 Fail 2 Fail 0xe1 or 0xd2 0xc3
J 2 Fail 2 Fail 0xc3 0xc3

On the table above, scenario F is highlighted. That is because we are going to show you how we were accomplished this recovery activity on our client using this scenario.

Recovery Procedure

Cisco has published a procedure for every scenarios listed on the document. When we dealing with scenario F a non-impacting recovery is possible. Below are the summary of the procedure in scenario F:

  • Backup running configuration for all vdc externally. You can use logging facility on your ssh terminal for “show running-config vdc-all” command.
  • Compare runnning configuration (show running-config vdc-all) and startup configuration (show startup-config vdc-all). Evaluate missing configuration on running configuration.
  • Perform supervisor switchover using “system switchover“.
  • New standby supervisor will begin rebooting. During this time you will want to add any missing configuration back to the new active.
  • New standby should reach “ha-standby” state. Use “show module” command to verify it alternatively you might use “show redundancy status” to ensure the all states on “Other supervisor” are “HA standby
  • If the new standby comes up in a “powered-up” state, you will need to manually bring it back online. This can be done by issuing the following commands, where “x” is the standby module stuck in a “powered-up” state:
  • (config)# out-of-service module x
    (config)# no poweroff module x
  • If you see that the standby keeps getting stuck in the powered-up state and ultimately keeps power cycling after the steps above, this is likely due to the active reloading the standby for not coming up in time. To resolve this, configure the following using ‘x’ for the standby slot that stuck in powered-up:
    (config)# system standby manual-boot
    (config)# reload module x force-dnld
  • Once the standby is back online in an “ha-standby” state, you will then need to run the recovery tool to ensure that the recovery is complete. The tool can be downloaded at the following link:
    recovery tool
  • unzipped recovery tool, and uploaded it to the bootflash of the box, you will need to execute the following command: “load bootflash:n7000-s2-flash-recovery-tool.10.0.2.gbin
  • check the recovery status with “show system internal file /proc/mdstat” command/

Procedure Output

Ok. let’s move on to the execution section. To avoid any confusion regarding the supervisor status. I will give a name to the supervisor like the following. Sup1Active means Supervisor one in active state and Sup2Standby means supervisor two on standby state. State on each supervisor will change during the procedure, please be aware with it.

Switchover Supervisor

On “Sup1Active do supevisor switchover. Sup1 will start to reboot and will be Sup1Standby.

N7K-SUP2E# system switchover 
N7K-SUP2E# 
User Access Verification
N7K-SUP2E login: 
>>>
>>>
>>>
NX7k SUP BIOS version ( 2.11 ) : Build - 01/09/2013 18:16:20
PM FPGA Version : 0x00000024 
Power sequence microcode revision - 0x00000009 : card type - 10156EEA0
Booting Spi Flash : Primary 
  CPU Signature - 0x000106e4: Version - 0x000106e0 
  CPU - 2 : Cores - 4 : HTEn - 1 : HT - 2 : Features - 0xbfebfbff 
  FSB Clk - 532 Mhz :  Freq - 2143 Mhz - 2128 Mhz 
  MicroCode Version : 0x00000002 
  Memory - 32768 MB : Frequency - 1067 MHZ 
  Loading Bootloader: Done 
  IO FPGA Version   : 0x1000d 
  PLX Version       : 861910b5
Bios digital signature verification - Passed
USB bootflash status : [1-1:0-0]
...

Below are the output from the Sup2Active, previously Sup2Standby

N7K-SUP2E(standby)# 2017 Apr 22 01:58:02  %$ VDC-1 %$ Apr 22 01:58:02 %KERN-2-SYSTEM_MSG: [18173381.026292] Switchover started by redundancy driver - kernel
2017 Apr 22 01:58:02  %$ VDC-1 %$ %SYSMGR-2-HASWITCHOVER_PRE_START: This supervisor is becoming active (pre-start phase).
2017 Apr 22 01:58:02  %$ VDC-1 %$ %SYSMGR-2-HASWITCHOVER_START: Supervisor 2 is becoming active.
2017 Apr 22 01:58:02  %$ VDC-1 %$ %SYSMGR-2-SWITCHOVER_OVER: Switchover completed.
N7K-SUP2E# show module
Mod  Ports  Module-Type                         Model              Status
---  -----  ----------------------------------- ------------------ ----------
1    0      Supervisor module-2                                    powered-up
2    0      Supervisor module-2                 N7K-SUP2E          active *
3    48     1000 Mbps Optical Ethernet XL Modul N7K-M148GS-11L     ok
4    24     10 Gbps Ethernet Module             N7K-M224XP-23L     ok
...

On my case, Sup1Standby was not able to back online. When you see highlighted lines below during the bootup process, it is a sign that your Sup is fail to boot and it will end on switch boot mode.

...
RAID assembly failed. Stopping all RAID partitions...
Trying to mount bootflash /dev/sdd3...
mount: block device /dev/sdd3 is write-protected, mounting read-only
mount: wrong fs type, bad option, bad superblock on /dev/sdd3,
       or too many mounted file systems
/dev/sdd3 mount failed, trying /dev/sdc3...
/dev/sdc3: Input/output error
mount: block device /dev/sdc3 is write-protected, mounting read-only
/dev/sdc3: Input/output error
mount: /dev/sdc3 is not a valid block device
Cannot find any valid bootflash partitions.
....
switch(boot)#

Even on switch boot mode your are not able to load the kickstart image since Sup doesn’t aware of any flash storage consist of kickstart image and operating system image.

switch(boot)# dir 

Usage for bootflash: filesystem 
   98643968 bytes used
  320786432 bytes free
  419430400 bytes total

Hence, we need to move on to the next procedure to bring Sup1Standby online. On Sup2Active do below command.

N7K-SUP2E(config)# out-of-service module 1
N7K-SUP2E(config)# 2017 Apr 22 02:00:46  %$ VDC-1 %$ %PLATFORM-2-MOD_PWRDN: Module 1 powered down (Serial number )
2017 Apr 22 02:00:46 N7K-SUP2E-VDC-4 %$ VDC-4 %$ %PLATFORM-2-MOD_PWRDN: Module 1 powered down (Serial number )
2017 Apr 22 02:00:46 N7K-SUP2E-VDC-2 %$ VDC-2 %$ %PLATFORM-2-MOD_PWRDN: Module 1 powered down (Serial number )
2017 Apr 22 02:00:46 N7K-SUP2E-VDC-3 %$ VDC-3 %$ %PLATFORM-2-MOD_PWRDN: Module 1 powered down (Serial number )
N7K-SUP2E(config)# no poweroff module 1

From Sup1Standby console, you will see it begin to bootup. When you see highlighted lines below during the bootup process, it is a sign that your Sup is in a good state.

...
Trying to mount bootflash /dev/sdd3...
Mounted primary /dev/sdd3 as /bootflash
Existing bootflash found, saving files...
Saving n7000-s2-dk9-npe.6.1.1.bin
Saving n7000-s2-dk9.6.1.2.bin
Saving n7000-s2-kickstart-npe.6.1.1.bin
Saving n7000-s2-kickstart.6.1.2.bin
Initializing the system...
Unmounting file systems...
Making partitions on physical devices...
Initializing RAID services...
Initializing startup-config and licenses...
mke2fs 1.35 (28-Feb-2004)
Checking for bad blocks (read-only test): done                        
mke2fs 1.35 (28-Feb-2004)
Checking for bad blocks (read-only test): done                        
Formatting PSS:
mke2fs 1.35 (28-Feb-2004)
Checking for bad blocks (read-only test): done                        
Formatting bootflash...
mke2fs 1.35 (28-Feb-2004)
Checking for bad blocks (read-only test): done                        
Fri Jan 3 19:04:29 2017: RAIDMON: Data(0x0) provided saved successfully to CMOS
Initialization completed - No reinit of CMOS/NVRAM
Copying saved files back to bootflash...
Checking obfl filesystem.
Checking all filesystems..... done.
Warning: switch is starting up with default configuration
rLoading system software
/bootflash//n7000-s2-dk9.6.1.2.bin read done
System image digital signature verification successful.
Uncompressing system image: bootflash:/n7000-s2-dk9.6.1.2.bin Fri Jan 3 19:06:12 UTC 2017
blogger: nothing to do.

..done Fri Jan 3 19:06:15 UTC 2017
Load plugins that defined in image conf: /isan/plugin_img/img.conf
Loading plugin 0: core_plugin...
num srgs 1
0: swid-core-sup2dc3, swid-core-sup2dc3
num srgs 1
0: swid-sup2dc3-ks, swid-sup2dc3-ks
INIT: Entering runlevel: 3



User Access Verification
N7K-SUP2E(standby) login:

Hence we need to wait until Sup1Standby reach “ha-standby” state. In this situation we would prefer use “show redundancy status” command to “show module” command from Sup2Active. Because we can see the Sup1Standby progress until “ha-standby” state.

N7K-SUP2E# show redundancy status 
Redundancy mode
---------------
      administrative:   HA
         operational:   None

This supervisor (sup-2)
-----------------------
    Redundancy state:   Active
    Supervisor state:   Active
      Internal state:   Active with HA standby

Other supervisor (sup-1)
------------------------
    Redundancy state:   Standby

    Supervisor state:   Unknown
      Internal state:   Other
...
N7K-SUP2E# show redundancy status 
Redundancy mode
---------------
      administrative:   HA
         operational:   None

This supervisor (sup-2)
-----------------------
    Redundancy state:   Active
    Supervisor state:   Active
      Internal state:   Active with HA standby

Other supervisor (sup-1)
------------------------
    Redundancy state:   Standby

    Supervisor state:   HA standby
      Internal state:   HA synchronization in progress
...
N7K-SUP2E# show redundancy status 
Redundancy mode
---------------
      administrative:   HA
         operational:   HA

This supervisor (sup-2)
-----------------------
    Redundancy state:   Active
    Supervisor state:   Active
      Internal state:   Active with HA standby

Other supervisor (sup-1)
------------------------
    Redundancy state:   Standby

    Supervisor state:   HA standby
      Internal state:   HA standby
...

Sup1Standby is the problematic Sup with the flash failure, after login prompt occurs. Login to Sup1Standby and execute command “show system internal file /proc/mdstat” to see recovery progress on this Sup (We don’t need to load recovery tool on Sup1Standby. Reload procedure will automatically recover it flash).

N7K-SUP2E(standby)#  show system internal file /proc/mdstat
Personalities : [raid1] 
md6 : active raid1 sdd6[2] sdc6[1]
      77888 blocks [2/1] [_U]
        resync=DELAYED
      
md5 : active raid1 sdd5[2] sdc5[1]
      78400 blocks [2/1] [_U]
        resync=DELAYED
      
md4 : active raid1 sdd4[2] sdc4[1]
      39424 blocks [2/1] [_U]
        resync=DELAYED
      
md3 : active raid1 sdd3[2] sdc3[1]
      1802240 blocks [2/1] [_U]
      [=========>...........]  recovery = 45.4% (819648/1802240) finish=1.2min s
peed=13142K/sec

Repeat the command above until you see the result like below, when it does your Sup1Standby is ready.

N7K-SUP2E(standby)#  show system internal file /proc/mdstat
Personalities : [raid1] 
md6 : active raid1 sdd6[0] sdc6[1]
      77888 blocks [2/2] [UU]
      
md5 : active raid1 sdd5[0] sdc5[1]
      78400 blocks [2/2] [UU]
      
md4 : active raid1 sdd4[0] sdc4[1]
      39424 blocks [2/2] [UU]
      
md3 : active raid1 sdd3[0] sdc3[1]
      1802240 blocks [2/2] [UU]

Execute Recovery Tool

As we run the procedure on scenario F, it is not necessary to execute the recovery tool on the Sup2Active, since Sup1Standby is the only problemactic Sup with flash failure. But in our case, after the supervisor switchover even though raid status info shows 0xf0, we were identified that Sup2Active raid status is not in [UU] state. You can do save configuration to startup at this state.

N7K-SUP2E# show system internal raid 
Current RAID status info:
RAID data from CMOS = 0xa5 0xf0
RAID data from driver disks 0 bad 0 name 
Bootflash: /dev/sdc
Mirrorflash: /dev/sdd

Current RAID status:
Personalities : [raid1] 
md6 : active raid1 sdc6[0]
      77888 blocks [2/1] [U_]
      
md5 : active raid1 sdc5[0]
      78400 blocks [2/1] [U_]
      
md4 : active raid1 sdc4[0]
      39424 blocks [2/1] [U_]
      
md3 : active raid1 sdc3[0]
      1802240 blocks [2/1] [U_]

Hence we need to execute the recovery tool. When you execute the tool, it will automatically copying it to the standby Sup if you have redundant Sup. Do notice on the output, since Sup1Standby already recovered it will not attempt any recovery action on it. Execute below command to run the tool.

N7K-SUP2E# load bootflash:n7000-s2-flash-recovery-tool.10.0.2.gbin
Loading plugin version 10.0(2)
###############################################################
  Warning: debug-plugin is for engineering internal use only!
  For security reason, plugin image has been deleted.
###############################################################
INFO: Running on active slot 2, checking if a ha-standby is available...
INFO: Standby present in slot 1. Copying the recovery tool...
###############################################################
  Warning: debug-plugin is for engineering internal use only!
  For security reason, plugin image has been deleted.
###############################################################
INFO: Running on the standby in slot 1, Checking RAID status...
INFO: Both disks are found to be healthy.
INFO: Verifying RAID configuration. Got primary=sdb Secondary=sdd
INFO: RAID device md3 is healthy.
INFO: RAID device md4 is healthy.
INFO: RAID device md5 is healthy.
INFO: RAID device md6 is healthy.
INFO: No recovery was attempted on module 1. All flashes left intact.
INFO: A detailed copy of the this log was saved as volatile:flash_repair_log_mod1.tgz.
INFO: Recovery procedures complete on module 1.
INFO: Please check for any errors in previous messages.
INFO: Run 'show system internal file /proc/mdstat' and check 'up status' [UU] for all disks.
INFO: Run 'show diagnostic result module ' on all available supervisor slots.
INFO: And restart CompactFlash test (7) instances if not in running state.
Loading plugin version 10.0(2)
INFO: Now starting the flash recovery procedures on active.
INFO: Primary=sdc(sdc) Secondary=sdd(sdd) Working=sdc
WARNING: Attempting recovery of secondary device sdd
INFO: Removing /dev/sdd from RAID configuration...
INFO: Resetting secondary flash...
INFO: Found secondary device sdd in 9 seconds.
INFO: Running health checks on the recovered device /dev/sdd...
INFO: Basic I/O tests passed. /dev/sdd looks healthy and responsive.
INFO: Verifying RAID configuration. Got primary=sdc Secondary=sdd
INFO: sdc3 is already a part of md3.
INFO: Adding sdd3 back into md3 RAID configuration...
INFO: sdc4 is already a part of md4.
INFO: Adding sdd4 back into md4 RAID configuration...
INFO: sdc5 is already a part of md5.
INFO: Adding sdd5 back into md5 RAID configuration...
INFO: sdc6 is already a part of md6.
INFO: Adding sdd6 back into md6 RAID configuration...
INFO: Resetting RAID status in CMOS...
WARNING: Flash recovery attempted on module 2.
INFO: A detailed copy of the this log was saved as volatile:flash_repair_log_mod2.tgz.
INFO: Recovery procedures complete on module 2.
INFO: Please check for any errors in previous messages.
INFO: Run 'show system internal file /proc/mdstat' and check 'up status' [UU] for all disks.
INFO: Run 'show diagnostic result module ' on all available supervisor slots.
INFO: And restart CompactFlash test (7) instances if not in running state.
N7K-SUP2E# show system internal file /proc/mdstat
Personalities : [raid1] 
md6 : active raid1 sdd6[2] sdc6[0]
      77888 blocks [2/1] [U_]
        resync=DELAYED
      
md5 : active raid1 sdd5[2] sdc5[0]
      78400 blocks [2/1] [U_]
        resync=DELAYED
      
md4 : active raid1 sdd4[2] sdc4[0]
      39424 blocks [2/1] [U_]
        resync=DELAYED
      
md3 : active raid1 sdd3[2] sdc3[0]
      1802240 blocks [2/1] [U_]
      [==>..................]  recovery = 14.7% (265984/1802240) finish=2.0min s
peed=12665K/sec

Wait until all blocks recover. Now you have all your flashes works.

N7K-SUP2E# show diagnostic result module 2

Current bootup diagnostic level: complete
Module 2: Supervisor module-2  (Active)

        Test results: (. = Pass, F = Fail, I = Incomplete,
        U = Untested, A = Abort, E = Error disabled)

         1) ASICRegisterCheck-------------> .
         2) USB---------------------------> .
         3) NVRAM-------------------------> .
         4) RealTimeClock-----------------> .
         5) PrimaryBootROM----------------> .
         6) SecondaryBootROM--------------> .
         7) CompactFlash------------------> .
         8) ExternalCompactFlash----------> U
         9) PwrMgmtBus--------------------> .
        10) SpineControlBus---------------> .
        11) SystemMgmtBus-----------------> .
        12) StatusBus---------------------> .
        13) StandbyFabricLoopback---------> .
        14) ManagementPortLoopback--------> .
        15) EOBCPortLoopback--------------> .
        16) OBFL--------------------------> .

Don’t forget to save all configuration to startup config.

N7K-SUP2E# copy running-config startup-config vdc-all
[########################################] 100%
Copy complete.

Source:

Nexus 7000 Supervisor 2/2E Compact Flash Failure Recovery

Contributor:
Muhammad Benny
Network Engineer 

Dirga Bramantyo
Network Engineer - CCNP

Ananto Yudi Hendrawan
Network Engineer - CCIE Service Provider #38962, RHCE, VCP6-DCV
nantoyudi@gmail.com

Cisco DMVPN Dual Hub Single Topology

Following up our previous article on DMVPN, we are going to implement another model of DMVPN deployment, DMVPN dual hub single topology. We have prepared the topology below as a guidance.

DMVPN_Dual_Hub_Single_Topology

The dual hub with single layout topology is fairly to set up. The idea in this case it to have a single DMPVN “cloud” with all hubs, and all spokes connected to this single subnet (“cloud”). On above topology, you will have two static tunnels from each spoke to the hubs (R1-HUB and R5-HUB). Since the spoke router are routing neighbors with the hub routers over the same mGRE tunnel interface, you cannot use link or interface differences (like metric, cost, delay or bandwidth) to modify the dynamic routing protocol metric toprefer one hub over the other hub when they are both up. If this preference is needed, then you can utilize the routing protocol feature to engineering traffic flow.

Router Configuration

Since we don’t have any major differences on the configuration. We will only show the additional configuration for this purpose. Any full configuration you may refer to my previous post on DMVPN.

R1-HUB

router eigrp 100
 network 10.155.145.1 0.0.0.0
 network 192.168.123.1 0.0.0.0

R5-HUB

crypto isakmp policy 10
 encr aes
 authentication pre-share
 group 14
crypto isakmp key cisco123 address 10.155.0.0     
!
!
crypto ipsec transform-set MYTRANSFORMSET esp-aes esp-sha-hmac 
 mode tunnel
!
crypto ipsec profile MYPROFILE
 set security-association lifetime seconds 900
 set transform-set MYTRANSFORMSET 
 
interface Tunnel0
 ip address 192.168.123.5 255.255.255.0
 no ip redirects
 ip mtu 1440
 no ip next-hop-self eigrp 100
 no ip split-horizon eigrp 100
 ip nhrp authentication cisco123
 ip nhrp map multicast dynamic
 ip nhrp map 192.168.123.1 10.155.16.1
 ip nhrp map multicast 10.155.16.1
 ip nhrp network-id 1
 ip nhrp nhs 192.168.123.1
 tunnel source GigabitEthernet0/1
 tunnel mode gre multipoint
 tunnel key 12345
 tunnel protection ipsec profile MYPROFILE
 
 router eigrp 100
 network 10.155.145.5 0.0.0.0
 network 192.168.123.5 0.0.0.0

I did two changes on R1-HUB, remove the loopback IP from EIGRP proceess and put datacenter segment instead. The configuration for R5-HUB is basically the same as the R1-HUB configuration with the appropriate IP address changes. The one main difference is that R5-HUB is also a spoke (or client) of R1-HUB, making R1-HUB the primary hub and R5-HUB the secondary hub.

R2-Spoke

interface Tunnel0
 ip address 192.168.123.2 255.255.255.0
 no ip redirects
 ip mtu 1440
 ip nhrp authentication cisco123
 ip nhrp map multicast dynamic
 ip nhrp map 192.168.123.1 10.155.16.1
 ip nhrp map multicast 10.155.16.1
 ip nhrp map 192.168.123.5 10.155.56.5
 ip nhrp map multicast 10.155.56.5
 ip nhrp network-id 1
 ip nhrp nhs 192.168.123.1
 ip nhrp nhs 192.168.123.5
 tunnel source GigabitEthernet0/2
 tunnel mode gre multipoint
 tunnel key 12345
 tunnel protection ipsec profile MYPROFILE

R3-Spoke

interface Tunnel0
 ip address 192.168.123.3 255.255.255.0
 no ip redirects
 ip mtu 1440
 ip nhrp authentication cisco123
 ip nhrp map multicast dynamic
 ip nhrp map 192.168.123.1 10.155.16.1
 ip nhrp map multicast 10.155.16.1
 ip nhrp map 192.168.123.5 10.155.56.5
 ip nhrp map multicast 10.155.56.5
 ip nhrp network-id 1
 ip nhrp nhs 192.168.123.1
 ip nhrp nhs 192.168.123.5
 tunnel source GigabitEthernet0/1
 tunnel mode gre multipoint
 tunnel key 12345
 tunnel protection ipsec profile MYPROFILE

Remember that by defining the static NHRP mapping and NHS on a spoke router for a hub, you are going to run the dynamic routing protocol over this tunnel. This defines the hub and spoke routing or neighbor network.

Verifications

From the R1-HUB perpective, its peer R5-HUB was discovered through dynamic tunnel. R1-HUB treat R5-HUB as another spoke router because from the R5-HUB perspective, it need to define its “next hop server” statically. From spokes side, it will have two “next hop server using the same tunnel, “tunnel0“.

R1-HUB#show dmvpn 
----------------output omitted for brevity-----------------

Interface: Tunnel0, IPv4 NHRP Details 
Type:Hub, NHRP Peers:3, 

 # Ent  Peer NBMA Addr Peer Tunnel Add State  UpDn Tm Attrb
 ----- --------------- --------------- ----- -------- -----
     1 10.155.26.2       192.168.123.2    UP 18:41:39     D
     1 10.155.36.3       192.168.123.3    UP 18:41:39     D
     1 10.155.56.5       192.168.123.5    UP 17:59:49     D
R5-HUB#sh dmvpn 
----------------output omitted for brevity-----------------

Interface: Tunnel0, IPv4 NHRP Details 
Type:Hub/Spoke, NHRP Peers:3, 

 # Ent  Peer NBMA Addr Peer Tunnel Add State  UpDn Tm Attrb
 ----- --------------- --------------- ----- -------- -----
     1 10.155.16.1       192.168.123.1    UP 18:01:04     S
     1 10.155.26.2       192.168.123.2    UP 18:01:00     D
     1 10.155.36.3       192.168.123.3    UP 18:01:00     D
R2-SPOKE#sh ip nhrp 
192.168.123.1/32 via 192.168.123.1
   Tunnel0 created 5d22h, never expire 
   Type: static, Flags: used 
   NBMA address: 10.155.16.1 
192.168.123.5/32 via 192.168.123.5
   Tunnel0 created 2d08h, never expire 
   Type: static, Flags: used 
   NBMA address: 10.155.56.5
R2-SPOKE#sh ip nhrp nhs 
Legend: E=Expecting replies, R=Responding, W=Waiting
Tunnel0:
192.168.123.1  RE priority = 0 cluster = 0
192.168.123.5  RE priority = 0 cluster = 0

On this dual hubs single topology layout, dynamic spoke-to-spoke tunnel is still works. As we demonstrated on earlier article, it will create a spoke-to-spoke dynamic tunnel after we triggered a traffic from one spoke to another.

R2-SPOKE#sh dmvpn 
----------------output omitted for brevity-----------------

Interface: Tunnel0, IPv4 NHRP Details 
Type:Spoke, NHRP Peers:2, 

 # Ent  Peer NBMA Addr Peer Tunnel Add State  UpDn Tm Attrb
 ----- --------------- --------------- ----- -------- -----
     1 10.155.16.1       192.168.123.1    UP    2d06h     S
     1 10.155.56.5       192.168.123.5    UP    1d15h     S
R2-SPOKE#traceroute 10.150.3.3
Type escape sequence to abort.
Tracing the route to 10.150.3.3
VRF info: (vrf in name/id, vrf out name/id)
  1 192.168.123.1 2 msec
    192.168.123.3 3 msec *
R2-SPOKE#traceroute 10.150.3.3
Type escape sequence to abort.
Tracing the route to 10.150.3.3
VRF info: (vrf in name/id, vrf out name/id)
  1 192.168.123.3 4 msec *  1 msec
R2-SPOKE#show dmvpn 
----------------output omitted for brevity-----------------

Interface: Tunnel0, IPv4 NHRP Details 
Type:Spoke, NHRP Peers:3, 

 # Ent  Peer NBMA Addr Peer Tunnel Add State  UpDn Tm Attrb
 ----- --------------- --------------- ----- -------- -----
     1 10.155.16.1       192.168.123.1    UP    2d06h     S
     1 10.155.36.3       192.168.123.3    UP 00:00:10     D
     1 10.155.56.5       192.168.123.5    UP    1d15h     S

After all communications establish, traffic from the spokes to the datacenter will load balance through R1-HUB and R5-HUB. This is a default behaviour since all HUBs using same dmvpn cloud.

R2-SPOKE#sh ip route 10.150.4.4
Routing entry for 10.150.4.4/32
  Known via "eigrp 100", distance 90, metric 27008256, type internal
  Redistributing via eigrp 100
  Last update from 192.168.123.1 on Tunnel0, 00:00:24 ago
  Routing Descriptor Blocks:
  * 192.168.123.5, from 192.168.123.5, 00:00:24 ago, via Tunnel0
      Route metric is 27008256, traffic share count is 1
      Total delay is 55010 microseconds, minimum bandwidth is 100 Kbit
      Reliability 255/255, minimum MTU 1440 bytes
      Loading 43/255, Hops 2
    192.168.123.1, from 192.168.123.1, 00:00:24 ago, via Tunnel0
      Route metric is 27008256, traffic share count is 1
      Total delay is 55010 microseconds, minimum bandwidth is 100 Kbit
      Reliability 255/255, minimum MTU 1440 bytes
      Loading 1/255, Hops 2
R2-SPOKE#sh ip eigrp topology 10.150.4.4 255.255.255.255
EIGRP-IPv4 Topology Entry for AS(100)/ID(10.150.2.2) for 10.150.4.4/32
  State is Passive, Query origin flag is 1, 2 Successor(s), FD is 27008256
  Descriptor Blocks:
  192.168.123.1 (Tunnel0), from 192.168.123.1, Send flag is 0x0
      Composite metric is (27008256/130816), route is Internal
      Vector metric:
        Minimum bandwidth is 100 Kbit
        Total delay is 55010 microseconds
        Reliability is 255/255
        Load is 1/255
        Minimum MTU is 1440
        Hop count is 2
        Originating router is 10.150.4.4
  192.168.123.5 (Tunnel0), from 192.168.123.5, Send flag is 0x0
      Composite metric is (27008256/130816), route is Internal
      Vector metric:
        Minimum bandwidth is 100 Kbit
        Total delay is 55010 microseconds
        Reliability is 255/255
        Load is 43/255
        Minimum MTU is 1440
        Hop count is 2
        Originating router is 10.150.4.4
R2-SPOKE#traceroute          
Protocol [ip]: 
Target IP address: 10.150.4.4
Source address: 
Numeric display [n]: 
Timeout in seconds [3]: 
Probe count [3]: 4
Minimum Time to Live [1]: 
Maximum Time to Live [30]: 
Port Number [33434]: 
Loose, Strict, Record, Timestamp, Verbose[none]: V
Loose, Strict, Record, Timestamp, Verbose[V]: 
Type escape sequence to abort.
Tracing the route to 10.150.4.4
VRF info: (vrf in name/id, vrf out name/id)
  1 192.168.123.1 1 msec
    192.168.123.5 11 msec
    192.168.123.1 3 msec
    192.168.123.5 6 msec
  2 10.155.145.4 1 msec *  1 msec *

Same goes for traffic from datacenter to each spoke, it will load balance through R1-HUB and R5-HUB. When this happens, asymmetric routing or per-packet load balancing across the links to the two hubs and this will lead to another problem, out of order packet delivery.

R4-DATACENTER#sh ip route eigrp 
Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP
       D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area 
       N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
       E1 - OSPF external type 1, E2 - OSPF external type 2
       i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2
       ia - IS-IS inter area, * - candidate default, U - per-user static route
       o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP
       a - application route
       + - replicated route, % - next hop override

Gateway of last resort is not set

      10.0.0.0/8 is variably subnetted, 7 subnets, 2 masks
D        10.150.2.2/32 
           [90/27008256] via 10.155.145.5, 1d15h, GigabitEthernet0/2
           [90/27008256] via 10.155.145.1, 1d15h, GigabitEthernet0/2
D        10.150.3.3/32 
           [90/27008256] via 10.155.145.5, 1d15h, GigabitEthernet0/2
           [90/27008256] via 10.155.145.1, 1d15h, GigabitEthernet0/2
D     192.168.123.0/24 
           [90/26880256] via 10.155.145.5, 1d15h, GigabitEthernet0/2
           [90/26880256] via 10.155.145.1, 1d15h, GigabitEthernet0/2
R4-DATACENTER#sh ip eigrp topology 10.150.2.2 255.255.255.255
EIGRP-IPv4 Topology Entry for AS(100)/ID(10.150.4.4) for 10.150.2.2/32
  State is Passive, Query origin flag is 1, 2 Successor(s), FD is 27008256
  Descriptor Blocks:
  10.155.145.1 (GigabitEthernet0/2), from 10.155.145.1, Send flag is 0x0
      Composite metric is (27008256/27008000), route is Internal
      Vector metric:
        Minimum bandwidth is 100 Kbit
        Total delay is 55010 microseconds
        Reliability is 255/255
        Load is 1/255
        Minimum MTU is 1440
        Hop count is 2
        Originating router is 10.150.2.2
  10.155.145.5 (GigabitEthernet0/2), from 10.155.145.5, Send flag is 0x0
      Composite metric is (27008256/27008000), route is Internal
      Vector metric:
        Minimum bandwidth is 100 Kbit
        Total delay is 55010 microseconds
        Reliability is 255/255
        Load is 58/255
        Minimum MTU is 1440
        Hop count is 2
        Originating router is 10.150.2.2
R4-DATACENTER#traceroute 
Protocol [ip]: 
Target IP address: 10.150.2.2
Source address: 
Numeric display [n]: 
Timeout in seconds [3]: 
Probe count [3]: 4
Minimum Time to Live [1]: 
Maximum Time to Live [30]: 
Port Number [33434]: 
Loose, Strict, Record, Timestamp, Verbose[none]: V
Loose, Strict, Record, Timestamp, Verbose[V]: 
Type escape sequence to abort.
Tracing the route to 10.150.2.2
VRF info: (vrf in name/id, vrf out name/id)
  1 10.155.145.1 1 msec
    10.155.145.5 1 msec
    10.155.145.1 7 msec
    10.155.145.5 1 msec
  2 192.168.123.2 4 msec *  8 msec *

In order to mitigate this behaviour, we will manipulate traffic flow from spokes to datacenter and vice versa. In this experiment, we engineered flow traffic to use R5-HUB as the primary traffic path. We accomplish this using an attribute on the routing protocol level.

ip access-list standard OFFSET-BRANCH
 permit 10.150.2.2
 permit 10.150.3.3
ip access-list standard OFFSET-DC
 permit 10.150.4.4
!
router eigrp 100
 network 10.155.145.1 0.0.0.0
 network 192.168.123.1 0.0.0.0
 offset-list OFFSET-DC out 500 Tunnel0 
 offset-list OFFSET-OFFSET-BRANCH out 500 GigabitEthernet0/3

Now let’s verify route information from the spokes and the datacenter. Make sure it uses R5-HUB as the gateway.

R2-SPOKE#sh ip eigrp topology 10.150.4.4 255.255.255.255
EIGRP-IPv4 Topology Entry for AS(100)/ID(10.150.2.2) for 10.150.4.4/32
  State is Passive, Query origin flag is 1, 1 Successor(s), FD is 27008256
  Descriptor Blocks:
  192.168.123.5 (Tunnel0), from 192.168.123.5, Send flag is 0x0
      Composite metric is (27008256/130816), route is Internal
      Vector metric:
        Minimum bandwidth is 100 Kbit
        Total delay is 55010 microseconds
        Reliability is 255/255
        Load is 43/255
        Minimum MTU is 1440
        Hop count is 2
        Originating router is 10.150.4.4
  192.168.123.1 (Tunnel0), from 192.168.123.1, Send flag is 0x0
      Composite metric is (27008756/131316), route is Internal
      Vector metric:
        Minimum bandwidth is 100 Kbit
        Total delay is 55029 microseconds
        Reliability is 255/255
        Load is 1/255
        Minimum MTU is 1440
        Hop count is 2
        Originating router is 10.150.4.4
R2-SPOKE#sh ip route 10.150.4.4
Routing entry for 10.150.4.4/32
  Known via "eigrp 100", distance 90, metric 27008256, type internal
  Redistributing via eigrp 100
  Last update from 192.168.123.5 on Tunnel0, 00:02:10 ago
  Routing Descriptor Blocks:
  * 192.168.123.5, from 192.168.123.5, 00:02:10 ago, via Tunnel0
      Route metric is 27008256, traffic share count is 1
      Total delay is 55010 microseconds, minimum bandwidth is 100 Kbit
      Reliability 255/255, minimum MTU 1440 bytes
      Loading 43/255, Hops 2
R2-SPOKE#traceroute 
Protocol [ip]: 
Target IP address: 10.150.4.4
Source address: 
Numeric display [n]: 
Timeout in seconds [3]: 
Probe count [3]: 4
Minimum Time to Live [1]: 
Maximum Time to Live [30]: 
Port Number [33434]: 
Loose, Strict, Record, Timestamp, Verbose[none]: V
Loose, Strict, Record, Timestamp, Verbose[V]: 
Type escape sequence to abort.
Tracing the route to 10.150.4.4
VRF info: (vrf in name/id, vrf out name/id)
  1 192.168.123.5 4 msec 5 msec 0 msec 3 msec
  2 10.155.145.4 2 msec *  1 msec * 

Also verify route information from the datacenter to the spokes. Make sure it uses R5-HUB as the gateway.

R4-DATACENTER#sh ip eigrp topology 10.150.2.2 255.255.255.255
EIGRP-IPv4 Topology Entry for AS(100)/ID(10.150.4.4) for 10.150.2.2/32
  State is Passive, Query origin flag is 1, 1 Successor(s), FD is 27008256
  Descriptor Blocks:
  10.155.145.5 (GigabitEthernet0/2), from 10.155.145.5, Send flag is 0x0
      Composite metric is (27008256/27008000), route is Internal
      Vector metric:
        Minimum bandwidth is 100 Kbit
        Total delay is 55010 microseconds
        Reliability is 255/255
        Load is 58/255
        Minimum MTU is 1440
        Hop count is 2
        Originating router is 10.150.2.2
  10.155.145.1 (GigabitEthernet0/2), from 10.155.145.1, Send flag is 0x0
      Composite metric is (27008756/27008500), route is Internal
      Vector metric:
        Minimum bandwidth is 100 Kbit
        Total delay is 55029 microseconds
        Reliability is 255/255
        Load is 1/255
        Minimum MTU is 1440
        Hop count is 2
        Originating router is 10.150.2.2
R4-DATACENTER#sh ip route 10.150.2.2
Routing entry for 10.150.2.2/32
  Known via "eigrp 100", distance 90, metric 27008256, type internal
  Redistributing via eigrp 100
  Last update from 10.155.145.5 on GigabitEthernet0/2, 00:00:42 ago
  Routing Descriptor Blocks:
  * 10.155.145.5, from 10.155.145.5, 00:00:42 ago, via GigabitEthernet0/2
      Route metric is 27008256, traffic share count is 1
      Total delay is 55010 microseconds, minimum bandwidth is 100 Kbit
      Reliability 255/255, minimum MTU 1440 bytes
      Loading 58/255, Hops 2
R4-DATACENTER#traceroute 
Protocol [ip]: 
Target IP address: 10.150.2.2            
Source address: 
Numeric display [n]: 
Timeout in seconds [3]: 
Probe count [3]: 4
Minimum Time to Live [1]: 
Maximum Time to Live [30]: 
Port Number [33434]: 
Loose, Strict, Record, Timestamp, Verbose[none]: V
Loose, Strict, Record, Timestamp, Verbose[V]: 
Type escape sequence to abort.
Tracing the route to 10.150.2.2
VRF info: (vrf in name/id, vrf out name/id)
  1 10.155.145.5 0 msec 1 msec 1 msec 1 msec
  2 192.168.123.2 1 msec *  2 msec *

High Availibility Test

In order to achive a resilience network, high availability is a must on a production network. On the first test, we tried to shut down tunnel0 interface on the R5-HUB so it will force traffic from the spokes through R1-HUB. After we shut down the interface, we could see several time out occurs.

R2-SPOKE#ping 10.150.4.4 repeat 1000
Type escape sequence to abort.
Sending 1000, 100-byte ICMP Echos to 10.150.4.4, timeout is 2 seconds:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!......!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!*Dec 31 06:09:32.676: %DUAL-5-NBRCHANGE: EIGRP-IPv4 100: Neighbor 192.168.123.5 (Tunnel0) is down: holding time expired!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!
Success rate is 99 percent (994/1000), round-trip min/avg/max = 1/5/16 ms

On the second test, we tried to bring back interface tunnel0 operational. At the same time we were executed ping command to measure how long tunnel0 takes to be operational.

R2-SPOKE#ping 192.168.123.5 repeat 1000
Type escape sequence to abort.
Sending 1000, 100-byte ICMP Echos to 192.168.123.5, timeout is 2 seconds:
....!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!
Success rate is 99 percent (996/1000), round-trip min/avg/max = 1/4/66 ms
R2-SPOKE#ping 10.150.4.4 repeat 6000
Type escape sequence to abort.
Sending 6000, 100-byte ICMP Echos to 10.150.4.4, timeout is 2 seconds:
---------------------output omitted for brevity-----------------------
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
*Dec 31 06:49:08.164: %DUAL-5-NBRCHANGE: EIGRP-IPv4 100: Neighbor 192.168.123.5 (Tunnel0) is up: new adjacency!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
---------------------output omitted for brevity-----------------------
Success rate is 100 percent (6000/6000), round-trip min/avg/max = 1/4/66 ms

As you can see from the output above, we are not seeing any packets lost from spokes to datacenter even though it took several time outs on the tunnel0 before it become online.

Contributor:

Ananto Yudi Hendrawan
Network Engineer - CCIE Service Provider #38962, RHCSA, VCP6-DCV
nantoyudi@gmail.com

Cisco ISR 4331 Throughput Capacity

This article is describes one of an issue we was faced on the past regarding Cisco router throughput capacity. This issue is quite interesting since I didn’t know that some of Cisco routers delivered with a throughput license feature.

At the beginning, we was received a report from our client that they were experincing slow transfer data when the link reach 90 – 95Mbps. We can see the throughput graph from below picture.

Sentraya_beforeAs a basic troubleshooting process, we tried to identify the router CPU process, we saw that all processes were normal. Also there wasn’t any packet drop on the interface. One of the key we have discovered was, slowness only happen for the traffic goes through the congested link (I said congested because our customer has 1Gbps link but it never reaches 100Mbps). Another question that came in mind how it could be slow, what was the evidence so you can say it is slow. Our customer was sent the ping comparation when the traffic is about 40-60 Mbps, ping through the router will have average delay around 2-3ms. When the congestion was occured, ping through the device will have average delay around 40-44ms. According to the graph above it even never reach 90Mbps, but when we verified it from the CLI it did.

After several tests on the network, we started to dig more information from Cisco documentation. According to Cisco, the aggregate throughput handled by isr4331 is 100Mbps to 300Mbps. By default the router is running with 100Mbps of throughput and you can increase it to maximum of 300Mbps using throughput license. you may see the throughput information summary on each ISR4000 series summary on below picture.

ISR4331_throughput

At this instance we cannot increase the router throughput capacity unless we buy the throughput license. Fortunately Cisco comes with a trial license on it, so we can do a temporary remediation to let the the current traffic utilise more bandwith space.

Before we start to activate the temporary license, let’s do some verification on the license status.

Current Throughput Level

ISR4331#show platform hardware throughput level 
The current throughput level is 100000 kb/s

Current License Status

ISR4331#sh license feature  
Feature name             Enforcement  Evaluation  Subscription   Enabled  RightToUse 
!
!output omitted for brevity
!
throughput               yes          yes         no             no       yes        
internal_service         yes          no          no             no       no
ISR4331#show license 
!
!output omitted for brevity
!
Index 7 Feature: throughput                     
        Period left: Not Activated
        Period Used: 0  minute  0  second  
        License Type: EvalRightToUse
        License State: Active, Not in Use, EULA not accepted
        License Count: Non-Counted
        License Priority: None

Now let’s enable temporary throughput license on the router. It will be available for next 60 days. Don’t forget to save your configuration and reload the chassis to take effect.

ISR4331(config)#platform hardware throughput level 300000
         Feature Name:throughput
 
PLEASE  READ THE  FOLLOWING TERMS  CAREFULLY. INSTALLING THE LICENSE OR
LICENSE  KEY  PROVIDED FOR  ANY CISCO  PRODUCT  FEATURE  OR  USING SUCH
PRODUCT  FEATURE  CONSTITUTES  YOUR  FULL ACCEPTANCE  OF  THE FOLLOWING
TERMS. YOU MUST NOT PROCEED FURTHER IF YOU ARE NOT WILLING TO  BE BOUND
BY ALL THE TERMS SET FORTH HEREIN.
 
Use of this product feature requires  an additional license from Cisco,
together with an additional  payment.  You may use this product feature
on an evaluation basis, without payment to Cisco, for 60 days. Your use
of the  product,  including  during the 60 day  evaluation  period,  is
subject to the Cisco end user license agreement
http://www.cisco.com/en/US/docs/general/warranty/English/EU1KEN_.html
If you use the product feature beyond the 60 day evaluation period, you
must submit the appropriate payment to Cisco for the license. After the
60 day  evaluation  period,  your  use of the  product  feature will be
governed  solely by the Cisco  end user license agreement (link above),
together  with any supplements  relating to such product  feature.  The
above  applies  even if the evaluation  license  is  not  automatically
terminated  and you do  not receive any notice of the expiration of the
evaluation  period.  It is your  responsibility  to  determine when the
evaluation  period is complete and you are required to make  payment to
Cisco for your use of the product feature beyond the evaluation period.
 
Your  acceptance  of  this agreement  for the software  features on one
product  shall be deemed  your  acceptance  with  respect  to all  such
software  on all Cisco  products  you purchase  which includes the same
software.  (The foregoing  notwithstanding, you must purchase a license
for each software  feature you use past the 60 days evaluation  period,
so  that  if you enable a software  feature on  1000  devices, you must
purchase 1000 licenses for use past  the 60 day evaluation period.)   
 
Activation  of the  software command line interface will be evidence of
your acceptance of this agreement.

ACCEPT? (yes/[no]): yes

Now let’s verify router status after we enable the temporary throughput license.

ISR4331#show license feature 
Feature name             Enforcement  Evaluation  Subscription   Enabled  RightToUse 
!
!output omitted for brevity
!        
throughput               yes          yes         no             yes      yes        
internal_service         yes          no          no             no       no
ISR4331#show license         
!
!output omitted for brevity
!                         
Index 7 Feature: throughput                     
        Period left: 8  weeks 4  days 
        Period Used: 0  day  0 hours 
        License Type: EvalRightToUse
        License State: Active, In Use
        License Count: Non-Counted
        License Priority: Low

And for the final information. Let me show you the throughput graph after we enable the temporary throughput license.

Sentraya_after

Contributor:

Ananto Yudi Hendrawan
Network Engineer - CCIE Service Provider #38962, RHCSA, VCP6-DCV
nantoyudi@gmail.com

Cisco DMVPN Single Hub

This article describes how to configure DMVPN using a single hub. We are using below topology for our lab test.

DMVPN_Single_Hub

According to our previous discussion on DMVPN, we will configure static tunnel on each router, spoke routers will only have one tunnel to the hub and hub only configured with one dynamic tunnel to communicate to its spoke routers. Also we will verify spoke-to-spoke dynamic tunnel between spokes router.

Connectivity Verification

Before you configure DMVPN on your network, make sure any routers who participate on DMVPN is well establish. I will do ping test from R1-HUB to other routers.

R1-HUB#tclsh
R1-HUB(tcl)#foreach ip {
+>(tcl)#10.155.26.2
+>(tcl)#10.155.36.3
+>(tcl)#} {ping $ip
+>(tcl)#}
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.155.26.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/5/16 ms
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.155.36.3, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/3/7 ms

Leveraging tcl on Cisco IOS, we can see from above output that all routers can communicate to each other.

Configuration

In this subsection, we will have there parts of configuration, cryto, tunnel and routing protocol.

Crypto Configuration

All routers Notes
crypto isakmp policy 10
 encr aes
 authentication pre-share
 group 14

crypto isakmp key cisco123 address 10.155.0.0  
   
crypto ipsec transform-set MYTRANSFORMSET esp-aes esp-sha-hmac 
 mode tunnel

crypto ipsec profile MYPROFILE
 set security-association lifetime seconds 900
 set transform-set MYTRANSFORMSET
This is basic configuration required when you want to use additional protection using IPsec. You may use your own parameter setting for the lab experiment.

Tunnel Configuration

R1-HUB Notes
interface Tunnel0
 ip address 192.168.123.1 255.255.255.0
 no ip redirects
 ip mtu 1440
 no ip next-hop-self eigrp 100
 no ip split-horizon eigrp 100
 ip nhrp authentication cisco123
 ip nhrp map multicast dynamic
 ip nhrp network-id 1
 tunnel source GigabitEthernet0/1
 tunnel mode gre multipoint
 tunnel key 12345
 tunnel protection ipsec profile MYPROFILE
  • To instruct EIGRP that the IP next hop is itself, use the ip next-hop-self eigrp command in interface configuration mode.
  • With “no ip next-hop-self eigrp 100” implemented it will bypass spoke-to-spoke traffic not using hub as the gateway. We will see it further on verification section.
  • Regarding split horizon rule, spoke router will not receive other spokes prefix unless you disable it.
R2-SPOKE Notes
interface Tunnel0
 ip address 192.168.123.2 255.255.255.0
 no ip redirects
 ip mtu 1440
 ip nhrp authentication cisco123
 ip nhrp map multicast dynamic
 ip nhrp map 192.168.123.1 10.155.16.1
 ip nhrp map multicast 10.155.16.1
 ip nhrp network-id 1
 ip nhrp nhs 192.168.123.1
 tunnel source GigabitEthernet0/2
 tunnel mode gre multipoint
 tunnel key 12345
 tunnel protection ipsec profile MYPROFILE
Since this is a static mapping, the Key point of the tunnel configuration on the spokes are nhrp mapping and nhs mapping.
R3-SPOKE Notes
interface Tunnel0
 ip address 192.168.123.3 255.255.255.0
 no ip redirects
 ip mtu 1440
 ip nhrp authentication cisco123
 ip nhrp map multicast dynamic
 ip nhrp map 192.168.123.1 10.155.16.1
 ip nhrp map multicast 10.155.16.1
 ip nhrp network-id 1
 ip nhrp nhs 192.168.123.1
 tunnel source GigabitEthernet0/1
 tunnel mode gre multipoint
 tunnel key 12345
 tunnel protection ipsec profile MYPROFILE
Since this is a static mapping, the Key point of the tunnel configuration on the spokes are nhrp mapping and nhs mapping.

Routing Protocol Configuration

R1-HUB Notes
router eigrp 100
 network 10.150.1.1 0.0.0.0
 network 192.168.123.1 0.0.0.0
We include only network from tunnel0 and loopback0 interface to participating on EIGRP route.
R2-SPOKE Notes
router eigrp 100
 network 10.150.2.2 0.0.0.0
 network 192.168.123.2 0.0.0.0
We include only network from tunnel0 and loopback0 interface to participating on EIGRP route.
R3-SPOKE Notes
router eigrp 100
 network 10.150.3.3 0.0.0.0
 network 192.168.123.3 0.0.0.0
We include only network from tunnel0 and loopback0 interface to participating on EIGRP route.

Tunnel Verification

R1-HUB

R1-HUB#show dmvpn 
Legend: Attrb --> S - Static, D - Dynamic, I - Incomplete
        N - NATed, L - Local, X - No Socket
        T1 - Route Installed, T2 - Nexthop-override
        C - CTS Capable
        # Ent --> Number of NHRP entries with same NBMA peer
        NHS Status: E --> Expecting Replies, R --> Responding, W --> Waiting
        UpDn Time --> Up or Down Time for a Tunnel
==========================================================================

Interface: Tunnel0, IPv4 NHRP Details 
Type:Hub, NHRP Peers:2, 

 # Ent  Peer NBMA Addr Peer Tunnel Add State  UpDn Tm Attrb
 ----- --------------- --------------- ----- -------- -----
     1 10.155.26.2       192.168.123.2    UP 00:22:26     D
     1 10.155.36.3       192.168.123.3    UP 00:22:26     D

R2-SPOKE

R2-SPOKE#show dmvpn 
!
! output omitted for brevity
!
Interface: Tunnel0, IPv4 NHRP Details 
Type:Spoke, NHRP Peers:1, 

 # Ent  Peer NBMA Addr Peer Tunnel Add State  UpDn Tm Attrb
 ----- --------------- --------------- ----- -------- -----
     1 10.155.16.1       192.168.123.1    UP 00:20:41     S

R3-SPOKE

R3-SPOKE#sh dmvpn 
!
! output omitted for brevity
!
Interface: Tunnel0, IPv4 NHRP Details 
Type:Spoke, NHRP Peers:1, 

 # Ent  Peer NBMA Addr Peer Tunnel Add State  UpDn Tm Attrb
 ----- --------------- --------------- ----- -------- -----
     1 10.155.16.1       192.168.123.1    UP 00:00:15     S

On the Hub router output. We can see information from two tunnels from R2 and R3, Hub router learn the spoke tunnel dynamically. From the spoke routers perspective, since those routers statically mapped hub interface for tunnel connection it will have only one tunnel connection to the Hub and it marked as a static tunnel.

Route Verification

R1-HUB

R1#sh ip route eigrp 
!
! output omitted for brevity
!
      10.0.0.0/8 is variably subnetted, 7 subnets, 2 masks
D        10.150.2.2/32 [90/27008000] via 192.168.123.2, 1w4d, Tunnel0
D        10.150.3.3/32 [90/27008000] via 192.168.123.3, 1w4d, Tunnel0

R2-SPOKE

R2#show ip route eigrp 
!
! output omitted for brevity
!
      10.0.0.0/8 is variably subnetted, 5 subnets, 2 masks
D        10.150.1.1/32 [90/27008000] via 192.168.123.1, 1w4d, Tunnel0
D        10.150.3.3/32 [90/28288000] via 192.168.123.3, 1w4d, Tunnel0

R3-SPOKE

R3#sh ip route eigrp 
!
! output omitted for brevity
!
      10.0.0.0/8 is variably subnetted, 7 subnets, 2 masks
D        10.150.1.1/32 [90/27008000] via 192.168.123.1, 1w4d, Tunnel0
D        10.150.2.2/32 [90/28288000] via 192.168.123.2, 1w4d, Tunnel0

From the routing table on each router, each router learns prefix from other routers through the EIGRP.

Connectivity Test

From the Hub router, make sure you have full connectivity to the network behind the spoke routers.

R1-HUB#tclsh
R1(tcl)#foreach ip {
+>(tcl)#10.150.2.2
+>(tcl)#10.150.3.3
+>(tcl)#} {ping $ip   
+>(tcl)#}
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.150.2.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/6/16 ms
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.150.3.3, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 3/5/9 ms

Trace route from one of spoke router to another spoke router will go through the HUB.

R2-SPOKE#traceroute 10.150.3.3 source loopback 0
Type escape sequence to abort.
Tracing the route to 10.150.3.3
VRF info: (vrf in name/id, vrf out name/id)
  1 192.168.123.1 2 msec 4 msec 9 msec
  2 192.168.123.3 18 msec *  1 msec

The first trace route will establish DMVPN session between R2-SPOKE and R3-SPOKE as it will create Spoke-to-Spoke dynamic tunnel.

R2-SPOKE#show dmvpn 
!
! output omitted for brevity
!
Interface: Tunnel0, IPv4 NHRP Details 
Type:Spoke, NHRP Peers:2, 

 # Ent  Peer NBMA Addr Peer Tunnel Add State  UpDn Tm Attrb
 ----- --------------- --------------- ----- -------- -----
     1 10.155.16.1       192.168.123.1    UP 00:00:48     S
     1 10.155.36.3       192.168.123.3    UP 00:00:31     D

Lets repeat the traceroute command you will see packet with destination to R3-SPOKE will directly send to it. When you enable “ip next-hop-self eigrp” any spoke-to-spoke traffic will go through the Hub. To mitigate this issue you may enable “ip nhrp shortcut” in the interface tunnel on each routers.

R2-SPOKE#traceroute 10.150.3.3 source loopback 0
Type escape sequence to abort.
Tracing the route to 10.150.3.3
VRF info: (vrf in name/id, vrf out name/id)
  1 192.168.123.3 2 msec *  1 msec

Don’t forget to do the ping test to Hub site and other spoke router.

R2-SPOKE#tclsh
R2-SPOKE(tcl)#foreach ip {
+>(tcl)#10.150.1.1
+>(tcl)#10.150.3.3
+>(tcl)#} {ping $ip
+>(tcl)#}
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.150.1.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/3/7 ms
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.150.3.3, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/2/5 ms

More on Verification

Next Hop Resolution Protocol (NHRP)

R2-SPOKE#sh ip nhrp nhs detail
Legend: E=Expecting replies, R=Responding, W=Waiting
Tunnel0:
192.168.123.1  RE priority = 0 cluster = 0  req-sent 432  req-failed 0  repl-recv 429 (00:19:31 ago)
R2-SPOKE#show ip nhrp detail 
192.168.123.1/32 via 192.168.123.1
   Tunnel0 created 1w4d, never expire 
   Type: static, Flags: used 
   NBMA address: 10.155.16.1
R2-SPOKE#show ip nhrp detail   
192.168.123.1/32 via 192.168.123.1
   Tunnel0 created 1w4d, never expire 
   Type: static, Flags: used 
   NBMA address: 10.155.16.1 
192.168.123.2/32 via 192.168.123.2
   Tunnel0 created 00:00:04, expire 01:59:55
   Type: dynamic, Flags: router unique local 
   NBMA address: 10.155.26.2 
    (no-socket) 
  Requester: 192.168.123.3 Request ID: 13
192.168.123.3/32 via 192.168.123.3
   Tunnel0 created 00:00:05, expire 01:59:55
   Type: dynamic, Flags: router nhop 
   NBMA address: 10.155.36.3

Crypto Isakmp Session Association

R2-SPOKE#show crypto isakmp sa 
IPv4 Crypto ISAKMP SA
dst             src             state          conn-id status
10.155.16.1     10.155.26.2     QM_IDLE           1029 ACTIVE
R2-SPOKE#show crypto engine connections active 
Crypto Engine Connections

   ID  Type    Algorithm           Encrypt  Decrypt LastSeqN IP-Address
  621  IPsec   AES+SHA                   0       13       13 10.155.26.2
  622  IPsec   AES+SHA                  13        0        0 10.155.26.2
 1029  IKE     SHA+AES                   0        0        0 10.155.26.2

Above output occur when spoke-to-spoke session is not yet established. It consist only IKE phase 1 and two IKE phase 2 (IPsec) for traffic incoming and outgoing from R2-SPOKE perpective.

 
R2-SPOKE#show crypto isakmp sa 
IPv4 Crypto ISAKMP SA
dst             src             state          conn-id status
10.155.36.3     10.155.26.2     QM_IDLE           1048 ACTIVE
10.155.16.1     10.155.26.2     QM_IDLE           1029 ACTIVE
10.155.26.2     10.155.36.3     QM_IDLE           1047 ACTIVE
R2-SPOKE#show crypto engine connections active
Crypto Engine Connections

   ID  Type    Algorithm           Encrypt  Decrypt LastSeqN IP-Address
  621  IPsec   AES+SHA                   0       41       41 10.155.26.2
  622  IPsec   AES+SHA                  42        0        0 10.155.26.2
  625  IPsec   AES+SHA                   0        0        0 10.155.26.2
  626  IPsec   AES+SHA                   0        0        0 10.155.26.2
  627  IPsec   AES+SHA                   0        0        0 10.155.26.2
  628  IPsec   AES+SHA                   0        0        0 10.155.26.2
 1029  IKE     SHA+AES                   0        0        0 10.155.26.2
 1047  IKE     SHA+AES                   0        0        0 10.155.26.2
 1048  IKE     SHA+AES                   0        0        0 10.155.26.2

When spoke-to-spoke session established, you will have two more information on crypto isakmp sa, two more IKE phase 1 tunnel and four IKE phase 2 tunnel (IPsec)

Crypto IPsec Session Association

R2-SPOKE#show crypto ipsec sa | i encaps|decaps|endpt|local|transform|Status
    Crypto map tag: Tunnel0-head-0, local addr 10.155.26.2
   local  ident (addr/mask/prot/port): (10.155.26.2/255.255.255.255/47/0)
    #pkts encaps: 123, #pkts encrypt: 123, #pkts digest: 123
    #pkts decaps: 108, #pkts decrypt: 108, #pkts verify: 108
     local crypto endpt.: 10.155.26.2, remote crypto endpt.: 10.155.16.1
        transform: esp-aes esp-sha-hmac ,
        Status: ACTIVE(ACTIVE)
        transform: esp-aes esp-sha-hmac ,
        Status: ACTIVE(ACTIVE)
R2-SPOKE#show crypto ipsec sa | i encaps|decaps|endpt|local|transform|Status
    Crypto map tag: Tunnel0-head-0, local addr 10.155.26.2
   local  ident (addr/mask/prot/port): (10.155.26.2/255.255.255.255/47/0)
    #pkts encaps: 1, #pkts encrypt: 1, #pkts digest: 1
    #pkts decaps: 1, #pkts decrypt: 1, #pkts verify: 1
     local crypto endpt.: 10.155.26.2, remote crypto endpt.: 10.155.36.3
        transform: esp-aes esp-sha-hmac ,
        Status: ACTIVE(ACTIVE)
        transform: esp-aes esp-sha-hmac ,
        Status: ACTIVE(ACTIVE)
   local  ident (addr/mask/prot/port): (10.155.26.2/255.255.255.255/47/0)
    #pkts encaps: 123, #pkts encrypt: 123, #pkts digest: 123
    #pkts decaps: 108, #pkts decrypt: 108, #pkts verify: 108
     local crypto endpt.: 10.155.26.2, remote crypto endpt.: 10.155.16.1
        transform: esp-aes esp-sha-hmac ,
        Status: ACTIVE(ACTIVE)
        transform: esp-aes esp-sha-hmac ,
        Status: ACTIVE(ACTIVE)

The second output was taken after spoke-to-spoke session is established. It add information regarding source spoke and desination spoke router. Also it shows that packet through the WAN is encrypted as expected.

Contributor:

Ananto Yudi Hendrawan
Network Engineer - CCIE Service Provider #38962, RHCSA, VCP6-DCV
nantoyudi@gmail.com