Test Lab – Day 4 Xsigo Redundancy testing with ESXi
Today I tested Xsigo redundancy capabilities within the ESXi test environment.
So far I have built up an environment with 4 x ESXi 4.1 hosts, each with a Single VM, and 2 Xsigo VP780’s.
Each VM is pretty much idle for this test, however tomorrow I plan to introduce some heavier IP and NFS traffic and re-run the tests below.
I used a Laptop and the ESXi console in tech support mode to capture the results.
Keep in mind this deployment is a SINGLE site scenario.
This means both Xsigo are considered at the same site and each ESXi host is connected to the A & B Xsigo.
Note: This test procedure is simply the pattern I used to test my environment. I’m not stating this is the right way to test an environment but simply the way it was done. I don’t recommend you use this pattern to test your systems or use it for validation. These are simply my notes, for my personal records, and nothing more.
Reminder:
XNA, XNB are Xsigo Network on Xsigo Device A or B and are meant for IP Network Traffic.
XSA, XSB are Xsigo Storage or NFS on Xsigo Device A or B and are meant for NFS Data Traffic.
Test 1 – LIVE I/O Card Replacement for Bay 10 for IP Networking
Summary –
Xsigo A sent a message to Xsigo support stating the I/O Module had an issue. Xsigo support contacted me and mailed out the replacement module.
The affected module controls the IP network traffic (VM, Management, vMotion).
Usually, an I/O Module going out is bad news. However, this is a POC (Proof of Concept) so I used this “blip” to our advantage and captured the test results.
Device – Xsigo A
Is the module to be affected currently active? Yes
Pre-Procedure –
Validate by Xsigo CLI – ‘show vnics’ to see if vnics are in the up state – OKAY
Ensure ESX Hosts vNICs are in Active mode and not standby – OKAY
Ensure ESX Hosts network configuration is setup for XNA and XNB in Active Mode – OKAY
Procedure –
Follow replacement procedure supplied with I/O Replacement Module
Basic Steps supplied by Xsigo –
- Press the Eject button for 5 seconds to gracefully shut down the I/O card
- Wait LED to go solid blue
- Remove card
- Insert new card
-
Wait for I/O card to come online LED will go from Blue to Yellow/Green
- The Xsigo VP780 will update card as needed Firmware & attached vNIC’s
- Once the card is online your ready to go
Expected results –
All active IP traffic for ESXi (including VM’s) will continue to pass through XNB
All active IP traffic for ESXi (including VM’s) might see a quick drop depending on which XN# is active
vCenter Server should show XNA as unavailable until new I/O Module is online
The I/O Module should take about 5 Minutes to come online
How I will quantify results –
All active IP traffic for ESXi (including VM’s) will continue to pass through XNB
- Active PING to ESXi Host (Management Network, VM’s) and other devices to ensure they stay up
All active IP traffic for ESXi (including VM’s) might see a quick drop depending on which XN# is active
- Active PING to ESXi Host (Management Network, VM’s)
vCenter Server should show XNA as unavailable until new I/O Module is online
- In vCenter Server under Network Configuration check to see if XNA goes down and back to active
The I/O Module should take about 5 Minutes to come online
- I will monitor the I/O Module to see how long it takes to come online
Actual Results –
Pings –
From Device | Destination Device | Type | Result During | Result coming online / After |
External Laptop | Windows 7 VM | VM | No Ping Loss | No Ping Loss |
External Laptop | vCenter Server | VM | One Ping Loss | No Ping Loss |
External Laptop | ESX Host 1 | ESX | One Ping Loss | One Ping Loss |
External Laptop | ESX Host 2 | ESX | One Ping Loss | One Ping Loss |
External Laptop | ESX Host 3 | ESX |
No Loss |
One Ping Loss |
External Laptop | ESX Host 4 | ESX | No Loss | No Loss |
ESX Host | IOMega Storage | NFS | No Loss | No Loss |
From vCenter Server –
XNA status showing down during module removal on all ESX Hosts
vCenter Server triggered the ‘Network uplink redundancy lost’ – Alarm
I/O Module Online –
The I/O Module took about 4 minutes to come online.
Test 1 Summary –
All results are as expected. There was only very minor ping loss, which for us is nothing to be worried about
Test 2 – Remove fibre 10gig Links on Bay 10 for IP Networking
Summary –
This test will simulate fibre connectivity going down for the IP network traffic.
I will simulate the outage by disconnecting the fibre connection from Xsigo A, measure/record the results, return the environment to normal, and then repeat for Xsigo B.
Device – Xsigo A and B
Is this device currently active? Yes
Pre-Procedure –
Validate by Xsigo CLI – ‘show vnics’ to see if vnics are in up state
- Xsigo A and B are reporting both I/O Modules are functional
Ensure ESX Host vNICs are in Active mode and not standby
- vCenter server is reporting all communication is normal
Procedure –
Remove the fibre connection from I/O Module in Bay 10 – Xsigo A
Measure results via Ping and vCenter Server
Replace the cable, ensure system is stable, and repeat for Xsigo B device
Expected results –
All active IP traffic for ESXi (including VM’s) will continue to pass through the redundant XN# adapter
All active IP traffic for ESXi (including VM’s) might see a quick drop if it’s traffic is flowing through the affected adapter.
vCenter Server should show XN# as unavailable until fibre is reconnected
How I will quantify results –
All active IP traffic for ESXi (including VM’s) will continue to pass through XNB
- Using PING the ESXi Hosts (Management Network, VM’s) and other devices to ensure they stay up
All active IP traffic for ESXi (including VM’s) might see a quick drop depending on which XN# is active
- Active PING to ESXi Host (Management Network, VM’s)
vCenter Server should show XNA as unavailable until new I/O Module is online
- In vCenter Server under Network Configuration check to see if XNA goes down and back to active
Actual Results –
Xsigo A Results…
Pings –
From Device | Destination Device | Type | Result During | Result coming online / After |
External Laptop | Windows 7 VM | VM | No Ping Loss | No Ping Loss |
External Laptop | vCenter Server | VM | No Ping Loss | One Ping Loss |
External Laptop | ESX Host 1 | ESX | One Ping Loss | No Ping Loss |
External Laptop | ESX Host 2 | ESX | No Ping Loss | One Ping Loss |
External Laptop | ESX Host 3 | ESX |
No Ping Loss |
One Ping Loss |
External Laptop | ESX Host 4 | ESX | One Ping Loss | One Ping Loss |
ESX Host | IOMega Storage | NFS | No Ping Loss | No Ping Loss |
From vCenter Server –
XNA status showing down during module removal on all ESX Hosts
vCenter Server triggered the ‘Network uplink redundancy lost’ – Alarm
Xsigo B Results…
Pings –
From Device | Destination Device | Type | Result During | Result coming on line / After |
External Laptop | Windows 7 VM | VM | One Ping Loss | One Ping Loss |
External Laptop | vCenter Server | VM | No Ping Loss | No Ping Loss |
External Laptop | ESX Host 1 | ESX | No Ping Loss | No Ping Loss |
External Laptop | ESX Host 2 | ESX | No Ping Loss | No Ping Loss |
External Laptop | ESX Host 3 | ESX |
No Ping Loss |
No Ping Loss |
External Laptop | ESX Host 4 | ESX | No Ping Loss | One Ping Loss |
ESX Host | IOMega Storage | NFS | No Ping Loss | No Ping Loss |
From vCenter Server –
XNB status showing down during module removal on all ESX Hosts
vCenter Server triggered the ‘Network up link redundancy lost’ – Alarm
Test 2 Summary –
All results are as expected. There was only very minor ping loss, which for us is nothing to be worried about
Test 3 – Remove fibre 10g Links to NFS
Summary –
This test will simulate fibre connectivity going down for the NFS network.
I will simulate the outage by disconnecting the fibre connection from Xsigo A, measure/record the results, return the environment to normal, and then repeat for Xsigo B.
Device – Xsigo A and B
Is this device currently active? Yes
Pre-Procedure –
Validate by Xsigo CLI – ‘show vnics’ to see if vnics are in up state
- Xsigo A and B are reporting both I/O Modules are functional
Ensure ESX Host vNICs are in Active mode and not standby
- vCenter server is reporting all communication is normal
Procedure –
Remove the fibre connection from I/O Module in Bay 11 – Xsigo A
Measure results via Ping, vCenter Server, and check for any VM GUI hesitation.
Replace the cable, ensure system is stable, and repeat for Xsigo B device
Expected results –
All active NFS traffic for ESXi (including VM’s) will continue to pass through the redundant XS# adapter
All active NFS traffic for ESXi (including VM’s) might see a quick drop if it’s traffic is flowing through the affected adapter.
vCenter Server should show XS# as unavailable until fibre is reconnected
I don’t expect for ESXi to take any of the NFS datastores off line
How I will quantify results –
All active NFS traffic for ESXi (including VM’s) will continue to pass through XSB
- Active PING to ESXi Host (Management Network, VM’s) and other devices to ensure they stay up
All active NFS traffic for ESXi (including VM’s) might see a quick drop depending on which XN# is active
- Active PING to ESXi Host (Storage, Management Network, VM’s)
vCenter Server should show XS# as unavailable until fibre is reconnected
- In vCenter Server under Network Configuration check to see if XS# goes down and back to active
I don’t expect for ESXi to take any of the NFS datastores offline
- In vCenter Server under storage, I will determine if the store goes offline
Actual Results –
Xsigo A Results…
Pings –
From Device | Destination Device | Type | Result During | Result coming online / After |
External Laptop | Windows 7 VM | VM | No Ping Loss | No Ping Loss |
External Laptop | vCenter Server | VM | No Ping Loss | No Ping Loss |
External Laptop | ESX Host 1 | ESX | No Ping Loss | No Ping Loss |
External Laptop | ESX Host 2 | ESX | No Ping Loss | No Ping Loss |
External Laptop | ESX Host 3 | ESX |
No Ping Loss |
No Ping Loss |
External Laptop | ESX Host 4 | ESX | No Ping Loss | No Ping Loss |
ESX Host | IOMega Storage | NFS | No Ping Loss | Two Ping Loss |
From vCenter Server –
XSA & XSB status showing down during fibre removal on all ESX Hosts
vCenter Server triggered the ‘Network uplink redundancy lost’ – Alarm
No VM GUI Hesitation reported
Xsigo B Results…
Pings –
From Device | Destination Device | Type | Result During | Result coming online / After |
External Laptop | Windows 7 VM | VM | No Ping Loss | No Ping Loss |
External Laptop | vCenter Server | VM | No Ping Loss | No Ping Loss |
External Laptop | ESX Host 1 | ESX | No Ping Loss | No Ping Loss |
External Laptop | ESX Host 2 | ESX | No Ping Loss | No Ping Loss |
External Laptop | ESX Host 3 | ESX |
No Ping Loss |
No Ping Loss |
External Laptop | ESX Host 4 | ESX | No Ping Loss | No Ping Loss |
ESX Host | IOMega Storage | NFS | No Ping Loss | No Ping Loss |
From vCenter Server –
XNB status showing down during module removal on all ESX Hosts
vCenter Server triggered the ‘Network up link redundancy lost’ – Alarm
No VM GUI Hesitation reported
Test 3 Summary –
All results are as expected. There was only very minor ping loss, which for us is nothing to be worried about
Test 4 – Remove Infiniband cables from the ESXi HBA.
Summary –
During this test, I will remove all the Infiniband cables (4 of them) from the ESXi HBA.
I will disconnect the Infiniband connection to Xsigo A first, measure/record the results, return the environment to normal, and then repeat for Xsigo B.
Pre-Procedure –
Validate by Xsigo CLI – ‘show vnics’ to see if vnics are in upstate
- Xsigo A and B are reporting both I/O Modules are functional
Ensure ESX Host vNICs are in Active mode and not standby
- vCenter server is reporting all communication is normal
Procedure –
Remove the InfiniBand cable from each ESXi Host attaching to Xsigo A
Measure results via Ping, vCenter Server, and check for any VM GUI hesitation.
Replace the cables, ensure system is stable, and repeat for Xsigo B device
Expected results –
ALL active traffic (IP or NFS) for ESXi (including VM’s) will continue to pass through the redundant XNB or XSB accordingly.
All active traffic (IP or NFS) for ESXi (including VM’s) might see a quick drop if it’s traffic is flowing through the affected adapter.
vCenter Server should show XNA and XSA as unavailable until cable is reconnected
I don’t expect for ESXi to take any of the NFS datastores offline
How I will quantify results –
ALL active traffic (IP or NFS) for ESXi (including VM’s) will continue to pass through the redundant XNB or XSB accordingly.
- Active PING to ESXi Host (Management Network, VM’s) and other devices to ensure they stay up
All active traffic (IP or NFS) for ESXi (including VM’s) might see a quick drop if it’s traffic is flowing through the affected adapter.
- Active PING to ESXi Host (Storage, Management Network, VM’s)
vCenter Server should show XNA and XSA as unavailable until cable is reconnected
- In vCenter Server under Network Configuration check to see if XS# goes down and back to active
I don’t expect for ESXi to take any of the NFS datastores offline
- In vCenter Server under storage, I will determine if the store goes offline
Actual Results –
Xsigo A Results…
Pings –
From Device | Destination Device | Type | Result During | Result coming online / After |
External Laptop | Windows 7 VM | VM | No Ping Loss | Two Ping Loss |
External Laptop | vCenter Server | VM | No Ping Loss | No Ping Loss |
External Laptop | ESX Host 1 | ESX | No Ping Loss | No Ping Loss |
External Laptop | ESX Host 2 | ESX | No Ping Loss | No Ping Loss |
External Laptop | ESX Host 3 | ESX |
No Ping Loss |
No Ping Loss |
External Laptop | ESX Host 4 | ESX | No Ping Loss | No Ping Loss |
ESX Host | IOMega Storage | NFS | No Ping Loss | No Ping Loss |
From vCenter Server –
XSA & XNA status showing down during fibre removal on all ESX Hosts
vCenter Server triggered the ‘Network uplink redundancy lost’ – Alarm
No VM GUI Hesitation reported
NFS Storage did not go offline
Xsigo B Results…
Pings –
From Device | Destination Device | Type | Result During | Result coming on line / After |
External Laptop | Windows 7 VM | VM | No Ping Loss | No Ping Loss |
External Laptop | vCenter Server | VM | One Ping Loss | One Ping Loss |
External Laptop | ESX Host 1 | ESX | No Ping Loss | No Ping Loss |
External Laptop | ESX Host 2 | ESX | No Ping Loss | One Ping Loss |
External Laptop | ESX Host 3 | ESX |
No Ping Loss |
No Ping Loss |
External Laptop | ESX Host 4 | ESX | One Ping Loss | No Ping Loss |
ESX Host | IOMega Storage | NFS | No Ping Loss | No Ping Loss |
From vCenter Server –
XNB & XSB status showing down during module removal on all ESX Hosts
vCenter Server triggered the ‘Network up link redundancy lost’ – Alarm
NFS Storage did not go offline
Test 4 Summary –
All results are as expected. There was only very minor ping loss, which for us is nothing to be worried about
Test 5 – Pull Power on active Xsigo vp780
Summary –
During this test, I will remove all the power cords from Xsigo A.
I will disconnect the power cords from Xsigo A first, measure/record the results, return the environment to normal, and then repeat for Xsigo B.
Pre-Procedure –
Validate by Xsigo CLI – ‘show vnics’ to see if vnics are in up state
- Xsigo A and B are reporting both I/O Modules are functional
Ensure ESX Host vNICs are in Active mode and not standby
- vCenter server is reporting all communication is normal
Procedure –
Remove power cables from Xsigo A
Measure results via Ping, vCenter Server, and check for any VM GUI hesitation.
Replace the cables, ensure system is stable, and repeat for Xsigo B device
Expected results –
ALL active traffic (IP or NFS) for ESXi (including VM’s) will continue to pass through the redundant XNB or XSB accordingly.
All active traffic (IP or NFS) for ESXi (including VM’s) might see a quick drop if it’s traffic is flowing through the affected adapter.
vCenter Server should show XNA and XSA as unavailable until cable is reconnected
I don’t expect for ESXi to take any of the NFS datastores offline
How I will quantify results –
ALL active traffic (IP or NFS) for ESXi (including VM’s) will continue to pass through the redundant XNB or XSB accordingly.
- Active PING to ESXi Host (Management Network, VM’s) and other devices to ensure they stay up
All active traffic (IP or NFS) for ESXi (including VM’s) might see a quick drop if it’s traffic is flowing through the affected adapter.
- Active PING to ESXi Host (Storage, Management Network, VM’s)
vCenter Server should show XNA and XSA as unavailable until cable is reconnected
- In vCenter Server under Network Configuration check to see if XS# goes down and back to active
I don’t expect for ESXi to take any of the NFS datastores offline
- In vCenter Server under storage, I will determine if the store goes offline
Actual Results –
Xsigo A Results…
Pings –
From Device | Destination Device | Type | Result During | Result coming online / After |
External Laptop | Windows 7 VM | VM | No Ping Loss | No Ping Loss |
External Laptop | vCenter Server | VM | One Ping Loss | One Ping Loss |
External Laptop | ESX Host 1 | ESX | One Ping Loss | One Ping Loss |
External Laptop | ESX Host 2 | ESX | One Ping Loss | One Ping Loss |
External Laptop | ESX Host 3 | ESX |
No Ping Loss |
No Ping Loss |
External Laptop | ESX Host 4 | ESX | No Ping Loss | One Ping Loss |
ESX Host | IOMega Storage | NFS | No Ping Loss | No Ping Loss |
From vCenter Server –
XSA & XNA status showing down during the removal on all ESX Hosts
vCenter Server triggered the ‘Network uplink redundancy lost’ – Alarm
No VM GUI Hesitation reported
NFS Storage did not go offline
Xsigo B Results…
Pings –
From Device | Destination Device | Type | Result During | Result coming online / After |
External Laptop | Windows 7 VM | VM | One Ping Loss | No Ping Loss |
External Laptop | vCenter Server | VM | One Ping Loss | One Ping Loss |
External Laptop | ESX Host 1 | ESX | No Ping Loss | No Ping Loss |
External Laptop | ESX Host 2 | ESX | No Ping Loss | One Ping Loss |
External Laptop | ESX Host 3 | ESX |
One Ping Loss |
One Ping Loss |
External Laptop | ESX Host 4 | ESX | One Ping Loss | No Ping Loss |
ESX Host | IOMega Storage | NFS | One Ping Loss | No Ping Loss |
From vCenter Server –
XNB & XSB status showing down during module removal on all ESX Hosts
vCenter Server triggered the ‘Network uplink redundancy lost’ – Alarm
No VM GUI Hesitation reported
NFS Storage did not go offline
Test 5 Summary –
All results are as expected. There was only very minor ping loss, which for us is nothing to be worried about
It took about 10 Mins for the Xsigo come up and online from the point I pulled the power cords to the point ESXi reported the vnics were online..
Overall Thoughts…
Under very low load the Xsigo it performed as expected with ESXi. So far the redundancy testing is going well.
Tomorrow I plan to place a pretty hefty load on the Xsigo and IOMega to see how they will perform under the same conditions.
I’m looking forward to seeing if the Xsigo can perform just as well under load.
Trivia Question…
How do you know if someone has rebooted and watched an Xsigo boot?
This very cool logo comes up on the bootup screen! Now that’s Old School and very cool!