VMware VCSA 6: FSCK Failed on Boot

This past weekend I decided to do some rewiring of my home lab and  accidentally pulled the power to the host that my VCSA was running on. While booting my VCSA 6 was booting back up I received the following error:

fsck failed. Please repair manually and reboot. The root file system is currently mounted ready-only. To remount it read-write do:
bash# mount -n -o remount,rw

VCSA Boot Error 1 - VCSA Boot Error

Read more…

VMware Guest Customization – Could not parse or process the unattend answer file

While deploying a virtual machine from template I received an error when guest customization was running:

Windows could not parse or process the unattend answer file for pass [specialize]. The settings specified in the answer file cannot be applied. The error was detected while processing settings for components [Microsoft-Windows-Shell-Setup].

Vmware Guest Customization Error

Per VMware KB 2008221 a ‘wrong product key‘ could cause this. Coincidentally I created and used a brand new Guest Customization for this deployment. Turned out I used a Server 2012 Datacenter KMS key instead of the Server 2012 R2 Datacenter KMS key (Microsoft KMS Keys). I deleted the VM and redeployed after correcting the Guest Customization thus solving the problem!

A General System Error Occurred: Cannot get user info

After a fresh deployment of the vCenter Server Appliance 6 (VCSA) I got the error below when using “Use Windows session credentials” check box on the thick and web client. After some searching I found VMware KB 2050701 which states this is a known issue affecting vCenter Server Appliance 5.1, 5.5, and 6.0.

1 VCSA 6 - Cannot get user info Error

Read more…

Lost connectivity to the device backing the boot filesystem

One of our HP Gen 9 blades had the following configuration error:

Lost connectivity to the device mpx.vmhba32:C0:T0:L0 backing the boot filesystem /vmfs/devices/disk/mpx.vmhba32:C0:T0:L0. As a result, host configuration changes will not be saved to persistent storage.

The boot filesystem is on an internal 8 GB SD Card. I logged into iLo, went to Diagnostics and found the system no longer saw the SD Card as mounted:

iLo 4 - Issue with media

We evacuated the host and reseated the SD Card. It mounted without issue but to be safe we went ahead and installed a replacement SD Card. Issue reoccurred. This is caused by the version of iLo running on the server.

iLo 4 - Issue Resolved

EDIT (9/14/2015): There seems to be a issue with the iLO firmware version 2.20 causing this issue. The firmware should be updated to version 2.22. HP should be releasing version 2.30 in the next few weeks that hopefully be a full fix for this issue! Big thanks to Mike B. for reporting his findings from HP!!!

EDIT (10/2/2015): The new HP Service Pack for Proliant (SPP) (dated version 2015.10.0) contains updated iLO firmware 2.30! We are rolling this through our environment over the next week.

EDIT (12/4/2015): The 5+ blades we upgraded to iLo 2.30 haven’t had the issue repeat.

 EDIT (12/10/2015): Multiple reports of the issue reoccurring on iLo 2.30 firmware have been received. Though 2.30 firmware seems to have helped the issue it isn’t a permanent fix. iLo firmware 2.22 should be used until we hear back from HP.

EDIT (2/5/2016):  Bjorn reports HP gave him iLo 2.40 firmware to load. Just checked HP Downloads and it isn’t available yet. Awaiting to read the release notes to see if it specifically addresses this issue.

EDIT (4/20/2016): HP released iLo 2.40 firmware on April 1st. Release notes do not specify this issue specifically, only IPv6 enhancements. We still have not had the issue reoccur on any of our blades running iLo 2.30 firmware. We are going to continue on 2.30 as the enhancements with 2.40 isn’t worth the gamble of a new version.

Deprecated VMFS volume(s) found on the host

One of my ESXi 6 hosts had the configuration issue message stating: “Deprecated VMFS volume(s) found on the host. please consider upgrading volume(s) to the latest version“. I only have two LUNs presented to my hosts and both are VMFS5:

Storage is VMFS

After some brief searching I found VMware KB article 2109735 which states the possible cause:

This issue occurs because at the time of initial detection, the version of the filesystem is not known. Therefore, comparing it against the list of valid filesystems does not return a match.

Looking through the hostd log I did find the entry mentioned in the KB article. It appears that after I renamed my StarWind LUN it did not report the filesystem to the host fast enough which caused the error to occur:

/var/log/hostd.log 

HOSTD Log Entry

The resolution was to restart the management agents on the host. I put the host in maintenance mode then restarted the management agents:

services.sh restart

services.sh restart

After the management services restarted the error cleared:

Error cleared

Leave comments below if you received this message and this KB article resolved your issue!

KB2109735:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2109735&src=vmw_so_vex_sbori_1079

Unable to kill DCUI – ESXi 5.1

One of our ESXi 5.1 hosts entered a disconnected state with reason unknown. When I logged into the console of the server and checked the network settings everything checked out. Then I went into the ESXi shell to see what the network interfaces looked like and here laid the problem… the management vmkernel interface was not enabled.

esxcli network ip interface list

1 ESXi Shell - Interface List

If you need to see what IP address is assign to each vmk, run this command:

esxcli network ip interface ipv4 get

ESXi Shell - Get Interface

Okay, no big deal…. that is a easy fix! But when I tried to enable it I received a “Unable to kill DCUI” error.

esxcli network ip interface set -e true -i vmk0

2 ESXi Shell - Enabled Interface - failed

I could not find any information about this error anywhere. With this production ESXi host disconnected and roughly 40 virtual machines still running on this server I admitted defeat and opened a support case with VMware.

The VMware Support Engineer referenced internal KB article 2052878 with the fix below:

First we need to find the processor ID of the DCUI.

ps | grep -i DCUI

3 ESXi Shell - Find DCUI process ID

Note: The number to the right of the Unable to kill DCUI error is NOT the PID. Use the command above

Now kill that PID, it will not return anything if successful.

kill -9 PID

4 ESXi Shell - kill the dcui pid

This should now let you enable the vmk interface

esxcli network ip interface set -e true -i vmk0

5 ESXi Shell - Enable interface sucess

Perform a interface list and that disabled vmk should be enabled. Check to see if your host is pingable again.

esxcli network ip interface list

6 ESXi Shell - Interface List with vmk0 enabled

If anyone else received this error, please comment with your scenario and results!

Not able to see the SD Card on HP Gen 9 during ESXi 5.5 Installation

When trying to install HP ESXi 5.5 customized image on a HP Gen 9 BL460 blade server I was not able to see the SD card to install on. After trying a few things I found that switching the USB 3.0 mode to Auto instead of the defaulted On allowed ESXi installation to see the SD card. The internal card reader goes over the USB bus so this appears to be a bug and hopefully HP will have a update soon to fix it. Below is what I did to get through this issue:

ESXi 5.5U2 Installation – If your not seeing the SD card, go ahead and reboot:

ESXi Install - Not seeing SD card

Hit F9 on the boot screen to enter System Utilities:

HP Gen 9 Boot Post

Hit Enter on the System Configuration:

HP Gen 9 - F9 Menu 1

Hit Enter on the BIOS/Platform Configuration:

HP Gen 9 - F9 Menu 2

Hit Enter on System Options:

HP Gen 9 - F9 Menu 3

Hit Enter on USB Options:

HP Gen 9 - F9 Menu 4

On USB 3.0 Mode, change the default value of Enabled to Auto. Now hit F10 to save and reboot.

HP Gen 9 - F9 Menu 5

Now on ESXi installer screen you should see the SD card:

ESXi Install - SD Card Ready for Installation

Guest customization runs on every boot

In our vSphere 5.1 update 2 environment we found that some of our Windows virtual machines were running guest customization every time they boot. This was causing them to loose their static IP address and take an additional 5 minutes to boot.

vmware-image-customization

To fix this issue, boot to Windows and open regedit. Navigate to to the following location:

HKLM\System\CurrentControlSet\Control\Session Manager\

Regedit Key

Edit the BootExecute key and remove all sysprepDecrypter.exe lines. We had some VMs with up to 10 entries!

Regedit - edit key

Your key should now look like this:

Regedit - correct key

Click OK. Now the VM will not perform the guest customization on every boot.

To prevent future VMs you deploy from gaining this issue update your vSphere environment to at least 5.1 Update 5.

Authorize Exception Error when logging into vCenter

If you get the following error when logging into vCenter(5.1U5): “A general system error occurred: Authorize Exception”

vCenter - Authorize Exception error

 

Restarting the Single Sign On service on your vCenter should resolve the issue. In my case the cause of this was due to the LDAP connection pool being exhausted. To confirm this check the ssoAdminServer.log found here:

C:\Program Files\VMware\Infrastructure\SSOServer\log

Do a search for the following error: No ManagedConnections available within configured blocking timeout

If you find that around the time when you was logging in then your LDAP connection pool was exhausted. This issue is resolved in vSphere 5.5.

The VMware KB article for this issue 2055448:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2055448