Converting vSphere Custom Attributes to Categories and Tags
In vSphere 5.1 "Tags" replace the old custom attributes to provide a way of adding metadata to vSphere objects. The "Tags" are organised into categories to "define how the tags can be applied to inventory objects". The easiest way to think of the difference is that custom attributes are "free text" and the tags are statically defined properties.
There is a wizard for converting custom attributes to tags, but it can get a bit confusing and is pretty poor - let me explain. We use four custom attributes in my current environment: CreatedBy, CreatedOn, Owner and ServiceType. CreatedBy contains the user ID of the person who created the VM, CreatedOn is the timestamp of when the VM was created, Owner is the Business Unit who own the server and ServiceType is the type of service - e.g. Active Directory, or SQL.
Site to Site VPN Tunnel traffic flow problems
Firewalls being used – Sonicwall 3500 & Cisco 506e
Several months ago we relocated and it was then necessary to setup a Site to Site VPN tunnel with another network. (In this instance the other network was not directly managed by us)
Upon the creation of the tunnel and after successful traffic tests all looked well. However after several hours or less in some cases traffic stopped flowing yet both firewalls reported the tunnel as “up”. We reviewed the first and second phase settings and tweaked the Sonicwall VPN settings to hopefully remedy.
Options on the Sonicwall such as “Enable IKE Dead Peer Detection” & “Enable Keep Alive” were enabled and disabled to try and find a fix for the VPN traffic flow problem.
What was interesting during the troubleshooting process, we found that if we manually restarted the VPN tunnel it would resume with no issue, but obviously this was hardly a practical fix for our issues.
Liaising with the other site we also experimented with Phase 1 and Phase 2 Life Time settings with no success.
It was then we had a small eureka moment, we decided to check the time servers each firewall referenced. It transpired the Time Server being referenced by the Cisco Firewall was out of sync (it was an internally hosted NTS)
After the offending NTS had been re-sync’d we decided to completely recreate the VPN tunnel double checking the settings as we went along. The VPN Tunnel came up with no issues and has been stable ever since.
I would add if we encounter a problem like this again I would simply point both Firewalls to the same NTS but as one of the firewalls in this case was managed by a third party this was not an option.
vSphere HA agent for host [Host's Name] has an error in [Cluster's Name] in [Datacenter's Name]: vSphere HA agent cannot be correctly installed or configured
Here's a lesson in checking the basics! I added new ESXi 5 host to a cluster today and spent a good couple of hours troubleshooting the error:
vSphere HA agent for host [Host's Name] has an error in [Cluster's Name] in [Datacenter's Name]: vSphere HA agent cannot be correctly installed or configured
After a few basic checks, migrating the host in and out of the cluster and rebooting, I headed off to google and began troubleshooting.
Cannot install the vSphere HA (FDM) agent on an ESXi host - this article suggests that the host is in lockdown mode. This is unlikely since we don't use lockdown mode, but I checked anyway:
Get-vmhost esxi001.definit.co.uk | select Name,@{N="LockDown";E={$_.Extensiondata.Config.adminDisabled}} | ft -auto Name,LockDown
This returned false - no lockdown.
To exit lockdown mode, you can use:
(get-vmhost esx001.definit.co.uk | get-view).ExitLockdownMode()
I spent a good amount of time going through the list on Troubleshooting VMware High Availability (HA) in vSphere which isn't entirely ESXi relevant but has some good pointers nonetheless.
I finally got to Reconfiguring HA (FDM) on a cluster fails with the error: Operation timed out, with the following gem of info:
This issue occurs if the vSphere High Availability Agent service on the ESXi host is stopped.
*Facepalm* - I checked the services and set the service to start and stop automatically. HA is now happily configured.
No matter how much you know, you gotta check the basics!
vMA 5: Cannot initialize property ‘ vami.DNS0.vSphere_Management_Assistant_(vMA)’
Just a quick post regarding the vSphere Management Assistant 5 - when deploying the vMA with a static IP address, you might see the following error:
Power On virtual machine <VM name> Cannot initialize property ' vami.DNS0.vSphere_Man- agement_Assistant_(vMA)' , since network '<network name>' has no associated IP pool configuration.
Edit the vMA virtual machine's properties and go to Options, vApp Options and select disable. Acknowledge the warning and click OK to close the VM properties.
The vMA booted fine after that - the solution comes from this vmware communities post.
VMware ESXi Maximum paths includes local storage
If you are close to the VMware ESXi storage path limit of 1024 paths per host, you may want to consider the following: local storage, including CD-ROMs, are counted in your total paths.
Simply because of the size and age of the environment, some of our production clusters have now reached the limit (including local paths) - you see this message in the logs
[2012-08-20 01:48:52.256 77C3DB90 info 'ha-eventmgr'] Event 2003 : The maximum number of supported paths of 1024 has been reached. Path vmhba3:C0:T4:L0 could not be added.
Upgrading to ESXi 5.0 Update 1 using VMware Update Manager
I'm currently updating a very small 4-host cluster built for a specific application within our datacentre, the hosts are IBM HS22 blades. Since we have the VMware Update Manager infrastructure in place already, I downloaded the IBM ESXi 5.0 Update 2 ISO and imported it into Update Manager, created a baseline and then applied it to the cluster. I scanned the cluster with the baseline and was issued this warning for each host:
That's fine - there is an option to remove those modules when you remediate the host.
Installing a TMG Enterprise Management Server and Migrating and Existing Standalone Array: Part 1
This is my current scenario: there are two existing servers in a stand-alone array - TMG01 and TMG02, and over in a DR site there is a new server (TMG03) that is in the process of being built. To comply with DR, all 3 servers must have their configurations up to date, however there is no direct communication allowed between the two DMZs, so simply adding to the new server as an array member is not possible.
Fortunately, IPSec is allowed between each DMZ and the management DMZ so the plan is to configure IPSec between a new Enterprise Management Server in the Management DMZ (we''ll call it EMS01) and each of the three TMG servers.
SCOM 2007 R2: Daily Health Check Script v2
A couple of months ago I posted the first version of my SCOM 2007 R2 Daily Health Check Script - here is version 2. It's more than a little motivated by some friendly competition with a Microsoft PFE for SCOM, hopefully you'll agree it's a big improvement on the last version.
Updated for this version
- Formatting changed to make it more readable and more compatible
- Added "Report generated on <server>" to the top of the report
- Management Server states reported as one section
- Default MP check moved to beneath the Management servers
- Agents in pending states moved to be with the Agent health states
- Clarified "Unresponsive Agents" and "Agents reporting errors"
- Management server alerts streamlined
- Added top 10 alerts for the last 7 days, and added top alerters for each
SCOM 2007 R2: Daily Health Check Script
An updated version of this script has been released: http://www.definit.co.uk/2012/05/scom-2007-r2-daily-health-check-script-v2/
I've been working with a Microsft SCOM PFE (Premier Field Engineer) for the last few months and part of the engagement is an environment health check for the SCOM setup. Based on this Microsoft recommend a series of health checks to for the environment that should be carried out every day. This is summarised as the following:
- Check the health of all Management Servers and Gateways
- Check the RMS is not in maintenance mode
- Review Outstanding Alerts
- Review Agent's Health Status
- Review Backup Status
- Review any Management Group Alerts
- Review the Pending Management status
- Review Database Sizes (Operations, Data warehouse, ACS)
- Review Volume of Alerts
- Review Alert Latency
- Document any changes




