I've been playing about with a compact SRM install in my lab - since I have limited resources and only one site I wanted to create a run-through for anyone learning SRM to be able to do it in their own lab too. I am creating two sites on the same IP subnet (pretend it's a stretched LAN across two sites) and will be protecting a single, tiny Linux web server using vSphere Replication. I'm aiming to cover SAN based replication in a later post.
Below is the list of hosts and VMs running for this exercise:
- ESXi-01 - my "Protected Site" - this is running DC-01, VC-01, SRM-01 and VRA-01 (to be installed later)
- ESXi-02 - my "Recovery Site" - this is running VC-02, SRM-02 and VRA-02 (to be installed later)
- DC-01 – this is my domain controller, I’m only going to use one DC for both “sites” as I don’t have the compute resource available to have a second running. This is also my Certificate Authority.
- VC-01 – this is my primary Virtual Center server, it’s a Windows 2012 R2 server. It is managing ESXi-01.
- VC-02 – this is my “recovery site” and it’s a Virtual Center Server Appliance (VCSA). It is managing ESXi-02
- SRM-01 - “protected site” SRM server, base install of Windows Server 2012 at this point
- SRM-02 - “recovery site” SRM server, base install of Windows Server 2012 at this point
- WEB-01 - this is a really, really, basic Ubuntu web server I've deployed from a template to use for testing.
Right - without further ado, let's get stuck in!
This had me scratching my head, what seemed to be a common problem wasn’t fixed by the common solution. It was actually my fault – too familiar with the product and setting things up too quickly to test.
I installed a VCSA 5.5 instance in my lab as a secondary site for some testing and during the process found I couldn’t log on to the web client – it failed with the error:
Failed to connect to VMware Lookup Service https://vCVA_IP_address:7444/lookupservice/sdk - SSL certificate verification failed.
I had a closer look at the certificate being generated and noticed that the Subject Name was malformed “CN=vc-02.definit.loca” – that led me to the network config of the VCSA. I’d entered the FQDN into the “host name” field, which was in turn being passed to the certificate generation, truncated and throwing the SSL error. Changing the FQDN back to the host name “VC-02” and regenerating the certificate resolved the issue.
If you do have to follow that process, remember to disable the SSL certificate regeneration after it’s fixed – otherwise you’ll suffer slow boot times!
I’ll put that one down to over-familiarity with the product!
After having a play with Virtual Flash and Host Caching on one of my lab hosts I wanted to re-use the SSD drive, but couldn’t seem to get vFlash to release the drive. I disabled flash usage on all VMs and disabled the Host Cache, then went to the Virtual Flash Resource Management page to click the “Remove All” button. That failed with errors:
“Host’s virtual flash resource is inaccessible.”
“The object or item referred to could not be found.”
In order to reclaim the SSD you need to erase the proprietary vFlash File System partition using some command line kung fu. SSH into your host and list the disks:
You’ll see something similar to this:
You can see the disk ID “t10.ATA_____M42DCT032M4SSD3__________________________00000000121903600F1F” and below it appended with the “:1” which is partition 1 on the disk. This is the partition that I need to delete. I then use partedUtil to delete the partition I just identified using the format below:
partedutil delete “/vmfs/devices/disks/<disk ID>” <partition number>
partedutil delete “/vmfs/devices/disks/t10.ATA_____M42DCT032M4SSD3__________________________00000000121903600F1F” 1
There’s no output after the command:
Now I can go and reclaim the SSD as a VMFS volume as required:
Hope that helps!
For those of you unaware VMware recently released the VMware vSphere Mobile Watchlist
What does it do?
"VMware vSphere Mobile Watchlist allows you to monitor the virtual machines you care about in your vSphere infrastructure remotely on your phone. Discover diagnostic information about any alerts on your VMs using VMware Knowledge Base Articles and the web. Remediate problems from your phone by using power operations or delegate the problem to someone on your team back at the datacenter."
- REMEDIATE REMOTELY
Use power operations to remediate many situations remotely from your device.
- VMS AT A GLANCE
Review the status of these VMs from your device including: their state, health, console and related objects.
I have been using it for a day or so and I have found it very useful, presently I have it installed on my Android Phone and Tablet.
If you consider using this in conjunction with VPN or whatever your preferred secure method to connect to your work LAN when you are "out and about" its a great way to quickly take a look at any problematic VMs without needing to fire up your laptop.
Its available on Android and iOS and is well worth a quick look.
In my post yesterday (vexpert.me/hS) I talked about how to recover from an expired default SSO administrator password – this prompted a discussion on twitter with Anthony Spiteri (@anthonyspiteri) and Grant Orchard (@grantorchard) about the defaults for expiration and how to mitigate the risk.
The first solution is to modify the password expiration policy for SSO. I’m not advocating this necessarily – I think that expiring passwords ensure that you change them regularly and increase the overall security of your SSO solution. However, I can envisage situations (similar to mine) when the SSO administrator account is not used for a long time and expired – that causes headaches.
To modify the SSO password policy log onto the vSphere Web Client as the SSO admin (admin@system-domain for 5.1 or [email protected] 5.5) and select Administration, then Sign-On and Discovery > Configuration. Select the Policies tab – you should see the default config:
Click edit and set the password policy as required. This only applies to SSO users (i.e. those in the System-Domain or vSphere.local domains). To set the password to never expire set the Maximum Lifetime to 0. IF you chose to do that, I’d beef up the complexity of your password policy to include upper, lower, numeric and special characters and increase the length from 8 to 13.
Similarly, you can edit the lockout policy which by default will lock you out if it has 3 failed attempts within 24 days. It will lock you out for 15 minutes. Setting the lockout time to 0 forces a manual unlock by an SSO admin.
The second option seems preferable to me (and Anthony and Grant) – that is to add some AD users or groups to the SSO administrators group. To do this, again log in as an SSO admin and select Administration, then Access > SSO Users and Groups, then the Groups tab. Select “__Administrators__” and click on the add principals button below. Select your AD domain from in the Identity Source field and search for your required user or group. Add them and click OK. Now those users, or group members have the ability to log on and reset or unlock the SSO admin account. AD accounts are obviously subject to your AD password policy, but can be reset independently of SSO and therefore don’t require you to use some command-line kung-fu to unlock.