DefinIT

VCAC 6.0 build-out to distributed model – Part 1: Certificates

This is the first article in a series about how to build-out a simple vCAC 6 installation to a distributed model.

image

Simple vCAC deployment

In a simple installation you have the Identity Appliance, the vCAC appliance (which includes a vPostgres DB and vCenter Orchestrator instance) and an IaaS server. The distributed model still has a single Identity Appliance but clusters 2 or more vCAC appliances behind a load balancer, backed by a separate vPostgres database appliance. The IaaS components are installed on 2 or more IaaS Windows servers and are load balanced, backed by an external MSSQL database. Additionally, the vCenter Orchestrator appliance is used in a failover cluster, backed by the external vPostgres database appliance.

The distributed model can improve availability, redundancy, disaster recovery and performance, however it is more complex to install and manage, and there are still single points of failure – e.g. the vPostgres database is not highly available and although protected by vSphere HA could be the cause of an outage. Clustering the database would provide an improved level of availability but may not be supported by VMware. Similarly the Identity Appliance is currently a single point of failure, although there are also options for high availability there too.

An overview of the steps required is below:

  • Issue and install certificates
  • Deploy an external vPostgres appliance and migrate the vCAC database
  • Configure load balancing
  • Deploy a second vCAC appliance and configure clustering
  • Install and configure additional IaaS server
  • Deploy vCenter Orchestrator Appliance cluster

(more…)

vSphere Security: Advanced SSH Configurations

Security-Guard_thumb2_thumb.pngThere are different schools of thought as to whether you should have SSH enabled on your hosts. VMware recommend it is disabled. With SSH disabled there is no possibility of attack, so that’s the “most secure” option. Of course in the real world there’s a balance between “most secure” and “usability” (e.g. the most secure host is powered off and physically isolated from the network, but you can’t run any workloads Winking smile). My preferred route is to have it enabled but locked down.

Note: VMware use the term “ESXi Shell”, most of us would term it “SSH” – the two are used interchangeably in this article although there is a slight difference. You can have the ESXi Shell enabled but SSH disabled – this means you can access the shell via the DCUI. For the sake of this article assume ESXi Shell and SSH are the same. (more…)

Trouble with SCOM 2007 R2 Certificates? Validate the entire PKI path!

MSFT-System-Center-logoI learned something new today: SCOM 2007 R2 certificate based communications not only checks the validity of the certificate you use, but also the CA that issued it…let me expand:

Like many organisations there is a root CA (we’ll call it ROOTCA01), and then a subordinate CA (we’ll call that SUBCA01). OPSMGM01 has a certificate to identify itself and has certificates for ROOTCA01 and SUBCA01 in it’s Trusted Root Certificate Authorities.

The certificate to secure the connection between OpsMgr Gateway (OPSGW01) and the OpsMgr Management Server (OPSMGM01) is issued by SUBCA01 and is installed on OPSGW01, and to validate the certificate chain SUBCA01’s certificate is also installed in the Trusted Root Certification Authorities. Opening OPSGW01’s certificate and examining the Certificate Path tab shows the certificate is valid all the way up to the issuing CA – SUBCA01.

The connection will not work – OPSGW01 logs the following events:

Log Name:      Operations Manager
Source:        OpsMgr Connector
Date:          05/01/2012 10:18:28
Event ID:      21016
Level:         Error
Computer:      opsgw01.definit.co.uk
Description:   OpsMgr was unable to set up a communications channel to opsmgm01.definit.co.uk and there are no failover hosts.  Communication will resume when opsmgm01.definit.co.uk is available and communication from this computer is allowed.

Log Name:      Operations Manager
Source:        OpsMgr Connector
Date:          05/01/2012 10:18:25
Event ID:      20070
Level:         Error
Computer:      opsgw01.definit.co.uk
Description:   The OpsMgr Connector connected to opsmgm01.definit.co.uk, but the connection was closed immediately after authentication occurred.  The most likely cause of this error is that the agent is not authorized to communicate with the server, or the server has not received configuration.  Check the event log on the server for the presence of 20000 events, indicating that agents which are not approved are attempting to connect.

Log Name:      Operations Manager
Source:        OpsMgr Connector
Date:          05/01/2012 10:18:24
Event ID:      21002
Level:         Warning
Computer:      opsgw01.definit.co.uk
Description:   The OpsMgr Connector could not accept a connection from xxx.xxx.xxx.xxx:5723 because mutual authentication failed.

Log Name:      Operations Manager
Source:        OpsMgr Connector
Date:          05/01/2012 10:18:24
Event ID:      20067
Level:         Warning
Computer:      opsgw01.definit.co.uk
Description:   A device at IP xxx.xxx.xxx.xxx:5723 attempted to connect but the certificate presented by the device was invalid.  The connection from the device has been rejected.  The failure code on the certificate was 0x800B0109 (A certificate chain processed, but terminated in a root certificate which is not trusted by the trust provider.).

It’s the last event that led me to check the certificate chain for the SUBCA01 certificate, which was installed and trusted but did not validate up the chain to ROOTCA01. Installing the ROOTCA01 certificate resolved this issue.