Lab Notes – vCloud Director 9.1 for Service Providers – Part 1: Pre-requisites

| 08/02/2019 | Tags: , , , , , ,

This series was originally going to be a more polished endeavour, but unfortunately time got in the way. A prod from James Kilby (@jameskilbynet) has convinced me to publish as is, as a series of lab notes. Maybe one day I’ll loop back and finish them…



Because I’m backing my vCloud Director installation with NSX-T, I will be using my existing Tier-0 router, which interfaces with my physical router via BGP. The Tier-0 router will be connected to the Tier-1 router, the NSX-T logical switches will be connected to the Tier-1, and the IP networks advertised to the Tier-0 (using NSX-T’s internal routing mechanism) and up via eBGP to the physical router.

The Tier-1 router will be created in Active-Standby mode because it will also provide the load balancing services later.

Tier1 vCloud Director

Tier 1 vCloud Director advertised routes

Logical Switches

I want to build vCloud Director as many Service Provider customers do, with different traffic types separated by logical switches. I will be using and subnetting into some smaller /27 networks to avoid wasting IPs (a typical Service Provider requirement) To that end, I am deploying four NSX-T logical switches:

  • vCloud Director API/UI/Console
  • vCloud Director SQL
  • vCloud Director NFS
  • vCloud Director RabbitMQ/Orchestrator

The four logical switches have been connected to the Tier1 router created for vCloud Director, and have router ports configured in the correct subnet

vCD Router Ports

Load Balancing

There are various load balancing requirements for the full vCloud Director installation, which will be fulfilled by the NSX-T Logical Load Balancer on the Tier-1 router:

  • vCloud Director API/UI
  • vCloud Director Console
  • vCloud Director RabbitMQ
vRealize Orchestrator

The actually load balancer configuration will be done later on when I have the components deployed.


All the VMs that are part of the vCloud Director installation will require A and PTR (forward and reverse) lookup records

Required DNS Records

Notice that the VCD cells have two IPs per VM, one for the UI/API, and one for the Console traffic. Two records are also created for the load balancer URLs for vRealize Orchestrator, and RabbitMQ.

VM Sizing

The vCloud Director cells, PostgreSQL database and RabbitMQ will be deployed using a standard CentOS7 template. vRealize Orchestrator is deployed as an appliance. The open-vm-tools package is installed on the template.

Vcd-sql-1. 2cpu. 4gb. 40gb


All VMs have been updated using yum update -y


All VMs are configured to use a default NTP source:

yum install -y ntp

systemctl enable ntpd

systemctl start ntpd


Replace SELINUX=enforcing with SELINUX=disabled in /etc/selinux/config and reboot

sed -i ‘s/^SELINUX=.*/SELINUX=disabled/g’ /etc/selinux/config && cat /etc/selinux/config && reboot

Lab Notes – vCloud Director 9.1 for Service Providers – Part 4: RabbitMQ Cluster Installation

| 13/07/2018 | Tags: , , , , , ,

This series was originally going to be a more polished endeavour, but unfortunately time got in the way. A prod from James Kilby (@jameskilbynet) has convinced me to publish as is, as a series of lab notes. Maybe one day I’ll loop back and finish them…

RabbitMQ for vCloud Director

RabbitMQ High Availability and Load Balancing

The vCloud Architecture Toolkit states

RabbitMQ scales up to thousands of messages per second, which is much more than vCloud Director is able to publish. Therefore, there is no need to load balance RabbitMQ nodes for performance reasons.

Therefore, I am deploying RabbitMQ in cluster mode for high availability rather than scaling out resources. This means that I can use a RabbitMQ cluster with two nodes, configure replication for the vCloud Director queue, and then load balance the two nodes.

  • When you configure a highly available queue, one node is elected the Master, and the other(s) become Slave(s)
  • If you target a node with the Slave, RabbitMQ will route you to the queue on the Master node
  • If the queue’s Master node becomes unavailable, a Slave node will be elected as Master

In order to provide a highly available RabbitMQ queue for vCloud Director extensibility, the load balancer will target the queue’s Master node and send traffic there. In the event that the node with the Master queue becomes unavailable, the load balancer will redirect traffic to the second node, which will have been elected as Master.

Both vCloud Director and vRealize Orchestrator will access the queue via the load balancer.

  • vCloud Director will publish messages to the load balancer
  • vRealize Orchestrator will subscribe as a consumer to the load balancer

RabbitMQ HA Cluster


I’ve deployed two CentOS7 VMs from my standard template, and configured the pre-requisites as per my pre-requisites post. Updates, NTP, DNS and  SELinux have all been configured.

RabbitMQ needs the exact same Erlang version installed on each node, the easiest way to do this is to enable the EPEL repository:

yum install epel-release -y
yum install erlang -y

vCloud Director 9.1 supports RabbitMQ 3.6, so locate and download the correct RPM from the GitHub release page


To trust the downloaded package I need to import the RabbitMQ public signing certificate:

rpm –import

Finally, lets open the host firewall ports required for RabbitMQ

firewall-cmd –zone=public –permanent –add-port=4369/tcp
firewall-cmd –zone=public –permanent –add-port=25672/tcp
firewall-cmd –zone=public –permanent –add-port=5671-5672/tcp
firewall-cmd –zone=public –permanent –add-port=15672/tcp
firewall-cmd –zone=public –permanent –add-port=61613-61614/tcp
firewall-cmd –zone=public –permanent –add-port=1883/tcp
firewall-cmd –zone=public –permanent –add-port=8883/tcp
firewall-cmd –reload

RabbitMQ Installation

The following steps should be completed on BOTH RabbitMQ nodes

Install the RabbitMQ RPM

yum install rabbitmq-server-3.6.16-1.el7.noarch.rpm -y

Enable, and start the RabbitMQ server:

systemctl enable rabbitmq-server

systemctl start rabbitmq-server

Enable the management interface and restart the server to take effect

rabbitmq-plugins enable rabbitmq_management
chown -R rabbitmq:rabbitmq /var/lib/rabbitmq/

systemctl restart rabbitmq-server

Finally, add an administrative user for vCloud Director:

sudo rabbitmqctl add_user vcloud ‘VMware1!’
sudo rabbitmqctl set_user_tags vcloud administrator
sudo rabbitmqctl set_permissions -p / vcloud “.*” “.*” “.*”

Validate that the RabbitMQ admin page is accessible on http://vcd-rmq-1.definit.local:15672

RabbitMQ Admin Interface

Clustering RabbitMQ nodes

Now I have two independent, stand-alone RabbitMQ nodes running, it’s time to cluster them. Firstly the Erlang cookie needs to be copied from the first node to the second, which allows them to join the same cluster.

IMPORTANT: Make sure both nodes can resolve each other using their short names (e.g. vcd-rmq-1 and vcd-rmq-2). If they cannot, create entries in the HOSTS file to ensure that they can.

On the first node only (vdc-rmq-1)

Read the Erlang cookie from the file:

cat /var/lib/rabbitmq/.erlang.cookie

Copy the cookie contents (e.g. “FAPNMJZLNOCUTWXTNJOG”) to the clipboard.

On the second node only (vcd-rmq-2)

Stop the RabbitMQ service:

Systemctl stop rabbitmq-server

Then replace the existing cookie file with the cookie from the first node

Echo “FAPNMJZLNOCUTWXTNJOG” > /var/lib/rabbitmq/.erlang.cookie

Start the RabbitMQ service

Systemctl start rabbitmq-server

Stop the RabbitMQ app and reset the configuration:

Rabbitmqctl stop_app

Rabbitmqctl reset

Join the second node to the first node:

rabbitmqctl join_cluster rabbit@vcd-rmq-1

Then start the RabbitMQ app:

rabbitmqctl start_app

Validate the cluster status, using rabbitmqctl cluster_status, or by refreshing the management interface:

RabbitMQ Cluster Status

Configuring RabbitMQ for vCloud Director

Queue HA Policy

Now that the RabbitMQ nodes are clustered, we can configure the Queue mirroring with a HA policy. The below command creates a policy called “ha-all”, which applies to all queues (matching “”), then sets the ha-mode to “all” (replicate to all nodes in cluster) and the ha-sync-mode to “automatic” (if a new node joins, sync automatically). You can read more about RabbitMQ HA configuration here

rabbitmqctl set_policy ha-all “” ‘{“ha-mode”:”all”,”ha-sync-mode”:”automatic”}’

Create a Topic Exchange

Using the RabbitMQ management interface, log on with the “vcloud” user created earlier and select the “Exchanges” tab. Expand the “Add a new exchange” box and enter a name for the exchange. The remaining settings can be left at default. Once the new Exchange has been created, you can see that the “ha-all” policy has applied to it.

Create Topic Exchange

Configuring the RabbitMQ Load Balancer

The final configuration step is to load balance the two RabbitMQ nodes in the cluster – as described in the opening of this post, this will steer the publisher (vCloud Director) and subscriber (vRealize Orchestrator) to the node with the active queue.

I will be configuring an NSX-T load balancer, on the Tier-1 router that all the vCloud Director components are connected to. However the basic configuration should apply across most load balancer vendors. The load balancer should direct all traffic to vcd-rmq-1, unless the health check API does not return the expected status.

  • Virtual Server
  • (vcd-rmq.definit.local)
  • Layer 4 – TCP 5672
  • Server Pool
  • Round Robin (though in reality, it’s active/standby)
  • (vcd-rmq-1)
  • TCP 5672
  • Weight 1
  • Enabled
  • (vcd-rmq-2)
  • TCP 5672
  • Weight 1
  • Enabled
  • Backup Member (used if the other member goes down)
  • Health Check
  • Request URL  /api/healthchecks/node
  • HTTP 15672
  • Header (basic authorisation header)
  • Response status: 200
  • Response body: {“status”:”ok”}

Next Steps

Later, once the vCloud Director installation is completed, I will configured vCloud Director to use notifications to this RabbitMQ cluster.

Lab Notes – vCloud Director 9.1 for Service Providers – Part 3: NFS Server Installation

This series was originally going to be a more polished endeavour, but unfortunately time got in the way. A prod from James Kilby (@jameskilbynet) has convinced me to publish as is, as a series of lab notes. Maybe one day I’ll loop back and finish them…


I’ve deployed a CentOS7 VM from my standard template, and configured the prerequisites as per my prerequisites post. Updates, NTP, DNS and  SELinux have all been configured. I have added a 200GB disk to the base VM, which has then been partitioned, formatted and mounted to /nfs/data – this will be the share used for vCloud Director.

Install and enable the NFS server

Installing and configuring an NFS share is a pretty common admin task, so it doesn’t require a lot of explanation (I hope!)

Install the packages:

yum install nfs-utils rpcbind

Enable, and start the services:

systemctl enable nfs-server

systemctl enable rpcbind

systemctl enable nfs-lock

systemctl enable nfs-idmap

systemctl start rpcbind

systemctl start nfs-server
systemctl start nfs-lock
systemctl start nfs-idmaptouch

Configure the NFS Export (Share)

Once the services have been configured I add a configuration line to /etc/exports to export the mount (/nfs/data), allow access from the NFS subnet ( with the required settings for vCloud Director.

echo “/nfs/data,sync,no_root_squash,no_subtree_check)” >> /etc/exports

The following command will load the /etc/exports configuration:

exportfs -a

Finally, open the firewall ports to allow NFS clients to connect:

firewall-cmd –permanent –zone=public –add-service=nfs
firewall-cmd –permanent –zone=public –add-service=mountd
firewall-cmd –permanent –zone=public –add-service=rpc-bind
firewall-cmd –reload

Next Steps

Now that the NFS share is in place, I can move on to the next supporting service for vCloud Director – RabbitMQ. The NFS share will be mounted to the vCloud Director cells when they are installed later.

Lab Notes – vCloud Director 9.1 for Service Providers – Part 5: vRealize Orchestrator Cluster

This series was originally going to be a more polished endeavour, but unfortunately time got in the way. A prod from James Kilby (@jameskilbynet) has convinced me to publish as is, as a series of lab notes. Maybe one day I’ll loop back and finish them…


PostgreSQL server deployed and configured

Two vRO 7.4 appliances deployed

Before powering them on, add an additional network card on the vcd-sql network

Power on the VM and wait until it boots, then log onto the VAMI interface (https://vcd-vro-[1-2]:5480) and configure the eth1 interface with an IP address on the vcd-sql subnet

vRO eth1 interface

Configure the NTP server

vRO NTP Server

Configuring the first vRO node

Log onto the Control Centre for the first node https://vcd-vro-1.definit.local:8283/vco-controlcenter

Select the deployment type as standalone, and configure the load balancer name.

vRO Install type

Select the vSphere authentication provider, and accept the certificate.

vRO vSphere Authentication

Enter credentials to register with vSphere

vRO Auth Credentials

Select the an Administrators group to manage vRO

vRO Admin Group

Configure the remote database connection

vRO Remote PostgreSQL DB

After a couple of minutes, the vRO server will have restarted and I can progress to the second node – check this has happened by going to the Validate Configuration page and waiting for all the green ticks!

vRO Configuration Validation

Configuring the second vRO node

Select Clustered Orchestrator from the deployment page, and enter the details of the first vRO node

vRO Clustered Orchestrator

Wait for the second node to restart it’s services (~2 minutes again) to apply the configuration. Once the configuration has applied, you should see both nodes in the Orchestrator Cluster Management page

vRO Cluster completed

Load Balancing the vRealize Orchestrator Cluster

I will be configuring an NSX-T load balancer, on the Tier-1 router that all the vCloud Director components are connected to. However the basic configuration should apply across most load balancer vendors.

Virtual Servers


  • IP address: (vcd-vro.definit.local)
  • Port: Layer 7 HTTPS 8281
  • SSL Offload


  • IP address: (vcd-vro.definit.local)
  • Port: Layer 7 HTTPS 8283
  • SSL Offload

Server Pool

  • Members:,
  • Algorithm: Round Robin

Health Check


  • URL:  /vco/api/healthstatus
  • Port: 8281
  • Response: HTTP 200


  • URL:  /vco-controlcenter/docs/
  • Port: 8283
  • Response: HTTP 200

Next Steps

Later, once the vCloud Director installation is completed, vRealize Orchestrator will be configured for “XaaS” extensibility, as well as being hooked in as a subscriber to the vCloud notifications on the RabbitMQ cluster.

Lab Notes – vCloud Director 9.1 for Service Providers – Part 2: PostgreSQL Installation

| 10/07/2018 | Tags: , , , , , ,

This series was originally going to be a more polished endeavour, but unfortunately time got in the way. A prod from James Kilby (@jameskilbynet) has convinced me to publish as is, as a series of lab notes. Maybe one day I’ll loop back and finish them…

Installing PostgreSQL 10 Server

The base OS for the PostgreSQL server is CentOS7, deployed from the same template and with the same preparation as detailed in the prerequisites post.

Install PostgreSQL and configure

Add the correct repository (OS and processor) for the base VM – for my CentOS7 64-bit installation, based on the PostgreSQL web site. I used the following command:

rpm -Uvh

Install PostgreSQL server and client tools:

yum install -y postgresql10-server postgresql10

Change the default postgres user password

passwd postgres

Then initialise PostgreSQL

/usr/pgsql-10/bin/postgresql-10-setup initdb

Finally, start, enable and validate the service:

systemctl start postgresql-10

systemctl enable postgresql-10

systemctl status postgresql-10

Create the vCloud Director and vRO Database

To create a database for vCloud Director to use, switch to the postgres user and open the psql command line:

sudo -u postgres -i


Then create the databases and users required – one for vCloud Director, and one for the vRealize Orchestrator cluster:

create user vcloud;
alter user vcloud password ‘VMware1!’;
alter role vcloud with login;
create database vcloud;
grant all privileges on database vcloud to vcloud;

create user vro;
alter user vro password ‘VMware1!’;
alter role vro with login;
create database vro;
grant all privileges on database vro to vro;

Quit psql with \q, then exit back to the root prompt.

Configure remote PostgreSQL access

In order to allow remote access from the vCloud Director Cells, and vRealize Orchestrator, we need to add some configuration to the PostgreSQL configuration files.

These two commands add a line to the pg_hba.conf file, allowing the user vcloud access to the database vcloud, and the user vro to access the database vro from the vcd-sql subnet. You could specify individual hosts to increase security, but I’m going to be using the NSX distributed firewall to secure these connections too, so the subnet will suffice.

echo “host vcloud vcloud md5” >> /var/lib/pgsql/10/data/pg_hba.conf

echo “host vro vro md5” >> /var/lib/pgsql/10/data/pg_hba.conf

By default, PostgreSQL will be listening on it’s internal loopback address. To configure PostgreSQL to listen on all addresses, the following lines need to be added to the postgresql.conf file:

echo “listen_addresses = ‘*'” >> /var/lib/pgsql/10/data/postgresql.conf
echo “port = 5432” >> /var/lib/pgsql/10/data/postgresql.conf

Finally, open the host-based firewall to allow in-bound connections from the same two IP subnets:

firewall-cmd –permanent –zone=trusted –add-source=
firewall-cmd –permanent –zone=trusted –add-port=5432/tcp
firewall-cmd –reload

Restart PostgreSQL

Systemctl restart postgresql-10

Configure PostgreSQL Performance Tuning

For production deployments, there are some recommended tuning settings specified in the following KB. These settings are specifically tuned for the size of PostgreSQL server deployed in my lab, so I have implemented them –

Testing Remote Access

In order to validate the PostgreSQL configuration, database setup, network, and firewall configuration, connect to the PostgreSQL database from one of the vCloud Director cell VMs to ensure access:

vRealize Lifecycle Manager 1.2 VC data collection fails when NSX-T hostswitches are in use

| 18/04/2018 | Tags: , , , ,

vRLCM LogoWhen vRealize Lifecycle Manager 1.2 was released recently, I was keen to get it installed in my lab, since I maintain several vRealize Automation deployments for development and testing, as well as performing upgrades. With vRLCM I can reduce the administrative overhead of managing the environments, as well as easily migrate content between environments (I’ll be blogging on some of these cool new features soon).

However, I hit a snag when I began to import my existing environment – I couldn’t get the vCenter data collection to run.

Data Collection Failed (more…)

Three Tier App for vRealize Automation

One question I’m asked quite a lot is what I use for a 3-tier application when I’m testing things like NSX micro-segmentation with vRealize Automation. The simple answer is that I used to make something up as I went along, deploying components by hand and generally repeating myself a lot. I had some cut/paste commands in my note application that sped things up a little, but nothing that developed. I’ve been meaning to rectify this for a while, and this is the result!

A lot of this is based on the excellent blog posts published on the VMware HOL blog by Doug Baer. Doug wrote five parts on creating his application on Photon OS and they’re well worth a read (start at part 1, here). I have changed a few things for my vRA Three Tier App, and some things are the same:

  • I’m using CentOS7, as that’s what I see out in the wild with customers (RHEL7) and I am most familiar with
  • The app itself is the PHP MySQL CRUD Application from Tutorial Republic
  • The DB tier uses MariaDB (MySQL) not SQLite
  • The App tier is an Apache/PHP server
  • The Web tier is still NGINX as a reverse proxy
  • I am including NSX on-demand load balancers in my blueprint, but you don’t actually need them for single-VM tiers
  • Finally, I want to be able to deploy my 3-tier application using vRA Software Components (though you can also use startup scripts in the customisation spec)

Based on this, my final application will look something like the image below, with clients connecting to the NSX load balancer on HTTPS/443, multiple NGINX reverse proxy servers communicating with the NSX load balancer on HTTP/8080, which is in front of multiple Apache web servers running the PHP application which all talk to the MySQL databased back end over MySQL/3306.

Three Tier App

When in use, the application looks like this:


NSX 6.x Network Communications Diagram

| 26/01/2018 | Tags: , , , ,

There are a few NSX Communications network diagrams floating around, but none have really displayed the info in a way I found to be clear or complete enough. To that end, I have been working on a diagram that covers as much of the communications between NSX Components as I can. I’ve currently only covered single site NSX (not Cross vCenter) but I’ll publish an updated version soon including that.


vRealize Automation 7.3 and NSX – Micro-segmentation strategies

vRealize Automation and NSX integration has introduced the ability to deploy multi-tiered applications with network services included. The current integration also enables a method to deploy micro-segmentation out of the box, based on dynamic Security Group membership and the Service Composer. This method does have some limitations, and can be inflexible for the on-going management of deployed applications. It requires in-depth knowledge and understanding of NSX and the Distributed Firewall, as well as access to the Networking and Security manager that is hosted by vCenter Server.

For customers who have deployed a private cloud solution using vRealize Automation, an alternative is to develop a “Firewall-as-a-Service” approach, using automation to allow authorised end users to configure micro-segmentation. This can be highly flexible, and allow the delegation of firewall management to the application owners who have intimate knowledge of the application. There are disadvantages to this approach, including significantly increased effort to author and maintain the automation workflows.

This blog post describes two possible micro-segmentation strategies for vRealize Automation with NSX and compares the two approaches against a common set of requirements.

This post was written based on the following software versions

Software Component Version (Build)
vRealize Automation 7.3 (5604410)
NSX 6.3.5 (7119875) – 6.4
vSphere 6.5 Update 1d (7312210)
ESXi 6.5 Update 1 (5969303)

These are some generic considerations when deploying micro-segmentation with vRealize Automation.

  • An application blueprint is designed to be deployed multiple times from vRealize Automation, the automation shouldn’t break any micro-segmentation or firewall policy when that happens.
  • vRealize Automation blueprints can scale in and out – this should be accommodated within the micro-segmentation strategy to ensure that required micro-segmentation is the same as implemented micro-segmentation.
  • vRealize Automation is a shared platform, so the micro-segmentation of one deployment should be limited in scope, but should also consider intra-deployment communications between applications, for example, of the same business group or tenant.

Application XYZ requirements

For illustration purposes, an example 3-tier application deployment is shown below “Application XYZ“. It consists of a Web, App and DB tier and a load balancer for the Web and App tiers.

Application XYZ Allowed Flows

Application XYZ Allowed Flows