Creating a Windows Server 2008 Microsoft Cluster Service SQL Active/Passive Cluster on a single ESXi 4.0 Server
Configuring the Virtual Environment and Virtual Machines
Note – this configuration will work for ESXi 4 upwards due to the server 2008 MSCS requirement for persistent SCSI-3 reservations.
The first step is to create a new vSwitch for the host-only cluster heartbeat network, don’t assign any network adaptors to the switch as it’s going to be local only.
Create a new virtual machine with a single hard disk. For the purposes of this test, I’ve assigned 2 vProcessors, 1GB RAM, 30GB drive for the OS, 1 vNIC in the default vSwitch0.
Add a second vNIC and assign it to the cluster network vSwitch created in step 1.
Install Windows Server 2008 R2 Enterprise and all the Windows Updates, for the example I’ve named it SQLCluster01.
Clone the server and rename the new one to SQLCluster02. In ESXi you can’t clone, so shut down the first server, copy the files to a new folder and right click the VMX file to add it to the inventory. When you boot it the first time VMware will ask if it’s been moved or copied – select copied.
Create a disk for use as the Quorom, this needs to be shared and since I’m using ESXi with local storage only it must be “eagerzeroedthick”. To do this I have to use the unsupported mode in ESXi (Alt+F1, type unsupported and then your root password) and use the vmkfstools command to create it (vmkfstools –c <size> –d eagerzeroedthick –a lsilogic /vmfs/volumes/<datastore>/<folder>/<disk>.vmdk)
Add the new disk to SQLCluster01 using a new SCSI virtual controller (different from the current controller, e.g. my first HD is on SCSI 0:1, the Quorum is on SCSI 1:0)
Check that the new SCSI controller is set to LSILogic (it is for Server 2008 by default) and set the SCSI Bus Sharing to Virtual.
Add the Quorum disk to the second virtual machine, using the same settings.
Edit the .vmx file for both servers, adding in the following lines (edit for your SCSI controller):
scsi1:0.mode = "independent-persistent"
scsi1:0.shared = "TRUE"
Create a disk for some shared storage for the cluster too, it will be needed for the DTC application as well as the SQL server – in a production environment you may want to separate logs and data, but for my test, I’m just adding another two 10GB disks. Use the same process as for creating the Quorum disk.
Syncing email, calendar and tasks over a laptop, desktop and iPhone
In the past, I would often say to my wife, “if it’s not in Outlook, it isn’t going to happen”. Increasingly it’s “if it’s not on my iPhone, it’s not going to happen”. The fact is that I can’t actually remember all the things that I need to do each day, I need reminding!
I spend perhaps 8 hours a day at my work PC, maybe 2 hours a day on my home laptop and my iPhone is with me pretty much 24/7 – all of which are both data sources, and data endpoints. They all remind me to do things. To add a bit more complication to the mix, some things are personal, some things are work related.
So, to summarise, I want email, calendaring and to-do/tasks on my desktop, laptop and iPhone, and I want to be able to add/edit/delete for any of them.
[more]
Step 1 – Email
My personal email is downloaded by POP3 to Gmail from my ISP’s (GoDaddy) email server. I use Outlook 2007 on both my Laptop and Desktop to connect via IMAP to my Gmail account. Both use my ISP’s SMTP server to send email. I also configured Gmail to send via GoDaddy's SMTP server, this allows me to send from my personal address rather than my Gmail address. Email is accessible from my iPhone via the Exchange server protocol (Gmail Sync). Since all of these access email on the Gmail storage, when an email is deleted/moved/replied to on any platform, it stays up to date.
Step 2 – Calendar
Once again Google is the central repository for the data, using Google Calendar Sync to synchronise my calendar on both my Laptop and Desktop Outlook. On my Desktop, Google Calendar Sync updates the corporate Exchange account. Again the iPhone calendar syncs over the Gmail Sync/Exchange protocol to Google directly.
Step 3 – To-do/Tasks
This one is the most difficult and I’ve not yet resolved it fully. Google do have a Tasks app, but it doesn’t have a sync tool. My corporate Exchange server has tasks, but I have no way of syncing it with my Laptop. At the moment I am using the Exchange tasks which is obviously sync’d with my Desktop Outlook. I’m also using a free app called IMLite on the iPhone to access the tasks on the Exchange, but it’s read only.
It’s easier to view a diagram!
Other things to note
- All the connections are over SSL, so they’re secure – that’s really important because it’s personal information and you don’t need just anyone getting it!
- I chose Gmail over other online hosts because of the storage (over 7.4GB and growing), because it hosts my calendar and tasks, and is easier to set up to SEND email from my SMTP server.
- I know Gmail is ad supported – but if you access via IMAP/Exchange protocols, you’ll never see them.
- I’d like to be using Google tasks and sync them with my Outlook, but as yet I’ve not found a way to do this (c’mon Google, release the app!)
- My iPhone is sync’d to my Laptop via iTunes, but only for media and contacts.
Finally, I’m looking at my options for photo sync (or online storage) but it’s got to be high res, I’m also looking at document sync, but I’m pretty sure Google has that nailed too. I much prefer having it all under one roof.
Any comments, ideas, suggestions, drop a comment below!
Cisco Qualified!
As is normally the case when I’m studying, I haven’t had time to post much on here lately. I’ve been studying to pass the ICND1 exam (snappily titled “Interconnecting Cisco Network Devices Part 1”)
I’m really pleased to say that neglecting this site paid off, or rather the study did – I passed with a score of 930! It was a LOT harder than I had expected, I thought I’d walk out after 20m! It does now mean that I am CCENT. I’ll be taking the ICND2 exam early in the new year which will move me up to CCNA.
Also in the exams category, I’m taking a beta exam “PRO: Design & Deploy Messaging Solutions with Microsoft Exchange Server 2010”. Another snappy title and another bundle of fun!
Sam
Migrating VMware Virtual Infrastructure 3 HA Cluster to vSphere 4 – Stage 1: vCenter Upgrade
I'm currently in the process of migrating a 2-host High Availability cluster of ESX 3.5u4 servers to vSphere 4. This is going to come in 3 distinct stages: Stage 1 is to upgrade VirtualCenter Server 2.5 to vCenter 4, which I am going to cover today. Stage 2 is to upgrade each host, and will be covered as I do it. Stage 3 is the upgrade of the Virtual Machines to the latest VMware Tools and then the new VM hardware.
So to start, I'll outline the process:
- Download the vSphere vCenter 4 installer from VMware (~1.8GB).
- Download your updated licensing for vSphere.
- Back up your VirtualCenter server.
- Run the installation.
-
I'm not going to run through the download of the installer or licensing, if you're not sure how to do that, probably best not to do the rest.
Backing up VirtualCenter Server
My VirtualCenter server is installed on a Virtual Machine, so this makes things a lot simpler – I'll just take a snapshot to start. Being a belt-and-braces kind of situation (live HA cluster), I'm also going to do the database and configuration backup too.
Databases - I'm using SQL Server 2005 express which is supported for vSphere vCenter, so there will be no database upgrades, however the schema will be changed. First off, I've connected to SQL with SQL Management Studio and run a full backup. As I have VMware Update Manager installed too, I'm backing up that database as well.
Configuration file – Make a copy of your vpxd.cfg file, which is stored in the C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter folder.
SSL Certificates – In the same folder as the vpxd.cfg file there's a folder called SSL, which you'll want to backup too.
If you're not using integrated authentication for the database access, you need to ensure you have the user name and password for the DB access.
Once all that's gathered together and safely backed up, you can move on to the installation.
Installing vSphere vCenter
Open services.msc and stop the VMware VirtualCenter Server service.
Insert your vCenter installation CD, the installer pops up:
Notice it's detected the earlier version of vCenter server and is going to upgrade.
Enter DB user details, or leave blank if you're using integrated authentication like me.
If you have any plug-ins installed (e.g. VirtualCenter Update Manager, or Converter) it will let you know that they need to be up to date too.
Select to upgrade the vCenter Server database, and tick that you've backed up the database and SSL folders.
Select the account that you want to use to run the vCenter Server service.
Configure some ports, I've left them as defaults.
Finally, install.
It will run a DB upgrade, and various other uninstall/upgrades.
At this point I sat and waited…and waited…and waited. SQL server was chewing 70-80% processor, it was progressing, just slowly.
Eventually, it finished and the server settled down. I ran through the upgrade of Update Manager and Converter Enterprise, all click and go.
Stage 1 complete!
MOSS 2007 – Alternate Access Mapping authentication fails
If you have an Alternate Access Mapping configured for a MOSS 2007 site with Integrated Authentication you might find that you get prompted for the DOMAIN\UserName and Password. After 3 attempts you get to a HTTP 401 error.
This can be resolved by following the steps in MS KB 896861
HTH,
Sam
The requested Storage VMotion would move a virtual machine’s disks without assigning the virtual machine a new home, but such a move is not supported on the source host
I'm migrating some hosts off of an older storage LUN, but when I drag the disk to the new Datastore with the SVMotion plug-in the job fails with the following error:
The error occurs because the virtual disk cannot be moved without moving the source files, the .vmx, .vswap etc. Simply drag the entire VM, rather than the virtual disk to the new Datastore.
If you're trying to move a 2nd, 3rd or nth disk and you get this error, drag the entire VM as per above over to the new Datastore, once that's completed, go back in to SVMotion and drag the whole VM across again, only this time before you apply, drag the nth disk back to the new Datastore.
ESX 3.5 snapshots of disks on different storage are stored with the VM files
A.K.A Why not to use snapshots
I ran into a slightly confusing problem today - our SQL servers are all created with 4 disks on 4 separate LUNs (System, Swap, SQL Data and SQL Logs). When viewing the server through Virtual Center I couldn't see all of the LUNs, just the System LUN. It's not a major problem as the VM can see the storage, but a little annoying when you have to remember what LUN the disks are on.
Slightly more distressing was the fact that the System-LUN was running out of space - fast. A LUN that should have had about 150GB free was running dangerously low. On investigation I found various snapshot files were being stored in with the System-LUN, which is where the VM's VMX, vswap etc are situated. These were the snapshot delta files of the additional disks, which were on other storage! This isn't first apparent at first as the disk snapshots have been named sequentially by ESX, so a VM with 4 disks on separate LUNs will in fact create 4 snapshot files on the SYSTEM-LUN named VM01-00001.vmdk, VM01-00002.vmdk, VM01-00003.vmdk and VM01-00004.vmdk. 00001 is for the System disk, 00002 is for the Swap disk etc etc. This means that the IO on that LUN has been multiplied, and the storage space is shrinking very rapidly.
A little more digging and it seems that this is by design - snapshots are not meant to be kept for very long, and I think VMware made a deliberate decision to make it difficult to do so. Any virtual disks created for a VM, lets call it VM01, were named VM01.vmdk. When additional virtual disks were created through vCenter on a different LUN, they were still named VM01.vmdk - there's no conflict because they're in different locations. However, when vCenter takes a snapshot it places them with the original disk, and because it's got the same name as the existing disk it starts to enumerate them.
This is bad for a number of reasons - most prominent of which is that if the snapshot file grows large, vCenter does not handle the commit well. In fact, neither does ESX, but I'll get to that. vCenter will time out on any operation that takes more than 15 minutes, so a commit of a 10GB snapshot will look for all intents and purposes in vCenter like it's failed. On top of that, the enumeration of snapshot delta files can cause confusion as to which disk it actualy belongs to, and if that happens, commiting
We all know snapshots are performance killers, but the functionality they provide is not insignificant, and as with most things a balance has to be struck between the functionality and the performance.
So the headlines
- VMs created with disks on multiple LUNs in vCenter use the SAME DISK NAME (eg; for VM01 the disks were created in /vmfs/volumes/SYSTEM-LUN/VM01.vmdk, /vmfs/volumes/SWAP-LUN/VM01.vmdk etc etc).
- Mitigate this by creating disks using the vmkfstools and adding them to the VM or renaming the existing disks (see below).
- Snapshots cause ALL disk delta files onto the "system" LUN (i.e. where your VMX file is stored.) This is bad because a) it multiplies your I/O on that disk and b) you negate the benefits of storing on multiple LUNs.
- Mitigate this by deleting your snapshots. There's no other way*, don't try manually moving them or you will have problems.
- Commiting large snapshots takes time - LOTS of time - and can have a big performance hit on your server.
- Mitigate this by shutting down your VM first and commiting the disk using the vmware-cmd out of business hours. You can also merge the old disk and snapshots into a "new" disk, then shut down the VM and boot with the "new disk".
- vCenter has a hard coded 15m timeout.
- If you are doing a operation that will take longer than that, do it via the console!
* when I say there's no other way, I mean, there's no other practical way. There are methods to move the snapshot files to another LUN but they bring some serious problems with them.
Create a vmdk (virtual disk) using vmkfstools
- Log in to your server console.
- Type su - (to log in as root, enter root password, note the "-" to load the root user environment variables)
- Navigate to the storage that you wish to use. E.g. cd /vmfs/volumes/System-LUN/
- Create a new folder for the virtual disk: mkdir VM01
- Navigate to the folder: cd VM01
- Create the disk: vmkfstools -c <size> <filename> -a <buslogic|lsilogic>
- For help just type vmkfstools
Rename vmdk files using vmkfstools
- Shut down the VM in vCenter.
- Edit the VM settings and remove the disk you wish to change. Do not delete the file!
- Log in to your server console.
- type su - (to log in as root - enter root password, note the "-" to load the root user environment variables)
- use the command: vmkfstools -E /vmfs/volumes/<LUN>/<Server Name>/<Disk Name>.vmdk /vmfs/volumes/<LUN>/<Server Name>/<New Disk Name>.vmdk
-
Go back to the vCenter and re-add the disk, using the new name.
Commit your snapshots using vmware-cmd
- Log in to your server console.
- Type su - (to log in as root, enter root password, note the "-" to load the root user environment variables).
- Use the vmware-cmd -l command to list your VMs. Note the path to the VM you want to deal with.
- Remove all snapshots for a VM: vmware-cmd /path/to/vm/VM01.vmx removesnapshots
Teaming NICs with ESX 3.5 and Cisco Switches in an aggregate.
Here's the setup. We have a core switch of 2 Cisco 3750s, connected together for fault tolerance as a single logical switch; we also have several ESX 3.5 hosts with 4 Gigabit Ethernet NICs installed each. The Virtual Machines will all be on VLAN 8 (reserved for internal servers) and the VMKernel will be on VLAN 107 (reserved for VMKernel traffic like VMotion). I want to create a load balanced, fault tolerant aggregate of these four NICs over the Core Switch.
Configure ESX server's vSwitch
Configuring the vSwitch is actually pretty simple, but there are a couple of gotchas, so don't skip this bit! First thing to note is that if you are making changes to the vSwitch and the Service Console is on that vSwitch you can quite easily lock yourself out. Make sure you configure this correctly, first time! In this setup, I am adding all 4 NICs to vSwitch0, which will be the only vSwitch. I'll then use Port Groups to assign VLANs and Active/Passive configurations to the VMKernel/Service Console.
First things first then - assign the four NICs to the vSwitch. This is done in the Configuration Tab in VMware Infrastructure Client, then the Networking page. Edit the properties of your vSwitch, then select the Network Adaptor tab. Add all the NICs you wish to team in there (they may already be in there, depending on your setup). You should end up with something that looks like this (note that I've not assigned any VLAN yet):

Now you need to configure the NIC teaming, so edit the vSwitch Properties and under the Ports tab select the vSwitch. Click edit, and then go to the NIC teaming tab. Configure the teaming options like this:

That's the easy part over and done with! Time to move onto the Cisco!
Configuring the Cisco Core Switch
Firstly, we need to log on to the switch and enter enable mode; I'm going to assume you know how to do this - if not, you really shouldn't be attempting this setup!
Determine the switches trunk load balancing setup by using the command "show etherchannel load-balance". It should look something like this:

If the protocol is NOT src-dst-ip, then you won't be able to establish a trunk connection with the ESX server. If your protocol is not src-dst-ip, change it with the command "port-channel load-balance src-dst-ip". This now matches the "Route based on IP hash" setting you configured in ESX. Although ESX has a setting for MAC based hashing, as does the Cisco, I was unable to get it to work.
Moving on. You need to create a Port-Channel interface for the trunk (this is a virtual interface that binds the 4 GigabitEthernet interfaces together). As i've got other Port-channels in use for connections to other switches, I'm setting up port-channel 40. Move to config mode (conf t) and then enter the setup:
interface Port-channel40
description VMTEST01 Aggregate
switchport trunk encapsulation dot1q
switchport trunk native vlan 8
switchport mode trunk
switchport nonegotiate
spanning-tree portfast trunk
end
Description simply adds a description, "switchport trunk encapsulation dot1q" sets the encapsulation of the trunk to 802.1Q. "switchport trunk native vlan 8" means that any traffic without a VLAN tag will be automatically assigned to VLAN 8. "switchport mode trunk" obviously designates that we want a trunk, rather than access. "switchport nonegotiate" means that it will not attempt to negotiate the protocol, and be a static trunk, rather than LCAP or PGaP. "spanning-tree portfast trunk" causes a Layer 2 LAN interface configured as an access port to enter the forwarding state immediately, bypassing the listening and learning states (i.e. if the link goes down and then comes back up, it will do so quickly).
With the Port-channel configured, you now need to edit your GigabitEthernet ports and assign them to the Port-channel. For each port in the trunk, enter the following config (this example is port 8 on the master switch in my stack, hence 1/0/8):
interface GigabitEthernet1/0/8
description VMTEST01 VMNIC1
switchport trunk encapsulation dot1q
switchport trunk native vlan 8
switchport mode trunk
switchport nonegotiate
channel-group 40 mode on
spanning-tree portfast trunk
end
The difference between that and the Port-channel setup? "channel-group 40 mode on" is simply assigning the port-channel in static mode.
Once all four NICs are assigned you might have to wait a few minutes for every layer of the connection to settle down before the trunk comes up. To check the status of the etherchannel you can use the command "show etherchannel 40 summary", replacing the 40 for whichever number you assigned to your port-channel.
I hope this helps navigate the minefield that I found to be setting up the NIC teaming!
DCDIAG /TEST:DNS fails with errors regarding root hint servers
I recently resolved an ongoing DNS issue where the Active Directory Integrated DNS was loaded in both the Domain and the DomainDNSZones partition of AD - this is a separate issue and should be resolved differently. My problem when I tried to verify that the fixed DNS setup had propogated around my domain controllers, DC01 and DC02. DC01 kept failing "DCDIAG /TEST:DNS" with errors regarding the root hint servers. Googling about it was clear that a lot of people were suffering the same issue, but no article I read had correctly identified the solution.
The error looked something like this:
P:\>dcdiag /test:dns
Domain Controller Diagnosis
Performing initial setup:
Done gathering initial info.Doing initial required tests
Testing server: SITE\DC01
Starting test: Connectivity
......................... DC01 passed test ConnectivityDoing primary tests
Testing server: SITE\DC01
DNS Tests are running and not hung. Please wait a few minutes...
Running partition tests on : ForestDnsZones
Running partition tests on : DomainDnsZones
Running partition tests on : Schema
Running partition tests on : Configuration
Running partition tests on : DOMAIN
Running enterprise tests on : DOMAIN.com
Starting test: DNS
Test results for domain controllers:DC: DC01.DOMAIN.COM
Domain: DOMAIN.com
TEST: Forwarders/Root hints (Forw)
Error: Root hints list has invalid root hint server: a.root-se
rvers.net. (198.41.0.4)
Error: Root hints list has invalid root hint server: b.root-se
rvers.net. (192.228.79.201)
Error: Root hints list has invalid root hint server: c.root-se
rvers.net. (192.33.4.12)
Error: Root hints list has invalid root hint server: d.root-se
rvers.net. (128.8.10.90)
Error: Root hints list has invalid root hint server: e.root-se
rvers.net. (192.203.230.10)
Error: Root hints list has invalid root hint server: f.root-se
rvers.net. (192.5.5.241)
Error: Root hints list has invalid root hint server: g.root-se
rvers.net. (192.112.36.4)
Error: Root hints list has invalid root hint server: h.root-se
rvers.net. (128.63.2.53)
Error: Root hints list has invalid root hint server: i.root-se
rvers.net. (192.36.148.17)
Error: Root hints list has invalid root hint server: j.root-se
rvers.net. (192.58.128.30)
Error: Root hints list has invalid root hint server: k.root-se
rvers.net. (193.0.14.129)TEST: Dynamic update (Dyn)
Warning: Dynamic update is enabled on the zone but not secure
DOMAIN.com.Summary of test results for DNS servers used by the above domain contro
llers:DNS server: 128.63.2.53 (h.root-servers.net.)
1 test failure on this DNS server
This is not a valid DNS server. PTR record query for the 1.0.0.12
7.in-addr.arpa. failed on the DNS server 128.63.2.53DNS server: 128.8.10.90 (d.root-servers.net.)
1 test failure on this DNS server
This is not a valid DNS server. PTR record query for the 1.0.0.12
7.in-addr.arpa. failed on the DNS server 128.8.10.90DNS server: 192.112.36.4 (g.root-servers.net.)
1 test failure on this DNS server
This is not a valid DNS server. PTR record query for the 1.0.0.12
7.in-addr.arpa. failed on the DNS server 192.112.36.4DNS server: 192.203.230.10 (e.root-servers.net.)
1 test failure on this DNS server
This is not a valid DNS server. PTR record query for the 1.0.0.12
7.in-addr.arpa. failed on the DNS server 192.203.230.10DNS server: 192.228.79.201 (b.root-servers.net.)
1 test failure on this DNS server
This is not a valid DNS server. PTR record query for the 1.0.0.12
7.in-addr.arpa. failed on the DNS server 192.228.79.201DNS server: 192.33.4.12 (c.root-servers.net.)
1 test failure on this DNS server
This is not a valid DNS server. PTR record query for the 1.0.0.12
7.in-addr.arpa. failed on the DNS server 192.33.4.12DNS server: 192.36.148.17 (i.root-servers.net.)
1 test failure on this DNS server
This is not a valid DNS server. PTR record query for the 1.0.0.12
7.in-addr.arpa. failed on the DNS server 192.36.148.17DNS server: 192.5.5.241 (f.root-servers.net.)
1 test failure on this DNS server
This is not a valid DNS server. PTR record query for the 1.0.0.12
7.in-addr.arpa. failed on the DNS server 192.5.5.241DNS server: 192.58.128.30 (j.root-servers.net.)
1 test failure on this DNS server
This is not a valid DNS server. PTR record query for the 1.0.0.12
7.in-addr.arpa. failed on the DNS server 192.58.128.30DNS server: 193.0.14.129 (k.root-servers.net.)
1 test failure on this DNS server
This is not a valid DNS server. PTR record query for the 1.0.0.12
7.in-addr.arpa. failed on the DNS server 193.0.14.129DNS server: 198.41.0.4 (a.root-servers.net.)
1 test failure on this DNS server
This is not a valid DNS server. PTR record query for the 1.0.0.12
7.in-addr.arpa. failed on the DNS server 198.41.0.4Summary of DNS test results:
Auth Basc Forw Del Dyn RReg Ext
________________________________________________________________
Domain: DOMAIN.com
DC01 PASS PASS FAIL PASS WARN PASS n/a......................... DOMAIN.com failed test DNS
It looks pretty horrific - DNS is failing at a basic level! It turns out that the actual issue is an old version of DCDIAG.EXE. After several hours and a lot of head scratching I checked the versions of the DCDIAG.EXE (normally c:\Program Files\Support Tools\dcdiag.exe) and "Lo! And Behold!" the version was different. I downloaded the Windows Server 2003 Support Tools R2, uninstalled the old version (v5.2.3790.1800) and installed the new one (v5.2.3790.3959).
Et voila! The working test...
P:\>dcdiag /test:dnsDomain Controller Diagnosis
Performing initial setup:
Done gathering initial info.Doing initial required tests
Testing server: SITE\DC01
Starting test: Connectivity
......................... DC01 passed test ConnectivityDoing primary tests
Testing server: SITE\DC01
DNS Tests are running and not hung. Please wait a few minutes...
Running partition tests on : ForestDnsZones
Running partition tests on : DomainDnsZones
Running partition tests on : Schema
Running partition tests on : Configuration
Running partition tests on : DOMAIN
Running enterprise tests on : DOMAIN.com
Starting test: DNS
Test results for domain controllers:DC: DC01.DOMAIN.COM
Domain: DOMAIN.com
TEST: Dynamic update (Dyn)
Warning: Dynamic update is enabled on the zone but not secure
DOMAIN.com.Summary of DNS test results:
Auth Basc Forw Del Dyn RReg Ext
________________________________________________________________
Domain: DOMAIN.com
DC01 PASS PASS PASS PASS WARN PASS n/a......................... DOMAIN.com passed test DNS
Multi-homed Domain controller logs Event ID 1030 and 1058
I recently had an issue where a hosting environment was registering a lot of Netlogon Event 1030/1058 issues, being unable to find the Group Policy objects or download them. In this example, the server DC is the domain controller for DOMAIN.LCL.
Event Type: Error
Event Source: Userenv
Event Category: None
Event ID: 1030
Date: 10/09/2009
Time: 06:24:29
User: NT AUTHORITY\SYSTEM
Computer: DC
Description:
Windows cannot query for the list of Group Policy objects. Check the event log for possible messages previously logged by the policy engine that describes the reason for this. For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Event Type: Error
Event Source: Userenv
Event Category: None
Event ID: 1058
Date: 10/09/2009
Time: 06:24:29
User: NT AUTHORITY\SYSTEM
Computer: DC
Description:
Windows cannot access the file gpt.ini for GPO CN={31B2F340-016D-11D2-945F-00C04FB984F9},CN=Policies,CN=System,DC=DOMAIN,DC=LCL. The file must be present at the location <\\DOMAIN.LCL\sysvol\DOMAIN.LCL\Policies\{31B2F340-016D-11D2-945F-00C04FB984F9}\gpt.ini>. (Windows cannot find the network path. Verify that the network path is correct and the destination computer is not busy or turned off. If Windows still cannot find the network path, contact your network administrator. ). Group Policy processing aborted. For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
On the affected machines, when navigating to \\DOMAIN.LCL there were no shares available, however navigating to \\DC shows the NETLOGON and SYSVOL shares. Pinging DOMAIN.LCL and then the DC showed that the IP addresses were not the same as expected, DOMAIN.LCL was resolving to the backup network, whereas DC was resolving to the servers LAN IP.
I checked the DNS records for the server, which were correct. Investigating the adaptor binding settings under Control Panel > Network Connections > Advanced > Advanced Settings showed that the backup network's adaptor was first in the list. I moved the adaptor for the LAN to the top of the list and OK'd my way out. I restarted the NETLOGON service and the issue was solved.
Windows servers have never been particularly good at being multi-homed, especially domain controllers. My advice comes from some bitter experience...
-
If you have multiple network adaptors for extra bandwidth/redundancy/resiliance, then I would strongly recommend using Teamed adaptors, most of the major manufacturers' drivers and management software support it. This will eliminate any issues with multi-homing because as far as the server is concerned, it has one adaptor.
-
If you have multiple network adaptors for different network segments and you're using RRAS to route between them, I would strongly suggest not using a Domain Controller at all for this purpose. Better yet, buy a hardware router.
-
If you have multiple network adaptors for different purpose networks (e.g. a LAN, a backup network and an iSCSI network) then make sure you do the following:
-
Disable "File and Printer Sharing for Microsoft Networks" and "Client for Microsoft Networks" on all but the LAN adaptor.
-
Ensure that your LAN adaptor is the FIRST adaptor in the bindings in the advanced network settings.
-
Hope that helps!
