DefinIT

SCOM 2007 R2: Daily Health Check Script v2

MSFT-System-Center-logoA couple of months ago I posted the first version of my SCOM 2007 R2 Daily Health Check Script – here is version 2. It’s more than a little motivated by some friendly competition with a Microsoft PFE for SCOM, hopefully you’ll agree it’s a big improvement on the last version.

Updated for this version

  • Formatting changed to make it more readable and more compatible
  • Added “Report generated on <server>” to the top of the report
  • Management Server states reported as one section
  • Default MP check moved to beneath the Management servers
  • Agents in pending states moved to be with the Agent health states
  • Clarified “Unresponsive Agents” and “Agents reporting errors”
  • Management server alerts streamlined
  • Added top 10 alerts for the last 7 days, and added top alerters for each

(more…)

SCOM 2007 R2: Daily Health Check Script

An updated version of this script has been released: https://www.definit.co.uk/2012/05/scom-2007-r2-daily-health-check-script-v2/

MSFT-System-Center-logoI’ve been working with a Microsft SCOM PFE (Premier Field Engineer) for the last few months and part of the engagement is an environment health check for the SCOM setup. Based on this Microsoft recommend a series of health checks to for the environment that should be carried out every day. This is summarised as the following:

  1. Check the health of all Management Servers and Gateways
  2. Check the RMS is not in maintenance mode
  3. Review Outstanding Alerts
  4. Review Agent’s Health Status
  5. Review Backup Status
  6. Review any Management Group Alerts
  7. Review the Pending Management status
  8. Review Database Sizes (Operations, Data warehouse, ACS)
  9. Review Volume of Alerts
  10. Review Alert Latency
  11. Document any changes 

(more…)

Overriding the OpsMgr Exchange 2007 Test MAPI Connectivity Monitor for Recovery Storage Groups

| 05/01/2012 | Tags: , , ,

MSFT-System-Center-logoThe Test MAPI Connectivity monitor for the Exchange 2007 management pack will automatically generate a critical error for any Recovery Storage Groups you have on monitored Exchange Mailbox Roles. As these are generally temporary Storage Groups created for a recovery and then removed, you don’t want an alert – but manually adding an override for every time is not a great use of your time either. (more…)

Trouble with SCOM 2007 R2 Certificates? Validate the entire PKI path!

MSFT-System-Center-logoI learned something new today: SCOM 2007 R2 certificate based communications not only checks the validity of the certificate you use, but also the CA that issued it…let me expand:

Like many organisations there is a root CA (we’ll call it ROOTCA01), and then a subordinate CA (we’ll call that SUBCA01). OPSMGM01 has a certificate to identify itself and has certificates for ROOTCA01 and SUBCA01 in it’s Trusted Root Certificate Authorities.

The certificate to secure the connection between OpsMgr Gateway (OPSGW01) and the OpsMgr Management Server (OPSMGM01) is issued by SUBCA01 and is installed on OPSGW01, and to validate the certificate chain SUBCA01’s certificate is also installed in the Trusted Root Certification Authorities. Opening OPSGW01’s certificate and examining the Certificate Path tab shows the certificate is valid all the way up to the issuing CA – SUBCA01.

The connection will not work – OPSGW01 logs the following events:

Log Name:      Operations Manager
Source:        OpsMgr Connector
Date:          05/01/2012 10:18:28
Event ID:      21016
Level:         Error
Computer:      opsgw01.definit.co.uk
Description:   OpsMgr was unable to set up a communications channel to opsmgm01.definit.co.uk and there are no failover hosts.  Communication will resume when opsmgm01.definit.co.uk is available and communication from this computer is allowed.

Log Name:      Operations Manager
Source:        OpsMgr Connector
Date:          05/01/2012 10:18:25
Event ID:      20070
Level:         Error
Computer:      opsgw01.definit.co.uk
Description:   The OpsMgr Connector connected to opsmgm01.definit.co.uk, but the connection was closed immediately after authentication occurred.  The most likely cause of this error is that the agent is not authorized to communicate with the server, or the server has not received configuration.  Check the event log on the server for the presence of 20000 events, indicating that agents which are not approved are attempting to connect.

Log Name:      Operations Manager
Source:        OpsMgr Connector
Date:          05/01/2012 10:18:24
Event ID:      21002
Level:         Warning
Computer:      opsgw01.definit.co.uk
Description:   The OpsMgr Connector could not accept a connection from xxx.xxx.xxx.xxx:5723 because mutual authentication failed.

Log Name:      Operations Manager
Source:        OpsMgr Connector
Date:          05/01/2012 10:18:24
Event ID:      20067
Level:         Warning
Computer:      opsgw01.definit.co.uk
Description:   A device at IP xxx.xxx.xxx.xxx:5723 attempted to connect but the certificate presented by the device was invalid.  The connection from the device has been rejected.  The failure code on the certificate was 0x800B0109 (A certificate chain processed, but terminated in a root certificate which is not trusted by the trust provider.).

It’s the last event that led me to check the certificate chain for the SUBCA01 certificate, which was installed and trusted but did not validate up the chain to ROOTCA01. Installing the ROOTCA01 certificate resolved this issue.

SCOM 2007 DFS Backlog Monitoring – Distributing a RunAs account to only DFS replication members

The DFS monitoring tool in SCOM 2007 has some great features, which will replace many a custom VB script running in enterprises. As with a lot of Management Packs, to get the most out of it you need to have a dedicated RunAs account with local admin permissions on the servers you are monitoring (e.g. for the Backlogged Files reporting).

The easy (and wrong) option here is to go with the less secure option and distribute a RunAs account to ALL servers. There are lots of reasons why you wouldn’t want to distribute the credentials to every server in your SCOM installation – but just from a security standpoint, you shouldn’t do it! Selecting the “More Secure” option and distributing credentials only to servers which will require them is a much safer bet.

You can view the members of the DFS discovered inventory in the SCOM Console by going to the “Discovered Inventory” view and changing the target type to “Replication Member” – which is great: you can see all the Servers involved in the DFS replication topology. But there’s no easy way to add these to a RunAs credential to distribute.

To narrow it down to a short list, you can open a Operation Manager Shell prompt and  list any monitoring classes which have “DFS” in the name – there are about 6 or so:

Get-MonitoringClass | where {$_.Name –match “DFS”}

The one that matches my SCOM console view is “Microsoft.Windows.DfsReplication.ReplicationGroupMember” so I want to select all the monitoring-objects that match this discovery and export the “Path” (server name) to a csv file:

Get-MonitoringClass | where {$_.Name –match “Microsoft.Windows.DfsReplication.ReplicationGroupMember”} | get-monitoringobject | select-object Path | export-csv c:\DFS-Members.csv

I’ve not yet figured out how to add these to the RunAs account credential distribution via PowerShell, so I’m afraid it’s a manual process from here. To make it easier I opened the csv in Excel and filtered out duplicates (for servers with multiple DFS shares) before pasting the servers in individually to the distribution dialogue.

Once the RunAs account has been downloaded by the Agents, and if you’ve added it correctly to your “DFS Replication Monitoring Account” profile, you should start to see the Backlog Monitoring view beginning to populate.

Remote Installation of SCOM 2007 R2 Agent on Threat Management Gateway Servers

MSFT-System-Center-logoGetting a SCOM 2007 R2 SCOM agent on TMG is a useful way of monitoring TMG, especially with the SCOM TMG Management Pack – it’s not exactly “out-of-the-box” functionality though, with many sources I’ve read simply stating that it can’t be done. There are some half-working solutions I’ve seen, but nothing that worked for me.

The process involves simply opening the correct ports and protocols between the TMG servers and the SCOM management servers, which after a few attempts watching the live logs, I found.

(more…)

Using System Center Operations Manager 2007 R2 Audit Collection Services for remote, DMZ or workgroup servers

MSFT-System-Center-logoSCOM 2007 R2’s Audit Collection Services (ACS from now on) is very useful for meeting compliance (e.g. Sarbanes Oxley) and security audit requirements – working with financial companies often requires such compliance. It’s pretty simple to install in a domain environment – you run the installer to create a collection server, then activate the forwarder on the client servers.

When it comes to servers you really want to audit, those that are by definition more at risk from security breach because they are publicly accessible, it’s not so straightforward. Take for example that web server, or FTP host in your DMZ, certainly not domain joined and probably bombarded by daily brute force password attacks. Select the SCOM agent in the console and enable Audit Collection Services?

(more…)

Installing System Center Operations Manager 2007 R2 CU2

MSFT-System-Center-logo This should be a simple update of some hotfixes, but there were a few tripping points along the way that I had to stumble past. As reference I used the CU2 update page and I also a Kevin Holman technet article.

So, I’m going to assume that a) you’re installing the update for a reason, like one of the bugs it fixes and b) you have taken a backup of your OpsManager databases.

(more…)

Requesting SCOM 2007 Gateway or Agent Certificates for Server 2008 from a Server 2003 Enterprise Certificate Authority

This is a pretty specific set of instructions for a specific environment:

  • If
  • you are using Microsoft System Center Operations Manager 2007
  • and
    • you have a Microsoft Certificate Services 2003 Certificate Authority on your domain
  • and
    • you have non-domain Windows Server 2008 servers you wish to monitor or set up as a gateway server.

     

    Getting a certificate for either a Gateway Server or remotely monitored Server can be a touch vexing. If you’re installing on the same domain as the SCOM management server the security settings take care of themselves, not so for non-domain servers, which require mutual certificate authentication. The Gateway must trust the Domain CA and identify itself as trusted to the Management Server. I have bashed my head against this several times now, so I thought I’d make a precise blog post to cover the steps required!

    In this scenario, we will have 2 servers CA01, the Windows 2003 Certificate Authority, and Gateway01, the SCOM 2007 gateway. The certificate template for Operations Manager has been created on CA01 as per the documentation and is called “OperationsManagerCert”. On Gateway01 I have copied the Gateway installer to c:\SCOM\Gateway and the SCOM Tools to c:\SCOM\Tools. SCOM01 is our SCOM collection server.

    CA01: Navigate to https://ca01/certsrv and download the CA Certificate.

    Gateway01: Copy the CA Certificate to the c:\SCOM folder by whatever means you have. Open mmc.exe and add the Certificates Snap-in for the local computer account. Right click the Trusted Root Certification Authorities store and Import the CA01 CA certificate.

    image

    Gateway01: Open notepad and create a new certificate request file with the contents below. Name the file Gateway01.inf and save in c:\SCOM

    [NewRequest]

    Subject="CN=<FQDN of Gateway01>"

    Exportable=TRUE

    KeyLength=2048

    KeySpec=1

    KeyUsage=0xf0

    MachineKeySet=TRUE

    [EnhancedKeyUsageExtension]

    OID=1.3.6.1.5.5.7.3.1

    OID=1.3.6.1.5.5.7.3.2

    Gateway01: Open a command prompt as administrator and navigate to c:\SCOM, use certreq.exe to generate a certificate request:

    certreq –new –f Gateway01.inf Gateway01.req

    Gateway01: Open Gateway01.req in notepad and copy the contents to clipboard.

    CA01: Open https://ca01/certsrv and start a new advanced certificate request, create the certificate request using a base64 encoded CMC. Paste the data from Gateway01.req into the “Saved Request” box. Select your SCOM certificate template and click next. Save the response as a Base 64 encoded certificate.

    image

    Gateway01: Copy the certificate file over to c:\SCOM on Gateway01 by whatever method you have available. Open a command prompt with admin rights and approve the new certificate with certutil.

    certreq –accept Gateway01.cer

    Check that the certificate has been imported into the Computer/Personal store using mmc.exe.

    SCOM01: At this point you can either install your SCOM agent, or Gateway Server on Gateway01 if you are installing the Gateway Server like me, you need to first approve the Gateway using the Gateway Approval Tool. Open a command prompt as administrator and navigate to “c:\Program Files\System Center Operations Manager 2007” or wherever your SCOM install is. Copy the Microsoft.EnterpriseManagement.GatewayApproval.Tool.exe from Support Tools into the parent folder (it requires .dlls in that folder).

    Microsoft.EnterpriseManagement.GatewayApproval.Tool.exe ¬

    /ManagementServerName=SCOM01 /GatewayName=Gateway01

    Gateway01: Run the Gateway Server installer and enter the details of the Management Server and Management Group name. When that’s finished, you need to tell SCOM which certificate to use with the MOMCertImport.exe tool located in c:\SCOM\Tools

    MOMCertImport /SubjectName Gateway01.Domain.Lcl

    Give it a few minutes and you should be able to see the new gateway under Management Servers in the Administration console for SCOM. Remember to right-click, properties, security and allow the server to act as a proxy if it’s reporting for other servers.

    I use the same procedure to install Agents in my DMZ that don’t have access to the certificate services – likewise our production web servers isolated in their hosting environment.

    I hope this helps you, I know this is an article that I will be referring back to time and time again!