MEMCM – Recovery

As a member of the Truesec incident team I frequently find myself recovering Microsoft Endpoint Manager Configuration Manager (MEMCM). As the product has evolved over time so have the options for recovering the environment.

In my time as a consultant I’ve seen Hyper-V replicas, SQL Always On, and VMWare snapshots defined as the MEMCM recovery plan. However, what happens when those full VM backups are corrupted? What are the minimum requirements to bring configuration manager back online and how do you successfully plan and execute recovery.

Requirements

Lets start with understanding the bare minimum required for a successful recovery.

  • SQL Backup of the MEMCM Database
  • File Level Backup of the CD.LATEST Directory

It’s short list but also the honest minimum you need to START the recovery of your environment. You’ll also want a copy of your content sources. If you run the built in MEMCM site maintenance task which executes a backup you’ll find these two things are essentially what the backup task creates plus some other site settings. However, the built in backup does suffer from some short comings. The backup task tends to run longer, than a SQL backup job and tends to impact server availability when running.

Obtaining these two items is simple. If you are running the Ola Hallengren maintenance plan in your MEMCM environment you can use the included SQL backup job and backing up the CD.Latest directory is as simple as copy paste.

Starting Recovery

When you start recovery, remember the name of the server matters. You need to ensure the primary site server uses the same name. Additionally, it’s ideal to restore the SQL database to an instance of the same level. While you can restore to a newer version of SQL the better practice is to install the same version of SQL, restore your database, then upgrade SQL. Some common gotcha items include:

  • System Management ADSI Permissions
  • Discovery Configurations
  • SQL Certificate Issues (Dependent on environment config)
  • Site Server Permissions to Distribution Points (If not also being recovered)
  • Client Communications – (Certificates)

While doing a recovery is scary, it’s also an opportunity to fix things in your environment you otherwise wouldn’t have an opportunity to fix. Including things like splitting apart the locations SQL is stored.

When you perform a recovery start the process the same way you would a new build of configuration manager. Begin at the bottom and work your way up. Rebuild SQL first and ensure its stable, and all maintenance is implemented then move on to recovering the site itself.

Finishing Recovery

This sounds self explanatory. However, it’s important to recognize when the recovery effort is over. When in recovery you’ll be making a large number of changes. These changes often can’t wait for things like change control, or manager approval. You need to act swiftly and with confidence. As a general rule you can call recovery completed when:

  • Basic OSD works again – don’t try to be fancy
  • Basic Application installs work – don’t stress about complex apps
  • Basic Software Update Management

Recovery is a Marathon

Whatever the reason your environment needs to be recovered remember its a marathon not a sprint. When recovering an environment, remember you are attempting to reassemble what was likely years of work in a matter of hours or days. Having a plan on how you intend to run the marathon, and understanding when you can sprint or when you can coast is vital.


By Jordan Benzing

Jordan Benzing is a consultant for TrueSec and a Microsoft MVP. Jordan has spoken at several user groups including the TrueSec summit in Stockholm and the Midwestern Management Summit in Minneapolis, on subjects such as reporting, patching and that wonderful thing no one likes doing documentation. In the past Jordan has worked in several ConfigMgr environments including being the ConfigMgr team lead for an organization with 170K endpoints. Jordan is known for his "love" of patching, his PowerBI dashboard templates and various other contributions to the ConfigMgr community.

When he’s not working you can usually find him hanging out on the summoners rift in League of Legends. Jordan has been playing since season one and helped out with beta testing the original gameplay.

Leave a Reply

Your email address will not be published. Required fields are marked *