9 min read | 2626 words | 216 views | 0 comments
Like any organization with a good IT department, InterLinked has an Active Directory domain and an automated deployment solution for deploying computers, in this case, the excellent and free Microsoft Deployment Toolkit (MDT), the free alternative to SCCM (System Center Configuration Manager, or whatever Microsoft is calling it this week). Unlike most organizations with a good IT department, we're primarily still running Windows 7, for reasons that have already earned a blog post of their own.
The last MDT server that I set up happened to be on a Windows Server 2019 machine that was handy for that workload at the time. MD, along with WDS (Windows Deployment Services) has a pretty small footprint, so it's easy to add on to a multipurpose Windows Server that's already doing other things. For a while, this worked fine, for deploying Windows (Windows 7, in this case) to both physical machines (mostly Dell OptiPlexes) and virtual machines (on HyperV, running on that same Windows Server 2019).
However, recently I began provisioning an HP Proliant DL380p that had been procured for use as a new file server, but was juiced up enough (with 64 GB of RAM and 12 CPUs) that using it as only a file server seemed like a waste. It was going to run Windows Server, for Active Directory integration for file share permissions, so adding additional HyperV virtual machines to it made perfect sense to take advantage of the compute power available on the machine.
Unlike the aforementioned Windows Server, which was not really running on a "server" per se, but just an OptiPlex tower, I installed Windows Server 2008 R2 on this server, rather than Windows Server 2019. There were a few reasons for this. Chief among them was that I wanted to ensure there was good integration between the SAS RAID controller in the server and the operating system - since the server was released in 2014, Windows Server 2008 R2 was a given but newer Windows Server support wasn't as obviously apparent. Additionally, Windows Server 2008 R2 remains my preferred Windows Server OS for much the same reason that Windows 7 remains my preferred workstation OS, it's fast, stable, and bloat-free, which are important on servers. The irony was not lost on me that the domain controller running on a virtualized Windows Server 2008 R2 VM on the Server 2019 host was blazing fast while the host machine was eternally sluggish.
Server 2019 has been problematic in other ways as well. Initial attempts to set up a domain controller on Server 2019 had failed for some reason which, at the time, led me to just set up another Server 2008 R2 VM and promote it to a domain controller . Likewise, Server 2019 did not want to cooperate when I was setting up Routing and Remote Access for dial-up access; Server 2003, however, had no such issues. Needless to say, over the years, I've turned a cold shoulder to newer versions of Windows Server while embracing the tried and true Windows Server OSes from the years when Microsoft was releasing its best software. As Windows and Windows Server are only used internally here (with the vast majority of InterLinked's public facing services powered by Linux), security has never been a major consideration in continuing to use out of support software in production internally (shameless plug for our W2K child site, for all the retrocomputers reading this).
(To its credit, Server 2019 is still good for some things... recently, I connected a parallel POS printer to a Debian 12 server, which couldn't seem to be able to let me access the parallel port. I had no such issues on Windows based systems, so that printer is now connected to a Windows server (with the Linux server using smbclient to access it, since Windows doesn't natively support hosting a raw print server on port 9100... ah well, you win some, you lose some...))
So, the new server is up and running with Server 2008 R2 and working great, what next? Well, my first step was to deploy a Windows 7 VM on the machine as a proof of concept. Surely, this was a low bar for such a test. Well, it ended up opening a can of worms that took several days to resolve, and indirectly led to this blog post. Naturally, I planned to use MDT to deploy the VM. I PXE booted the VM, and started booting Windows PE. As soon as the "modern" Windows logo appeared, indicating it was now booting the WDS boot image, the VM BSOD'ed (blue screen of death). How anticlimactic.
While the stop error text was entirely unhelpful, it wasn't hard to guess what the issue was. A quick search confirmed that Server 2008 R2, for all its virtues, can't run anything much newer than Windows 7 or 8 era operating systems, which makes perfect sense. Since I had set up MDT initially on Server 2019, I naturally had used the most recent version of MDT and the most recent Windows 10 ADK at the time (which, for a time, was afflicted by a major bug that caused any BIOS machine to be misdetected as UEFI and fail deployment). Well, it didn't take much thinking to conclude that the boot image generated using the Windows 10 ADK would probably not load correctly in a VM running on Server 2008 R2.
This didn't phase me too much, the obvious solution seemed to be set up a new MDT deployment share running on Server 2008 R2 and, more importantly, using an older version of MDT and the Windows ADK/AIK in particular. Microsoft has not made this easy, as all the downloads for old versions of MDT are, of course, long purged from their website. Some days, if it weren't for the Internet Wayback Archive, I don't know how I would survive. I eventually found some links for MDT 2012 Update 1 that Wayback was able to regurgitate installers for me. The WAIK for Windows 7 is, amazingly, at the time of this writing, still available for download directly from Microsoft. We'll see how long that lasts.
Incidentally, you might be thinking that, even an older ADK/AIK was required, there was no need to use such an old version of MDT. While that might be true (there probably is a cutoff, since modern MDT seems to require a relatively current Windows 10 ADK), there were other reasons for selecting 2012. I had tried to deploy Windows XP in the past using recent MDT and found, to my dismay, that Windows XP was no longer support by current MDT releases. MDT 2012, on the other hand, will happily let you deploy Windows XP, and while neither Windows 10 nor Windows XP sees much usage around here, there is slightly more of a use case to need to deploy the latter than the former, on occasion.
There were no surprises getting MDT and the AIK installed. Afterwards is where the fun began. MDT, like most software, accounts for upgrades from older versions of MDT to newer ones. It doesn't account for downgrades from newer versions to older ones because, well, who would be crazy enough to do that, right? I proceeded cautiously, since it wasn't clear what the incompatibilites might be from taking a modern MDT deployment share and backporting it to a decade-older version. I first created a new deployment share, then closed Deployment Workbench and copied over applications, operating system files, and drivers, and then most of the files in the Control directory (which is where the task sequences reside). I proceeded to fire up Deployment Workbench and was immediately met by this angry message:
Luckily, I had noticed there was a Version file in the Control directory, so I figured that was probably the culprit. While MDT actually happens to back up most of the XML files in the Control directory when the deployment share is first created, the Version file, inconveniently, is not one of the files it backs up. Fortunately, I had backed up all the Control files before overwriting them, so I restored the original Control file from creating the deployment share, reopened Deployment Workbench, and was able to view the contents without issue:
This is where things began to get interesting. At this point, I went ahead and tried redeploying a Hyper VM running on Server 2008 (actually a sibling VM, as I had set up the new, er, older version of MDT in another VM running on the same server). No surprises, it worked. However, somewhere between updating the references in the deployment share from the old hostname to the new one and updating the deployment share and regenerating the boot images, I tried deploying another Windows 7 VM and it failed almost as soon as the task sequence started running. The first sign that something was wrong was when the task sequence lingered on this screen for several minutes, when usually it takes a fraction of a second:
Eventually, I looked back at the VM and was greeted by the sight of every deployment admin's worst nightmare:
While the above sort of result is expected when developing and testing, I had just migrated a production deployment share, so how had this happened? The errors didn't reveal much:
Naturally, I looked these up to see what the likely cause was. MDT deployment issues are so common that there are forums upon forums full of newbie sysadmins wondering how to decipher various errors spit out in the Deployment Summary window. Unfortunately, this error didn't yield many relevant results. The ones that were similar all had some specificity that wasn't present here.
Next, I took a look at the SMSTS log that the deployment had put in the deployment logs share, using CMTrace:
ZTIDiskpart.wsf exited with code 424... how much help was that? None, really. However, I hadn't initially realized that the full BDD log wouldn't be written to the log share until I actually closed the error window, at which time, the BDD log was dumped in full, and I was able to analyze that, which fortunately had much more detail:
Okay, now we were getting somewhere. It was clear that for some reason, immediately after running diskpart, MDT was unable to find the drive, or something like that. Still, this seemed to be a very niche issue. Practically nobody it seemed had ever encountered this issue before. Since this was a VBScript, I opened it up and looked at the part of the code where it was failing, but this didn't provide much insight, either.
Fortunately, I still had logs on the old MDT server from previous deployments there, so I opened those up side by side to compare. Here's the same place in a successful deployment:
From this, it's clear that MDT was able to find the drive here and it wasn't in the other case. It does seem a bit odd that it had assigned drive letter E at this point, but on other machines, it was C, as expected:
Since the community (in this case, some cursory searches on the World Wide Web) didn't have anything to add here, troubleshooting to pinpoint the cause of the problem was really all that could be done. I tried changing some of the references from the new MDT server back to the old one in case that had been it. I tried deploying on a Server 2019 HyperV instance which, unsurprisingly, failed in the same manner. I deleted the new deployment share and recreated it in a slightly different way. And, just as a sanity check, I tried deploying from the Server 2019 deployment share - a modern HyperV machine got past this stage with no issues while a Server 2008 R2 hosted VM again bombed to a BSOD in no time.
At some point, just before I was about to nuke the deployment share again and start over, I thought it was worthwhile to create a new simple task sequence and see if that worked. It did. This immediately narrowed the issue down to the task sequence, specifically the part that had to do with formatting the disks. Here is the original task sequence, as ported over from the newer MDT system, which was causing issues:
While it would appear that both the BIOS and UEFI steps are run, only one or the other is run, depending on whether the machine was detected to be BIOS or UEFI:
Given the BIOS/UEFI detection bug that had plagued newer versions of MDT for some time, I had wondered if this might be a similar issue. However, looking at the logs in CMTrace, that was quickly ruled out - it was indeed executing the BIOS step, not the UEFI step, and this was correct, given a Server 2008 R2 generation Hyper VM (and for the matter, all the machines that I deploy to are old enough to be BIOS, not UEFI).
However, the difference was clear when looking at the format step in the simple (but working) task sequence:
Of course, UEFI wasn't supported in this edition of MDT, so it makes sense that it would be intended for BIOS only without qualifying it as such. However, since the BIOS step had actually been the one getting executed, the difference couldn't be chalked up simply to that. Indeed, the actual format settings are also different. Disabling the original BIOS/UEFI-specific format steps in the production task sequence and replacing them with one that mirrored the settings of the default one here proved to be the fix.
With this minor modification of not adding a System Reserved or Recovery partition, the deployment succeeded without a single issue. Why this has worked for years and suddenly was causing issues in an older environment, I don't know and honestly, I don't really care, as long as it's working. Workstations don't need a recovery partition anyways since they can simply be reimaged.
One small amendment to the screenshot above though: when you click "Edit" on the partition in the "Volume" section at the bottom, the "Partition Name" is what you'll actually see as the name of the C: drive post-deployment, so don't call this "OSDisk" as it defaults to, but call it "Windows" to avoid triggering the OCD in your inner IT admin. Ask me how I learned the hard way.
In this article, I've detailed a couple gotchas here when porting a deployment share from MDT (with no version number, like many Microsoft products today) to MDT 2012 Update 1. I haven't analyzed who would want to do this and why, but you folks know who you are. For my part, I'm happy with the new (old) deployment share. While I've given up Windows 10 support, I've gained Windows XP support, so I think I've come out ahead.
Of course, in all seriousness, the major loss from this downgrade has been UEFI support, which as I mentioned, is not something that I need currently or that I anticipate will be needed anytime soon, since the entire fleet here support BIOS (and generally BIOS only). The exception would be HyperV Generation 2 VMs running on Server 2019, but as I anticipate phasing out that Server 2019's duties progressively and relying on Server 2008 R2 more, Generation 2 VMs won't even be an option anyways.
In hindsight, this was a relatively simple issue, but the few problems I encountered in downgrading a deployment share to MDT 2012 were anything but obvious from the errors received. However, given the default configurations of both environments, it seems likely that anyone performing a similar procedure would likely run into the same issue, so I'm writing this up so that there's at least one reference on the possible snags and snafus you may encounter when doing something like this, because hey, one reference is a lot better than one!
All right, back to the console now... the HyperV machines I just deployed using MDT are all finished now, so it's time to keep on rolling!
Log in to leave a comment!