It’s strange… but I’ve been able to finish reading a book cover-to-cover that’s not a book for my kids. It’s hard for me to get hooked into books and find the time to read them. I think the one I finished before this might have been “10 PRINT”, and that was over a year ago. This time, though, I completed the book on the story behind the creation of Windows NT: namely, “Showstopper!” in its short form, written by G. Pascal Zachary.

Reading the story of how Windows NT came to be was entertaining, as it is a story of the system itself and the dynamics between Dave Cutler, the original designer and lead for NT, and the other people involved in the project. I was shy of being 10 years old when Windows NT launched and I didn’t comprehend what was going on in the operating systems world and why this release was such a big deal. Reading the book made me learn various new things about the development process, the role of Microsoft in that era, and allowed me to settle some questions I’ve had over the years.

This article is a mixture of a book review and a collection of thoughts and reflections that the book evoked. Let’s begin because we have a lot of ground to cover.

Let’s create a brand new OS

Operating systems are pretty much a commodity these days—but you couldn’t take them for granted before. There was a time during the 1980s and 1990s when a multitude of computer architectures and operating systems thrived, each offering vastly different feature sets and levels of stability than the others. Yes, DOS and Windows were the most popular choices for personal computers, but there were other contenders such as OS/2, BeOS, QNX… as well as the myriad different commercial Unix derivatives. The open-source BSDs and Linux had just seen the light of day.

Microsoft’s bet to build NT in 1988, a brand new OS from scratch, was bold. There were good reasons for taking the risk, which centered around the desire to modernize DOS and to have first-class support for networks, but investing multiple years of R&D on a massive project like this was risky: others had tried and failed.

At the end of the day, Microsoft succeeded with NT. The development process took longer than anticipated, but the system finally shipped in 1993. Now, more than 30 years later, NT is behind almost all desktop and laptop computers in the world. The book claims that Cutler repeatedly told his team that they’d remember this time period as “the good old days”, and I think he was well damn right. As tough as that period must have been for everyone involved, they were writing history and setting the direction of personal computing for years to come.

Development pressure

Contrast this to the open-source BSDs and Linux, which were already a reality by NT’s launch in 1993. Linux and the original 386BSD were “hobby projects” for their creators. I’m convinced that, at the time, these people and their contributors didn’t imagine their toys as anything that would influence the world—even if, in retrospect, they kinda have: Linux is the basis for almost all mobile devices and servers, and macOS is derived from those original BSDs.

What I’m trying to get at is that there is something to be said about the different thrill of working on these projects. On the one hand, NT was in a race to ship the next revolutionary OS to power all personal computers, with intense pressure from OS/2 and other contenders, and with a design that went against established practice. On the other hand, Linux was in no such rush: its developers worked on it for their own entertainment, probably not grasping what was ahead of them.

Having been part of the NT team and of that era of computing must have been incredibly exciting. However, it was not all roses for this team. Even though the work of the team was exciting and the potential for impact was massive, the book also shows a clear picture of a really toxic culture: leaders screaming to their peers and reports; normalized long hours and destroyed families; cross-team distrust and shaming… not something you’d probably want to be involved in.

Dogfooding

Creating a new OS is fun but, unless you use the OS regularly, you will have little incentive to discover and fix sharp usability edges or certain classes of bugs. This is where dogfooding, or the practice of ~~eating your own dog food~~ using your own creations, becomes key. And due to the importance of this, the book has a whole chapter on dogfooding NT.

Windows NT 3.1 boot screen. Courtesy of WinWorld.

The mandate to dogfood NT after it passed a certain functionality threshold comes from Cutler himself and sounds awesome… and frustrating at the same time. I’ve previously dogfooded operating systems—I ran NetBSD and FreeBSD CURRENT for years while I was a contributor—and the infrequent crashes were annoying. Note, however, that these two systems were already past a certain stability point by the time I used them so, in general, they worked. I wonder what level of functionality the first dogfood versions of NT had but, based on what I grasp from the book, the answer is “not a lot”, which means the experience may have been quite a nightmare.

Note that these days, it seems silly to talk about “how this OS is more stable than this other one” because all modern OSes implement similar process and file protections, but stability couldn’t be taken for granted back then. The DOS-based Windows editions really were quite unstable, and OS/2 and Unix systems put them to shame. The design of NT was going to fix these issues for the Windows world, but it had to be stabilized first too.

The build lab

Windows releases are often referred to by specific build numbers, which always sounded weird to me: as a regular FreeBSD and NetBSD user, I churned out multiple new builds of these whole OS per day, and doing so was trivial and inconsequential. What was so special about Windows that individual build numbers were important? Was it really true that Windows NT 3.51, numbered build 1057, was the 1057th time that the codebase had been assembled together? And the answer is… yes.

The book talks a lot about the build lab: the place—laptops and remote work were definitely not a thing in the early 1990s—where a few engineers took the changes that everyone else made and assembled the whole system into something that could boot. This system was then distributed to the whole team for dogfooding on a daily basis.

This sounds ridiculous, right? Why wasn’t there a CI/CD pipeline to build the system every few hours and publish the resulting image to a place where engineers could download it? Ha, ha. Things weren’t this rosy back then. Having automated tests, reproducible builds, a CI/CD system, unattended nightly builds, or even version control… these are all a pretty new “inventions”. Quality control had to be done by hand, as did integrating the various pieces that formed the system.

Regardless, even if you think about having to design a CI/CD system for an OS today, it isn’t as trivial as what you need for your run-of-the-mill GitHub project. Validating the built OS against different hardware, assessing the behavior of fundamental system components like the process scheduler, and accepting that any validation can crash the machine and require a physical power cycle… is not trivial. Even relying on VMs like we do today isn’t sufficient, because you do want to test the OS against real hardware.

Portability and design cleanliness

These days we take x86-based personal computers as the one and only possibility, which wasn’t true back in the day—and may not be true in the near future either. As I mentioned earlier, the computer world was full of different architectures during the 1980s and early 1990s.

In my journey to move away from Windows, which started with OS/2 Warp 3 and was followed by Linux and the BSDs, I was lured by NetBSD and stuck with it for years. The main reason I ended up settling on NetBSD was its claims of being “cleanly designed”, and a big reason for needing and having a clean design was supporting tens of different architectures out of the same codebase.

Which is interesting because this is precisely how NT started. From reading the book, I learned that Cutler had the same mentality for his OS and, in fact, the system wasn’t ported to x86 until late in its development. He wanted developers to target non-x86 machines first to prevent sloppy non-portable coding, because x86 machines were already the primary desktops that developers had. In fact, even as x86 increasingly became the primary target platform for NT, the team maintained a MIPS build at all times, and having most tests passing in this platform was a requirement for launching.

Windows NT 3.51 running on a PowerPC machine. From the "The lost history of PReP: Windows NT 3.5x and the RS/6000 40p" article at Virtually Fun.

On this topic, I briefly remember some contemporary news articles in computer magazines describing how the PowerPC was going to be “the next big thing” and how NT ran on it. I did not grasp what a big deal this was at the time because based on the screenshots that they printed, Windows NT on PowerPC looked exactly the same as Windows 3.11 on my i386 PC. Well, I guess I really just didn’t understand what PowerPC nor NT were at all.

Dialing difficulty up to 11 with NTFS

This golden era also saw the creation of many new file systems. If you were on DOS and Windows 3.x land, your choices were limited, but if you were playing with Linux, you saw new file systems pop up “every week”. Things settled around the early/mid 2010s, except maybe for the uneventful launch of APFS in 2017.

Creating a new OS from scratch is no easy feat, but wanting to create a brand new file system to go with it is making things difficult just because. While writing a file system is already tricky enough, stabilizing it is a whole different story. But, as the book describes, the team felt that they needed a new file system to support the reliability and networking needs of NT, so it had to be done. FAT was nowhere close to offer them.

And NTFS is very interesting. NTFS was a really advanced file system for its time thanks to journaling, support for multi-TB volumes, optional per-directory and per-file compression, detailed ACLs… none of the Unix file systems of the era had these features. For example, ext3, which extended ext2 with journaling support, did not exist until 2001, and it has been replaced by ext4 and btrfs since. But NTFS is still chugging along in Windows 11, so it is impressive that they could make it happen on time for the OS release. (The same cannot be said about WinFS… which was supposed to ship with Vista and didn’t.)

Now, I know you will complain that NTFS is slow… but you know what? It almost-certainly is not. Most of the problems with NTFS performance stem from file system filters, which intercept all I/O operations to do heavy stuff like virus scanning. NTFS on its own is fine, but you need to write your applications with knowledge of the system you target: if you treat something as Unix when it isn’t, then bad things happen.

Correctness first, performance second

Speaking of NTFS performance, the book also touches upon system performance. Unsurprisingly, Cutler wanted to make the system stable and well-designed first, leaving performance for later. Performance was pushed to the side for a while, but the team was confident that they could improve performance and resource consumption later. This was a good strategy, although it didn’t play out well for a while.

What I found interesting is that the book claims that Bill Gates was obsessed with performance and routinely asked about it, because the prospective user base of NT did not have the means to buy high-end computers. Which is great… but then, I just don’t know where performance has gone wrong at Microsoft because that doesn’t seem to be the focus anymore.

Regardless, and even after performance work, NT was not the fastest at its launch. As the initial reviews published, the system was deemed “too big, too slow” and requiring too high hardware requirements. This seemed to affect Cutler. Right after the initial 3.1 launch, when the team deserved a break, they were back at work right away to improve these topics in preparation for the 3.51 launch.

Huge migrations at Microsoft

Apple is touted as the expert in huge migrations: they moved their hardware lineup from 68k to PowerPC, from PowerPC to x86 and, most recently, from x86 to Apple Silicon (ARM). They also moved from 32 bits to 64 bits, dropping 32-bit support later, and from Mac OS Classic to Mac OS X. Apple has indeed executed these huge migrations well, but at each step, they have left apps and developers behind because Apple has never been big on backwards compatibility. Somehow its customers have accepted that.

If we peek under the covers, Microsoft has also pushed similar humongous migrations. At each step, however, Microsoft has preserved backwards compatibility as much as possible, and I think this is why these migrations seem less of a big deal than they really were.

What am I talking about? For starters: the jump from the DOS / Windows 3.x world to Windows 95. Windows 95 unified these two systems by making DOS applications work well under Windows—something that wasn’t true in 3.x. To make this possible, Windows 95 carried tons of application-specific tweaks to remain bug-for-bug compatible and, while gross, that’s what users needed. We know how big of a splash Windows 95 made.

But the other huge migration was the jump from Windows 9x to Windows NT. Maintaining two separate operating systems with the same UI for a few years with support for roughly the same apps is a big feat, but it’s an even bigger feat to unify these two tracks into one with Windows XP. And, with that release, they were finally able to drop the Windows 9x line for good.

Putting the Windows in NT

NT started as a joint project between Microsoft and IBM. The goal was to create the next OS for PC systems, named OS/2. So, while NT was designed from the ground up to support multiple personalities (or… subsystems, like WSL 1), the original plan was to make NT support DOS and OS/2 applications.

Microsoft OS/2 1.3 logo. Courtesy of OS/2 World.

The Windows personality—the thing that gives NT its ability to run Windows applications—wasn’t in the cards at first. As the book explains, the project was originally called NT, not Windows NT, and was supposed to become OS/2. It wasn’t until later that it became clear within Microsoft that NT had to run Windows applications to not be dead on arrival, and thus Gates pushed for NT to prioritize compatibility with Windows applications. Later on, the system was renamed to Windows NT to emphasize this point, and all ties with IBM broke.

One thing that struck as interesting was how the team struggled to fix compatibility bugs late in the development cycle, because each fixed bug resulted in a new batch of compatibility issues. It wasn’t until later in the project that a key developer had the insight to create a tracer that recorded the system/library calls an app made. This recording could then be used by developers, without the help of testers, to trivially observe the behavior of a sequence of operations on NT and adjust them. This was the magic piece that provided a path towards stability and being able to ship.

From this work, NT also gained the ability to define “compatibility flags” for certain applications. This was necessary to work around cases where the applications did the wrong thing—yet compatibility had to be kept. And this is why Windows 9x and NT had (and still have) such a great compatibility story: the system is full of levers to appease weird behaviors in random apps, even if the apps themselves do something “wrong”. Most developers would go against the latter, claiming that those are bugs in the apps and they deserve to break. Apple has taken this path. Open source tries to take this path. Microsoft didn’t.

The UI matters

Another topic that the book touches upon was the animosity between Cutler and the graphics team. Cutler did not think that graphics were important, which means they were neglected for a long while. And even when a UI team was put together, Cutler was not particularly happy or impressed by them.

However, graphics are critical and they were difficult to get right. Performance was a problem, as were the many bugs that plagued the primary UI apps of the OS. And… you might have the fastest, most portable, best designed OS in the world, but if its shell is not usable nor stable… nobody will care. The UI makes the OS as much as the kernel makes the OS.

Eventually, due to the desire to support primarily Windows applications in NT, the team adopted the UI that had been developed for OS/2 and for Windows 3.x. This was critical to make the OS seem familiar, although as I briefly touched upon earlier, this also made it difficult to see how 3.x and NT differed to the untrained eye. Later in life, I had a hard time seeing how the Windows NT 4 that ran in our high school computer lab differed from the Windows 95 I had at home: to me, it just seemed slower and heavier.

Windows NT 4 initial desktop to emphasize how it looks exactly the same as Windows 95.

In any case, the UI is crucial to the way people appreciate an OS. And, for NT specifically, I’m sad that there is no choice: Microsoft keeps “advancing” the UI in ways that seem detrimental. In particular, modern versions of Windows feel incredibly slow in hardware barely 10 years old… which is sad because you can’t do anything about it other than stay in the hardware upgrade treadmill. I want to like Windows, but recent UI changes have slowly pushed me away.

And finally, about the book

To conclude this look into the past, let’s actually talk a tiny bit about the book that sparkled it.

The book is definitely entertaining to read; I guess the fact that I now live in the area where this all happened helped a tiny bit make it more so. It’s easy to read. It’s fun. It makes me wish I had been born just a few years earlier to understand everything that was going on in more detail—and maybe to have had a chance to become part of it. And as I mentioned in the introduction, it is engaging enough that I was able to stick with it for about three months.

But one thing that surprised me is that the book carries a handful of really obvious, painful editing mistakes. A couple of paragraphs are completely unreadable due to broken grammar, punctuation, and capitaliazation. I’m not sure how that happened. Luckily, the unreadable parts are reduced to one or two pages… so you don’t miss much.

Hope you enjoyed this article. And if you did, I’m sure you’ll enjoy the book much more. Click on the picture to buy and read it!