Pwning Windows Kernel - Unkillable User-Mode Process - Windows 10 kernel deadlock bug and our experience submitting it to Microsoft Security Response Center.

Preface

During our routine tinkering with Windows 10 in the late 2019, Rbmm and I have discovered what appeared to be a kernel bug in Windows 10. At the time I was doing some debugging of a program that I was developing and during my testing I managed to deadlock that process. Upon later inspection it turned out that it wasn't just a simple deadlock in my program, when you kill the process with the Task Manager, and you're done. This was a nasty deadlock that eventually required a hard reset of my computer just to kill the process in question.

Having shown this to Rbmm, he tried it on his end and could confirm my findings. He then later traced it to a deadlock in one of the kernel synchronization mechanisms, called push-lock, which is a special type of a kernel primitive with important characteristics:

Push locks cannot be acquired recursively. If the caller already has acquired the push lock for exclusive or shared access, the thread will hang.
- FltAcquirePushLockExclusive function

And that is exactly what we inadvertently managed to do from our user-mode process. (I'm assuming that the reader understands that the ability of a user mode process to deadlock a kernel object is not good, and is a serious bug.)

What follows is the description of our experience of submitting a bug report to Microsoft Security Response Center (MSRC), their response, and our overall (mis-)adventure of going through the whole process. Such tale may be a good lesson to any security researchers that find bugs of their own, who may want to be "good citizen" on the net, and to submit them to the vendor, i.e. Microsoft. This tale may save you a few steps.

The Bug

Due to the reasons that this kernel bug is still not fixed in the majority of the Windows 10 systems in use today, we will not be sharing the exact specifics in this post. I hope you understand it. (I may update this post later when Microsoft fixes this bug in all production versions of Windows 10.) For now though, this bug is only fixed in the insider preview builds of Windows 10.

For now, I'll explain the consequences of this bug instead. (I also recorded a short video that will demonstrate what happens.)

In short, this bug can be activated from any non-admin user-mode process. Or, from any program, in other words. During one of the kernel syscalls the calling thread deadlocks in the kernel push-lock, as I explained above. This leads to any attempt to acquire the affected push-lock again, say, while terminating that process, to return an unconditional access denied error. No elevation to administrator, or doing any conventional things to terminate such process will help. The only thing that will kill that process would be a reboot. But that also comes with a catch ...

Because of the way Windows reboots, if we pick the "right" process to deadlock, rebooting the OS with that kernel push-lock being in the "acquired" state, may even deadlock the rebooting process as well. Yes, you heard me right. You may not even be able to reboot your system after running such a malicious program, and only a hard-reset may be needed to revert the system back to normal. (Just to remind, "hard reset" is that radical action when you either pull the power plug, or press and hold down the power button for a few seconds to turn off your PC. This is generally something that you would do as a last resort to restore your system.)

I think I don't have to say that such consequences of just running a normal user-mode process were bad!

Consequences Of The Bug

There were some consequences that such deadlock would entail. To name just a few, off the top of my head:

The user-mode thread that was deadlocked in such a way in kernel will prevent the process that it ran in to be terminated. This, in turn will not allow such user-mode process to run either. So it will basically hang forever (or until the operating system goes through a hard-reset.)
Any other process that tries to acquire the affected push-lock will hang indefinitely.
In some special circumstances, the reboot itself may not be able to complete and may hang the system indefinitely.

In other words, all these things make this bug into a good candidate for a denial of service type weapon. And that is the kind of PoC that we decided to write, to demonstrate all these consequences to the vendor, or to the Microsoft.

Proof-Of-Concept (PoC)

To make our PoC more convincing we'd decided to do the following:

Pick such a situation (or such a push-lock) that by deadlocking it in our innocuous user-mode process we could also deadlock some important Windows functions. Interestingly enough, doing it to any UWP app, would deadlock some internal RPC infrastructure that Microsoft uses to "broker" communication between UWP apps. No doubt having security of isolation of UWP app sandboxes in mind, this actually played in our hand, to create a really "evil" PoC that could deadlock not only our test process, but also the Start menu, Edge browser, Cortana, or any other UWP app.
PoC - "Unkillable User Process" with kernel deadlock activated.
And obviously deadlocking the restart of the operating system itself was the main point in this PoC. You may want to know why it would be bad. Well, it may be undesirable for any autonomous system, such as a virtualized operating system running in some shared environment, such as Azure. Because our PoC test app did not require any elevation, this means that an attacker could upload it to any virtualized platform, trigger the bug and deadlock that VM.
Deadlocked restart screen.

We could probably spend more time coming up with other "evil" ways how this bug could've been abused, but we thought that those two major points that I listed above were enough to convince the vendor.

Bug Submission Process

The process of submitting this bug in 2019 was pretty straightforward. They changed it now. Today, one needs to create an account to submit a bug report. But back then we had to email all the details to secure@microsoft.com. (Don't try to email it now, because your email may not get anywhere, and you'll wait forever for a response.)

So I wrote a short PDF with the details of the bug, exact description of what and how to reproduce it. Rbmm provided his kernel stack of the bug, and we made a short Win32 C++ app with the demonstration of the bug. After that I zipped the source code files, along with the binary PoC file, and sent it to the aforementioned email. This was on December 1, 2019.

To their credit, MSRC's automated system replied immediately:

SUBJECT: "MSRC - Acknowledgement Notification VULN-******* CRM:077*******"

Hello,

Thank you for contacting the Microsoft Security Response Center (MSRC).

Your report has been received and you should receive a follow-up message from the case manager once your case has been fully reviewed.

Please reply to this email thread to add details to your report.

Thanks,

MSRC

The next day they assigned a case number to our submission. And we received this email:

SUBJECT: "MSRC Case ****** CRM:077********

Thank you for contacting the Microsoft Security Response Center (MSRC). I've opened a case for this issue:

MSRC Case ******

Please use this email thread--do not modify or remove the "CRM" token in the subject line--for all communication about this case. Also, please keep the details of this case confidential during our investigation. For more information, please see below:

Microsoft's Approach to Coordinated Vulnerability Disclosure
https://www.microsoft.com/en-us/msrc/cvd

Microsoft Bug Bounty Program
https://www.microsoft.com/en-us/msrc/bounty

Acknowledgements
https://portal.msrc.microsoft.com/en-us/security-guidance/acknowledgments

Online Service Acknowledgements
https://portal.msrc.microsoft.com/en-us/security-guidance/researcher-acknowledgments-online-services

MSRC

After that message all communication had stopped for a few days ...

Resolution

Seven days after we received the automated email above, we got an email that was personally addressed to me, which kinda appeared to have been written by a human, although I'm not sure. So you decide:

SUBJECT: "MSRC Case ****** CRM:077********"

Hi Dennis,

Thank you for your submission. We determined your finding is valid but does not meet our bar for immediate servicing. For more information, please see the Microsoft Security Servicing Criteria for Windows (https://aka.ms/windowscriteria).

However, we’ve marked your finding for future review as an opportunity to improve our products. I do not have a timeline for this review and will not provide updates moving forward. As no further action is required at this time, I am closing this case. You will not receive further correspondence regarding this submission.

Thank you for helping us protect our customers! For more information about our Security Development Lifecycle, please visit https://www.microsoft.com/en-us/sdl/default.aspx.

**Name-Redacted**
MSRC

😆 OK. Good deal!

So evidently unless you have a remote code execution that can lead to a worm that can spread laterally through a network using the SMB protocol, Microsoft will not care that much about your submission. And obviously our kernel deadlock that locks up vital Windows 10 functionality and may prevent a soft reboot is "valid but does not meet [their] bar for immediate servicing".

And that is, by the way, literally all we got for about a week's worth of work of preparing the PoC, writing a white-paper with explanation and providing suggestions how to fix this, for submission process, etc.

Thank you for helping us protect our customers!
- MSRC

You're welcome, MSRC!

So the question at this point that you, a security researcher, has to ask yourself. Is it worth going through all this hassle of submitting your bug report just to get an almost-human response in an email thanking you for protecting their customers? You'll be the judge.

And, I'm sure it's obvious to whoever read this far, that there was no bounty reward for our submission. Is despite of their claims to the contrary:

"General Awards" list from MSRC page.

There wasn't even an acknowledgement of our work for the year 2019, or 2020.

Conclusion

As a conclusion, let me finish this by saying, that in despite of us submitting this bug report, and Microsoft acknowledging it at the end of 2019, as of today, September 7, 2020, this kernel push-lock deadlock vulnerability is still not fixed in all production versions of Windows 10 that millions of people use daily.

That is also the reason why we're choosing not to provide any technical details of this bug. I tried to remain as nebulous about this vulnerability as I could. I apologize to the readers for that. With time, and if Microsoft fixes this bug in the majority of their operating systems, we may reveal more technical details.

The reason we're posting the overview of this bug now is the glimmer of hope that we noticed with the release of the latest "Insider Preview" builds of Windows 10. This bug appears to have been fixed there. Yay, Microsoft! 9 months later (plus whatever time it will take for them to push that "Insider Preview" to the most users' PCs.) But at least there's some progress.

Epilogue

On an interesting side note, MSRC seems to have a slight vulnerability of their own in their bug reporting platform. If someone submits several bug reports within a certain time frame, they can notice that the MSRC Case numbers that Microsoft assigns are sequentially incremented. It won't expose anything sensitive that I know of, but it gave us an interesting metric of how many cases they may receive and process.

For instance, we'd submitted several vulnerability reports back in 2019 and in 2020. And judging by the case numbers that were assigned to them, I can tell that:

December 2, 2019: Base line
December 4, 2019, 10:34 AM: +66 submissions
December 4, 2019, 10:41 AM: +1 submission (ours)
December 4, 2019, 11:20 AM: +2 submissions (ours + someone else's)
August 25, 2020: +5530 submissions
September 4, 2020: +149 submissions

As you can see from our very unscientific chart, they could be receiving one submission per hour. And that is either a lot of bugs, or a lot of eager secure researchers there. 🤨

Lastly, you may ask, why is it bad to have their case numbers predictable in such a way?

An attacker can hypothetically spoof the email message that MSRC uses for communication. See that wording for not modifying the "CRM" token. This means that their automatic parser simply greps all incoming messages for that token and directs them to a specific case. So, if an attacker can predict case numbers by simply incrementing them, they can flood and DDoS their service with fake emails, or disrupt their active investigation with a form of phishing messages.

We haven't tried it, but say, if you just send them an email with a spoofed "CRM" token in the subject line and ask for some sensitive details of the investigation, will they reply without verifying if the email matches the original sender? I hope not!

In any way, this may be a potential vulnerability in their own reporting system.

Video Overview

And lastly, here's a video demonstration of the kernel bug that I described above:

Play video fullscreen

Technical Details

Technical details of this bug are available in part 2, where Rbmm explains what caused this deadlock.

Blog Post

Pwning Windows Kernel - Unkillable User-Mode Process

Windows 10 kernel deadlock bug and our experience submitting it to Microsoft Security Response Center.