Preface
The following is a technical description of the Windows 10 kernel deadlock bug. Please read part one first. It will demonstrate that bug in action.
Note that this bug, along with the demonstrated PoC, was submitted to Microsoft on December 1, 2019, following responsible disclosure guidelines. This blog post became available for the public only after reported vulnerability was fixed.
Research
While creating a child process with a parent that belonged to a job object that was frozen, my process had ground to a halt.
When I checked the wait reason
for the thread that was stalling, I saw the following:
The thread was waiting in a kernel push lock. I then checked the call stack:
RSP Return Address Function Name
FFFF818E795AC7B0 FFFFF80232C00C26 ntoskrnl.exe!SwapContext + 4b7
FFFF818E795AC8F0 FFFFF80232AEA540 ntoskrnl.exe!KiSwapContext + 76
FFFF818E795AC9A0 FFFFF80232AE9A6F ntoskrnl.exe!KiSwapThread + 500
FFFF818E795ACA40 FFFFF80232AE9313 ntoskrnl.exe!KiCommitThreadWait + 14f
FFFF818E795ACB30 FFFFF80232A14320 ntoskrnl.exe!KeWaitForSingleObject + 233
FFFF818E795ACBE0 FFFFF80232AE58F2 ntoskrnl.exe!ExfAcquirePushLockExclusiveEx + la0
FFFF818E795ACC20 FFFFF80232DED4E8 ntoskrnl.exe!ExAcquirePushLockExclusiveEx + la2
FFFF818E795ACC50 FFFFF80232E886A3 ntoskrnl.exe!PsFreezeProcess + 7c
FFFF818E795ACD10 FFFFF80232E893CF ntoskrnl.exe!PspChangeProcessExecutionState + 14f
FFFF818E795ACD50 FFFFF80232E89DA0 ntoskrnl.exe!PspImplicitAssignProcessToJob + 13f
FFFF818E795ACDDO FFFFF80232E22B21 ntoskrnl.exe!PspInsertProcess + 7c
FFFF818E795ADA90 FFFFF80232C0B875 ntoskrnl.exe!NtCreateUserProcess + 901
000000D9F26FE2E8 00007FFCE448D6B4 ntoskrnl.exe!KiSystemServiceCopyEnd + 25
000000D9F26FE2F0 00007FFCE1EC876C ntdll.dll!ZwCreateUserProcess + 14
000000D9F26FF800 00007FFCE1EC6AB6 KERNELBASE.dll!CreateProcessInternalW + fcc
000000D9F26FF870 00007FFCE2EOCB54 KERNELBASE.dll!CreateProcessW + 66
000000D9F26FF8D0 00007FF6A7E811A5 KERNEL32.DLL!CreateProcessW + 54
000000D9F26FFC20 00007FF6A7E813BC FreezePoc.exe! + 11a5
000000D9F26FFCB0 00007FFCE2E06FD4 FreezePoc.exe! + 13bc
000000D9F26FFCE0 00007FFCE443CEC1 KERNEL32.DLL!BaseThreadInitThunk + 14
000000D9F26FFD60 0000000000000000 ntdll.dll!RtlUserThreadStart + 21
for the affected thread:
; PsFreezeProcess function
FFFFF80232DED46C mov qword ptr [rsp+10h],rbx
FFFFF80232DED471 mov qword ptr [rsp+18h],rbp
FFFFF80232DED476 mov qword ptr [rsp+20h],rsi
FFFFF80232DED47B push rdi
FFFFF80232DED47C sub rsp,20h
FFFFF80232DED480 mov eax,dword ptr [rcx+464h]
FFFFF80232DED486 mov bl,dl
FFFFF80232DED488 mov rdi,rcx
FFFFF80232DED48B test al,8
FFFFF80232DED48D jne FFFFF80232DED57D
FFFFF80232DED493 call KeFreezeProcess(FFFFF80232A11A48)
FFFFF80232DED498 mov ecx,dword ptr [rdi+464h]
FFFFF80232DED49E test cl,8
FFFFF80232DED4A1 jne FFFFF80232DED575
FFFFF80232DED4A7 mov rsi,qword ptr gs:[188h]
FFFFF80232DED4B0 test eax,eax
FFFFF80232DED4B2 jne FFFFF80232DED4C4
FFFFF80232DED4B4 test dword ptr [rdi+87Ch],80000h
FFFFF80232DED4BE jne FFFFF80232FE52EC
FFFFF80232DED4C4 test bl,bl
FFFFF80232DED4C6 je FFFFF80232DED55D
FFFFF80232DED4CC or rbx,0FFFFFFFFFFFFFFFFh
FFFFF80232DED4D0 add word ptr [rsi+1E4h],bx
FFFFF80232DED4D7 lea rbp,[rdi+438h]
FFFFF80232DED4DE xor edx,edx
FFFFF80232DED4E0 mov rcx,rbp
FFFFF80232DED4E3 call ExAcquirePushLockExclusiveEx(FFFFF80232AE5750)
FFFFF80232DED4E8 mov cl,1
FFFFF80232DED4EA call KiQueryUnbiasedInterruptTime(FFFFF80232AA15C4)
FFFFF80232DED4EF mov qword ptr [rdi+8B0h],rax
FFFFF80232DED4F6 lock xadd qword ptr [rbp],rbx
FFFFF80232DED4FC and bl,6
FFFFF80232DED4FF cmp bl,2
FFFFF80232DED502 je FFFFF80232DED581
FFFFF80232DED504 mov rcx,rbp
FFFFF80232DED507 call KeAbPostRelease(FFFFF80232AE8900)
It looked like during creation of the process the system immediately tried to freeze it. But before that it did this:
KeEnterCriticalRegion(); //Inlined by compiler
ExAcquirePushLockExclusiveEx(&Process->ProcessLock);
If you look at the Assembly code above, RDI
register before the ExAcquirePushLockExclusiveEx
call points to the process (EPROCESS
struct),
and at the offset 0x438
is the push lock:
struct _EPROCESS {
/*0000*/ _KPROCESS Pcb;
/*0438*/ _EX_PUSH_LOCK ProcessLock;
/*0440*/ void * UniqueProcessId;
/*0448*/ _LIST_ENTRY ActiveProcessLinks;
/*0458*/ _EX_RUNDOWN_REF RundownProtect;
/*0460*/ ULONG Flags2;
/*0460*/ ULONG JobNotReallyActive : 01; // 0x00000001;
/*0460*/ ULONG AccountingFolded : 01; // 0x00000002;
/*0460*/ ULONG NewProcessReported : 01; // 0x00000004;
/*0460*/ ULONG ExitProcessReported : 01; // 0x00000008;
/*0460*/ ULONG ReportCommitChanges : 01; // 0x00000010;
/*0460*/ ULONG LastReportMemory : 01; // 0x00000020;
/*0460*/ ULONG ForceWakeCharge : 01; // 0x00000040;
/*0460*/ ULONG CrossSessionCreate : 01; // 0x00000080;
/*0460*/ ULONG NeedsHandleRundown : 01; // 0x00000100;
/*0460*/ ULONG RefTraceEnabled : 01; // 0x00000200;
/*0460*/ ULONG PicoCreated : 01; // 0x00000400;
/*0460*/ ULONG EmptyJobEvaluated : 01; // 0x00000800;
/*0460*/ ULONG DefaultPagePriority : 03; // 0x00007000;
/*0460*/ ULONG PrimaryTokenFrozen : 01; // 0x00008000;
/*0460*/ ULONG ProcessVerifierTarget : 01; // 0x00010000;
/*0460*/ ULONG RestrictSetThreadContext : 01; // 0x00020000;
/*0460*/ ULONG AffinityPermanent : 01; // 0x00040000;
/*0460*/ ULONG AffinityUpdateEnable : 01; // 0x00080000;
/*0460*/ ULONG PropagateNode : 01; // 0x00100000;
/*0460*/ ULONG ExplicitAffinity : 01; // 0x00200000;
/*0460*/ ULONG ProcessExecutionState : 02; // 0x00c00000;
/*0460*/ ULONG EnableReadVmLogging : 01; // 0x01000000;
/*0460*/ ULONG EnableWriteVmLogging : 01; // 0x02000000;
/*0460*/ ULONG FatalAccessTerminationRequested : 01; // 0x04000000;
/*0460*/ ULONG DisableSystemAllowedCpuSet : 01; // 0x08000000;
/*0460*/ ULONG ProcessStateChangeRequest : 02; // 0x30000000;
/*0460*/ ULONG ProcessStateChangeInProgress : 01; // 0x40000000;
/*0460*/ ULONG InPrivate : 01; // 0x80000000;
/*0464*/ ULONG Flags;
/*0464*/ ULONG CreateReported : 01; // 0x00000001;
/*0464*/ ULONG NoDebugInherit : 01; // 0x00000002;
/*0464*/ ULONG ProcessExiting : 01; // 0x00000004;
/*0464*/ ULONG ProcessDelete : 01; // 0x00000008;
/*0464*/ ULONG ManageExecutableMemoryWrites : 01; // 0x00000010;
/*0464*/ ULONG VmDeleted : 01; // 0x00000020;
/*0464*/ ULONG OutswapEnabled : 01; // 0x00000040;
/*0464*/ ULONG Outswapped : 01; // 0x00000080;
/*0464*/ ULONG FailFastOnCommitFail : 01; // 0x00000100;
/*0464*/ ULONG Wow64VaSpace4Gb : 01; // 0x00000200;
/*0464*/ ULONG AddressSpaceInitialized : 02; // 0x00000c00;
/*0464*/ ULONG SetTimerResolution : 01; // 0x00001000;
/*0464*/ ULONG BreakOnTermination : 01; // 0x00002000;
/*0464*/ ULONG DeprioritizeViews : 01; // 0x00004000;
/*0464*/ ULONG WriteWatch : 01; // 0x00008000;
/*0464*/ ULONG ProcessInSession : 01; // 0x00010000;
/*0464*/ ULONG OverrideAddressSpace : 01; // 0x00020000;
/*0464*/ ULONG HasAddressSpace : 01; // 0x00040000;
/*0464*/ ULONG LaunchPrefetched : 01; // 0x00080000;
/*0464*/ ULONG Background : 01; // 0x00100000;
/*0464*/ ULONG VmTopDown : 01; // 0x00200000;
/*0464*/ ULONG ImageNotifyDone : 01; // 0x00400000;
/*0464*/ ULONG PdeUpdateNeeded : 01; // 0x00800000;
/*0464*/ ULONG VdmAllowed : 01; // 0x01000000;
/*0464*/ ULONG ProcessRundown : 01; // 0x02000000;
/*0464*/ ULONG ProcessInserted : 01; // 0x04000000;
/*0464*/ ULONG DefaultIoPriority : 03; // 0x38000000;
/*0464*/ ULONG ProcessSelfDelete : 01; // 0x40000000;
/*0464*/ ULONG SetTimerResolutionLink : 01; // 0x80000000;
/*0468*/ _LARGE_INTEGER CreateTime;
/*0470*/ ULONGLONG ProcessQuotaUsage[0x2];
/*0480*/ ULONGLONG ProcessQuotaPeak[0x2];
/*0490*/ ULONGLONG PeakVirtualSize;
/*0498*/ ULONGLONG VirtualSize;
/*04a0*/ _LIST_ENTRY SessionProcessLinks;
/*04b0*/ void * ExceptionPortData;
/*04b0*/ ULONGLONG ExceptionPortValue;
/*04b0*/ ULONGLONG ExceptionPortState : 03; // 0x0000000000000007;
/*04b8*/ _EX_FAST_REF Token;
/*04c0*/ ULONGLONG MmReserved;
/*04c8*/ _EX_PUSH_LOCK AddressCreationLock;
/*04d0*/ _EX_PUSH_LOCK PageTableCommitmentLock;
/*04d8*/ _ETHREAD * RotateInProgress;
/*04e0*/ _ETHREAD * ForkInProgress;
/*04e8*/ _EJOB * volatile CommitChargeJob;
/*04f0*/ _RTL_AVL_TREE CloneRoot;
/*04f8*/ ULONGLONG volatile NumberOfPrivatePages;
/*0500*/ ULONGLONG volatile NumberOfLockedPages;
/*0508*/ void * Win32Process;
/*0510*/ _EJOB * volatile Job;
/*0518*/ void * SectionObject;
/*0520*/ void * SectionBaseAddress;
/*0528*/ ULONG Cookie;
/*0530*/ _PAGEFAULT_HISTORY * WorkingSetWatch;
/*0538*/ void * Win32WindowStation;
/*0540*/ void * InheritedFromUniqueProcessId;
/*0548*/ ULONGLONG volatile OwnerProcessId;
/*0550*/ _PEB * Peb;
/*0558*/ _MM_SESSION_SPACE * Session;
/*0560*/ void * Spare1;
/*0568*/ _EPROCESS_QUOTA_BLOCK * QuotaBlock;
/*0570*/ _HANDLE_TABLE * ObjectTable;
/*0578*/ void * DebugPort;
/*0580*/ _EWOW64PROCESS * WoW64Process;
/*0588*/ void * DeviceMap;
/*0590*/ void * EtwDataSource;
/*0598*/ ULONGLONG PageDirectoryPte;
/*05a0*/ _FILE_OBJECT * ImageFilePointer;
/*05a8*/ UCHAR ImageFileName[0xf];
/*05b7*/ UCHAR PriorityClass;
/*05b8*/ void * SecurityPort;
/*05c0*/ _SE_AUDIT_PROCESS_CREATION_INFO SeAuditProcessCreationInfo;
/*05c8*/ _LIST_ENTRY JobLinks;
/*05d8*/ void * HighestUserAddress;
/*05e0*/ _LIST_ENTRY ThreadListHead;
/*05f0*/ ULONG volatile ActiveThreads;
/*05f4*/ ULONG ImagePathHash;
/*05f8*/ ULONG DefaultHardErrorProcessing;
/*05fc*/ LONG LastThreadExitStatus;
/*0600*/ _EX_FAST_REF PrefetchTrace;
/*0608*/ void * LockedPagesList;
/*0610*/ _LARGE_INTEGER ReadOperationCount;
/*0618*/ _LARGE_INTEGER WriteOperationCount;
/*0620*/ _LARGE_INTEGER OtherOperationCount;
/*0628*/ _LARGE_INTEGER ReadTransferCount;
/*0630*/ _LARGE_INTEGER WriteTransferCount;
/*0638*/ _LARGE_INTEGER OtherTransferCount;
/*0640*/ ULONGLONG CommitChargeLimit;
/*0648*/ ULONGLONG volatile CommitCharge;
/*0650*/ ULONGLONG volatile CommitChargePeak;
/*0680*/ _MMSUPPORT_FULL Vm;
/*07c0*/ _LIST_ENTRY MmProcessLinks;
/*07d0*/ ULONG ModifiedPageCount;
/*07d4*/ LONG ExitStatus;
/*07d8*/ _RTL_AVL_TREE VadRoot;
/*07e0*/ void * VadHint;
/*07e8*/ ULONGLONG VadCount;
/*07f0*/ ULONGLONG volatile VadPhysicalPages;
/*07f8*/ ULONGLONG VadPhysicalPagesLimit;
/*0800*/ _ALPC_PROCESS_CONTEXT AlpcContext;
/*0820*/ _LIST_ENTRY TimerResolutionLink;
/*0830*/ _PO_DIAG_STACK_RECORD * TimerResolutionStackRecord;
/*0838*/ ULONG RequestedTimerResolution;
/*083c*/ ULONG SmallestTimerResolution;
/*0840*/ _LARGE_INTEGER ExitTime;
/*0848*/ _INVERTED_FUNCTION_TABLE * InvertedFunctionTable;
/*0850*/ _EX_PUSH_LOCK InvertedFunctionTableLock;
/*0858*/ ULONG ActiveThreadsHighWatermark;
/*085c*/ ULONG LargePrivateVadCount;
/*0860*/ _EX_PUSH_LOCK ThreadListLock;
/*0868*/ void * WnfContext;
/*0870*/ _EJOB * ServerSilo;
/*0878*/ UCHAR SignatureLevel;
/*0879*/ UCHAR SectionSignatureLevel;
/*087a*/ _PS_PROTECTION Protection;
/*087b*/ UCHAR HangCount : 03; // 0x07;
/*087b*/ UCHAR GhostCount : 03; // 0x38;
/*087b*/ UCHAR PrefilterException : 01; // 0x40;
/*087c*/ ULONG Flags3;
/*087c*/ ULONG Minimal : 01; // 0x00000001;
/*087c*/ ULONG ReplacingPageRoot : 01; // 0x00000002;
/*087c*/ ULONG Crashed : 01; // 0x00000004;
/*087c*/ ULONG JobVadsAreTracked : 01; // 0x00000008;
/*087c*/ ULONG VadTrackingDisabled : 01; // 0x00000010;
/*087c*/ ULONG AuxiliaryProcess : 01; // 0x00000020;
/*087c*/ ULONG SubsystemProcess : 01; // 0x00000040;
/*087c*/ ULONG IndirectCpuSets : 01; // 0x00000080;
/*087c*/ ULONG RelinquishedCommit : 01; // 0x00000100;
/*087c*/ ULONG HighGraphicsPriority : 01; // 0x00000200;
/*087c*/ ULONG CommitFailLogged : 01; // 0x00000400;
/*087c*/ ULONG ReserveFailLogged : 01; // 0x00000800;
/*087c*/ ULONG SystemProcess : 01; // 0x00001000;
/*087c*/ ULONG HideImageBaseAddresses : 01; // 0x00002000;
/*087c*/ ULONG AddressPolicyFrozen : 01; // 0x00004000;
/*087c*/ ULONG ProcessFirstResume : 01; // 0x00008000;
/*087c*/ ULONG ForegroundExternal : 01; // 0x00010000;
/*087c*/ ULONG ForegroundSystem : 01; // 0x00020000;
/*087c*/ ULONG HighMemoryPriority : 01; // 0x00040000;
/*087c*/ ULONG EnableProcessSuspendResumeLogging : 01; // 0x00080000;
/*087c*/ ULONG EnableThreadSuspendResumeLogging : 01; // 0x00100000;
/*087c*/ ULONG SecurityDomainChanged : 01; // 0x00200000;
/*087c*/ ULONG SecurityFreezeComplete : 01; // 0x00400000;
/*087c*/ ULONG VmProcessorHost : 01; // 0x00800000;
/*087c*/ ULONG VmProcessorHostTransition : 01; // 0x01000000;
/*087c*/ ULONG AltSyscall : 01; // 0x02000000;
/*087c*/ ULONG TimerResolutionIgnore : 01; // 0x04000000;
/*087c*/ ULONG DisallowUserTerminate : 01; // 0x08000000;
/*0880*/ LONG DeviceAsid;
/*0888*/ void * SvmData;
/*0890*/ _EX_PUSH_LOCK SvmProcessLock;
/*0898*/ ULONGLONG SvmLock;
/*08a0*/ _LIST_ENTRY SvmProcessDeviceListHead;
/*08b0*/ ULONGLONG LastFreezeInterruptTime;
/*08b8*/ _PROCESS_DISK_COUNTERS * DiskCounters;
/*08c0*/ void * PicoContext;
/*08c8*/ void * EnclaveTable;
/*08d0*/ ULONGLONG EnclaveNumber;
/*08d8*/ _EX_PUSH_LOCK EnclaveLock;
/*08e0*/ ULONG HighPriorityFaultsAllowed;
/*08e8*/ _PO_PROCESS_ENERGY_CONTEXT * EnergyContext;
/*08f0*/ void * VmContext;
/*08f8*/ ULONGLONG SequenceNumber;
/*0900*/ ULONGLONG CreateInterruptTime;
/*0908*/ ULONGLONG CreateUnbiasedInterruptTime;
/*0910*/ ULONGLONG TotalUnbiasedFrozenTime;
/*0918*/ ULONGLONG LastAppStateUpdateTime;
/*0920*/ ULONGLONG LastAppStateUptime : 61; // 0x1fffffffffffffff;
/*0920*/ ULONGLONG LastAppState : 03; // 0xe000000000000000;
/*0928*/ ULONGLONG volatile SharedCommitCharge;
/*0930*/ _EX_PUSH_LOCK SharedCommitLock;
/*0938*/ _LIST_ENTRY SharedCommitLinks;
/*0948*/ ULONGLONG AllowedCpuSets;
/*0950*/ ULONGLONG DefaultCpuSets;
/*0948*/ ULONGLONG * AllowedCpuSetsIndirect;
/*0950*/ ULONGLONG * DefaultCpuSetsIndirect;
/*0958*/ void * DiskIoAttribution;
/*0960*/ void * DxgProcess;
/*0968*/ ULONG Win32KFilterSet;
/*096c*/ USHORT Machine;
/*096e*/ USHORT Spare0;
/*0970*/ _PS_INTERLOCKED_TIMER_DELAY_VALUES volatile ProcessTimerDelay;
/*0978*/ ULONG volatile KTimerSets;
/*097c*/ ULONG volatile KTimer2Sets;
/*0980*/ ULONG volatile ThreadTimerSets;
/*0988*/ ULONGLONG VirtualTimerListLock;
/*0990*/ _LIST_ENTRY VirtualTimerListHead;
/*09a0*/ _WNF_STATE_NAME WakeChannel;
/*09a0*/ _PS_PROCESS_WAKE_INFORMATION WakeInfo;
/*09d0*/ ULONG MitigationFlags;
struct {
/*09d0*/ ULONG ControlFlowGuardEnabled : 01; // 0x00000001;
/*09d0*/ ULONG ControlFlowGuardExportSuppressionEnabled : 01; // 0x00000002;
/*09d0*/ ULONG ControlFlowGuardStrict : 01; // 0x00000004;
/*09d0*/ ULONG DisallowStrippedImages : 01; // 0x00000008;
/*09d0*/ ULONG ForceRelocateImages : 01; // 0x00000010;
/*09d0*/ ULONG HighEntropyASLREnabled : 01; // 0x00000020;
/*09d0*/ ULONG StackRandomizationDisabled : 01; // 0x00000040;
/*09d0*/ ULONG ExtensionPointDisable : 01; // 0x00000080;
/*09d0*/ ULONG DisableDynamicCode : 01; // 0x00000100;
/*09d0*/ ULONG DisableDynamicCodeAllowOptOut : 01; // 0x00000200;
/*09d0*/ ULONG DisableDynamicCodeAllowRemoteDowngrade : 01; // 0x00000400;
/*09d0*/ ULONG AuditDisableDynamicCode : 01; // 0x00000800;
/*09d0*/ ULONG DisallowWin32kSystemCalls : 01; // 0x00001000;
/*09d0*/ ULONG AuditDisallowWin32kSystemCalls : 01; // 0x00002000;
/*09d0*/ ULONG EnableFilteredWin32kAPIs : 01; // 0x00004000;
/*09d0*/ ULONG AuditFilteredWin32kAPIs : 01; // 0x00008000;
/*09d0*/ ULONG DisableNonSystemFonts : 01; // 0x00010000;
/*09d0*/ ULONG AuditNonSystemFontLoading : 01; // 0x00020000;
/*09d0*/ ULONG PreferSystem32Images : 01; // 0x00040000;
/*09d0*/ ULONG ProhibitRemoteImageMap : 01; // 0x00080000;
/*09d0*/ ULONG AuditProhibitRemoteImageMap : 01; // 0x00100000;
/*09d0*/ ULONG ProhibitLowILImageMap : 01; // 0x00200000;
/*09d0*/ ULONG AuditProhibitLowILImageMap : 01; // 0x00400000;
/*09d0*/ ULONG SignatureMitigationOptIn : 01; // 0x00800000;
/*09d0*/ ULONG AuditBlockNonMicrosoftBinaries : 01; // 0x01000000;
/*09d0*/ ULONG AuditBlockNonMicrosoftBinariesAllowStore : 01; // 0x02000000;
/*09d0*/ ULONG LoaderIntegrityContinuityEnabled : 01; // 0x04000000;
/*09d0*/ ULONG AuditLoaderIntegrityContinuity : 01; // 0x08000000;
/*09d0*/ ULONG EnableModuleTamperingProtection : 01; // 0x10000000;
/*09d0*/ ULONG EnableModuleTamperingProtectionNoInherit : 01; // 0x20000000;
/*09d0*/ ULONG RestrictIndirectBranchPrediction : 01; // 0x40000000;
/*09d0*/ ULONG IsolateSecurityDomain : 01; // 0x80000000;
/*09d4*/
} MitigationFlagsValues;
/*09d4*/ ULONG MitigationFlags2;
struct {
/*09d4*/ ULONG EnableExportAddressFilter : 01; // 0x00000001;
/*09d4*/ ULONG AuditExportAddressFilter : 01; // 0x00000002;
/*09d4*/ ULONG EnableExportAddressFilterPlus : 01; // 0x00000004;
/*09d4*/ ULONG AuditExportAddressFilterPlus : 01; // 0x00000008;
/*09d4*/ ULONG EnableRopStackPivot : 01; // 0x00000010;
/*09d4*/ ULONG AuditRopStackPivot : 01; // 0x00000020;
/*09d4*/ ULONG EnableRopCallerCheck : 01; // 0x00000040;
/*09d4*/ ULONG AuditRopCallerCheck : 01; // 0x00000080;
/*09d4*/ ULONG EnableRopSimExec : 01; // 0x00000100;
/*09d4*/ ULONG AuditRopSimExec : 01; // 0x00000200;
/*09d4*/ ULONG EnableImportAddressFilter : 01; // 0x00000400;
/*09d4*/ ULONG AuditImportAddressFilter : 01; // 0x00000800;
/*09d4*/ ULONG DisablePageCombine : 01; // 0x00001000;
/*09d4*/ ULONG SpeculativeStoreBypassDisable : 01; // 0x00002000;
/*09d4*/ ULONG CetUserShadowStacks : 01; // 0x00004000;
/*09d4*/ ULONG AuditCetUserShadowStacks : 01; // 0x00008000;
/*09d4*/ ULONG AuditCetUserShadowStacksLogged : 01; // 0x00010000;
/*09d4*/ ULONG UserCetSetContextIpValidation : 01; // 0x00020000;
/*09d4*/ ULONG AuditUserCetSetContextIpValidation : 01; // 0x00040000;
/*09d4*/ ULONG AuditUserCetSetContextIpValidationLogged : 01; // 0x00080000;
/*09d4*/ ULONG CetUserShadowStacksStrictMode : 01; // 0x00100000;
/*09d4*/ ULONG BlockNonCetBinaries : 01; // 0x00200000;
/*09d4*/ ULONG BlockNonCetBinariesNonEhcont : 01; // 0x00400000;
/*09d4*/ ULONG AuditBlockNonCetBinaries : 01; // 0x00800000;
/*09d4*/ ULONG AuditBlockNonCetBinariesLogged : 01; // 0x01000000;
/*09d4*/ ULONG XtendedControlFlowGuard : 01; // 0x02000000;
/*09d4*/ ULONG AuditXtendedControlFlowGuard : 01; // 0x04000000;
/*09d4*/ ULONG PointerAuthUserIp : 01; // 0x08000000;
/*09d4*/ ULONG AuditPointerAuthUserIp : 01; // 0x10000000;
/*09d4*/ ULONG AuditPointerAuthUserIpLogged : 01; // 0x20000000;
/*09d4*/ ULONG CetDynamicApisOutOfProcOnly : 01; // 0x40000000;
/*09d4*/ ULONG UserCetSetContextIpValidationRelaxedMode : 01; // 0x80000000;
/*09d8*/
} MitigationFlags2Values;
/*09d8*/ void * PartitionObject;
/*09e0*/ ULONGLONG SecurityDomain;
/*09e8*/ ULONGLONG ParentSecurityDomain;
/*09f0*/ void * CoverageSamplerContext;
/*09f8*/ void * MmHotPatchContext;
/*0a00*/ _KE_IDEAL_PROCESSOR_ASSIGNMENT_BLOCK IdealProcessorAssignmentBlock;
/*0b18*/ _RTL_AVL_TREE DynamicEHContinuationTargetsTree;
/*0b20*/ _EX_PUSH_LOCK DynamicEHContinuationTargetsLock;
/*0b28*/ _PS_DYNAMIC_ENFORCED_ADDRESS_RANGES DynamicEnforcedCetCompatibleRanges;
/*0b40*/
};
To quote MSDN:
Unlike ERESOURCE
structures, push locks cannot be acquired recursively. If the caller already has acquired the push lock for exclusive or shared access, the thread hangs.
So it seems like during creation of the process, the thread had already acquired the Process
->ProcessLock
and a repeated call inside PsFreezeProcess
lead to a recursive acquisition of the same push lock, which in turn caused the deadlock.
Consequences
But it didn't just end on deadlocking the new (not fully created) process. I noticed that the Start button had also stopped working, along with the Windows Store UWP apps, such as Calculator, and others.
My guess was that something else was trying to acquire the same Process
->ProcessLock
for our thread, and also hung up in the process. But that "something"
must've also had the WrPushLock
wait reason while hanging. So I decided to look for such threads in the system, having enumerated them using the
NtQuerySystemInformation
function:
void DumpWrPushLock()
{
union {
PSYSTEM_PROCESS_INFORMATION pspi;
PVOID pv;
PBYTE pb;
};
NTSTATUS status;
ULONG cb = 0x10000;
do
{
status = STATUS_INSUFFICIENT_RESOURCES;
if (PVOID buf = LocalAlloc(0, cb += PAGE_SIZE))
{
if (0 <= (status = NtQuerySystemInformation(SystemProcessInformation, buf, cb, &cb)))
{
pv = buf;
ULONG NextEntryOffset = 0;
PSYSTEM_PROCESS_INFORMATION _pspi = 0;
do
{
pb += NextEntryOffset;
if (ULONG NumberOfThreads = pspi->NumberOfThreads)
{
SYSTEM_THREAD_INFORMATION* TH = pspi->TH;
do
{
if (TH->ThreadState == StateWait && TH->WaitReason == WrPushLock)
{
if (_pspi != pspi)
{
_pspi = pspi;
DbgPrint("%x %wZ\n", pspi->UniqueProcessId, &pspi->ImageName);
}
DbgPrint("\t%x\n", TH->ClientId.UniqueThread);
}
} while (TH++, --NumberOfThreads);
}
} while (NextEntryOffset = pspi->NextEntryOffset);
status = STATUS_OBJECT_NAME_NOT_FOUND;
}
LocalFree(buf);
}
} while (status == STATUS_INFO_LENGTH_MISMATCH);
}
In a normal operation, during a random check, there should be no such threads at all. But I was able to find one in the following process:
svchost.exe -k DcomLaunch -p
FYI: DcomLaunch
is one of the critically important processes in the system, responsible for the COM & DCOM functionality.
300 9 00007FF7801E4E80 00000044B63B7000 UserRequest
394 8 00007FFCE44020E0 00000044B63C7000 UserRequest
3C0 8 00007FFCE44020E0 00000044B63CB000 WrQueue
170 8 00007FFCDD458D60 00000044B63D7000 UserRequest
318 9 00007FFCE44020E0 00000044B63DD000 WrAlestByThreadId
474 9 00007FFCE44020E0 00000044B63E5000 WrLpcReply
DEO 9 00007FFCDF48E290 00000044B63E7000 WrAlertByThreadld
BC8 8 00007FFCE44020E0 00000044563E9000 WrQueue
AE0 9 00007ETCE44020E0 0000004436200000 WrPushLock
1574 8 00007FFCE44020E0 0000004486206000 WrQueue
988 9 00007FFCE44020E0 00000044B620A000 WrAlertByThreadld
At that time I immediately noticed a thread with the WrLpcReply
wait reason. This meant that the thread initiated a request to ALPC
port and was waiting for a response
from the server. Most certainly an RPC
call (RPC over LPC.)
In normal circumstances, the reply should arrive very quickly and we would never see WrLpcReply
in our debugging tool.
But if we could see it consistently, this meant that there was something very wrong with the system.
I also noticed threads with the WrAlertByThreadId
wait reason. Such reason usually means that the thread is waiting for a
critical section or for a
SRW lock.
So having listed such threads, I noticed that there were quite a few of them. Which was very unusual, as it's a very transient wait state for the thread as well:
At this point it was clear to me that there's something very wrong with the DcomLaunch
process. And knowing its significance for the system,
I started my further research with the thread with ID 0xAE0
that was stuck in WrPushLock
.
DcomLaunch
I checked the call stack of the affected thread:
RSP Return Address Function Name
FFFF818E78B088F0 FFFFF80232C00C26 ntoskrnl.exe!SwapContext + 4b7
FFFF818E78B08A30 FFFFF80232AEA540 ntoskrnl.exe!KiSwapContext + 76
FFFF818E78B08AEO FFFFF80232AE9A6F ntoskrnl.exe!KiSwapThread + 500
FFFF818E78B08B80 FFFFF80232AE9313 ntoskrnl.exe!KiCommitThreadWait + 14f
FFFF818E78B08C70 FFFFF80232A14563 ntoskrnl.exe!KeWaitForSingleObject + 233
FFFF818E78B08D20 FFFFF80232AE5A72 ntoskrnl.exe!ExfAcquirePushLockSharedEx + 1b3
FFFF818E78B08D70 FFFFF80232EF11B1 ntoskrnl.exe!ExAcquirePushLockSharedEx + 162
FFFF818E78B08DC0 FIFFF80232EF2820 ntoskrnl.exe!PsGetNextProcessThread + 51
FFFF818E78B08E10 FFFFF80232E97CBE ntoskrnl.exe!PspBoostJobIoPriorityCallback + 70
FFFF818E78B08E60 FFFFF80232E97BCC ntoskrnl.exe!PspCallJobHierarchyCallbacks + ba
FFFF818E78B08EFO FFFFF80232E78630 ntoskrnl.exe!PspEnumJobsAndProcessesInJobHierarchy + 1cc
FFFF818E78B08F30 FFFFF80232FOCE1A ntoskrnl.exe!PspSetBackgroundJobTree + a0
FFFF818E78B09B00 FIFFF80232C0B875 ntoskrnl.exe!NtSetInformationJobObject + 11ca
00000044B657D8E8 00007FFCE448F094 ntoskrnl.exe!KiSystemServiceCopyEnd + 25
00000044B657D8F0 00007FFCDF63E7DD ntdll.dll!ZwSetInformationJobObject + 14
00000044B657DA10 00007FFCDF63E737 psmsrv.dll!PsmpEnableDisableBackgroundMode + 5d
00000044B657DA50 00007FFCDF631BC4 psmsrv.dll!PsmpEnableDisablePrioritySandbox + cb
00000044B657DA80 00007FFCDF6392E2 psmsrv.dll!PsmpEnableDisableSandbox + a4
00000044B657DBE0 00007FFCDF633E22 psmsrv.dll!PsmpChangeApplicationState + 482
00000044B657DC30 00007FFCDF633C07 psmsrv.dll!PsmpEndWorkItemExecution + 86
00000044B657DC90 00007FFCDF249612 psmsrv.dll!PsmDereferenceBrokeredExecution + 27
00000044B657DCD0 00007FFCDF23F057 bisrv.dll!BipTaskInstanceStop + ea
00000044B657DF90 00007FFCDF24988D bisrv.dll!BipTaskCompletionCallback + 337
00000044B657E060 00007FFCE31B2033 bisrv.dll!virtual long CBackgroundWorkItemInstanceRemote::C...
00000044B657E000 00007FFCE321B1D3 RPCRT4.dll!Invoke + 73
00000044B657E760 00007FFCE315B3F9 RPCRT4.dll!long Ndr64StubWorker(void *,void *,_RPC_MESSAGE ...
00000044B657E7C0 00007FFCE38675D0 RPCRT4.dll!NdrStubCall3 + c9
00000044B657E800 00007FFCE319A3AB combase.dll!CStdStubBuffer_Invoke + 60
00000044B657E830 00007FFCE37F5EE3 RPCRT4.dll!CStdStubBuffer_Invoke + 3b
00000044B657E890 00007FFCE37F5C6E combase.dll!long DefaultStubInvoke(bool,IServerCall *,IRpcC...
00000044B657E9F0 00007FFCE386AB76 combase.dll!long DefaultStubInvoke(bool,IServerCall *,IRpcC...
00000044B657EA30 00007FFCE37EF9B3 combase.dll!virtual long SyncServerCall::StubInvoke(IRpcCha...
00000044B657ED80 00007FFCE386354D combase.dll!virtual long ServerCall::ContextInvoke(tagRPCOL...
00000044B657EDD0 00007FFCE37DED68 combase.dll!long DefaultInvokeInApartment(tagRPCOLEMESSAGE
00000044B657F1B0 00007FFCE38A77D8 combase.dll!long ComInvokeWithLockAndIPID(ServerCall *,tagI...
00000044B657F4D0 00007FFCE37EOE29 combase.dll!long ThreadInvokeWorker(_RPC_MESSAGE *) + 7c4
00000044B657F500 00007FFCE3196708 combase.dll!void ThreadInvoke(_RPC_MESSAGE *) + 9
00000044B657F550 00007FFCE3179196 RPCRT4.dll!DispatchToStubInCNoAvrf + 18
00000044B657F630 00007FFCE3178DC6 RPCRT4.dll!long RPC_INTERFACE::DispatchToStubWorker(_RPC_ME...
00000044B657F6D0 00007FFCE318714F RPCRT4.dll!long RPC_INTERFACE::DispatchToStubWithObject(_RP...
00000044B657F7A0 00007FFCE3186708 RPCRT4.dll!long LRPC_SCALL::DispatchRequest(int *) + 16f
00000044B657F8B0 00007FFCE3185CF1 RPCRT4.dll!void LRPC_SCALL::HandleRequest(_PORT_MESSAGE *,_...
00000044B657F950 00007FFCE318575E RPCRT4.dll!void LRPC_ADDRESS::HandleRequest(_PORT_MESSAGE *...
00000044B657FA90 00007FFCE3189CE2 RPCRT4.dll!void LRPC_ADDRESS::ProcessIO(void *) + 89e
00000044B657FB30 00007FFCE442F220 RPCRT4.dll!void LrpcIoComplete(_TP_CALLBACK_INSTANCE *,void...
00000044B657FBB0 00007FFCE4402536 ntdll.dll!TppAlpcpExecuteCallback + 260
And what did we see there? Our familiar ExAcquirePushLockSharedEx
:
; PsGetNextProcessThread function
FFFFF80232EF1160 mov qword ptr [rsp+10h],rbx
FFFFF80232EF1165 mov qword ptr [rsp+18h],rbp
FFFFF80232EF116A mov qword ptr [rsp+20h],rsi
FFFFF80232EF116F push rdi
FFFFF80232EF1170 push r12
FFFFF80232EF1172 push r13
FFFFF80232EF1174 push r14
FFFFF80232EF1176 push r15
FFFFF80232EF1178 sub rsp,20h
FFFFF80232EF117C mov r12,qword ptr gs:[188h]
FFFFF80232EF1185 lea r13,[rcx+5E0h]
FFFFF80232EF118C xor ebx,ebx
FFFFF80232EF118E mov rdi,rdx
FFFFF80232EF1191 mov r15d,ebx
FFFFF80232EF1194 mov r14d,ebx
FFFFF80232EF1197 dec word ptr [r12+1E4h]
FFFFF80232EF11A0 lea rbp,[rcx+438h]
FFFFF80232EF11A7 xor edx,edx
FFFFF80232EF11A9 mov rcx,rbp
FFFFF80232EF11AC call ExAcquirePushLockSharedEx(FFFFF80232AE5910)
FFFFF80232EF11B1 test rdi,rdi
FFFFF80232EF11B4 je FFFFF80232EF123B
FFFFF80232EF11BA mov rsi,qword ptr [rdi+4E8h]
FFFFF80232EF11C1 cmp rsi,r13
FFFFF80232EF11C4 je FFFFF80232EF11E4
FFFFF80232EF11C6 lea r15,[rsi-4E8h]
FFFFF80232EF11CD mov edx,6E457350h
FFFFF80232EF11D2 mov rcx,r15
FFFFF80232EF11D5 call ObReferenceObjectSafeWithTag(FFFFF80232AE8B50)
FFFFF80232EF11DA test al,al
FFFFF80232EF11DC je FFFFF80232EF124B
FFFFF80232EF11DE mov r14d,1
FFFFF80232EF11E4 mov eax,11h
FFFFF80232EF11E9 lock cmpxchg qword ptr [rbp],rbx
FFFFF80232EF11EF jne FFFFF80232EF1241
FFFFF80232EF11F1 mov rcx,rbp
That thread was servicing some RPC call (which meant that the other thread on the other end of RPC was also hanging!)
It also looked like the RPC call came from ZwSetInformationJobObject
, where our deadlocked process was also stalling.
(That is also where we got the acquisition of the push lock in Process
->ProcessLock
through the job object.)
At this point nothing would've been able to communicate with such process, since its creation was not finished yet.
(And it wouldn't be even visible in the process list, i.e. it wouldn't be included in the PsActiveProcessHead
.)
And that is how that thread got to be hanging. But what called it?
backgroundTaskHost
I searched through the PORT_MESSAGE
stack and found the process and thread ID of the caller. It turned out to be some thread from this process:
"C:\Windows\system32\backgroundTaskHost.exe" -ServerName:CortanaUI.AppX3bn25b6f886wmg6twh46972vprk9tnbf.mca
ntdll.dll!ZwAlpcSendWaitReceivePort + 14
rpcrt4.dll!long LRPC_BASE_CCALL::DoSendReceive(void) + 112
rpcrt4.dll!virtual long LRPC_CCALL::SendReceive(_RPC_MESSAGE *) + 51
rpcrt4.dll!I_RpcSendReceive + 6f
combase.dll!long CMessageCall::RpcSendRequestReceiveResponse(_GUID const &) + b5
combase.dll!virtual long CSyncClientCall::SendReceive2(tagRPCOLEMESSAGE *,unsigned long *) + 248
combase.dll!long DefaultSendReceive(CSyncClientCall *,tagRPCOLEMESSAGE *,unsigned long *) + 64
combase.dll!virtual long CSyncClientCall::SendReceive(tagRPCOLEMESSAGE *,unsigned long *) + 18b
combase.dll!virtual long CClientChannel::SendReceive(tagRPCOLEMESSAGE *,unsigned long *) + 84
combase.dll!void NdrExtpProxySendReceive(void *,_MIDL_STUB_MESSAGE *) + 4e
rpcrt4.dll!NdrpClientCall3 + 3a4
combase.dll!ObjectStublessClient + 138
combase.dll!ObjectStubless + 42
biwinrt.dll!void Windows::ApplicationModel::Background::CBackgroundTaskInstance::NotifyRemoteInstanceCompleted(long,unsigned char) + 4b
biwinrt.dll!virtual long Windows::ApplicationModel::Background::CBackgroundTaskInstance::Run(Windows::ApplicationModel::Background::IBackgro...
twinapi.appcore.dll!void Windows::ApplicationModel::Core::BackgroundTaskWrapper::Run(_TP_CALLBACK_INSTANCE *) + 6f
twinapi.appcore.dll!static void Windows::ApplicationModel::Core::BackgroundTaskWrapper::ThreadProc(_TP_CALLBACK_INSTANCE *,void *,_TP_WORK *...
ntdll.dll!TppWorkpExecuteCallback + 130
ntdll.dll!TppWorkerThread + 68a
kernel32.dll!BaseThreadInitThunk + 14
ntdll.dll!RtlUserThreadStart + 21
It was obvious that this thread was brokering some other call and that "something" was also hanging. Unfortunately we couldn't trace it
using this call stack, as it looked like the original caller submitted a WorkItem
using
SubmitThreadpoolWork
,
and went into its waiting loop.
(Or, it could've been QueueUserWorkItem
.)
In other words, the trace had been lost there.
So I decided to go back to DcomLaunch and started looking at the thread with the WrLpcReply
wait reason.
WrLpcReply & LPC Port Trace
Here's the call stack for it:
ntdll.dll!ZwAlpcSendWaitReceivePort + 14
RPCRT4.dll!virtual long LRPC_BASE_CCALL::SendReceive(_RPC_MESSAGE *) + 12f
RPCRT4.dll!I_RpcSendReceive + 6f
RPCRT4.dll!NdrSendReceive + 36
RPCRT4.dll!NdrpClientCall3 + 5d2
RPCRT4.dll!NdrClientCall3 + f0
RMCLIENT.dll!HamCreateActivityEx + 12f
bisrv.dll!long Execution::BmWorkItem::StartActivity(void *,unsigned short const *,unsigned long,unsigned long) + 180
bisrv.dll!virtual long Execution::BackgroundManager::TryAcquireResourceSetInternal(Execution::BMWorkItem *,RmResourceSetType,int,unsi...
bisrv.dll!virtual long Execution::BackgroundManager::TryAcquireResourceSetEx(_GUID const *,RmResourceSetType,int,unsigned long) + 8b
bisrv.dll!virtual long Execution::BackgroundManager::TryAcquireResourceSet(_GUID const *,RmResourceSetType,int) + 16
bisrv.dll!virtual long Execution::BackgroundPolicies::BackgroundPolicyBase<Execution::BackgroundPolicies::Common::DefaultWorkItemEntr...
bisrv.dll!virtual long Execution::BackgroundPolicies::BackgroundPolicyBase<Execution::BackgroundPolicies::Common::DefaultWorkItemEntr...
bisrv.dll!virtual long Execution::BackgroundManager::EvaluateActivationActionWithPolicy(Execution::BmWorkItem *) + 48c
bisrv.dll!static void ThreadPool::WorkerCallback(_TP_CALLBACK_INSTANCE *,void *,_TP_WORK *) + 14a
ntdll.dll!TppWorkpExecuteCallback + 130
ntdll.dll!TppWorkerThread + 68a
KERNEL32.DLL!BaseThreadInitThunk + 14
ntdll.dll!RtlUserThreadStart + 21
Again, it looked like a worker thread that had been started by
SubmitThreadpoolWork
,
and its original context was now lost (or, how it was initially
created by a call inside DcomLaunch.) In other words, I couldn't see it in the call stack that I showed above.
Additionally, a thread that was supposed to process requests on this LPC port was also hanging. But who owns that port?
We can take the thread context,
and use it to determine the value of the HANDLE
for that port. Then using the handle, find out the pointer to the ALPC_PORT
object.
After that we can do CommunicationInfo
->ServerCommunicationPort
->OwnerProcess
->UniqueProcessId
to deduce the PID
.
And that is how we can determine the process that was servicing the server side of the port.
struct _ALPC_PORT {
/*0000*/ _LIST_ENTRY PortListEntry;
/*0010*/ _ALPC_COMMUNICATION_INFO * CommunicationInfo;
/*0018*/ _EPROCESS * OwnerProcess;
/*0020*/ void * CompletionPort;
/*0028*/ void * CompletionKey;
/*0030*/ _ALPC_COMPLETION_PACKET_LOOKASIDE * CompletionPacketLookaside;
/*0038*/ void * PortContext;
/*0040*/ _SECURITY_CLIENT_CONTEXT StaticSecurity;
/*0088*/ _EX_PUSH_LOCK IncomingQueueLock;
/*0090*/ _LIST_ENTRY MainQueue;
/*00a0*/ _LIST_ENTRY LargeMessageQueue;
/*00b0*/ _EX_PUSH_LOCK PendingQueueLock;
/*00b8*/ _LIST_ENTRY PendingQueue;
/*00c8*/ _EX_PUSH_LOCK DirectQueueLock;
/*00d0*/ _LIST_ENTRY DirectQueue;
/*00e0*/ _EX_PUSH_LOCK WaitQueueLock;
/*00e8*/ _LIST_ENTRY WaitQueue;
/*00f8*/ _KSEMAPHORE * Semaphore;
/*00f8*/ _KEVENT * DummyEvent;
/*0100*/ _ALPC_PORT_ATTRIBUTES PortAttributes;
/*0148*/ _EX_PUSH_LOCK ResourceListLock;
/*0150*/ _LIST_ENTRY ResourceListHead;
/*0160*/ _EX_PUSH_LOCK PortObjectLock;
/*0168*/ _ALPC_COMPLETION_LIST * CompletionList;
/*0170*/ _CALLBACK_OBJECT * CallbackObject;
/*0178*/ void * CallbackContext;
/*0180*/ _LIST_ENTRY CanceledQueue;
/*0190*/ LONG SequenceNo;
/*0194*/ LONG ReferenceNo;
/*0198*/ _PALPC_PORT_REFERENCE_WAIT_BLOCK * ReferenceNoWait;
union {
struct {
/*01a0*/ ULONG Initialized : 01; // 0x00000001;
/*01a0*/ ULONG Type : 02; // 0x00000006;
/*01a0*/ ULONG ConnectionPending : 01; // 0x00000008;
/*01a0*/ ULONG ConnectionRefused : 01; // 0x00000010;
/*01a0*/ ULONG Disconnected : 01; // 0x00000020;
/*01a0*/ ULONG Closed : 01; // 0x00000040;
/*01a0*/ ULONG NoFlushOnClose : 01; // 0x00000080;
/*01a0*/ ULONG ReturnExtendedInfo : 01; // 0x00000100;
/*01a0*/ ULONG Waitable : 01; // 0x00000200;
/*01a0*/ ULONG DynamicSecurity : 01; // 0x00000400;
/*01a0*/ ULONG Wow64CompletionList : 01; // 0x00000800;
/*01a0*/ ULONG Lpc : 01; // 0x00001000;
/*01a0*/ ULONG LpcToLpc : 01; // 0x00002000;
/*01a0*/ ULONG HasCompletionList : 01; // 0x00004000;
/*01a0*/ ULONG HadCompletionList : 01; // 0x00008000;
/*01a0*/ ULONG EnableCompletionList : 01; // 0x00010000;
/*01a4*/
} s1;
/*01a0*/ ULONG State;
/*01a4*/
} u1;
/*01a8*/ _ALPC_PORT * TargetQueuePort;
/*01b0*/ _ALPC_PORT * TargetSequencePort;
/*01b8*/ _KALPC_MESSAGE * CachedMessage;
/*01c0*/ ULONG MainQueueLength;
/*01c4*/ ULONG LargeMessageQueueLength;
/*01c8*/ ULONG PendingQueueLength;
/*01cc*/ ULONG DirectQueueLength;
/*01d0*/ ULONG CanceledQueueLength;
/*01d4*/ ULONG WaitQueueLength;
/*01d8*/
};
struct _ALPC_COMMUNICATION_INFO {
/*0000*/ _ALPC_PORT * ConnectionPort;
/*0008*/ _ALPC_PORT * ServerCommunicationPort;
/*0010*/ _ALPC_PORT * ClientCommunicationPort;
/*0018*/ _LIST_ENTRY CommunicationList;
/*0028*/ _ALPC_HANDLE_TABLE HandleTable;
/*0040*/ _KALPC_MESSAGE * CloseMessage;
/*0048*/
};
struct _ALPC_PORT {
/*0000*/ _LIST_ENTRY PortListEntry;
/*0010*/ _ALPC_COMMUNICATION_INFO * CommunicationInfo;
/*0018*/ _EPROCESS * OwnerProcess;
/*0020*/ void * CompletionPort;
/*0028*/ void * CompletionKey;
/*0030*/ _ALPC_COMPLETION_PACKET_LOOKASIDE * CompletionPacketLookaside;
/*0038*/ void * PortContext;
/*0040*/ _SECURITY_CLIENT_CONTEXT StaticSecurity;
/*0088*/ _EX_PUSH_LOCK IncomingQueueLock;
/*0090*/ _LIST_ENTRY MainQueue;
/*00a0*/ _LIST_ENTRY LargeMessageQueue;
/*00b0*/ _EX_PUSH_LOCK PendingQueueLock;
/*00b8*/ _LIST_ENTRY PendingQueue;
/*00c8*/ _EX_PUSH_LOCK DirectQueueLock;
/*00d0*/ _LIST_ENTRY DirectQueue;
/*00e0*/ _EX_PUSH_LOCK WaitQueueLock;
/*00e8*/ _LIST_ENTRY WaitQueue;
/*00f8*/ _KSEMAPHORE * Semaphore;
/*00f8*/ _KEVENT * DummyEvent;
/*0100*/ _ALPC_PORT_ATTRIBUTES PortAttributes;
/*0148*/ _EX_PUSH_LOCK ResourceListLock;
/*0150*/ _LIST_ENTRY ResourceListHead;
/*0160*/ _EX_PUSH_LOCK PortObjectLock;
/*0168*/ _ALPC_COMPLETION_LIST * CompletionList;
/*0170*/ _CALLBACK_OBJECT * CallbackObject;
/*0178*/ void * CallbackContext;
/*0180*/ _LIST_ENTRY CanceledQueue;
/*0190*/ LONG SequenceNo;
/*0194*/ LONG ReferenceNo;
/*0198*/ _PALPC_PORT_REFERENCE_WAIT_BLOCK * ReferenceNoWait;
union {
struct {
/*01a0*/ ULONG Initialized : 01; // 0x00000001;
/*01a0*/ ULONG Type : 02; // 0x00000006;
/*01a0*/ ULONG ConnectionPending : 01; // 0x00000008;
/*01a0*/ ULONG ConnectionRefused : 01; // 0x00000010;
/*01a0*/ ULONG Disconnected : 01; // 0x00000020;
/*01a0*/ ULONG Closed : 01; // 0x00000040;
/*01a0*/ ULONG NoFlushOnClose : 01; // 0x00000080;
/*01a0*/ ULONG ReturnExtendedInfo : 01; // 0x00000100;
/*01a0*/ ULONG Waitable : 01; // 0x00000200;
/*01a0*/ ULONG DynamicSecurity : 01; // 0x00000400;
/*01a0*/ ULONG Wow64CompletionList : 01; // 0x00000800;
/*01a0*/ ULONG Lpc : 01; // 0x00001000;
/*01a0*/ ULONG LpcToLpc : 01; // 0x00002000;
/*01a0*/ ULONG HasCompletionList : 01; // 0x00004000;
/*01a0*/ ULONG HadCompletionList : 01; // 0x00008000;
/*01a0*/ ULONG EnableCompletionList : 01; // 0x00010000;
/*01a4*/
} s1;
/*01a0*/ ULONG State;
/*01a4*/
} u1;
/*01a8*/ _ALPC_PORT * TargetQueuePort;
/*01b0*/ _ALPC_PORT * TargetSequencePort;
/*01b8*/ _KALPC_MESSAGE * CachedMessage;
/*01c0*/ ULONG MainQueueLength;
/*01c4*/ ULONG LargeMessageQueueLength;
/*01c8*/ ULONG PendingQueueLength;
/*01cc*/ ULONG DirectQueueLength;
/*01d0*/ ULONG CanceledQueueLength;
/*01d4*/ ULONG WaitQueueLength;
/*01d8*/
};
Check above for the declaration of the _EPROCESS
struct.
So knowing the PID
of the process I was able to trace it to our DcomLaunch
.
But why was the LPC request not processed and the reply not dispatched back? Thus, it's obvious that "something" on the other end, that was supposed
to provide a reply, had hung up. And the multitude of threads with the WrAlertByThreadId
wait state implied where the deadlock was.
I looked at their call stacks and the picture was similar to this:
ntdll.dll!ZwWaitForAlertByThreadId + 14
ntdll.dll!RtlAcquireSRWLockExclusive + 165
bisrv.dll!BipAcquireGlobalLock + 79
bisrv.dll!void BipWorkItemStartCallback(void *,_CBROKERED_EVENT_ID, void *,void *,unsigned long, unsigned long,_BI_ACTIVATION_STATUS *)...
EventAggregation.dll!EaiSignalAggregateEvent + 178
EventAggregation.dll!EaiSignalCallback + e7
EventAggregation.dll!EaiProcessNotification + la3
EventAggregation.dll!WnfEventCallback + 528
EventAggregation.dll!AggregateEventWnfCallback + 38
ntdll.dll!RtlpWnfWalkUserSubscriptionList + 249
ntdll.dll!RtlpWnfProcessCurrentDescriptor + 105
ntdll.dll!RtlpWnfNotificationThread + 80
ntdll.dll!TppExecuteWaitCallback + a4
ntdll.dll!TppWorkerThread + 456
KERNEL32.DLL!BaseThreadInitThunk + 14
ntdll.dll!RtlUserThreadStart + 21
ntdll.dll!ZwWaitForAlertByThreadId + 14
ntdll.dll!RtlAcquireSRWLockExclusive + 165
psmserviceexthost.dll!void RmpAcquireGlobalLock(void) + 56
psmserviceexthost.dll!long RmpCpuSetManagerAudioCpuWnfCallback(_WNF_STATE_NAME,unsigned long,_WNF_TYPE_ID *,void *,void const *,unsig..
ntdll.dll!RtlpWnfWalkUserSubscriptionList + 249
ntdll.dll!RtlpWnfProcessCurrentDescriptor + 105
ntdll.dll!RtlpWnfNotificationThread + 80
ntdll.dll!TppExecuteWaitCallback + a4
ntdll.dll!TppWorkerThread + 456
KERNEL32.DLL!BaseThreadInitThunk + 14
ntdll.dll!RtlUserThreadStart + 21
ntdll.dll!ZwWaitForAlertByThreadId + 14
ntdll.dll!RtlAcquireSRWLockShared + 153
psmsrv.dll!PsmpOueryProcessListForUser + 488
psmserviceexthost.dll!void HampHostCheckNotifyZeroProcess(_RM_HAM_ACTIVITY_HOST *) + db
psmserviceexthost.dll!void ExecutionModel::ResourceManager::OnApplicationTerminated(_RM_PSM_HOST_INFO *) + 9f
psmserviceexthost.dll!virtual void ExecutionModel::ResourceManager::HostStateNotification(unsigned short const *,_RM_SID *,unsigned l...
psmserviceexthost.dll!void ModernResourceEnforcer::ProcessWorkItem(ModernResourceEnforcer::ModernResourceEnforcerWorkItem *) + 9b
psmserviceexthost.dll!static void ModernResourceEnforcer::NotificationWorkCallback(_TP_CALLBACK_INSTANCE *,void *,_TP_WORK *) + e2
ntdll.dll!TppWorkpExecuteCallback + 130
ntdll.dll!TppWorkerThread + 68a
KERNEL32.DLL!BaseThreadInitThunk + 14
ntdll.dll!RtlUserThreadStart + 21
It looked like all those threads were hanging in different SRW locks. But how could it be possible?
Resolution - Cascade Of Deadlocks
Everything started from a thread that hung up in the kernel with the WrPushLock
state. It is obvious that that thread was also holding at least one
SRW lock.
(It could've called BipAcquireGlobalLock
before it had deadlocked itself in the kernel - there are multiple references to bisrv.dll
in the call stacks that
I showed above.)
Then all other threads that tried to acquire BipAcquireGlobalLock
also hung up. One of them, having called RmpAcquireGlobalLock
, locked up another
SRW. And so on and so forth.
And what happened to the LPC clients that never received a reply and hung up indefinitely with the WrLpcReply
wait state?
I found several such threads in the explorer.exe
(or, Windows Explorer) process:
ntdll.dll!ZwAlpcSendWaitReceivePort + 14
rpcrt4.dll!virtual long LRPC_BASE_CCALL::SendReceive(_RPC_MESSAGE *) + 12f
rpcrt4.dll!I_RpcSendReceive + 6f
rpcrt4.dll!NdrSendReceive + 36
rpcrt4.dll!NdrpClientCall3 + 5d2
rpcrt4.dll!NdrClientCall3 + f0
rmclient.dll!HamCreateActivityForProcess + c9
twinapi.appcore.dll!long COSTaskCompletionHam::BeginTaskByHandleInternal(void *,PLM_TASKCOMPLETION_CATEGORY_FLAGS,PLM_TASKCOMPLETIO...
twinapi.appcore.dll!long COSTaskCompletionHam::BeginTask(unsigned long, PLM_TASKCOMPLETION_CATEGORY_FLAGS) + b6
twinui.pcshell.dll!Microsoft::HRL::RuntimeClass<Microsoft::HRL::RuntimeClassFlags<2>,Windows::Foundation::IAsyncOperationCompletedH...
SHCore.dll!void WorkThreadManager::CThread::ThreadProc(void) + 27e
SHCore.dll!?s_ExecuteThreadProc@CThread@WorkThreadManager@@CA?AV?$unique_any_t@V?$unique_storage@U?$resource_policy@PEAUHINSTANCE__
SHCore.dll!static unsigned long <lambda_9844335fc1434515leefcc3593dd6895>::<lambda_invoker_cdecl>(void *) + 11
kernel32.dll!BaseThreadInitThunk + 14
ntdll.dll!RtlUserThreadStart + 21
Something called HamCreateActivityForProcess
, but again all we can see in the call stack is the thread-pool worker, thus it's hard to tell what exactly
started the thread itself. But quite probably that "something" had hung up as well.
And that created a cascade of deadlocking threads throughout the entire system. While especially bad (practically fatally) it affected DcomLaunch. It was "alive" but in deep "coma".
Then every new click on the Start button, only increased the number of hanging threads in the system.
What is really amazing to me is that this entire cascade had started from just a deadlock in some insignificant process (in our PoC in this case), but then some thread in a critical process, such as DcomLaunch, happened to have been affected by it as well through the bug in the job object, that started this entire cascade of failures.
Conclusion
The moral of the story is this. Always try to avoid any system calls if you're holding some critical section! If possible, always move them out of its scope. There is always a way to avoid doing it - it just requires a little bit more work.
So again, if that didn't stick from you reading my post above:
It is always advisable to minimize the number of system calls inside an exclusive mutex scope, or even better, not to make them there at all!
And that's what happened with the kernel deadlock that we reported to MSRC earlier.