This article contains functions and features that are not documented by the original manufacturer. By following advice in this article, you're doing so at your own risk. The methods presented in this article may rely on internal implementation and may not work in the future.
Intro
This post will be more of a vlog post than anything else. I've spent over a week screen-recording myself while coding a Windows driver in Visual Studio. The videos should show the way how one can inject a test DLL into all running processes on Windows 10. I recorded myself coding it from start to finish, so it should be a somewhat comprehensive demo ... or, a snoozefest. 😁
In the process I also learned that coding complex algorithms and talking at the same time isn't easy, and what comes out isn't always what I would want to say had I had a quiet time to think about it 😂 So, I misspoke in a few places there. Thus, please be lenient with me if you watch it all.
And, if you are not into reading blog posts and want to start watching the video tutorial itself, check the playlist blow. Additionally, you can just download the source code alone.
Finally, let me say that if you mess up your production OS by misapplying what I showed here, it will be entirely on you. Don't blame me later!
Credit
First and foremost, I want to show my appreciation to Rbmm for sharing his original code that my solution is based on. Please give him props at his GitHub repo. He is the original author of most of the concepts that I will outline in my long video presentation here.
Quick Overview
I am not going to delve into all the nitty-gritty details in this blog post that I covered when recording my tutorial. But just to recap, here is how the process of injection into all running processes in Windows works:
- We'll write a kernel driver to install our callback that will be invoked when a module (or DLL) is mapped into a process. We can do it using
the
PsSetLoadImageNotifyRoutine
function. - Knowing the sequence how DLLs are loaded in Windows, namely, first we have
ntdll.dll
that loads into any user-mode process, followed bykernel32.dll
, that loads into all non-native processes. Thus, if we intercept in our callback the moment whenkernel32.dll
is being loaded, we can inject our own DLL before it.Just for fun, we will call our DLL, that we will be injecting into all processes, as
FAKE.DLL
. And to signify its bitness, the actual file will be namedFAKE64.DLL
orFAKE32.DLL
. It won't do much, except just write into a log file the date & time and the process that it was injected into.The way we will be injecting it puts a constraint on our
FAKE.DLL
in that it cannot rely on imports from any DLLs except forntdll.dll
. This includes C-Runtime (or CRT) and most of the C++ standard libraries. - To be able to bypass security mitigations in Windows,
and to streamline the loading of our injected DLL, we will first create a
KnownDll
section out of ourFAKE.DLL
. This way we will be able to load ourFAKE.DLL
from user-mode without raising alarms from "Code Integrity Guard" (CIG) or from "Arbitrary Code Guard" (ACG).Note that this is not a bypass of the security mitigations in Windows, since we're employing a kernel driver for our solution.
- The injection itself will be done through a series of
Asynchronous Procedure Calls (APC)
that will be initiated from the kernel mode. The sequence will go as such:
- We will open our
FAKE.DLL
and create aKnownDll
section out of it in the callback to thePsSetLoadImageNotifyRoutine
function. We need to keep in mind that the callback will be executing from within a critical section, and thus we can't do much from it. Thus we will only quickly queue a kernel APC, usingKeInitializeApc
/KeInsertQueueApc
functions. - From our APC callbacks, we will skip the
KernelRoutine
routine because it will be executing underAPC_LEVEL
IRQL. - But from within the
NormalRoutine
routine (that will be running under thePASSIVE_LEVEL
IRQL) we will map our special base-independent shell-code into the target process, and queue user-mode APC that will invoke it.We will write our shell-code in Assembly language that will enable it to be base-independent, meaning that it will not require relocations and can run from any address in memory.
- The shell-code will execute two simple function calls from the address space of target process:
C++ pseudo-code[Copy]
UNICODE_STRING uS = { sizeof(L"FAKE.DLL") - sizeof(WCHAR), sizeof(L"FAKE.DLL"), L"FAKE.DLL" }; HANDLE h; LdrLoadDll(NULL, 0, &uS, &h); //BaseAddress = base address of this module NtUnmapViewOfSection(NtCurrentProcess(), BaseAddress);
- After that our
FAKE.DLL
will be injected into the target process, that we can verify by running itsDllMain
function that will do some basic logging into a file for us.
- We will open our
This is a quick overview of the injection technique, where I omitted the peculiarities of dealing with the WOW64 processes (or 32-bit processes running on the 64-bit operating system) and other important details that I covered in detail in my video overview.
The video tutorial also covers the aspects of testing the driver in a VM, and creating a separate test C++ project to debug the injected FAKE.DLL.
Video Playlist
Note that the following is a playlist of multiple consecutive videos where I will show you the coding process from start to finish. I would recommend watching them in sequence and playing them full-screen to make sure that you can see the code:
Video Timecodes
Or, the following are time-coded segments of the tutorial that will open in a YouTube player:
- Installing & Setting Up Tools, Basic Concepts:
- 1:31 - Setting up virtual machines to run driver tests in.
- 4:22 - Setting up Visual Studio components needed to code our project.
- 7:00 - Setting up tools in a VM:
- 7:44 - Process Hacker - to view running processes & modules.
- 9:36 - DebugView - to view debugging output from our driver.
- 11:16 - WinObj - to view kernel space objects.
- 11:55 - PEInternals - to statically view PE files.
- 13:11 - WERSetup - to set up Windows Error Reporting to catch user-mode process crashes.
- 15:19 - WinAPI Search - to check Imports/Exports from PE files and to search for error codes.
- 16:53 - Driver Loader/Unloaded - to register, start, stop and unregister our driver.
- 17:37 - Putting the Operating System in a VM into a test signing mode to be able to run our driver.
- 19:52 - Creating a snapshot in the VM in case we mess up the operating system during our driver testing.
- 21:20 - Quick overview of: physical/virtual memory, and of DLLs/modules/"sections" in the kernel space.
- 30:34 - Overview of DLL injection with the
PsSetLoadImageNotifyRoutine
function. - 31:13 - Basic overview how we can inject our DLL into every process.
- Starting Windows Driver C++ Project:
- 0:29 - Credit to Rbmm.
- 1:01 - Recap of how we'll be injecting our FAKE.DLL into all processes: ntdll.dll, kernel32.dll, no CRT, use CFG, kernel APC.
- 9:38 - Starting to code: Creating solution, named "InjectAll".
- 11:03 - Starting WDM Windows driver project, named "Drv".
- 12:26 - Adding
DrvMain.cpp
. - 13:41 - Adding
DrvTypes.h
. - 15:55 - Adding
SharedDefs.h
. - 17:14 - Adding
CFunc
class. - 19:38 - Adding
DriverEntry
function. - 21:12 - Installing the correct Windows SDK & WDK.
- 24:04 - Installing (fighting with) Spectre-mitigated libraries for Visual Studio.
- 26:25 - Solution to missing Spectre-mitigated libraries.
- 28:49 - Fixing initial issues with building a driver solution.
- 31:25 - (Erroneously) Removing test signing from building a driver.
- 34:01 - Coding
DbgPrintLine
macro. - 38:11 - Coding
DriverUnload
routine. - 39:59 - Testing our first build of the driver.
- 43:15 - Adding test signing back for building a driver in Visual Studio.
- 45:02 - Was able to start and stop our first build of the driver!
- Beginning to Code Windows Driver:
- 0:55 - Coding basic driver entry objects.
- 2:43 - Setting up
PsSetLoadImageNotifyRoutine
callback. - 8:10 - Setting up
OnLoadImage
callback. - 11:15 - Coding
FreeResources()
function. - 15:30 - Coding the statement to catch kernel32.dll being loaded.
- 19:50 - Coding
CFunc::IsSuffixedUnicodeString()
function. - 25:41 - Defining
STATIC_UNICODE_STRING
macro. - 30:01 - Coding
CFunc::IsMappedByLdrLoadDll()
function. - 40:03 - Coding
CFunc::IsSpecificProcessW()
function. - 1:10:45 - Determining if we got a WOW64 process,
IoIs32bitProcess
. - 1:12:57 - Running another driver test of what we built so far.
- Coding Windows Driver: Creating Section:
- 0:39 - Quick review of what we've done so far.
- 3:09 - Setting up
CSection
class. - 4:37 - Setting up
DLL_STATS
struct. - 6:07 - Declaring
SECTION_TYPE
enum. - 10:25 - Coding
CSection::Initialize()
function. - 12:04 - Coding
CSection::GetSection()
singleton function usingRtlRunOnceBeginInitialize
/RtlRunOnceComplete
functions. - 32:03 - Explanation of Code Integrity Guard (CIG) and how it may affect our DLL injection.
- 35:26 - Lowdown on
KnownDlls
. - 37:48 - Using
PsInitialSystemProcess
to attach to system process. - 45:15 - Defining the debugging
TAG
macro for kernel functions. - 47:39 - Continuing to code
CSection::GetSection()
function.
- Coding Windows Driver: Creating Section - KnownDlls:
- 1:24 - Fixing previous bug in the
CSection::GetSection()
function. - 3:44 - Coding
CSection::FreeSection()
function. - 9:49 - Adding
DBG_VERBOSE_DRV
preprocessor directive for verbose debugging output. - 13:51 - Adding code to call
CSection::FreeSection()
function. - 17:10 - Starting to code
CSection::CreateKnownDllSection()
function. - 20:27 - Setting up to "steal" security descriptor from the existing
KnownDll
- kernel32.dll. - 21:22 - Opening existing kernel32.dll section.
- 30:58 - Testing current build of the driver.
- 34:14 - Adding code to call
CSection::GetSection()
function. - 39:17 - Testing again the current build of the driver.
- 41:21 - Going back to coding
CSection::CreateKnownDllSection()
function. - 42:20 - Retrieving security descriptor from kernel32.dll section with
ZwQuerySecurityObject
. - 47:22 - Description of the
OBJ_PERMANENT
section object. - 49:48 - Differentiation of our Fake.dll section names for
KnownDlls
. - 57:22 - Allocating memory for the security descriptor from the kernel32.dll section.
- 1:24 - Fixing previous bug in the
- Coding Injected FAKE.DLL:
- 1:18 - Adding new C++ project - FAKE.dll.
- 3:03 - Review of restrictions of injection of our DLL into a process: ntdll.dll, kernel32.dll.
- 9:11 - Adding new
DllTypes.h
file. - 12:15 - Removing C-Run-Time (CRT) from our FAKE.dll for the 64-bit build.
- 15:54 - Adding Exports.def file.
- 16:41 - Adding loadcfg.c file to enable Control Flow Guard (CFG) for our FAKE.dll.
- 19:54 - Adding loadcfg64.asm file and x64 Assembly into it for CFG.
- 25:29 - Removing C-Run-Time (CRT) from our FAKE.dll for the 32-bit build.
- 28:48 - Coding loadcfg32.asm file with x86 Assembly into it for CFG.
- 36:13 - Adding
LogToFile()
function using native functions from ntdll.dll. - 51:46 - Adding
LogToFileFmt()
function. - 59:39 - Adding code in
DllMain()
to run when our DLL is injected into a process.
- Coding Injected FAKE.DLL - TestConsole Project:
- 1:02 - Creating
TestConsole
project. - 1:45 - Writing test code to call
DllMain
in our FAKE.DLL. - 4:36 - Ways to debug a DLL using TestConsole project.
- 11:52 - Adding code to get pointer to
TEB
in DllMain. - 13:33 - Coding
Get_TEB()
function. - 17:30 - Coding
Get_PEB()
function. - 18:36 - Adding code to our
DllMain
for debugging output: process ID, process image path, current time with ntdll.dll only. - 28:33 - Testing our FAKE.DLL in a
TestConsole
with debugging output. - 30:57 - Explanation why we need to adjust security descriptor for the InjectAll folder for access from any process.
- 32:37 - Adding
SetDS_InjectAllFolder()
debugging function. - 43:28 - Running our TestConsole with the
SetDS_InjectAllFolder()
function to adjust security descriptor on the InjectAll folder.
- 1:02 - Creating
- Coding Windows Driver: Creating Section - KnownDlls (continued):
- 0:36 - Continuing to code
CSection::CreateKnownDllSection()
function. - 3:16 - Opening our FAKE.DLL file using
ZwOpenFile
. - 13:09 - Creating a section from our FAKE.DLL using
ZwCreateSection
. - 17:57 - Filling in our
DLL_STATS
with created section info. - 18:22 - Getting our section object pointer with
ObReferenceObjectByHandleWithTag
. - 24:49 - Adjusting
CSection::FreeSection()
function to remove our section. - 27:28 - Adjusting
CSection::CreateKnownDllSection()
function to close permanent section correctly in case of an error. - 30:46 - Testing current build of the driver and two bitnesses of FAKE.DLL in a test VM.
- 34:36 - Dealing with the error
0xC0000035
during testing. - 37:09 - Fixing a bug with missing
CSection::Initialize()
function call. - 48:01 - Adjusting
sectionType
debugging output to be more readable after a change by doing some refactoring. - 51:06 - Checking that security descriptor is set up correctly on the InjectAll folder.
- 0:36 - Continuing to code
- Coding Windows Driver: DLL Injection via Kernel APC:
- 0:52 - Adding version resource to our FAKE.DLL.
- 2:41 - Explanation why we need to use Asynchronous Procedure Calls (APC) from our driver callback.
- 7:00 - Quick lowdown on kernel APC
KernelRoutine
,NormalRoutine
,RundownRoutine
. - 10:44 - Adding
CSection::InjectDLL()
function. - 14:55 - Quick lowdown on why we need to allocate from
NonPagedPool
when queuing KAPC. - 18:00 - Coding of queuing of the kernel APC with
KeInitializeApc
. - 23:38 - Using reference count on our driver object and the section object to prevent problems when queuing APC.
- 27:42 - Inserting kernel APC with
KeInsertQueueApc
. - 33:29 - Explanation of how to dereference driver object from APC routines correctly. Why I'm coding it using JMP instruction from Assembly language.
- 41:21 - Adding asm64.asm and asm32.asm files for APC callback stubs.
- 43:21 - Coding
RundownRoutine
APC callback stub in x64 Assembly. - 44:44 - Coding
RundownRoutine_Proc()
callback procedure in C++. - 51:58 - Lowdown on the use of the __imp_ prefix on imported function calls from the Assembly code.
- 58:00 - Coding
KernelRoutine
APC callback stub in x64 Assembly. - 1:01:11 - Coding
KernelRoutine_Proc()
callback procedure in C++. - 1:13:06 - Explanation of forwarding function call parameters on the stack inside
KernelRoutine
function written in x64 Assembly. - 1:18:04 - Coding
NormalRoutine
APC callback stub in x64 Assembly. - 1:19:17 - Coding
NormalRoutine_Proc()
callback procedure in C++.
- Coding Windows Driver: DLL Injection via Kernel APC (continued):
- 0:28 - Recap of what we've coded in x64 Assembly so far.
- 3:16 - Starting to code asm32.asm x86 Assembly file.
- 4:00 - Coding
RundownRoutine
APC callback stub in x86 Assembly. - 7:24 - Explanation of forwarding function call parameters on the stack inside
RundownRoutine
function written in x86 Assembly. - 16:05 - Coding
KernelRoutine
APC callback stub in x86 Assembly. - 18:31 - Explanation of forwarding function call parameters on the stack inside
KernelRoutine
function written in x86 Assembly. - 22:52 - Coding
NormalRoutine
APC callback stub in x86 Assembly.
- Coding Windows Driver: DLL Injection - ShellCode x64:
- 1:22 - Reasons for using APC to code DLL injection from our
OnLoadImage
kernel callback. - 8:05 - Coding
RundownRoutine_Proc()
callback. - 11:59 - Coding
KernelRoutine_Proc()
callback. - 14:50 - Coding
NormalRoutine_Proc()
callback. - 19:21 - Explanation of two types of code that we will put into our FAKE.DLL: Shell-code and DllMain.
- 22:50 - Adding dll_asm64.asm file with the base-independent x64 Assembly shell-code to the FAKE.DLL project.
- 24:33 - Coding
UserModeNormalRoutine
function shell-code in base-independent x64 Assembly. - 29:57 - Explanation why we can't use imports from external DLLs to call system functions in our base-independent shell-code.
- 31:45 - Coding
getProcAddrForMod
function to resolve exported function address from a module in base-independent x64 Assembly. - 1:01:49 - Finishing to code
UserModeNormalRoutine
function in base-independent x64 Assembly.
- 1:22 - Reasons for using APC to code DLL injection from our
- Coding Windows Driver: DLL Injection - ShellCode x86:
- 1:07 - Adding dll_asm32.asm file with the base-independent x86 Assembly shell-code to the FAKE.DLL project.
- 2:04 - Recap of
UserModeNormalRoutine
function from x64 Assembly code. - 4:31 - Coding
getProcAddrForMod
function to resolve exported function address from a module in base-independent x86 Assembly. - 25:55 - Coding
UserModeNormalRoutine
function in base-independent x86 Assembly. - 30:58 - Coding
getStr_LdrLoadDll()
function to obtain pointer to a base-independent static string. - 47:59 - Coding
getStr_NtUnmapViewOfSection()
function to obtain pointer to a base-independent static string. - 59:54 - Setting up
UserModeNormalRoutine
function to be exported as the ordinal 1 in Exports.def. - 1:02:33 - Explanation how to mark
UserModeNormalRoutine
function to bypass Export Suppression from CFG. - 1:05:00 - Coding exported stub function
f1()
to include CFG conformance for theUserModeNormalRoutine
function.
- Coding Windows Driver: DLL Injection - Finishing up:
- 1:13 - Adding
SEARCH_TAG_W
struct to keep static signature in our fake.dll. - 7:00 - Modifying our dummy exported function
f1()
to include static signature inSEARCH_TAG_W
struct. - 13:36 - Coding
CFunc::FindStringByTag()
function. - 20:29 - Adjusting
CSection::CreateKnownDllSection()
function to retrieve info from our FAKE.DLL section:ZwMapViewOfSection
, resolving ordinal 1 forUserModeNormalRoutine
, callingCFunc::FindStringByTag
andZwQuerySection
. - 43:06 - Adding new members into
DLL_STATS
with additional info about our section.
- 1:13 - Adding
- Coding Windows Driver: Mapping Shell-Code & FAKE.DLL:
- 1:21 - Review of
DLL_STATS
struct members. - 2:22 - Diagram of mapping FAKE.DLL into a process: shell-code and
DllMain
functions,PreferredAddress
when mapping. - 16:07 - Creating
CSection::MapSectionForShellCode()
function that maps our shell-code. - 37:05 - Writing code to map section for shell-code in
NormalRoutine_Proc()
callback. - 42:52 - Coding
CFunc::debugGetCurrentProcName()
to get current process image name.
- 1:21 - Review of
- Coding Windows Driver: Invoking Shell-Code & Loading FAKE.DLL:
- 0:40 - Recap of how our Shell-code will run from the
UserModeNormalRoutine()
function. - 5:24 - Diagram with explanation of invoking kernel APCs to run our Shell-code in user-mode.
- 14:15 - Finishing up writing kernel APC callbacks:
KernelRoutine_Proc()
,NormalRoutine_Proc()
. - 37:19 - Adding code to inject DLL into
OnLoadImage()
callback via ourCSection::InjectDLL()
function. - 40:32 - Building and testing our injection project with the notepad.exe process only.
- 50:17 - Example of dealing with a crash in a user-mode process (notepad.exe), collecting crash dumps with
WERSetup
. - 52:40 - Adjusting
NormalRoutine_Proc()
to handle injection into WOW64 processes withPsWrapApcWow64Thread
. - 56:23 - Testing injection into WOW64 notepad.exe process.
- 0:40 - Recap of how our Shell-code will run from the
- Final Testing:
- Testing Driver On Windows 7, Crash Dump Analysis, Bug Fixes:
- 1:25 - Fixing a small bug.
- 3:24 - Overview of how I used PE Internals tool.
- 5:55 - Testing our driver on Windows 7 Pro, 64-bit OS.
- 10:28 - Dealing with the Blue Screen Of Death (BSOD), or BugCheck on Windows 7.
- 14:26 - Opening a crash dump file
memory.dmp
in WinDbg to analyze OS crash:run !analyze -v
. - 20:17 - Fixing the issue with the crash to make our driver backward compatible with Windows 7.
- 21:32 - Testing updated driver on Windows 7 to inject our FAKE.DLL into all running processes.
- 28:15 - Conclusion.
Downloads
If you are interested in the source code for what I've been coding in the tutorial above:
- You can download the source code here as the Visual Studio 2019 solution.