Trusted Platform Module in Windows - Part 1 - How to use TPM to encrypt and decrypt small blobs of data programmatically.

Intro

TPM, or a Trusted Platform Module has been a part of our computers for a while now. In a nutshell, TPM is a small dedicated microchip (on a motherboard, or it may be purchased separately) that can perform dedicated cryptographic functions. This blog post will not be a review of the TPM features though. Instead we'll focus on just one thing that can be useful for the Windows developers.

Rbmm had done the bulk of the initial research on how to encrypt (and decrypt) small chunks of data programmatically using TPM and Windows APIs. In this article I will expand on his findings with some additional research of my own.

Why TPM?

I was skeptical at first when I saw Rbmm's findings on how to encrypt and decrypt blobs of bytes using TPM. To me it looked very similar to what could be done with a basic hashing algorithm and a later symmetric encryption, such as HMAC & AES. After all, a TPM encryption function was accepting an arbitrary array of bytes, an optional PIN and on return it gave back an encrypted blob.

So why do we need a TPM then?

Upon further inspection (and reverse-engineering) of the implementation of the TPM encryption in Windows I was able to clearly see the reasons why it was beneficial to use TPM. To name just a few that come to mind:

TPM encryption and decryption are performed by the TPM hardware, which is much more secure than a software-implemented crypto.
TPM crypto is very slow (comparable to the CPU clock speeds, of course.) This could be an intentional feature to prevent brute-force attacks.
TPM functions can be blocked, or locked out, because of a repeated misuse.
TPM works with very small amounts of data. The thinking here is that you will use it to protect the encryption key itself instead of your entire data. And then encrypt the latter one with such a key using another form of encryption.
TPM protects encrypted data from being copied to another computer, even if encryption PIN, or password is known.

Let's review some of these points with the following code sample.

TPM Encryption & Decryption In Code

I'll take a liberty to re-write Rbmm's original function that performs TPM crypto. This is a higher-level way to encrypt or decrypt a small blob of data using the TPM:

C++[Copy]

#include <assert.h>
#include <vector>

#include <ncrypt.h>
#pragma comment(lib, "Ncrypt.lib")

#include <bcrypt.h>
#pragma comment(lib, "Bcrypt.lib")


/// <summary>
/// Encrypt or decrypt a small blob of bytes using the built-in TPM.
/// </summary>
/// <param name="bEncrypt">true - to encrypt, false - to decrypt</param>
/// <param name="pInput">Pointer to the byte array to encrypt or decrypt</param>
/// <param name="ncbSzInput">Size of the 'pInput' array in bytes. Maximum allowed size is 128 bytes.</param>
/// <param name="pPIN">Pointer to an optional PIN to use for encryption or decryption. Can be any arbitrary byte array</param>
/// <param name="ncbSzPIN">Size of 'pPIN' in bytes, or 0 not to use PIN. Maximum allowed size is 1024 bytes for TPM 2.0.</param>
/// <param name="result">Receives the resulting byte array if success</param>
/// <returns>ERROR_SUCCESS if operation succeeds, otherwise an error status code.</returns>
NTSTATUS TPM_encrypt_decrypt(
	__in bool bEncrypt,
	__in const void* pInput,
	__in size_t ncbSzInput,
	__in_opt const void* pPIN,
	__in size_t ncbSzPIN,
	__out std::vector<uint8_t>& result
	)
{
	assert(pInput);

	NTSTATUS status;

	//Check sizes for the ULONG overflow that is used internally by MSFT
	if(ncbSzInput <= INT_MAX &&
		ncbSzPIN <= INT_MAX)
	{
		//Open a handle to the CNG key storage provider
		NCRYPT_PROV_HANDLE hProv;
		status = NCryptOpenStorageProvider(&hProv,
			MS_PLATFORM_KEY_STORAGE_PROVIDER,               //"Microsoft Platform Crypto Provider"
			0);

		if(status == ERROR_SUCCESS)
		{
			//Open TPM provider handle
			NCRYPT_KEY_HANDLE hNcKey;
			status = NCryptOpenKey(hProv, 
				&hNcKey, 
				TPM_RSA_SRK_SEAL_KEY,       //"MICROSOFT_PCP_KSP_RSA_SEAL_KEY_3BD1C4BF-004E-4E2F-8A4D-0BF633DCB074"
				0,
				NCRYPT_SILENT_FLAG);

			//Close provider handle
			NCryptFreeObject(hProv);

			if(status == ERROR_SUCCESS)
			{
				BCryptBuffer bc_buff;
				BCryptBufferDesc bc_desc;

				//Do we have a PIN?
				if(ncbSzPIN)
				{
					//Fill out additional data for it
					assert(pPIN);

					bc_buff.cbBuffer = (ULONG)ncbSzPIN;
					bc_buff.BufferType = NCRYPTBUFFER_TPM_SEAL_PASSWORD;
					bc_buff.pvBuffer = (PVOID)pPIN;

					bc_desc.ulVersion = BCRYPTBUFFER_VERSION;
					bc_desc.cBuffers = 1;
					bc_desc.pBuffers = &bc_buff;
				}


				PBYTE pb = nullptr;
				ULONG cb = 0;

				while(true)
				{
					//Encrypt/decrypt ... but first get the needed buffer size
					status = (bEncrypt ? NCryptEncrypt : NCryptDecrypt)(
						hNcKey,
						(PBYTE)pInput,
						(DWORD)ncbSzInput,
						ncbSzPIN ? &bc_desc : nullptr,
						pb,
						cb,
						&cb,
						NCRYPT_SEALING_FLAG);

					if(status != ERROR_SUCCESS)
					{
						//Failed
						break;
					}

					//Did we allocate the buffer already?
					if(pb)
					{
						if(result.size() >= cb)
						{
							//Did we get less data?
							if(result.size() > cb)
							{
								result.resize(cb);
							}

							break;
						}
						else
						{
							//Why is the API giving us different sizes?
							assert(false);
						}
					}

					//Allocate needed buffer
					assert(cb > 0);
					result.resize(cb);
					pb = result.data();
				}

				//Close TPM key
				NCryptFreeObject(hNcKey);
			}
		}
	}
	else
	{
		//Size overflow
		status = STATUS_INTEGER_OVERFLOW;
	}

	//Clear the result if we failed
	if(status != ERROR_SUCCESS)
	{
		result.clear();
	}

	return status;
}

And then, if you want to use it, you can do something similar to this:

C++[Copy]

//Key to encrypt
#define INPUT_2_ENCRYPT "{235CFBED-1089-4F76-8BB3-419C9BB4590F}"

//PIN to use
#define TPM_PIN "7331"  //Can be set to "" for no PIN

std::vector<uint8_t> enc_data;

//Encrypt our input first
NTSTATUS status = TPM_encrypt_decrypt(true, 
	INPUT_2_ENCRYPT, 
	sizeof(INPUT_2_ENCRYPT) - 1, 
	TPM_PIN, 
	sizeof(TPM_PIN) - 1, 
	enc_data);

if(status == ERROR_SUCCESS)
{
	std::vector<uint8_t> dec_data;

	//Then decrypt it
	status = TPM_encrypt_decrypt(false, 
		enc_data.data(), 
		enc_data.size(), 
		TPM_PIN, 
		sizeof(TPM_PIN) - 1,
		dec_data);

	if(status == ERROR_SUCCESS)
	{
		//All good!
		std::cout << "Result: " << (const char*)dec_data.data() << std::endl;
	}
	else
	{
		//Failed
		std::cout << "ERROR: (0x" << std::hex << status << ") TPM decryption" << std::endl;
	}
}
else
{
	//Failed
	std::cout << "ERROR: (0x" << std::hex << status << ") TPM encryption" << std::endl;
}

There two function calls would be used separately: in most cases the encryption logic - before the encrypted data is serialized to persistent storage; and the decryption logic - to convert the de-serialized data back into plaintext. The PIN should preferably come from the user. It should not be stored anywhere other than temporarily in RAM.

Specifics of TPM Crypto

There are a few things to point out about the TPM encryption and decryption:

Both functions, i.e. NCryptEncrypt and NCryptDecrypt with MS_PLATFORM_KEY_STORAGE_PROVIDER and TPM_RSA_SRK_SEAL_KEY are really slow. So don't call them in a performance-critical part of your code.
The size of the data chunk, or the pInput parameter for my TPM_encrypt_decrypt function is limited to only 128 bytes, inclusively. Thus, ideally you will use it only to encrypt (and decrypt) the symmetric encryption key, which can be later used to perform the needed crypto on a larger set of data.
The PIN for my TPM_encrypt_decrypt function can be omitted, if you set it to size 0 in the ncbSzPIN parameter. If you decide to go with it then it can be any sequence of bytes. (In the TPM documentation, Microsoft calls it a "password".) Additionally, the size of the PIN is limited to 1024 bytes for the TPM v.2.0 and doesn't seem to have that limit for the TPM v.1.2.
If either of the lengths of the input data is too long for the TPM crypto functions, they will return the error code 0x80090027, or NTE_INVALID_PARAMETER.
Interestingly enough, if you search by the error code 0x80090027, the results that come up indicate that some Windows users that enabled PIN logins receive that error. (Ref 1, ref 2.)
Thus if you're one of the people who found this blog post using the Google search by that error code, I would suggest following steps outlined here to reset your TPM.
If you provide an incorrect PIN for the decryption function, the NCryptDecrypt API will return the error code 0x80090010, or NTE_PERM:
Access denied.

You need to be careful about how many wrong PINs you feed into the TPM module.

Note that if you call the TPM decryption function 5 times in a row with an incorrect PIN, the call to NCryptDecrypt will return 0x80280013, or TPM_E_NOTSEALED_BLOB:
TPM 1.2: An encrypted blob is invalid or was not created by this TPM.

After that the TPM will be locked out and any attempt to call NCryptEncrypt or NCryptDecrypt again will return the error code 0x80280921, or TPM_20_E_LOCKOUT:

TPM 2.0: Authorization for objects subject to DA protection are not allowed at this time because the TPM is in DA lockout mode.

To unlock a locked TPM one may need to reboot the computer. Read here for more details about TPM lockouts.
If the hardware does not have a TPM module, the NCryptOpenStorageProvider API will return the error code 0x80090030, or NTE_DEVICE_NOT_READY:
The device that is required by this cryptographic provider is not ready for use.
The input data blob that was encrypted with a TPM on one computer will be computationally infeasible to decrypt on another computer, even if the PIN for decryption is known.
The versions of the TPM, prior to v.1.2, do not support encrypting (or decrypting) data and will return a hardcoded error code 0x80004001, or E_NOTIMPL.

Uses For The TPM Crypto Functions

One of the primary ways how Microsoft are using the TPM cryptography is for facilitating encryption and decryption for the Windows user logins that are done with a PIN:

Windows user login with a PIN.

Since the PIN is not stored anywhere on the computer, a user becomes a perfect source for it. It acts as a knowledge factor in the login process.

Thus, a user-provided PIN during the login phase from the credential provider is fed into the TPM decryption function to receive the necessary login credentials for the user. In this case, the TPM hardware provides the added protection, such as locking out the TPM in case of multiple incorrect attempts to enter the PIN:

You've entered an incorrect PIN too many times. To try again, restart your device.

So ideally, the PIN for the TPM encryption or decryption functions should be provided by the user and can be a very simple sequence of numbers (or letters) that a human can remember. The strength of such a PIN is ensured by the TPM hardware that will block repeated attempts to brute-force it.

On the flip side, do not use a PIN that can be stored unencrypted anywhere in persistent storage. In that case it is better not to use any PIN at all, if your requirement is to employ TPM encryption.

TPM Implementation Internals

And to wrap up this topic, let me share a few internals of the TPM implementation in Windows.

The NCryptEncrypt and NCryptDecrypt functions that I demoed above are a higher-level method of using TPM encryption. They route the execution into the PCPKsp.dll module, if you specify the TPM_RSA_SRK_SEAL_KEY name (or "MICROSOFT_PCP_KSP_RSA_SEAL_KEY_3BD1C4BF-004E-4E2F-8A4D-0BF633DCB074") in a call to NCryptOpenKey.

PCPKsp.dll is a module that has the following description: "Microsoft Platform Key Storage Provider for Platform Crypto Provider".
It's a relatively new key storage provider in Windows.

The code-flow inside the NCryptEncrypt function that invokes a TPM encryption goes as such:

PCPKsp!PCPKspEncrypt function is invoked first.
It then calls into PCPStorageProvider::Seal(uchar *,ulong,uchar *,ulong,uchar *,ulong,ulong *,ulong)
That redirects into ProviderSeal(void *,uchar *,ulong,uchar *,ulong,uchar *,ulong,ulong *,ulong)

Which calls into either one of the following functions (depending on the model of the TPM hardware available):

TpmProvider20::Seal[Copy]

virtual HRESULT TpmProvider20::Seal(
	unsigned char* pInput,        //Input bytes to encrypt
	unsigned int ncbSzInput,      //Length of 'pInput' in bytes    - max is 0x80 (inclusive)
	unsigned char* pPin,          //PIN, or NULL if none
	unsigned int ncbSzPin,        //Length of the 'pPin' in bytes  - max is 0x400 (inclusive)
	unsigned char* pOutput,       //Optional encryption output buffer
	unsigned int ncbSzOutput,     //Size of 'pOutput' in bytes
	unsigned int* ncbSzUsed,      //Address of a variable that receives the number of bytes used in 'pOutput'
	unsigned int                  //Unknown - some flags?
	);

And then, there's an almost identical function for TPM v.1.2, TpmProvider12:Seal, with the difference that the ncbSzPin parameter is not checked against the 0x400 upper limit.

To convert an HRESULT error code returned by many functions in the PCPKsp.dll module to SECURITY_STATUS, it uses an internal function, called SecurityStatusFromHResult.

TPM Native Implementation

If you prefer to approach TPM from even lower level, you may try to use "TPM Base Services" for that. These are not the easiest set of APIs to call. But overall you will need to create a context with a call to Tbsi_Context_Create, and then use Tbsip_Submit_Command function to pass commands to the TPM, and finally remember to invoke Tbsip_Context_Close, among other functions.

Alternatively, if you want to go even lower:

Tbsi_Context_Create function internally communicates with the TPM driver using the "\\??\\TPM" kernel symbolic link by opening a file object using the native NtCreateFile call. It then uses the undocumented TpmDeviceIoControl function to perform the following sequence:
- NtCreateEvent creates a temporary event to communicate with the driver.
- NtDeviceIoControlFile passes an IOCTL request to the driver.
- NtWaitForSingleObject waits for the event, that will be signaled by the driver when the IO request is completed.
- NtClose cleans up the event.
Tbsip_Submit_Command function operates in a similar manner:
- It uses a previously opened file object handle by a call to Tbsi_Context_Create.
- It then uses a similar sequence of calls to NtCreateEvent, NtDeviceIoControlFile, NtWaitForSingleObject and NtClose to communicate with the driver.

Overall, considering a built-in inefficiency of the implementation of the TPM crypto, I would not recommend programming it from a very low level.

Conclusion

TPM chip is a good helper for security needs of your app. Unfortunately though, it may not be present in every hardware configuration that your Windows app may run with. So make sure to check error codes returned by the Windows functions and be able to pivot to another solution if TPM is not available.

One, much less secure workaround is to use CryptProtectData and CryptUnprotectData functions if TPM is not available.

Blog Post

Trusted Platform Module in Windows - Part 1

How to use TPM to encrypt and decrypt small blobs of data programmatically.