Blog Post

Secure Programming Practices - Serialization

Example of secure binary serialization and de-serialization in C++.

Secure Programming Practices - Serialization - Example of secure binary serialization and de-serialization in C++.

Intro

Binary serialization, or more precisely, an insecure binary de-serialization has been a cause of many vulnerabilities in software. In this blog post I will demonstrate my approach to writing a secure method for serializing and, more importantly, de-serializing binary data.

Note that if a well-established library or a framework that you are working with provides methods for serialization, always resort to using them! DO NOT attempt to write your own serialization, especially if you've never done that before.

This blog post is designed primarily for people whose job it is to write serialization (and de-serialization) logic for such libraries.

You can download the BinSerialize project with the code samples that I will be using in this post from my GitHub.

This post will include video screencasts of me coding a small POC app to demonstrating my approach.

Test Classes

To be able to demonstrate a de-serialization algorithm, I need to write a serialization part first. For that I will create two fictional classes to test it with.

My Student class will hold information about students in my fictional school:

C++[Copy]
struct Student
{
	//Age of the person, or 0 if not known
	//[MIN_ALLOWED_AGE - MAX_ALLOWED_AGE] acceptable range
	int nAge = 0;

	//Student's given name - must be provided
	//MAX_NAME_LEN_1 - max length
	std::string strGivenName;

	//Optional
	//MAX_NAME_LEN_1 - max length
	std::string strSecondName;

	//Optional
	//MAX_NAME_LEN_1 - max length
	std::string strThirdName;

	//Type of the person's attendance
	AttendanceType attendance = AttendanceType::Unknown;

	//true if the student is currently suspended
	bool bSuspended = false;

	//Student's performance score
	double fPerformanceScore = 0.0;

	//Internal notes about the student
	std::string strNotes;
};	

The AttendanceType is declared in my custom enum.

Advice: When coding a class make sure to include the ranges of acceptable values in the comments for each member variable. That way it will be easier to write a de-serialization logic for them.

Then the Student class is used in MyClass. It holds the information about my fictional class.

C++[Copy]
struct MyClass
{
	//Year the class was established, 0 if unknown
	//[MIN_ALLOWED_YEAR - MAX_ALLOWED_YEAR] acceptable range
	int nYearEstablished = 0;

	//Name of the class, must be provided
	//MAX_NAME_LEN_2 - max length
	std::string strName;

	//People associated with a class
	std::vector<Student> students;

	//Internal notes about the class
	std::string strNotes;	
}

There two classes contain different data types that will let us study the pitfalls and nuances of binary serialization.

Serialization

I will encode the serialization logic into two functions, that I will name toByteArray, for each of the classes that I showed above.

A serialization is the process of converting data from an internal structure, specific for your app, into an external data format that can be saved in a file, or sent over a network. Such could be a binary, text (JSON, XML) or any other format.

I usually follow these rules during serialization:

  • Place the serialization logic into separate functions inside of each of the classes that need to be serialized. This way it will be easier to update it if any of the class members change.
  • I didn't show it in my example, but it will be also wise to place a version number into your serialized data to distinguish between different structures of your serialized data in case a future version of your app will require additional members to be added, or changed in the class. This way you will be able to tell apart an older version of the serialized data.
  • Try to place the minimum required amount of information into your serialized data. Don't create duplicate information. For instance, when encoding a string either use a null-terminated string, or a size of string and an array of characters to match that size. This way you will avoid ambiguities when de-serializing.
  • Try to place common range values into a single location like a preprocessor variable, or a global constant, instead of hardcoding them in code. This way you will avoid refactoring your project if such variable changes. Examples of those could be my MAX_NAME_LEN_1, or STR_CHAR.

To illustrate this concept, this is my version of the serialization function for the Student class:

Student::toByteArray[Copy]
/// <summary>
/// Serializes this struct by converting it to a byte array
/// </summary>
/// <param name="pBuff">if not 0, pointer to the buffer to fill out</param>
/// <param name="szcbBuff">Size of provided 'pBuff' in bytes</param>
/// <returns>Size of the filled (or needed to fill) buffer in bytes, or 0 if error</returns>
size_t toByteArray(void* pBuff = nullptr, size_t szcbBuff = 0) const
{
	size_t szcbRet = 0;

	//Determine the size needed
	size_t szcbData = 
		aligned(sizeof(nAge)) +
		aligned_sizeof_str(strGivenName) +
		aligned_sizeof_str(strSecondName) +
		aligned_sizeof_str(strThirdName) +
		aligned(sizeof(attendance)) +
		aligned(sizeof(bSuspended)) +
		aligned(sizeof(fPerformanceScore)) +
		aligned_sizeof_str(strNotes);

	//Was the buffer provided?
	if(pBuff)
	{
		//Compare the size provided
		if(szcbBuff >= szcbData)
		{
			//Fill out the buffer
			uint8_t* pD = (uint8_t*)pBuff;

			//Clear provided buffer
			memset(pD, 0, szcbData);

			copy_aligned(pD, nAge);

			copy_aligned_str(pD, strGivenName);
			copy_aligned_str(pD, strSecondName);
			copy_aligned_str(pD, strThirdName);

			copy_aligned(pD, attendance);
			copy_aligned(pD, bSuspended);
			copy_aligned(pD, fPerformanceScore);

			copy_aligned_str(pD, strNotes);


			//Sanity check
			if(pD - (uint8_t*)pBuff == szcbData)
			{
				//All done!
				szcbRet = szcbData;
			}
			else
			{
				//Overflow
				assert(false);

#ifdef _WIN32
				//Microsoft specific code
				__fastfail(FAST_FAIL_FATAL_APP_EXIT);
#else
				//General case
				abort(-1);
#endif
			}
		}
		else
			assert(false);
	}
	else
	{
		//Only needs the size
		szcbRet = szcbData;
	}

	return szcbRet;
}
I personally start from writing a serialization logic for the innermost class. In this case, it's my Student class.

And for the MyClass class:

MyClass::toByteArray[Copy]
/// <summary>
/// Serializes this struct by converting it to a byte array
/// </summary>
/// <param name="pBuff">if not 0, pointer to the buffer to fill out</param>
/// <param name="szcbBuff">Size of provided 'pBuff' in bytes</param>
/// <returns>Size of the filled (or needed to fill) buffer in bytes, or 0 if error</returns>
size_t toByteArray(void* pBuff = nullptr, size_t szcbBuff = 0) const
{
	size_t szcbRet = 0;

	//Determine the size needed
	size_t szcbData = 
		aligned(sizeof(nYearEstablished)) +
		aligned_sizeof_str(strName) +
		aligned(sizeof(size_t)) +               //Count of elements in the 'students' array
		aligned_sizeof_str(strNotes);

	std::vector<Student>::const_iterator itr = students.begin();
	const std::vector<Student>::const_iterator itrEnd = students.end();

	for(; itr != itrEnd; itr++)
	{
		szcbData += itr->toByteArray();
	}

	//Was the buffer provided?
	if(pBuff)
	{
		//Compare the size provided
		if(szcbBuff >= szcbData)
		{
			//Fill out the buffer
			uint8_t* pD = (uint8_t*)pBuff;
			uint8_t* pEnd = pD + szcbData;

			//Clear provided buffer
			memset(pD, 0, szcbData);

			copy_aligned(pD, nYearEstablished);

			copy_aligned_str(pD, strName);

			//Students array
			size_t szCntStudents = students.size();
			copy_aligned(pD, szCntStudents);

			for(itr = students.begin(); itr != itrEnd; itr++)
			{
				size_t szcb = itr->toByteArray(pD, pEnd - pD);
				if(!szcb)
				{
					//Failed
					assert(false);

#ifdef _WIN32
					//Microsoft specific code
					__fastfail(FAST_FAIL_FATAL_APP_EXIT);
#else
					//General case
					abort(-1);
#endif
				}

				pD += szcb;
			}
			
			//Add notes
			copy_aligned_str(pD, strNotes);


			//Sanity check
			if(pD - (uint8_t*)pBuff == szcbData)
			{
				//All done!
				szcbRet = szcbData;
			}
			else
			{
				//Overflow
				assert(false);

#ifdef _WIN32
				//Microsoft specific code
				__fastfail(FAST_FAIL_FATAL_APP_EXIT);
#else
				//General case
				abort(-1);
#endif
			}
		}
		else
			assert(false);
	}
	else
	{
		//Only needs the size
		szcbRet = szcbData;
	}

	return szcbRet;
}

Both functions are very similar. There're several things that I need to point out:

  1. I'm using my global ALIGN_BY preprocessor definition for the alignment to be used for serialization and de-serialization.
    C++[Copy]
    #define ALIGN_BY (sizeof(void*))    //Align by this number of bytes, or comment out to remove alignment

    If you comment out that line, the code logic will not use any alignment.

    The alignment is not strictly necessary for the Intel and AMD x86 family of CPUs. But it may be required for an ARM architecture.
  2. Notice that I'm placing most of the repeating serialization logic into templates. This will help to eliminate mistakes, mostly in copy-and-pasting similar chunks of code.

    For instance, I'm using the following template to preserve the alignment, defined in the ALIGN_BY macro:

    aligned template[Copy]
    /// <summary>
    /// Template that returns an aligned size for 'n'
    /// </summary>
    /// <param name="n">Size to align</param>
    /// <returns>Aligned size</returns>
    template<class T>
    inline T aligned(T n)
    {
    #ifdef ALIGN_BY
    	//Use alignment
    
    	static_assert(ALIGN_BY > 0, "Alignment must be positive");
    	static_assert(std::popcount(ALIGN_BY) == 1, "Alignment must be done by a size the power of 2");
    
    	return n + ((-(intptr_t)n) & (ALIGN_BY - 1));
    
    #else
    	//No alignment
    	return n;
    #endif
    }

    Another template is used to determine the size needed to store an STL string, considering the alignment:

    aligned_sizeof_str template[Copy]
    /// <summary>
    /// Calculates size of STL string taking alignment into account
    /// </summary>
    /// <typeparam name="s">STL string</typeparam>
    /// <returns>Aligned size</returns>
    template<class T>
    inline size_t aligned_sizeof_str(T& s)
    {
    	/*
    	size_t length;
    	char[] str;
    	*/
    
    	return aligned(sizeof(size_t)) +
    		aligned(s.size() * sizeof(STR_CHAR));
    }

    Or the following template helps to copy a primitive data type into a memory byte array:

    copy_aligned template[Copy]
    /// <summary>
    /// Copy primitive type into a memory location
    /// </summary>
    /// <param name="p">Pointer to the memory location. It will be incremented by the sizeof 's'</param>
    /// <param name="s">Primitive variable to copy</param>
    template<class T>
    inline void copy_aligned(uint8_t*& p, T s)
    {
    	*(T*)p = s;
    	p += aligned(sizeof(s));
    }	

    And, a similar template to copy an STL string:

    copy_aligned_str template[Copy]
    /// <summary>
    /// Copy STL string into a memory location
    /// </summary>
    /// <param name="p">Pointer to the memory location. It will be incremented by the sizeof string</param>
    /// <param name="s">STL string to copy</param>
    template<class T>
    inline void copy_aligned_str(uint8_t*& p, T& s)
    {
    	size_t szStr = s.size();
    
    	*(size_t*)p = szStr;
    	p += aligned(sizeof(szStr));
    
    	memcpy(p, s.c_str(), szStr * sizeof(STR_CHAR));
    	p += aligned(szStr * sizeof(STR_CHAR));
    }

    The beauty of these templates is that the modern C++ compiler will try to optimize everything by inlining them into a parent function where they are called from, making a resulting binary code quite efficient while preserving a good readability and safety of the source code.

  3. At the end of each synchronization routine I'm checking for the correct alignment of my floating pointer pD in the conditional statement, that I labeled as the "Sanity check". This if-statement ensures that I don't have a buffer overrun that may be caused by a fault in my internal logic. If that is the case, I crash the process with a call to the Microsoft-specific __fastfail function, or with a more generic abort.
    By crashing a process in that case we ensure that operation does not continue with a possibility of a memory corruption, as well as to let developers know of a serious fault in the serialization logic.

Screencast - Serialization

Watch this video demonstration of how I would write my serialization logic in C++:

Video Timecodes

The following time-coded segments will open in a YouTube player:

  • 1:56 - What is serialization & de-serialization.
  • 6:38 - Starting to code serialization example in C++.
  • 7:57 - Coding two sample classes: MyClass and Student.
  • 30:53 - Writing code to initialize our test data for MyClass.
  • 37:12 - Starting to write serialization code.
  • 38:13 - Writing serialization logic for the Student class: toByteArray() function.
  • 1:23:59 - Writing serialization logic for MyClass class: toByteArray() function.
  • 1:38:18 - Writing code to test our serialization logic.
  • 1:46:42 - Trying to optimize efficiency of our alignment code/template: checking optimized Assembly language code with IDA free.

Note that in the last installment of my video demonstration I optimized my final version of the alignment template by analyzing a resulting production Assembly language code after compilation, by replacing:

C++[Copy]
	return n + ((n % ALIGN_BY) != 0 ? ALIGN_BY - (n % ALIGN_BY) : 0);

With:

C++[Copy]
	return n + ((-(intptr_t)n) & (ALIGN_BY - 1));

That is one way to gauge the efficiency of your production code.

De-serialization

To follow my naming pattern, I will code the de-serialization logic in the fromByteArray functions in each of my classes.

A de-serialization, as you might have concluded, is an operation opposite to serialization. It converts the app data from an external format (be it a binary or a text format, such as XML or JSON) into an internal data structure for the app.
Because of a conversion that I explained above, a de-serialization logic may expose the internal app data to the outside world. That is why the de-serialization code is one of the most attacked surfaces of any application, especially if it's a web-facing software, such as a web browser, or if it's a kernel component.

Incorrectly written, or buggy de-serialization can be the reason why an evil font could pawn your computer via Adobe Reader; or how your PHP server can execute arbitrary attacker-controlled code on your behalf; or why someone can gain code execution via Adobe automation service, or how a threat actor could take over your iPhone by sending you a specially crafted iMessage, and more.

There's even a special category CWE-502 coined just for that class of vulnerabilities.

Because of the heightened security considerations I usually follow these rules for de-serialization:

  • Place your de-serialization logic next to the serialization one to minimize the possibility of mistakes or omissions.
  • When parsing external input always check everything for correctness, going step-by-step. Pay extra attention to all details.
  • Before reading data, even for primitive types, always check available buffer for overflows.
  • When receiving sizes of data arrays from an external source, always check it for overflows or underflows.
  • To minimize the risk of a DDoS attack on your app, always check external sizes of buffers for adherence to allowed ranges. Don't let them be too large.
  • Make sure to pay attention and distinguish between signed and unsigned variables, especially when performing range checks. Keep in mind that an attacker can crowbar a negative value into an unsigned variable.

To demonstrate some of these techniques, let me show you the de-serialization logic for my Student class:

Student::fromByteArray[Copy]
/// <summary>
/// De-serializes byte array into this struct
/// </summary>
/// <param name="pData">Byte array to convert</param>
/// <param name="szcbData">Size of 'pData' in bytes</param>
/// <returns>[1 and up) if success, for amount of bytes used, 0 if error - in this case this struct will be reset</returns>
size_t fromByteArray(const void* pData, size_t szcbData)
{
	while(true)
	{
		//Do we have a pointer to data?
		if(!pData)
			break;

		//Check overall data size provided
		if((intptr_t)szcbData <= 0)
			break;

		const uint8_t* pS = (const uint8_t*)pData;
		const uint8_t* pEnd = pS + szcbData;
		assert(pEnd > pS);

		//Check 'nAge'
		if(!read_aligned(pS, pEnd, nAge))
			break;

		if(nAge != 0)
		{
			if(nAge < MIN_ALLOWED_AGE ||
				nAge > MAX_ALLOWED_AGE)
			{
				break;
			}
		}

		//Check 'strGivenName'
		if(!read_aligned_str(pS, pEnd, strGivenName, MAX_NAME_LEN_1))
			break;

		if(strGivenName.empty())
			break;

		//Check 'strSecondName'
		if(!read_aligned_str(pS, pEnd, strSecondName, MAX_NAME_LEN_1))
			break;


		//Check 'strThirdName'
		if(!read_aligned_str(pS, pEnd, strThirdName, MAX_NAME_LEN_1))
			break;

		//Check 'attendance'
		if(!read_aligned(pS, pEnd, attendance))
			break;

		if(attendance < AttendanceType::Unknown ||
			attendance >= AttendanceType::MaxCount)
			break;

		//Check 'bSuspended'
		if(!read_aligned(pS, pEnd, bSuspended))
			break;

		if(bSuspended != true &&
			bSuspended != false)
			break;

		//Check 'fPerformanceScore'
		if(!read_aligned_double(pS, pEnd, fPerformanceScore))
			break;


		//Check 'strNotes'
		if(!read_aligned_str(pS, pEnd, strNotes, 0))
			break;

		//Sanity check
		if(pS <= pEnd)
		{
			//Success!
			return pS - (const uint8_t*)pData;
		}
		else
		{
			//Overflow
			assert(false);

#ifdef _WIN32
			//Microsoft specific code
			__fastfail(FAST_FAIL_FATAL_APP_EXIT);
#else
			//General case
			abort(-1);
#endif
		}

		break;
	}

	//Failure to de-serialize

	//Reset this struct
	*this = Student();

	return 0;
}

Then let's review the same but for MyClass:

MyClass::fromByteArray[Copy]
/// <summary>
/// De-serializes byte array into this struct
/// </summary>
/// <param name="pData">Byte array to convert</param>
/// <param name="szcbData">Size of 'pData' in bytes</param>
/// <returns>[1 and up) if success, for amount of bytes used, 0 if error - in this case this struct will be reset</returns>
size_t fromByteArray(const void* pData, size_t szcbData)
{
	while(true)
	{
		//Do we have a pointer to data?
		if(!pData)
			break;

		//Check overall data size provided
		if((intptr_t)szcbData <= 0)
			break;

		const uint8_t* pS = (const uint8_t*)pData;
		const uint8_t* pEnd = pS + szcbData;
		assert(pEnd > pS);

		//Check 'nYearEstablished'
		if(!read_aligned(pS, pEnd, nYearEstablished))
			break;

		if(nYearEstablished != 0)
		{
			if(nYearEstablished < MIN_ALLOWED_YEAR ||
				nYearEstablished > MAX_ALLOWED_YEAR)
				break;
		}

		//Check 'strName'
		if(!read_aligned_str(pS, pEnd, strName, MAX_NAME_LEN_2))
			break;

		if(strName.empty())
			break;

		//Check 'students'
		size_t szCntStudents;
		if(!read_aligned(pS, pEnd, szCntStudents))
			break;

		if((intptr_t)szCntStudents < 0)
			break;

		//Get all students
		bool bReadStudentsOK = true;
		students.clear();

		Student st;

		for(size_t s = 0; s < szCntStudents; s++)
		{
			size_t szcb = st.fromByteArray(pS, pEnd - pS);
			if(!szcb)
			{
				//Failed
				bReadStudentsOK = false;

				break;
			}

			//Add student to the list
			students.push_back(st);

			pS += szcb;
		}

		if(!bReadStudentsOK)
			break;

		//Check 'strNotes'
		if(!read_aligned_str(pS, pEnd, strNotes, 0))
			break;


		//Sanity check
		if(pS <= pEnd)
		{
			//Success!
			return pS - (const uint8_t*)pData;
		}
		else
		{
			//Overflow
			assert(false);

#ifdef _WIN32
			//Microsoft specific code
			__fastfail(FAST_FAIL_FATAL_APP_EXIT);
#else
			//General case
			abort(-1);
#endif
		}

		break;
	}

	//Failure to de-serialize

	//Reset this struct
	*this = MyClass();

	return 0;
}

Let me expand on some of the techniques:

  • Notice that, as in the serialization logic, I'm using templates for repeated checks. This will not only help with readability of the code, but will also help eliminate copy-and-paste bugs for repeated logic.

    For instance, this is how I check and read primitive data types from the external byte array:

    read_aligned template[Copy]
    /// <summary>
    /// Read primitive type from memory, by checking for overruns
    /// </summary>
    /// <param name="p">Pointer to byte array to read from</param>
    /// <param name="pEnd">End of the byte array, exclusive</param>
    /// <param name="s">Primite type to set</param>
    /// <returns>true if success, false if failed</returns>
    template<class T>
    inline bool read_aligned(const uint8_t*& p, const uint8_t* pEnd, T& s)
    {
    	intptr_t szcb = aligned(sizeof(s));
    	if(p + szcb > pEnd)
    	{
    		//Overrun
    		return false;
    	}
    
    	s = *(T*)p;
    	p += szcb;
    
    	return true;
    }

    Note that I check the available size first and only then read the data.

    It is important to repeat this for every step on the way!
  • Checking more complex data structures requires additional attention. For instance, this is how I read an STL string from the external memory array in my other template:
    read_aligned_str template[Copy]
    /// <summary>
    /// Read STL string from memory, by checking for overruns
    /// </summary>
    /// <param name="p">Pointer to byte array to read from</param>
    /// <param name="pEnd">End of the byte array, exclusive</param>
    /// <param name="s">STL string to set</param>
    /// <param name="szchMaxLen">if not 0, maximum allowed length of 's' in characters</param>
    /// <returns>true if success, false if failed</returns>
    template<class T>
    inline bool read_aligned_str(const uint8_t*& p, const uint8_t* pEnd, T& s, size_t szchMaxLen)
    {
    	/*
    	size_t length;
    	char[] str;
    	*/
    
    	size_t sz;
    	if(!read_aligned(p, pEnd, sz))
    	{
    		return false;
    	}
    
    	if((intptr_t)sz < 0 ||
    		p + sz * sizeof(STR_CHAR) > pEnd)
    	{
    		//Overrun
    		return false;
    	}
    
    	s.assign((const STR_CHAR*)p, sz);
    	p += aligned(sz * sizeof(STR_CHAR));
    
    	if(szchMaxLen > 0)
    	{
    		if(s.size() > szchMaxLen)
    		{
    			return false;
    		}
    	}
    
    	return true;
    }

    Same logic applies here as well, except that I also need to perform additional checks of the size of the character array that follows. It is paramount to check for negative array sizes.

  • Another important step is to clear the class that is being de-serialized if the process fails. This will prevent faulty, or deliberately crafted data from entering the app's memory. I do this at the "Failure to de-serialize" comment in my logic above.
  • Lastly, I perform a similar sanity check, as I did for my serialization logic, and deliberately crash the app if it fails.

Screencast - De-serialization

Watch this video demonstration of how I would write my de-serialization logic in C++:

Video Timecodes

The following time-coded segments will open in a YouTube player:

  • 1:24 - Why de-serialization poses a greater security risk to your app.
  • 5:03 - Starting coding de-serialization logic.
  • 6:35 - Writing de-serialization logic for the Student class: fromByteArray() function.
  • 33:06 - Example of why writing a secure de-serialization code is hard - I put a bug in my code. (I'll correct it in part 3.)
  • 48:07 - Another example why it's important to pay attention to what you are writing during an important sequence of code: I made another bug. (I will correct it later.)
  • 53:26 - Writing de-serialization logic for MyClass class: fromByteArray() function.
  • 1:11:36 - Writing code to test our de-serialization logic.
  • 1:15:38 - Writing a fuzzer to brute-force my de-serialization logic.

Note that de-serialization logic is so tricky that it is virtually guaranteed that you will make a mistake. To prove my point I have inadvertently recorded myself making several bugs in my demonstration. The following correction will explain it and will also fix everything.

Screencast - De-serialization (Correction)

This is a short follow-up with corrections to the previous part:

Video Timecodes

The following time-coded segments will open in a YouTube player:

  • 0:28 - Correction for the incorrect std::string size check during de-serialization.
  • 3:25 - Adding a check if a double type refers to a valid number during de-serialization.
  • 7:32 - Final check if de-serialization logic produced the same structure as the original.

This will conclude my demonstration of the serialization and de-serialization coding techniques.

Conclusion

By sharing my approach to coding the binary serialization logic I was not trying to rub in my coding style or technique. There are many ways to code that. My sharing and the explanations were done with the intention to demonstrate a multitude of possible pitfalls in the process of coding a secure serialization, and most importantly, de-serialization logic. I base my approach on years of experience, mistakes and bad examples that I've witnessed as a spectator, as well as a reverse-engineer.

Don't be the author of the next software vulnerability. Heed my advice.

Let's write secure software!

Related Articles