Intro
Binary serialization, or more precisely, an insecure binary de-serialization has been a cause of many vulnerabilities in software. In this blog post I will demonstrate my approach to writing a secure method for serializing and, more importantly, de-serializing binary data.
Note that if a well-established library or a framework that you are working with provides methods for serialization, always resort to using them! DO NOT attempt to write your own serialization, especially if you've never done that before.This blog post is designed primarily for people whose job it is to write serialization (and de-serialization) logic for such libraries.
You can download the BinSerialize
project with the code samples that I will be using in this post from my GitHub.
This post will include video screencasts of me coding a small POC app to demonstrating my approach.
Test Classes
To be able to demonstrate a de-serialization algorithm, I need to write a serialization part first. For that I will create two fictional classes to test it with.
My Student
class will hold information about students in my fictional school:
struct Student
{
//Age of the person, or 0 if not known
//[MIN_ALLOWED_AGE - MAX_ALLOWED_AGE] acceptable range
int nAge = 0;
//Student's given name - must be provided
//MAX_NAME_LEN_1 - max length
std::string strGivenName;
//Optional
//MAX_NAME_LEN_1 - max length
std::string strSecondName;
//Optional
//MAX_NAME_LEN_1 - max length
std::string strThirdName;
//Type of the person's attendance
AttendanceType attendance = AttendanceType::Unknown;
//true if the student is currently suspended
bool bSuspended = false;
//Student's performance score
double fPerformanceScore = 0.0;
//Internal notes about the student
std::string strNotes;
};
The AttendanceType
is declared in my custom enum
.
Advice: When coding a class make sure to include the ranges of acceptable values in the comments for each member variable. That way it will be easier to write a de-serialization logic for them.
Then the Student
class is used in MyClass
. It holds the information about my fictional class.
struct MyClass
{
//Year the class was established, 0 if unknown
//[MIN_ALLOWED_YEAR - MAX_ALLOWED_YEAR] acceptable range
int nYearEstablished = 0;
//Name of the class, must be provided
//MAX_NAME_LEN_2 - max length
std::string strName;
//People associated with a class
std::vector<Student> students;
//Internal notes about the class
std::string strNotes;
}
There two classes contain different data types that will let us study the pitfalls and nuances of binary serialization.
Serialization
I will encode the serialization logic into two functions, that I will name toByteArray
, for each of the classes that I showed above.
A serialization is the process of converting data from an internal structure, specific for your app, into an external data format that can be saved in a file, or sent over a network. Such could be a binary, text (JSON, XML) or any other format.
I usually follow these rules during serialization:
- Place the serialization logic into separate functions inside of each of the classes that need to be serialized. This way it will be easier to update it if any of the class members change.
- I didn't show it in my example, but it will be also wise to place a version number into your serialized data to distinguish between different structures of your serialized data in case a future version of your app will require additional members to be added, or changed in the class. This way you will be able to tell apart an older version of the serialized data.
- Try to place the minimum required amount of information into your serialized data. Don't create duplicate information. For instance, when encoding a string either use a null-terminated string, or a size of string and an array of characters to match that size. This way you will avoid ambiguities when de-serializing.
- Try to place common range values into a single location like a preprocessor variable, or a global constant, instead of hardcoding them in code. This way you will avoid refactoring your project if such variable changes. Examples of those could be my
MAX_NAME_LEN_1
, orSTR_CHAR
.
To illustrate this concept, this is my version of the serialization function for the Student
class:
/// <summary>
/// Serializes this struct by converting it to a byte array
/// </summary>
/// <param name="pBuff">if not 0, pointer to the buffer to fill out</param>
/// <param name="szcbBuff">Size of provided 'pBuff' in bytes</param>
/// <returns>Size of the filled (or needed to fill) buffer in bytes, or 0 if error</returns>
size_t toByteArray(void* pBuff = nullptr, size_t szcbBuff = 0) const
{
size_t szcbRet = 0;
//Determine the size needed
size_t szcbData =
aligned(sizeof(nAge)) +
aligned_sizeof_str(strGivenName) +
aligned_sizeof_str(strSecondName) +
aligned_sizeof_str(strThirdName) +
aligned(sizeof(attendance)) +
aligned(sizeof(bSuspended)) +
aligned(sizeof(fPerformanceScore)) +
aligned_sizeof_str(strNotes);
//Was the buffer provided?
if(pBuff)
{
//Compare the size provided
if(szcbBuff >= szcbData)
{
//Fill out the buffer
uint8_t* pD = (uint8_t*)pBuff;
//Clear provided buffer
memset(pD, 0, szcbData);
copy_aligned(pD, nAge);
copy_aligned_str(pD, strGivenName);
copy_aligned_str(pD, strSecondName);
copy_aligned_str(pD, strThirdName);
copy_aligned(pD, attendance);
copy_aligned(pD, bSuspended);
copy_aligned(pD, fPerformanceScore);
copy_aligned_str(pD, strNotes);
//Sanity check
if(pD - (uint8_t*)pBuff == szcbData)
{
//All done!
szcbRet = szcbData;
}
else
{
//Overflow
assert(false);
#ifdef _WIN32
//Microsoft specific code
__fastfail(FAST_FAIL_FATAL_APP_EXIT);
#else
//General case
abort(-1);
#endif
}
}
else
assert(false);
}
else
{
//Only needs the size
szcbRet = szcbData;
}
return szcbRet;
}
I personally start from writing a serialization logic for the innermost class. In this case, it's my Student
class.
And for the MyClass
class:
/// <summary>
/// Serializes this struct by converting it to a byte array
/// </summary>
/// <param name="pBuff">if not 0, pointer to the buffer to fill out</param>
/// <param name="szcbBuff">Size of provided 'pBuff' in bytes</param>
/// <returns>Size of the filled (or needed to fill) buffer in bytes, or 0 if error</returns>
size_t toByteArray(void* pBuff = nullptr, size_t szcbBuff = 0) const
{
size_t szcbRet = 0;
//Determine the size needed
size_t szcbData =
aligned(sizeof(nYearEstablished)) +
aligned_sizeof_str(strName) +
aligned(sizeof(size_t)) + //Count of elements in the 'students' array
aligned_sizeof_str(strNotes);
std::vector<Student>::const_iterator itr = students.begin();
const std::vector<Student>::const_iterator itrEnd = students.end();
for(; itr != itrEnd; itr++)
{
szcbData += itr->toByteArray();
}
//Was the buffer provided?
if(pBuff)
{
//Compare the size provided
if(szcbBuff >= szcbData)
{
//Fill out the buffer
uint8_t* pD = (uint8_t*)pBuff;
uint8_t* pEnd = pD + szcbData;
//Clear provided buffer
memset(pD, 0, szcbData);
copy_aligned(pD, nYearEstablished);
copy_aligned_str(pD, strName);
//Students array
size_t szCntStudents = students.size();
copy_aligned(pD, szCntStudents);
for(itr = students.begin(); itr != itrEnd; itr++)
{
size_t szcb = itr->toByteArray(pD, pEnd - pD);
if(!szcb)
{
//Failed
assert(false);
#ifdef _WIN32
//Microsoft specific code
__fastfail(FAST_FAIL_FATAL_APP_EXIT);
#else
//General case
abort(-1);
#endif
}
pD += szcb;
}
//Add notes
copy_aligned_str(pD, strNotes);
//Sanity check
if(pD - (uint8_t*)pBuff == szcbData)
{
//All done!
szcbRet = szcbData;
}
else
{
//Overflow
assert(false);
#ifdef _WIN32
//Microsoft specific code
__fastfail(FAST_FAIL_FATAL_APP_EXIT);
#else
//General case
abort(-1);
#endif
}
}
else
assert(false);
}
else
{
//Only needs the size
szcbRet = szcbData;
}
return szcbRet;
}
Both functions are very similar. There're several things that I need to point out:
- I'm using my global
ALIGN_BY
preprocessor definition for the alignment to be used for serialization and de-serialization.C++[Copy]#define ALIGN_BY (sizeof(void*)) //Align by this number of bytes, or comment out to remove alignment
If you comment out that line, the code logic will not use any alignment.
The alignment is not strictly necessary for the Intel and AMD x86 family of CPUs. But it may be required for an ARM architecture.
- Notice that I'm placing most of the repeating serialization logic into templates. This will help to eliminate mistakes, mostly in copy-and-pasting similar chunks of code.
For instance, I'm using the following template to preserve the alignment, defined in the
ALIGN_BY
macro:aligned template[Copy]/// <summary> /// Template that returns an aligned size for 'n' /// </summary> /// <param name="n">Size to align</param> /// <returns>Aligned size</returns> template<class T> inline T aligned(T n) { #ifdef ALIGN_BY //Use alignment static_assert(ALIGN_BY > 0, "Alignment must be positive"); static_assert(std::popcount(ALIGN_BY) == 1, "Alignment must be done by a size the power of 2"); return n + ((-(intptr_t)n) & (ALIGN_BY - 1)); #else //No alignment return n; #endif }
Another template is used to determine the size needed to store an STL string, considering the alignment:
aligned_sizeof_str template[Copy]/// <summary> /// Calculates size of STL string taking alignment into account /// </summary> /// <typeparam name="s">STL string</typeparam> /// <returns>Aligned size</returns> template<class T> inline size_t aligned_sizeof_str(T& s) { /* size_t length; char[] str; */ return aligned(sizeof(size_t)) + aligned(s.size() * sizeof(STR_CHAR)); }
Or the following template helps to copy a primitive data type into a memory byte array:
copy_aligned template[Copy]/// <summary> /// Copy primitive type into a memory location /// </summary> /// <param name="p">Pointer to the memory location. It will be incremented by the sizeof 's'</param> /// <param name="s">Primitive variable to copy</param> template<class T> inline void copy_aligned(uint8_t*& p, T s) { *(T*)p = s; p += aligned(sizeof(s)); }
And, a similar template to copy an STL string:
copy_aligned_str template[Copy]/// <summary> /// Copy STL string into a memory location /// </summary> /// <param name="p">Pointer to the memory location. It will be incremented by the sizeof string</param> /// <param name="s">STL string to copy</param> template<class T> inline void copy_aligned_str(uint8_t*& p, T& s) { size_t szStr = s.size(); *(size_t*)p = szStr; p += aligned(sizeof(szStr)); memcpy(p, s.c_str(), szStr * sizeof(STR_CHAR)); p += aligned(szStr * sizeof(STR_CHAR)); }
The beauty of these templates is that the modern C++ compiler will try to optimize everything by inlining them into a parent function where they are called from, making a resulting binary code quite efficient while preserving a good readability and safety of the source code.
- At the end of each synchronization routine I'm checking for the correct alignment of my floating pointer
pD
in the conditional statement, that I labeled as the "Sanity check". This if-statement ensures that I don't have a buffer overrun that may be caused by a fault in my internal logic. If that is the case, I crash the process with a call to the Microsoft-specific__fastfail
function, or with a more genericabort
.By crashing a process in that case we ensure that operation does not continue with a possibility of a memory corruption, as well as to let developers know of a serious fault in the serialization logic.
Screencast - Serialization
Watch this video demonstration of how I would write my serialization logic in C++:
Video Timecodes
The following time-coded segments will open in a YouTube player:
- 1:56 - What is serialization & de-serialization.
- 6:38 - Starting to code serialization example in C++.
- 7:57 - Coding two sample classes:
MyClass
andStudent
. - 30:53 - Writing code to initialize our test data for
MyClass
. - 37:12 - Starting to write serialization code.
- 38:13 - Writing serialization logic for the
Student
class:toByteArray()
function. - 1:23:59 - Writing serialization logic for
MyClass
class:toByteArray()
function. - 1:38:18 - Writing code to test our serialization logic.
- 1:46:42 - Trying to optimize efficiency of our alignment code/template: checking optimized Assembly language code with IDA free.
Note that in the last installment of my video demonstration I optimized my final version of the alignment template by analyzing a resulting production Assembly language code after compilation, by replacing:
With:
That is one way to gauge the efficiency of your production code.
De-serialization
To follow my naming pattern, I will code the de-serialization logic in the fromByteArray
functions in each of my classes.
A de-serialization, as you might have concluded, is an operation opposite to serialization. It converts the app data from an external format (be it a binary or a text format, such as XML or JSON) into an internal data structure for the app.
Because of a conversion that I explained above, a de-serialization logic may expose the internal app data to the outside world. That is why the de-serialization code is one of the most attacked surfaces of any application, especially if it's a web-facing software, such as a web browser, or if it's a kernel component.Incorrectly written, or buggy de-serialization can be the reason why an evil font could pawn your computer via Adobe Reader; or how your PHP server can execute arbitrary attacker-controlled code on your behalf; or why someone can gain code execution via Adobe automation service, or how a threat actor could take over your iPhone by sending you a specially crafted iMessage, and more.
There's even a special category
CWE-502
coined just for that class of vulnerabilities.
Because of the heightened security considerations I usually follow these rules for de-serialization:
- Place your de-serialization logic next to the serialization one to minimize the possibility of mistakes or omissions.
- When parsing external input always check everything for correctness, going step-by-step. Pay extra attention to all details.
- Before reading data, even for primitive types, always check available buffer for overflows.
- When receiving sizes of data arrays from an external source, always check it for overflows or underflows.
- To minimize the risk of a DDoS attack on your app, always check external sizes of buffers for adherence to allowed ranges. Don't let them be too large.
- Make sure to pay attention and distinguish between signed and unsigned variables, especially when performing range checks. Keep in mind that an attacker can crowbar a negative value into an unsigned variable.
To demonstrate some of these techniques, let me show you the de-serialization logic for my Student
class:
/// <summary>
/// De-serializes byte array into this struct
/// </summary>
/// <param name="pData">Byte array to convert</param>
/// <param name="szcbData">Size of 'pData' in bytes</param>
/// <returns>[1 and up) if success, for amount of bytes used, 0 if error - in this case this struct will be reset</returns>
size_t fromByteArray(const void* pData, size_t szcbData)
{
while(true)
{
//Do we have a pointer to data?
if(!pData)
break;
//Check overall data size provided
if((intptr_t)szcbData <= 0)
break;
const uint8_t* pS = (const uint8_t*)pData;
const uint8_t* pEnd = pS + szcbData;
assert(pEnd > pS);
//Check 'nAge'
if(!read_aligned(pS, pEnd, nAge))
break;
if(nAge != 0)
{
if(nAge < MIN_ALLOWED_AGE ||
nAge > MAX_ALLOWED_AGE)
{
break;
}
}
//Check 'strGivenName'
if(!read_aligned_str(pS, pEnd, strGivenName, MAX_NAME_LEN_1))
break;
if(strGivenName.empty())
break;
//Check 'strSecondName'
if(!read_aligned_str(pS, pEnd, strSecondName, MAX_NAME_LEN_1))
break;
//Check 'strThirdName'
if(!read_aligned_str(pS, pEnd, strThirdName, MAX_NAME_LEN_1))
break;
//Check 'attendance'
if(!read_aligned(pS, pEnd, attendance))
break;
if(attendance < AttendanceType::Unknown ||
attendance >= AttendanceType::MaxCount)
break;
//Check 'bSuspended'
if(!read_aligned(pS, pEnd, bSuspended))
break;
if(bSuspended != true &&
bSuspended != false)
break;
//Check 'fPerformanceScore'
if(!read_aligned_double(pS, pEnd, fPerformanceScore))
break;
//Check 'strNotes'
if(!read_aligned_str(pS, pEnd, strNotes, 0))
break;
//Sanity check
if(pS <= pEnd)
{
//Success!
return pS - (const uint8_t*)pData;
}
else
{
//Overflow
assert(false);
#ifdef _WIN32
//Microsoft specific code
__fastfail(FAST_FAIL_FATAL_APP_EXIT);
#else
//General case
abort(-1);
#endif
}
break;
}
//Failure to de-serialize
//Reset this struct
*this = Student();
return 0;
}
Then let's review the same but for MyClass
:
/// <summary>
/// De-serializes byte array into this struct
/// </summary>
/// <param name="pData">Byte array to convert</param>
/// <param name="szcbData">Size of 'pData' in bytes</param>
/// <returns>[1 and up) if success, for amount of bytes used, 0 if error - in this case this struct will be reset</returns>
size_t fromByteArray(const void* pData, size_t szcbData)
{
while(true)
{
//Do we have a pointer to data?
if(!pData)
break;
//Check overall data size provided
if((intptr_t)szcbData <= 0)
break;
const uint8_t* pS = (const uint8_t*)pData;
const uint8_t* pEnd = pS + szcbData;
assert(pEnd > pS);
//Check 'nYearEstablished'
if(!read_aligned(pS, pEnd, nYearEstablished))
break;
if(nYearEstablished != 0)
{
if(nYearEstablished < MIN_ALLOWED_YEAR ||
nYearEstablished > MAX_ALLOWED_YEAR)
break;
}
//Check 'strName'
if(!read_aligned_str(pS, pEnd, strName, MAX_NAME_LEN_2))
break;
if(strName.empty())
break;
//Check 'students'
size_t szCntStudents;
if(!read_aligned(pS, pEnd, szCntStudents))
break;
if((intptr_t)szCntStudents < 0)
break;
//Get all students
bool bReadStudentsOK = true;
students.clear();
Student st;
for(size_t s = 0; s < szCntStudents; s++)
{
size_t szcb = st.fromByteArray(pS, pEnd - pS);
if(!szcb)
{
//Failed
bReadStudentsOK = false;
break;
}
//Add student to the list
students.push_back(st);
pS += szcb;
}
if(!bReadStudentsOK)
break;
//Check 'strNotes'
if(!read_aligned_str(pS, pEnd, strNotes, 0))
break;
//Sanity check
if(pS <= pEnd)
{
//Success!
return pS - (const uint8_t*)pData;
}
else
{
//Overflow
assert(false);
#ifdef _WIN32
//Microsoft specific code
__fastfail(FAST_FAIL_FATAL_APP_EXIT);
#else
//General case
abort(-1);
#endif
}
break;
}
//Failure to de-serialize
//Reset this struct
*this = MyClass();
return 0;
}
Let me expand on some of the techniques:
- Notice that, as in the serialization logic, I'm using templates for repeated checks. This will not only help with readability of the code, but will also help eliminate copy-and-paste bugs for repeated logic.
For instance, this is how I check and read primitive data types from the external byte array:
read_aligned template[Copy]/// <summary> /// Read primitive type from memory, by checking for overruns /// </summary> /// <param name="p">Pointer to byte array to read from</param> /// <param name="pEnd">End of the byte array, exclusive</param> /// <param name="s">Primite type to set</param> /// <returns>true if success, false if failed</returns> template<class T> inline bool read_aligned(const uint8_t*& p, const uint8_t* pEnd, T& s) { intptr_t szcb = aligned(sizeof(s)); if(p + szcb > pEnd) { //Overrun return false; } s = *(T*)p; p += szcb; return true; }
Note that I check the available size first and only then read the data.
It is important to repeat this for every step on the way!
- Checking more complex data structures requires additional attention. For instance, this is how I read an STL string from the external memory array in my other template:
read_aligned_str template[Copy]
/// <summary> /// Read STL string from memory, by checking for overruns /// </summary> /// <param name="p">Pointer to byte array to read from</param> /// <param name="pEnd">End of the byte array, exclusive</param> /// <param name="s">STL string to set</param> /// <param name="szchMaxLen">if not 0, maximum allowed length of 's' in characters</param> /// <returns>true if success, false if failed</returns> template<class T> inline bool read_aligned_str(const uint8_t*& p, const uint8_t* pEnd, T& s, size_t szchMaxLen) { /* size_t length; char[] str; */ size_t sz; if(!read_aligned(p, pEnd, sz)) { return false; } if((intptr_t)sz < 0 || p + sz * sizeof(STR_CHAR) > pEnd) { //Overrun return false; } s.assign((const STR_CHAR*)p, sz); p += aligned(sz * sizeof(STR_CHAR)); if(szchMaxLen > 0) { if(s.size() > szchMaxLen) { return false; } } return true; }
Same logic applies here as well, except that I also need to perform additional checks of the size of the character array that follows. It is paramount to check for negative array sizes.
- Another important step is to clear the class that is being de-serialized if the process fails. This will prevent faulty, or deliberately crafted data from entering the app's memory. I do this at the "Failure to de-serialize" comment in my logic above.
- Lastly, I perform a similar sanity check, as I did for my serialization logic, and deliberately crash the app if it fails.
Screencast - De-serialization
Watch this video demonstration of how I would write my de-serialization logic in C++:
Video Timecodes
The following time-coded segments will open in a YouTube player:
- 1:24 - Why de-serialization poses a greater security risk to your app.
- 5:03 - Starting coding de-serialization logic.
- 6:35 - Writing de-serialization logic for the
Student
class:fromByteArray()
function. - 33:06 - Example of why writing a secure de-serialization code is hard - I put a bug in my code. (I'll correct it in part 3.)
- 48:07 - Another example why it's important to pay attention to what you are writing during an important sequence of code: I made another bug. (I will correct it later.)
- 53:26 - Writing de-serialization logic for
MyClass
class:fromByteArray()
function. - 1:11:36 - Writing code to test our de-serialization logic.
- 1:15:38 - Writing a fuzzer to brute-force my de-serialization logic.
Note that de-serialization logic is so tricky that it is virtually guaranteed that you will make a mistake. To prove my point I have inadvertently recorded myself making several bugs in my demonstration. The following correction will explain it and will also fix everything.
Screencast - De-serialization (Correction)
This is a short follow-up with corrections to the previous part:
Video Timecodes
The following time-coded segments will open in a YouTube player:
- 0:28 - Correction for the incorrect
std::string
size check during de-serialization. - 3:25 - Adding a check if a
double
type refers to a valid number during de-serialization. - 7:32 - Final check if de-serialization logic produced the same structure as the original.
This will conclude my demonstration of the serialization and de-serialization coding techniques.
Conclusion
By sharing my approach to coding the binary serialization logic I was not trying to rub in my coding style or technique. There are many ways to code that. My sharing and the explanations were done with the intention to demonstrate a multitude of possible pitfalls in the process of coding a secure serialization, and most importantly, de-serialization logic. I base my approach on years of experience, mistakes and bad examples that I've witnessed as a spectator, as well as a reverse-engineer.
Don't be the author of the next software vulnerability. Heed my advice.
Let's write secure software!