I've been involved in consulting a client that had intermittent deadlocks in one of their programs when a process was exiting. The issue was hard to debug & diagnose but after I found a solution (with the help from Rbmm) I would like to share it with my readers in hopes that it will save you time if you have to deal with a similar bug.
Deadlock In a Synchronization Lock
When I received one of the crash dumps from the client, I could see in the callstack that the deadlock was happening in the
WaitForSingleObject function when it was waiting for a kernel object that was used for a synchronization lock. The issue with that find was that the
WaitForSingleObject function was called from a very small class that employed the RAII technique for acquiring and releasing the lock, and thus the programming logic seemed solid.
Another peculiar finding was that the deadlocked
WaitForSingleObject call was invoked from a hook that the company installed on one of the system APIs, that in turn, was invoked from a destructor for one of their global classes.
After having dug around the crash dump, I found out that the entire sequence of events that lead to the deadlock had started from a call to the
ExitProcess function somewhere inside the CRT code for that process. This gave us a good clue.
When a process kills itself with a call to the
ExitProcess function, one of the consequences of this call is that the process enters an intermediate state when the system terminates all threads in that process, except for the thread that called
ExitProcess, before actually exiting it.
This could have dire consequences if the remaining running thread attempts to enter a poorly designed synchronization lock that was held by some other thread that was forcefully terminated before it had a chance to release the lock. In this case, the remaining thread that tries to acquire such a lock will enter an infinite waiting loop because the other thread that was holding the lock was killed and can no longer release it.
To make matters worse, such deadlock may not happen all the time. Or, in other words, it can take place on some rare occasions.
So what was happening there?
The problem is that when the logic inside the
ExitProcess function terminates all threads in the process, except one, any terminated threads that acquired synchronization locks will remain keeping those locks locked indefinitely.
But how can we resolve it?
Well, if you're coding using a built-in synchronization primitive, such as a critical section, it has a built-in mechanism to address this scenario. The issue happens when developers try to come up with their own synchronization techniques, say, using an event, that does not have a built-in capacity to deal with forcefully terminated threads.
But even for a home-made synchronization technique, there's a solution.
In this case, we can use the
RtlDllShutdownInProgress function that will return TRUE if the process is in the shut-down state (or, in the intermediate state after the
ExitProcess call that I described above.)
The shut-down state is a special state for the process when we can know for sure that there's only one thread running in that process, thus making it a 100% single-threaded application. And thus, we do not need any intra-process synchronization for such an application.
The reason we can be absolutely sure that there's only one thread that is running in a process where
RtlDllShutdownInProgressreturns TRUE, is because the kernel will not allow to start another thread in a process that is in the shut-down state. This is a special state for that process, and if we try to create a thread in it we'd get an error
STATUS_PROCESS_IS_TERMINATING:An attempt was made to access an exiting process.
The solution in this case was to call
RtlDllShutdownInProgress and if it returns TRUE then to skip entering a synchronization lock altogether.
You can observe how it can be done in the
EnterCriticalSectionfunction for my
That POC project was used for another purpose, but that function can illustrate a good example of the use of the
If you are brave enough to write your own synchronization primitives, make sure to understand the workings of the operating system really well. Otherwise you may be faced with some very unexpected bugs and crashes.
But better yet, don't try to re-invent the wheel and use what the OS already provides for you: critical sections, slim reader/writer locks or mutexes for the IPC locks. Because I doubt that you can code those better than the authors of the operating system.