In the same week, the shell team was asked to investigate two failures.
The first one was a deadlock in Explorer. The participating threads look like this:
- Thread 1 called
FreeLibrary
on a shell extension as part of normalCoFreeUnusedLibraries
processing. That DLL calledOleUninitialize
from itsDllMain
function. This thread blocked because the COM lock was held by thread 2. - Thread 2 called
CoCreateInstance
, and COM tried to load the DLL which handles the object, but the thread blocked because the loader lock was held by thread 1.
The shell extension caused this problem because it ignored the rule
against calling shell and COM functions from the
DllMain
entry point, as specifically called out
in the
DllMain
documentation as examples of functions that
should not be called.
The authors of this shell extension may never have caught this problem in their internal testing (or if they did they didn't understand what it meant) because hitting this deadlock requires that a race window be hit: The shell extension DLL needs to be unloaded on one thread at the exact same moment another thread is inside the COM global lock trying to load another DLL.
Meanwhile, another failure was traced back to a DLL calling
CoInitialize
from their
DllMain
.
This extra COM initialization count means that when the thread
called
CoUninitialize
thinking that it was uninitializing COM,
it actually merely decremented the count to 1.
The code then proceeded to do things that are not allowed
in a single-threaded apartment, believing that it had already
torn down the apartment.
But the secret CoInitialize
performed by the
shell extension violated that assumption.
Result: A thread that stopped respondingt to messages.
The authors of both of these shell extensions seemed be calling
CoUninitialize
/OleUninitialize
in order to cancel out a
CoInitialize
/OleInitialize
which they performed in their
DLL_
.
This is fundamentally unsound not only because of the general
rule of not calling COM functions inside DllMain
but also because
OLE initialization is a per-thread state,
whereas
the thread that gets the
DLL_
notification
is not necessarily the one that receives the
DLL_
notification.
It so happens that in the second case, the DLL in question was a shell
copy hook, and the hang was occuring not in Explorer but in an
application which was using SHFileOperation
to delete some files.
We could at least advise the application authors to pass the
FOFX_
flag to
IFileOperation::
to prevent copy hooks from being loaded.
Previous articles in this series: Part 1, Part 2.