Overview
For this article, I will focus on multithreading, outlining the subset of Win32 and pthread routines needed to create a simple thread control class that can be used on either Windows or MacOS. This uses some of the low-level wrapper routines covered in the previous article, and covers some issues with spawning threads in a way that works the same for both Windows and Mac.
Some multi-platform projects attempt to emulate Win32 routines using
pthreads, or to emulate pthreads using Win32 routines. This ends up being
an awkward proposition, since the two systems behave differently and
expose different types of functionality. A pthread condition variable
requires both a condition and a mutex, while Win32 only exposes a single
handle for an event. Win32 has PulseEvent
and
WaitForMultipleObjects
, for which there are no equivalents
in pthreads. Some libraries go to great lengths to emulate the full set
of functionality from one API using the other. In my experience, it is
easier and safer to implement a single abstraction that can use either
Win32 or pthreads, rather than dealing with code that attempts to emulate
Win32 functionality using pthreads or vice versa.
This article takes the approach of defining a single interface class, with both a Win32 version and a pthread version that provide the same behavior on both Windows and MacOS. Only the common subset of Win32 and pthread functionality is exposed — functionality specific to only one platform is ignored to avoid trying to emulate non-native functionality.
Grab this zip file: crossplat.zip. It contains a project that can be built with both DevStudio 7 and Xcode 3. There's quite of bit of extra framework in there for something I've just started working on (which you can ignore), along with some platform-specific abstractions I've used on a number of other projects built on both Mac and PC.
The Thread Function
Although the semantics are different between the platforms, the basic approach to creating and using threads is the same for both Win32 and pthreads.
As an entry point, both APIs require a thread function, which is
essentially the thread's equivalent of main()
. The
operating system will call into the thread function. When the thread
function returns, the OS will automatically terminate the thread.
Win32 requires the following prototype for the function that is passed into
_beginthreadex
:
unsigned __stdcall QzBaseThreadFunc(void *pContext);
For pthreads, the thread function given to pthread_create
needs to have the following prototype:
void* QzBaseThreadFunc(void *pContext);
Although the return types are different, both systems allow for a void*
pointer to be passed into thread function. The programmer can use this pointer
to pass any desired block of data to the worker thread, such as a struct filled
with parameters. When dealing with C++ code, a common technique is to pass in
a this
pointer, allowing the thread to be contained inside a
class object. However, normal class methods cannot be used for this, since calling
a class method involves passing a hidden this
pointer.
Since only one pointer can be passed to the thread function, a simple approach is
to use two class methods, one of which is static. Since static methods do not
have hidden this
pointers, they can be used for callbacks. The
static method can recast the context pointer to a this
pointer,
then invoke a normal class method that serves as the real thread function,
which allows full access to the class's member variables.
Such a class generally looks like this:
class MyThreadClass { public: // This is the real thread function, which does the real work. void* RealFunc(void) { while (StillBusy()) { DoSomething(); } return NULL; } // This is a static function, since normal class methods cannot be // used to create threads. The context pointer is a pointer to an // instance of MyThreadClass, so all the code needs to do is recast // the pointer and pass the function call into the real thread // function for this class. static void* StaticFunc(void *pContext) { return reinterpret_cast<MyThreadClass*>(pContext)->RealFunc(); } // Calling this method will spawn a worker thread for this object. // We pass in a function pointer to the static function, and the // *this pointer so StaticFunc can reference the object. void StartThread(void) { pthread_t hThread; pthread_create(&hThread, NULL, StaticFunc, this); } };
Calling the StartThread
method will create a new thread, with
StaticFunc
as the entry point and this
as the
context pointer. StaticFunc
only exists to recast the pointer
back to a MyThreadClass
pointer and invoke RealFunc
.
This is a bit of a circuitous way to do things, but the end result is that a
class method is used for the thread function, providing an object oriented
wrapper for the worker thread.
The downside to this specific example is that the code only works with
pthreads. Win32 uses a slightly different prototype for the thread function,
so a different function declaration is needed for the current operating
system. That could be accomplished by using #ifdef
s to
provide the correct prototype for the current platform, but that kind of
construct can be messy when multiple thread functions are required, and it
would require cross-compiling to verify every declaration.
A cleaner implemention is to move the platform-specific prototypes behind an abstraction layer. This allows a single prototype to be used for worker threads across all platforms. It also allows extra platform-specific logic to be added to the thread entry and exit code, as well as the addition of debugging support (assigning names to threads, logging when threads are created and terminated, setting status flags, etc.).
The approach I use with threading is to define a custom function pointer:
U32 FuncPointer(void *pContext);
This allows a common prototype to be used for both platforms. Microsoft's
prototype requires the __stdcall
declaration, which Xcode does not like.
Meanwhile, pthreads use a void*
as the return type, but which is not
supported by Win32 (although it could be simulated if required).
A Common Interface
In my library, I use a single class declaration, as defined in QzSystem.h. This allows the app to use a single construct for interacting with threads, providing a common set of routines to start, stop, signal, and wait on a thread. More importantly, this class behaves the same, regardless of whether the Win32 or pthread implementation is being used.
class QzThreadControl { private: void* m_pContext; public: QzThreadControl(void); ~QzThreadControl(void); bool StartThread(U32 (*func)(void*), void *pParams, const char threadName[]); void StopThread(void); U32 WaitForClosure(void); U32 WaitForEvent(U32 milliseconds = 0xFFFFFFFF); void SignalThread(void); U32 GetID(void); bool IsThreadAlive(void); bool TestStopFlag(void); };
The class hides its implementation by using a custom structure for each
implementation. This structure is dynamically allocated, and stored as a
void*
pointer. The definition of the structure only exists
in the platform-specific CPP file, and requires that the internal methods
cast this pointer back to the actual type before accessing any of the data
hidden in the struct. Needing to recast the pointer is a bit ugly, but
it keeps the definition of the struct hidden from the rest of the app.
Internal Structs
For Win32, the custom structure is defined as follows:
struct QzThreadContext_t { char ThreadName[32]; U32 ThreadID; HANDLE hThread; volatile bool StopRequested; volatile bool StopCompleted; void* pParams; U32 (*Func)(void*); QzSyncEvent SignalEvent; };
This struct stores all of the information required for the thread. These values are used both from the worker thread itself for information and communication, and from the main thread to signal to the worker thread and to detect when the thread has terminated.
- ThreadName
- For debugging purposes, an ASCII string is assigned to the thread. Under Win32, this is used to name the thread at the kernel level, so the name will appear in system-level debugging messages. For MacOS, this is just used in logging message. In either case, assigning a unique string to each thread makes it much easier to identify an arbitrary thread when breaking in the debugger.
- ThreadID
- On Windows, each thread is assigned a unique ID, which is displayed by low-level debugging messages. This is really just for informational and debugging purposes. Thread creation will log the ID in the struct for reference when debugging. For MacOS, this serves no purpose, but is emulated to expose the same functionality on both platforms. The ID is useful enough on Windows that it is worth emulating on MacOS. (If you're familiar enough the kernel internals, you could store a task ID here, but I have not needed to do this on MacOS.)
- hThread
- The thread handle is used to uniquely identify the thread. Whereas the thread ID is just informational, the handle is used in function calls to access the thread. There are a number of Win32 functions that can touch a thread, but the only one this code cares about is for detecting whether the thread has terminated. MacOS also provides a thread handle (although it uses a different type than Win32), which is needed when calling some pthread functions.
- StopRequested
- This flag is used by the waiting logic to test whether the app has requested that the thread terminate.
- StopCompleted
- Once the thread exits, this flag is set so that the main thread is able to test the exit status of the thread. Although the main thread can wait for the worker thread to terminate, this flag allows the main thread to poll the status without executing an infinite wait.
- pParams
-
This is a generic
void*
context pointer that is provided by the app when creating a thread. This one is calledpParams
since there is already anm_pContext
pointer being used. - Func
-
This is the static function pointer that is used for the worker thread's
main function. The
pParams
value will be passed into this function as thevoid*
context pointer. - SignalEvent
- This event is used to wake up a waiting worker thread. It is used for both internal signalling to wake up a thread that needs to terminate, and to wake up the thread in the general case to process data.
A further point on using events with worker threads: Win32 programmers often
use WaitForMultipleObjects
when communicating with a worker
thread. This allows multiple events to be used: one to indicate that the
thread should terminate, another indicates that data on a queue needs to be
processed, and as many as 62 other events could be used to indicate other
logic states.
The problem with this is twofold. First, any logic being driven by
WaitForMultipleObjects
must be coded very carefully, or it
is possible that some events would be dropped, or for the code to not
detect multiple signals to the same event.
The second — and more important — problem is that pthreads do not have any concept of multiple events. It is possible to jerry-rig a form of multiple wait using pthreads, but the reliability of any such implementation would be questionable. I personally have never seen a reliable implementation. The much safer approach is to simply never use multiple-wait logic. Use only one event to signal the worker thread.
Sometimes you can get away with using a few volatile status variables to indicate
why a thread is being signalled. However, the safest approach is to define a
message queue, with all information to the worker thread going through the queue.
After a message has been appended to the queue, use the
QzThreadControl::SignalThread()
method to wake up the thread. If messages need to be returned, use a separate response
queue that the main thread can poll. Obviously, any such queues would need to
be thoroughly protected by critical sections (mutexes).
There are also various non-blocking wait techniques for inter-thread communication, but those are beyond the scope of this article.
Now, for comparison, here is the struct I use for MacOS:
struct QzThreadContext_t { char ThreadName[32]; U32 ThreadID; pthread_t hThread; volatile bool StopRequested; volatile bool StopCompleted; void* pParams; U32 (*Func)(void*); QzSyncEvent SignalEvent; };
This is almost identical to the Win32 struct. The only difference is
hThread
. Win32 uses a handle (literally, just a void*
)
for thread handles, whereas pthreads define a specific variable type. Depending
on the platform, pthread_t
may be a 32 or 64-bit integer, or it
may be a pointer.
Note: You could define hThread
as a void*
pointer, then
put the struct definition in the header file and reuse it for both Win32 and MacOS,
but this would not necessarily be reusable for pthread implemenations on
other platforms. If you've ported code to work on OS X, you're half way to
supporting Linux. Keeping the specifics of the implementation in the CPP
file will future-proof the code for working with other versions of pthreads,
as well as any future non-backwards-compatible changes Apple makes to
pthread_t
in future versions of their OS.
For both Win32 and pthreads, the class constructor is the same: all it needs to do is allocate a struct, initialize all of the variables to zeroes (using the correct semantic symbols), and create the event that will be used to signal the thread.
QzThreadControl::QzThreadControl(void) : m_pContext(NULL) { QzThreadContext_t *pTC = new QzThreadContext_t; pTC->ThreadName[0] = '\0'; pTC->ThreadID = 0; pTC->hThread = NULL; pTC->StopRequested = false; pTC->StopCompleted = false; pTC->pParams = NULL; pTC->Func = NULL; pTC->SignalEvent.Create(); m_pContext = pTC; }
Base Thread Function
A separate thread function is defined for each platform. This function,
QzBaseThreadFunc
, is the actual function pointer that will
be passed into the OS call to create the thread. This is little more
than a hook function, which passes control to the app-provided function
pointer. By using a hook function, we gain two benefits: a common
function pointer type is exposed to the rest of the app (hiding the
platform-specific prototype), and we can have the hook function to do
some set-up before calling into the worker function, then any clean-up
when the function exits.
Here is the Win32 version of the hook function (minus some logging code):
static unsigned __stdcall QzBaseThreadFunc(void *pContext) { QzThreadContext_t *pTC = reinterpret_cast<QzThreadContext_t*>(pContext); // Apply the name to the worker thread. This name will be // displayed in some Win32 debugging messages, and can be // accessed in the debugger to help figure out which thread // is currenting being viewed. QzSetThreadName(pTC->ThreadName); // Invoke the static thread function defined in higher-level // code. This function call will not return until the thread // has finished. unsigned result = pTC->Func(pTC->pParams); // The last thing we do is set this volatile flag, indicating // that the thread has terminated normally. pTC->StopCompleted = true; // Return the result code from pTC->Func(). This gets passed // through to Win32, which will write a termination message // to debug output. return result; }
The MacOS hook function is almost identical:
static void* QzBaseThreadFunc(void *pContext) { QzThreadContext_t *pTC = reinterpret_cast<QzThreadContext_t*>(pContext); // Invoke the static thread function defined in higher-level // code. This function call will not return until the thread // has finished. unsigned result = pTC->Func(pTC->pParams); // The last thing we do is set this volatile flag, indicating // that the thread has terminated normally. pTC->StopCompleted = true; // Return NULL, since the return value is not important to // pthreads. It is possible for the app to retrieve the // return pointer, but that logic is specific to pthreads. // Since it does not generalize to Win32, we cannot do // anything useful with this pointer. return NULL; }
The Win32 version will set the thread's kernel-level name. Since MacOS does not have any capacity to name threads (or at least none I can find), it just ignores the thread name. The MacOS version also ignores the return result from the thread function, since the integer return type cannot be mapped to a pointer (the return value is only intended for logging purposes).
Starting Threads
Under Win32, a thread is created as follows:
bool QzThreadControl::StartThread( U32 (*func)(void*), void *pParams, const char threadName[]) { QzThreadContext_t *pTC = reinterpret_cast<QzThreadContext_t*>(pContext); // Do not attempt to start the thread if one is already running. if (NULL != pTC->hThread) { return false; } // The thread name needs to be ASCII, so we need to use standard string // copy functions instead of UTF-8 routines. And since strncpy() is not // safe, we must explicitly make certain the copied string is terminated. strncpy(pTC->ThreadName, threadName, ArraySize(pTC->ThreadName)); pTC->ThreadName[ArraySize(pTC->ThreadName)-1] = '\0'; pTC->StopRequested = false; pTC->StopCompleted = false; pTC->pParams = pParams; pTC->Func = func; // Use a temporary variable to hold the thread ID, since strict // compiling will complain about pointer types if we attempt to // pass in a pointer to any other integer type. unsigned int id = 0; pTC->hThread = reinterpret_cast<Handle_t>(_beginthreadex(NULL, 0, QzBaseThreadFunc, m_pContext, 0, &id)); pTC->ThreadID = id; // NOTE: _beginthreadex returns NULL if it cannot create the thread // (as opposed to _beginthread, which returns the inconsistently // used INVALID_HANDLE_VALUE value if there is an error) return (NULL != pTC->hThread); }
Also, be wary of the INVALID_HANDLE_VALUE
symbol.
_beginthreadex
will return NULL
if there is an error.
Since it takes a lot of effort to force thread creation to fail, any
error handling code you put in place is almost certainly never going to
be exercised by test code, so you may never realize that the wrong value
is being used when testing for errors. Confusion about when to use
INVALID_HANDLE_VALUE
is common among Win32 programmers
(thanks to Microsoft's inconsistent use of it), and a lot of code has been
written over the years with _beginthread
, only later to have
it changed over to using _beginthreadex
, without careful
checking of the return symbol.
Make certain any error handling code you write is testing for NULL
.
In comparison, here is the MacOS version of the code:
bool QzThreadControl::StartThread( U32 (*func)(void*), void *pParams, const char threadName[]) { QzThreadContext_t *pTC = reinterpret_cast<QzThreadContext_t*>(pContext); // Do not attempt to start the thread if one is already running. if (NULL != pTC->hThread) { return false; } // The thread name needs to be ASCII, so we need to use standard string // copy functions instead of UTF-8 routines. And since strncpy() is not // safe, we must explicitly make certain the copied string is terminated. strncpy(pTC->ThreadName, threadName, ArraySize(pTC->ThreadName)); pTC->ThreadName[ArraySize(pTC->ThreadName)-1] = '\0'; pTC->StopRequested = false; pTC->StopCompleted = false; pTC->pParams = pParams; pTC->Func = func; pTC->ThreadID = QzThreadSafeIncrement(&g_NextThreadID); pthread_create(&(pTC->hThread), NULL, QzBaseThreadFunc, m_pContext); return (NULL != pTC->hThread); }
This is almost identical to the Win32 implementation, with two differences.
The first is that we have to manufacture the ThreadID
value, since
pthreads do not have an equivalent value. We only do this to maintain compatibility
with Win32's logic, so the thread ID can be written to the log and exposed to any
code that needs to fetch it for testing the ID value. (It is possible to
substitute a low-level process ID, if you are familiar enough with the Linux
routines, but this information is not exposed by pthreads.)
The second difference, of course, is the call to pthread_create
to
create the worker thread.
Signalling
Signalling a thread is a common operation. This is done any time the thread needs to wake up and process data. A common case is when using a message queue to send commands to the thread: after new messages have been pushed onto the queue, the thread is signalled to make sure it wakes up and processes all of the data in the queue.
There is already an abstraction for events, so the main thread only needs to signal that event after appending a message to the queue (or by whatever other technique is used to pass data to the worker thread for processing).
void QzThreadControl::SignalThread(void) { QzThreadContext_t *pTC = reinterpret_cast<QzThreadContext_t*>(pContext); pTC->SignalEvent.Signal(); }
The worker thread would sit in a loop, waiting for events to process (see the
example code at the end of this article for one possible way to implement the
body of a worker thread that waits for events). It calls
WaitForEvent
to handle the waiting, which will return one of three
results: signalled, timed out, or a stop request. The
stop request indicates that the thread function needs to break out of
its loop, clean up, and return so that the thread can be terminated.
The distinction between signalled and timed out is less important. Although signalled indicates that there is some kind of event to process, this is not a guaranteed state. Due to concurrency, it is possible that the "new" work to process was already processed on the previous iteration of the loop, before the event was signalled. Or the thread's wait may time out just before the main thread signals the thread to wake up, so new data is available when a time-out occurs. These conditions may not be likely, but they will occur. It might happen once every few seconds or only once a week, depending on the vagaries of timing in your app. As such, the simplest approach is to treat signalled and timed out as the same state: always process some work when the thread wakes up, regardless of the reason (unless a termination request has been received).
Since abstractions hide the platform-specific details, WaitForEvent
is implemented the same on both platforms.
U32 QzThreadControl::WaitForEvent(U32 milliseconds) { QzThreadContext_t *pTC = reinterpret_cast<QzThreadContext_t*>(pContext); // Test the termination flag before waiting. if (pTC->StopRequested) { return QzSyncWait_StopRequest; } U32 result = pTC->SignalEvent.Wait(milliseconds); // Test the termination flag once more after the wait completes. // We have no idea how long the wait required, so the flag may have // been set in the interrum. This requires that the flag be volatile // to avoid memory caching problems. if (pTC->StopRequested) { return QzSyncWait_StopRequest; } return result; }
All the code really needs to do is check whether the termination flag is set. If not, it needs to wait for the sync event to become signalled.
Stopping Threads
Stopping a thread works the same as signalling the thread. The difference is that the termination flag needs to be set before signalling the sync event.
void QzThreadControl::StopThread(void) { QzThreadContext_t *pTC = reinterpret_cast<QzThreadContext_t*>(pContext); // First, set this volatile flag. WaitForClosure() will test this // flag to determine whether the thread should terminate. pTC->StopRequested = true; // Then we signal this event. This will wake up the worker thread // so it can check the status of the StopRequested flag. pTC->SignalEvent.Signal(); }
By setting the termination flag, WaitForEvent
will detect the
termination request and return the stop request enum value.
The final step in terminating a thread is to wait for the thread to
enter a fully terminated state. Win32 and pthreads both provide functions
that can be used to wait for the thread to stop. The difference between
them is that pthread_join
will never return unless the thread
terminates. However, Win32 exposes thread termination as another event
that can be tested by WaitForSingleObject
. Since it is
possible to provide a short timeout for WaitForSingleObject
,
the Win32 API allows an app to periodically poll whether a thread has
terminated, then go do something else if the thread is still running.
Since pthread_join
will wait forever, a generic thread
wrapper class should adopt the "wait forever" approach even when using
Win32 threads. This is a good approach, since many programmers (myself
included) have used this timeout period to assume that a thread is
hung, and then proceed with shutting down the app. The danger is that a
still-running thread can wake up during the app shut down sequence and
attempt to access resources that are being deleted, causing the app to
crash while shutting down. Some users may not notice or care about
this condition, but any good SQA group will detect this and write it up
in the bug database.
The problem is that if it app takes a long time to shut down, you may feel inclined to use a short wait so the app shuts down faster when there is a very high CPU load. This is a case where SQA or management may try to browbeat you into reducing the wait so it times out after a short period of time, allowing the app shut down faster when being subjected to stress tests. This only results in more bug reports from SQA when the app periodically crashes while shutting down, and you have to choose between "shutdown is slow" versus "sometimes crashes when shutting down". Been there, done that, resolved the bug reports.
There is no good answer to a worker thread that deadlocks or crashes. Waiting forever for the thread to terminate can cause the whole app to deadlock. Abandoning the wait so the app can terminate may give a thread enough time to wake up and access deleted resources. Neither choice is good, and either way, the app must shut down or risk letting things get worse. The only real answer is to never let worker threads crash or deadlock — in other words, the answer to that problem is not something this article can address.
The implementation for WaitForClosure
is almost identical
for Win32 and MacOS.
The difference is that once WaitForSingleObject
has returned,
we need to call CloseHandle
to release the handle, otherwise
the handle will not be freed, causing a minor resource leak that could
result in the app running out of handle values if an extremely large number
of threads are created and destroyed over the lifetime of the app.
U32 QzThreadControl::WaitForClosure(void) { QzThreadContext_t *pTC = reinterpret_cast<QzThreadContext_t*>(pContext); U32 result = QzSyncWait_Signalled; // WaitForClosure() may be called more than once to assure // that a thread is really stopped, as part of fall-back // error handling when shutting down. So this pointer is // NULL after the thread has been stopped. // if (NULL != pTC->hThread) { U32 flag = WaitForSingleObject(pTC->hThread, INFINITE); // Once the thread is done executing, we need to close // the handle to release the associated resources, and // so the rest of the logic in this class can detect // that the thread is has finished. CloseHandle(pTC->hThread); pTC->hThread = NULL; if (WAIT_OBJECT_0 == flag) { result = QzSyncWait_Signalled; } else if (WAIT_TIMEOUT == result) { result = QzSyncWait_Timeout; } else { result = QzSyncWait_Error; } } return result; }
The pthread version of WaitForClosure
is nearly the same,
except that it uses pthread_join
to wait for the thread
to terminate. Since pthread_join
is inherently an infinite
wait, we don't need to worry about timing out.
U32 QzThreadControl::WaitForClosure(void) { QzThreadContext_t *pTC = reinterpret_cast<QzThreadContext_t*>(pContext); U32 result = QzSyncWait_Signalled; // WaitForClosure() may be called more than once to assure // that a thread is really stopped, as part of fall-back // error handling when shutting down. So this pointer is // NULL after the thread has been stopped. // if (NULL != pTC->hThread) { // This gets assigned the void* pointer that is returned by // the function pointer that was passed to StartThread. void *pReturnedVoid = NULL; // Wait for the thread to terminate. S32 result = pthread_join(pTC->hThread, &pReturnedVoid); pTC->hThread = NULL; if (0 == result) { result = QzSyncWait_Signalled; } else { result = QzSyncWait_Error; } } return result; }
Example Thread
An example worker thread is provided in the QzTest project (consult the TestThread.cpp source file for more comments). This example is simple, but demonstrates how all of the above code is used by a worker thread. All the thread does is wake up every 250 milliseconds, increment a counter, then go back to sleep. Visually, you will see that the thread is working because there is a little spinning character at the top of the screen.
class TestThread { private: volatile S32 m_Counter; QzThreadControl m_Thread; public: TestThread(void) : m_Counter(0) { } ~TestThread(void) { // Always make certain that the worker thread has been terminated. DestroyWorkerThread(); } void CreateWorkerThread(void) { m_Thread.StartThread(StaticThreadFunc, this, "ThreadFunc"); } void DestroyWorkerThread(void) { if (false == m_Thread.IsThreadAlive()) { return; } // Request the thread to stop. m_Thread.StopThread(); // Now wait for the thread to wake up, process the // termination request, and exit. if (QzSyncWait_Signalled == m_Thread.WaitForClosure()) { LogMessage("ThreadFunc is dead"); } else { LogErrorMessage("ThreadFunc did not terminate"); } } static U32 StaticThreadFunc(void *pContext); { TestThread *p = reinterpret_cast<TestThread*>(pContext); return p->ThreadFunc(); } U32 ThreadFunc(void) { for (;;) { // Set the time-out duration so the thread will wake up // and rotate through all four states once every second. U32 result = m_Thread.WaitForEvent(250); if (QzSyncWait_Error == result) { LogErrorMessage("ThreadFunc wait failed"); break; } if (QzSyncWait_StopRequest == result) { LogMessage("ThreadFunc detected termination request"); break; } // Cycle through four possible counter values. m_Counter = (m_Counter + 1) % 4; } return 0; } S32 GetCounter(void) { return m_Counter; } };
Final Warnings
Never call _endthreadex
. It will kill the thread without allowing
the stack to unwind. This prevents the destructors for stack objects from being
called, which can result in memory and resource leaks. Unfortunately, some of
the MSDN documentation shows this function being used to terminate a thread,
which has resulted in many programs getting written that make this mistake.
(Of course, if the thread function does not have any local objects, then it is
possible for this to be a safe way to terminate a thread, but as a rule, avoid
this function. If you are writing C++ code, local objects are almost
guaranteed to be added to the thread function at some point — even if
avoided in the initial version of the code, someone maintaining the code can
easily add them without realizing the problems that will result.)
Do not use _beginthread
to create a thread. First, _beginthread
will close the thread handle when the thread exits, which prevents the app from
detecting when the thread has terminated via WaitForSingleObject
.
More subtly, _beginthread
may not return a valid handle to begin with,
if the worker thread can terminate fast enough. The odds of this second case happening
are very small, but possible — the worker thread needs to start, run, and exit
before the call to _beginthread
returns.
Never use the Win32 CreateThread
function. This suffers from potential
memory leaks due to standard library functions. Certain functions (such as strtok
)
require thread local storage to store state information between calls. This TLS
will only be allocated the first time one of these standard functions is called,
but it will not be freed when the thread terminates. The size of the leak is
small, but will add up over the lifetime of the app if a very large number of
threads are created and destroyed. This problem does not exist with
_beginthreadex
; use that function instead of CreateThread
.