When not to use threads, part 2

The lesson learned from the previous experiment was that threads (at least pthreads on Linux) aren’t lightweight. They take quite some time to create and start executing. If your task is of pretty short duration you’ll spend more time setting up the thread than you will actually doing work. They also take up considerable memory. You’ll want to be considerate of these things when designing something using threads.

For my logging example task there’s another design that might be superior. Instead of starting a thread for each bit of data to be logged I could create a single thread that would log data and sleep. If more data is logged it can be passed to the sleeping thread and written. This will eliminate the setup time for each data write and greatly reduce the amount of system resources used if many writes are done in succession.

Let’s test that. Here’s the code to implement a single thread version:


// queue for log data
std::queue< std::string > LogQueue;
pthread_cond_t myconvar;
BMutex QueueLock;

// thread function
void* QueueThread( void* arg )
{
 while ( true )
 {
 QueueLock.lock();
 pthread_cond_wait( &myconvar, &QueueLock._mutex );
 while ( ! LogQueue.empty() )
 {
 fwrite( LogQueue.front().c_str(), LogQueue.front().size(), 1, fd );
 LogQueue.pop();
 }
 QueueLock.unlock();
 }
}

{
 fd = fopen( "log4.txt", "wt" );

 pthread_setconcurrency( 2 );

 pthread_attr_t attr;
 pthread_attr_init( &attr );
 pthread_attr_setdetachstate( &attr, PTHREAD_CREATE_JOINABLE );
 pthread_attr_setstacksize ( &attr, 65536 );

 pthread_cond_init( &myconvar, NULL );

 pthread_t thread;
 int rc = pthread_create( &thread, &attr, QueueThread, NULL );

 clock_t start = clock();

 for ( int i = 0; i < tests; i++ )
 {
 QueueLock.lock();
 // queue the data
 LogQueue.push( sz );
 if ( LogQueue.size() % 4 == 0 )
 // signal the thread to write it
 pthread_cond_signal( &myconvar );
 QueueLock.unlock();
 }

 clock_t stop = clock();

 double d = ((double)(stop-start))/ CLOCKS_PER_SEC;
 printf( "%8.3f seconds (%d clocks)\n", d, (int)(stop-start) );

 // wait for queue to empty
 bool done;
 do
 {
 QueueLock.lock();
 done = LogQueue.empty();
 QueueLock.unlock();
 BSleep::sleep( 10 );
 } while ( ! done );

 pthread_cancel( thread );
 pthread_cond_destroy( &myconvar );
 pthread_attr_destroy( &attr );
 fclose( fd );
}

I’m using pthreads condition variables to communicate to the sleeping thread when new data is available.

How does this perform?

This design is much better than the previous one.

I changed the test to use 10,000,000 iterations this time because the difference between the two designs was much lower. I needed that many iteration to magnify any differences enough to make them visible.

Manual data write: 0.920 seconds (920000 clocks)
Threaded write: 14.640 seconds (14640000 clocks)

This is many orders of magnitude better than the previous design but the threaded code is 15.9 times slower still! This pretty much proves that none of the threaded designs was in any way superior to just writing the data directly to a file. The logging frameworks in all my programs aren’t going to use threading!

Have a great day

++djs

This entry was posted on Saturday, September 25th, 2010 at 11:26 am and is filed under C++, parallel processing, threads, tutorial. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Non maskable interrupt