Getting Started with tinySTM (Ubuntu 9.04)

This post is a quick guide to go from nothing to writing small tinySTM based applications. For those that don’t know, tinySTM is a library for writing applications that use transactional memory for synchronization in lieu of traditional locks an semaphores. So this begs two questions now. What is synchronization and what is transactional memory?

Loosely speaking, synchronization is a term used to refer to any method to prevent processes or threads from trampling on one another. What do I mean trampling? There’s things like memory consistency errors which is a term for when threads have an inconsistent view of the same data. For example, if two threads check the value of an integer and see different values. This is typically caused when the integer is cached on the CPU. One core will load a cached version of the variable and the other thread (running on a different core) will go to RAM to read the value. And so different values are seen! Synchronization prevents problems like these.

Transactional Memory (TM) is a style of synchronization that was inspired heavily by databases. In a database requests are encapsulated as transactions. Databases ensure integrity through transactions. This is accomplished by rolling-back any changes that were made in a partially completed transaction. This means that failed transactions won’t break your database. The same is true of memory.

With a little back story, we’re ready to start

wget http://tinystm.org/sites/tinystm.org/files/tinySTM/tinySTM-0.9.9.tgz
tar -xvf tinySTM-0.9.9.tgz
cd tinySTM-0.9.9
sudo apt-get install libatomic-ops-dev
export LIBAO_HOME=/usr/include/atomic_ops
make

Runing make compiles tinySTM and puts a static library file at ~/tinySTM-0.9.9/lib/libstm.a. Anything we write to use tinySTM will need to link to this lib file.

Let’s make sure that everything is working by compiling and running the example code that came with tinySTM.

cd test
make
cd bank
# To run these demos with multiple threads we use the "-n" option
./bank -n 3

If everything is working correctly you should get some pretty lengthy output that looks similar to this:

kris@cosmos:~/tinySTM-0.9.9/test/bank$ ./bank -n 3
Nb accounts : 1024
Duration : 10000
Nb threads : 3
Read-all rate : 20
Read threads : 0
Seed : 0
Write-all rate : 0
Write threads : 0
Type sizes : int=4/long=8/ptr=8/word=8
Initializing STM
STM flags : -O3 -DNDEBUG -Wall -Wno-unused-function -Wno-unused-label -fno-strict-aliasing -D_REENTRANT -I/usr/include/atomic_ops/include -I./include -I./src -DTLS -DDESIGN=2 -DCM=0 -DINTERNAL_STATS -DROLLOVER_CLOCK -DCLOCK_IN_CACHE_LINE -UNO_DUPLICATES_IN_RW_SETS -UWAIT_YIELD -UUSE_BLOOM_FILTER -DEPOCH_GC -UCONFLICT_TRACKING -UREAD_LOCKED_DATA -ULOCK_IDX_SWAP -UDEBUG -UDEBUG2
Creating thread 0
Creating thread 1
Creating thread 2
STARTING...
STOPPING...
Thread 0
#transfer : 1969727
#read-all : 492137
#write-all : 0
#aborts : 522377
#lock-r : 167012
#lock-w : 387
#val-r : 354978
#val-w : 0
#val-c : 0
#inv-mem : 0
#realloc : 0
#r-over : 0
#lr-ok : 0
#lr-failed : 0
Max retries : 35784
Thread 1
#transfer : 3517300
#read-all : 879229
#write-all : 0
#aborts : 986623
#lock-r : 288231
#lock-w : 691
#val-r : 697695
#val-w : 6
#val-c : 0
#inv-mem : 0
#realloc : 0
#r-over : 0
#lr-ok : 0
#lr-failed : 0
Max retries : 45082
Thread 2
#transfer : 1947009
#read-all : 486864
#write-all : 0
#aborts : 580381
#lock-r : 228081
#lock-w : 328
#val-r : 351970
#val-w : 2
#val-c : 0
#inv-mem : 0
#realloc : 0
#r-over : 0
#lr-ok : 0
#lr-failed : 0
Max retries : 57503
Bank total : 0 (expected: 0)
Duration : 10000 (ms)
#txs : 9292266 (929226.600000 / s)
#read txs : 1858230 (185823.000000 / s)
#write txs : 0 (0.000000 / s)
#update txs : 7434036 (743403.600000 / s)
#aborts : 2089381 (208938.100000 / s)
#lock-r : 683324 (68332.400000 / s)
#lock-w : 1406 (140.600000 / s)
#val-r : 1404643 (140464.300000 / s)
#val-w : 8 (0.800000 / s)
#val-c : 0 (0.000000 / s)
#inv-mem : 0 (0.000000 / s)
#realloc : 0 (0.000000 / s)
#r-over : 0 (0.000000 / s)
#lr-ok : 0 (0.000000 / s)
#lr-failed : 0 (0.000000 / s)
Max retries : 57503

It’s really no fun to run someone else’s code, so lets build something simple from the ground up. I’ll be using the Boost Thread library for threading instead of pthreads (which is what the tinySTM examples use).

I’m going to write a very contrived example, where I’ll have a Counter class and a MyRunnable class. The Counter class will be extremely simple. In fact, it will basically just be a wrapper around an integer. The only method of interest it will provide will be increment(), which will increment the integer some amount each time it is called. The other class, MyRunnable is basically just an encapsulation of a Boost thread, you can think of it as class the implements Runnable in Java.

The program will start a bunch of threads via Boost, which results in the the run() method of each MyRunnable object getting executed from a different thread of execution. The MyRunnables will try to call increment() on the same Counter object. If everything is done right, each call should be accounted for in the end.

I will synchronize the increment() method by enclosing its body in a transaction. That means that if another thread modifies any of the memory touched in the body of increment, the transaction will be canceled and rolled back to the original state.

Don’t forget to copy all of the tinySTM .h files (stm.h, mod_mem.h, etc) and the library file (libstm.a) into your current working directory. With all of that in mind, here’s the example:

//File: samplestm.cpp
//Author: Kristopher Kalish
#include <iostream>
#include <boost/thread.hpp>
#include <atomic_ops.h>
#include "stm.h"

// These following macros are from the tinySTM examples, and they truly 
// are useful.
/*
 * Useful macros to work with transactions. Note that, to use nested
 * transactions, one should check the environment returned by
 * stm_get_env() and only call sigsetjmp() if it is not null.
 */
#define RO                              1
#define RW                              0
#define START(id, ro)                   { sigjmp_buf *_e = stm_get_env(); stm_tx_attr_t _a = {id, ro}; sigsetjmp(*_e, 0); stm_start(_e, &_a)
#define LOAD(addr)                      stm_load((stm_word_t *)addr)
#define STORE(addr, value)              stm_store((stm_word_t *)addr, (stm_word_t)value)
#define COMMIT                          stm_commit(); }

using namespace std;

static const int INCREMENT = 5;
static const int NUM_RUNS  = 100000;

class Counter
{
public:
	Counter()
	{
		value = 0;
	}

	/**
	 * Increment the counter by five by looping. A loop was picked to
	 * make calls to increment() take more cpu time.
	 */
	void increment()
	{
		START(0, RW);

		for(int i = 0; i < INCREMENT; i++)
		{
			int tmp = (int) LOAD(&this->value);
			tmp = tmp + 1;

			STORE(&this->value, tmp);
		}

		COMMIT;
	}

	int getValue()
	{
		return value;
	}

private:
	int value;

};

class MyRunnable
{
public:

	MyRunnable(int id, boost::barrier* bar, Counter* count)
	{
		this->id    = id;
		this->bar   = bar;
		this->count = count;
	}

	void run()
	{
		for(int i = 0; i < NUM_RUNS; i++)
		{
			count->increment();
		}

		// all done, wait at the barrier
		bar->wait();
	}

	// The entry point for a thread
	void operator()()
	{
		// We must call stm_init_thread() at the beginning of each
		// thread's line of execution before using the tinySTM library
		stm_init_thread();

		run();

		// Call this at the end of each thread's execution to have
		// tinySTM clean up.
		stm_exit_thread();
	}

private:
	int             id;
	boost::barrier* bar;
	Counter*        count;
	
};

int main()
{
	int            numThreads = 4;
	boost::barrier my_barrier(numThreads);
	Counter        count;

	cout << "Intializing tinySTM." << endl;
	stm_init();

	cout << "Counter is starting with value: " << count.getValue() << endl;
	cout << "Starting " << numThreads << " counting threads..." << endl;

	// Need to make at least one thread
	assert(numThreads >= 1);

	// Make the first thread
	boost::thread thread1(MyRunnable(0, &my_barrier, &count));

	// Then make the remaining threads
	for(int i = 1; i < numThreads; i++)
		boost::thread thread(MyRunnable(i, &my_barrier, &count));

	// thread1 will terminate when all threads have reached the barrier
	thread1.join(); // Wait for thread1 to terminate 

	cout << "Counter is ended with value: " << count.getValue() << endl;
	cout << "Counter should be: " << NUM_RUNS * numThreads * INCREMENT << endl;

	// Let tinySTM clean up after itself
	stm_exit();

	return 0;
}

Then to compile and run, we will need to link against the tinySTM library and Boost library:

g++ samplestm.cpp -lboost_thread-mt libstm.a -o sample
./sample

Example output:

Intializing tinySTM.
Counter is starting with value: 0
Starting 4 counting threads...
Counter is ended with value: 2000000
Counter should be: 2000000

Leave a Comment