Porting MATLAB’s Discrete Gaussian Smoothing to C++

A couple years ago I was placed in charge of porting a project written in MATLAB to C++. The resulting product, which we call LoCA Chop internally, takes a set of meshes and prepares them to be suitable for Localized Components Analysis, a method developed by some fellow students at UC Davis. LoCA Chop accepts a set of input shapes. The assumption is that all of the shapes are different samples of the same structure. In my case I was applying it to a set of hippocampi. Each input sample does not necessarily contain the entire hippocampus; portions of either end may have been omitted because they didn’t show up in the brain scan. LoCA Chop’s job is to find the largest subset of the shape that exists in all input meshes.

I found it to be a very interesting problem, and had a lot of fun porting it to C++ because it forced me to understand the entire solution. It worked by maximizing an objective function which depended on a handful of MATLAB functions. This is where the headaches began. The method worked by optimizing a function which had no derivative, so I needed my implementation of the objective function to be as numerically similar as possible. This meant doing a lot of work to replicate MATLAB functions in C++. One of the most basic parts I had to port was a function which smoothed a vector of numbers:

function [smoothedVector] = gaussianSmooth1D(vector)
    % Construct blurring window.
    windowWidth = int16(7);
    halfWidth = windowWidth / 2;
    gaussFilter = gausswin(double(windowWidth));
    gaussFilter = gaussFilter / sum(gaussFilter); % Normalize.

    % Do the blur.
    smoothedVector = conv(vector, gaussFilter);
    smoothedVector = smoothedVector(halfWidth:end-halfWidth);

The corresponding C++ function takes a vector of doubles, smooths it, and returns it. It also turns out that MATLAB rounds the result of integer division (instead of truncating it) when you use the resulting number to do subset selection, so my C++ implementation ended up looking like this:

std::vector< double > smoothVector( std::vector vect, int windowWidth )
    std::vector gaussFilter;
    std::vector result;
    std::vector smoothedVector;

    gaussFilter = makeGaussianWindow( windowWidth );
    gaussFilter = normalize( gaussFilter );

    // Note the use of rounding here instead of truncating integer divsion
    int halfWidth = floor((double)windowWidth / 2.0 + 0.5);

    smoothedVector = convolve( vect, gaussFilter );
    result = std::vector(
        smoothedVector.begin() + halfWidth-1,
        smoothedVector.end() - halfWidth

    return result;

I found the implementation of gausswin to be a little contradictory as I needed to use regular truncating integer division inside instead of the rounding division from earlier that is used to select a subset of the smoothed vector.

// Replicates gausswin(N, Alpha) in MATLAB.
// See: http://www.mathworks.com/help/toolbox/signal/ref/gausswin.html
std::vector< double > makeGaussianWindow( int size, double alpha )
    std::vector< double > window( size, 0.0 );
    int halfWidth = size/2; // integer division (truncating)

    for( int i = 0; i < size; ++i )
        // w(n) = e^( -1/2 * ( alpha * n/(n/2) )^2 )
        // for n such that -N/2 <= n <= N/2

        int n = i - halfWidth;

        window[i] = pow( E_CONST, -0.5 * (
            ( alpha * n / ( (double) halfWidth ) ) *
            ( alpha * n / ( (double) halfWidth ) )

    return window;

It was the same story with the truncating division in convolve. I can only conclude that there must be something weird about the subset selection in MATLAB.

 * Output has same dimension has f
 * @param f - Some data set
 * @param g - The kernel to convolve with
std::vector< double > convolve( std::vector f, const std::vector& g )
    int halfKernelWidth = g.size() / 2; // truncating division

    // Pad f in matlab style
    f.insert( f.begin(), halfKernelWidth, 0.0 );
    f.insert( f.end(), 2*halfKernelWidth, 0.0 );

    //std::vector< double > result( f.size()+2*halfKernelWidth, 0.0 );
    std::vector< double > result( f.size(), 0.0 );

    // Set result[g.size()-1] to result[f.size()-1]
    for( int i = g.size() - 1; i < f.size(); ++ i )
        for( int j = i, k = 0; k < g.size(); --j, ++k )
            result[i] += f[j] * g[k];

    // Set result[0] through result[g.size()-2]
    for( int i = 0; i < g.size() - 1; ++i )
        for( int j = i, k = 0; j >= 0; --j, ++k )
            result[i] += f[j] * g[k];

    // more matlab-ism
    result.erase( result.begin(), result.begin()+3 );

    return result;

The implementation of normalize is trivial, but reproduced here for completeness.

std::vector< double > normalize( std::vector vect )
    double sum;
    std::vector< double > result;

    sum = 0.0;
    for( unsigned int i = 0; i < vect.size(); ++i )
        sum += vect[i];

    assert( sum != 0 );

    for( unsigned int i = 0; i < vect.size(); ++i )
        result.push_back( vect[i] / sum );

    return result;

That concludes the implementation. Hopefully someone will find this as useful as I did. I spent a fair amount of time ensuring that these methods fell within a pretty strict numerical tolerance (1e-9) of the MATLAB implementation.

j j j

Building VTK, ITK, ITPP, and Boost on Centos 6.5

I recently had to build some software I wrote over a year ago on a Centos 6.5 machine. The vast majority of my Linux experience has been with Ubuntu and Debian, so this was a whole new world for me: a totally different package manager with a slightly different directory layout. Notice that I use some older versions of the libraries; this is because I wanted to ensure 100% compatibility with my old code base, I’m sure the instructions are more or less unchanged for newer versions. Follow up in the comments if you notice any deviations.

Without further ado, let’s get a build environment set up and get these libraries built.

yum groupinstall "Development Tools"

# We are going to need a new version of cmake to build vtk, so remove the old one
yum remove cmake
yum install qt qt4 qt4-designer
wget http://pkgs.repoforge.org/cmake/cmake-2.8.8-1.el6.rfx.x86_64.rpm
yum install cmake-2.8.8-1.el6.rfx.x86_64.rpm
wget http://www.vtk.org/files/release/5.8/vtk-5.8.0.tar.gz
tar xzvf vtk-5.8.0.tar.gz
cd VTK
ccmake .
# Press c to configure
# Set type to "Release"
# Enable qt
# Press c
# Press g

# Compile with one process per core (including hyper-threading cores)
make -j `nproc`
make install
ln -s /usr/local/vtk-5.8 /usr/local/vtk

Building ITK is very similar, except it forces you to build it outside of the source directory.

wget http://downloads.sourceforge.net/project/itk/itk/4.1/InsightToolkit-4.1.0.tar.gz?r=http%3A%2F%2Fwww.itk.org%2FITK%2Fresources%2Flegacy_releases.html&ts=1387418924&use_mirror=iweb
tar zxvf InsightToolkit-4.1.0.tar.gz
mkdir itk-build
cd itk-build
ccmake ../InsightToolkit-4.1.0
# If you accidentally ran ccmake or cmake in the source folder you have to blow it away and remake it.
# Press c
# Disable examples
# Enable shared libs
# Release
# Press c
# Press g
make -j `nproc`
make install

IT++ is a mathematical signal processing library that I also need.

yum install blas lapack
yum install autoconf automake libtool
wget http://sourceforge.net/projects/itpp/files/itpp/4.2.0/itpp-4.2.tar.gz
tar xvzf itpp-4.2.tar.gz
cd itpp-4.2
./configure --without-fft --with-blas=/usr/lib64/libblas.so.3 --with-lapack=/usr/lib64/liblapack.so.3 CFLAGS=-fPIC CXXFLAGS=-fPIC CPPFLAGS=-fPIC
make -j `nproc`
make install

Of course every reasonable software project written in C++ depends on Boost. In my case, the version 1.41 that they package with Centos 6.5 is not adequate because the Boost Filesystem API changed in newer versions.

wget http://downloads.sourceforge.net/project/boost/boost/1.49.0/boost_1_49_0.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fboost%2Ffiles%2Fboost%2F1.49.0%2F&ts=1387504197&use_mirror=superb-dca2
tar -zxvf boost_1_49_0.tar.gz
cd boost_1_49_0/
./bootstrap.sh --prefix=/usr/local/boost
./bjam --layout=system install

There you go! Hopefully this post saves you some time, as it took a lot of trial and error for me to get the right paths and dependencies installed.

j j j

Use Webmail as the Default “mailto” Handler in Windows

Note: A guide for the impatient is at the bottom.

I was trying to configure Internet Explore to use Comcast webmail as the default handler for mailto links recently. As I’m not really a Windows user, I was appalled at the dreary selection of add-ons for IE and the inconvenience of Windows’ “Default Programs” manager. From “Default Programs” you would expect to be able to pick any application you want to be the “mailto” handler, but this is not the case.

I found an article describing how to add handlers for any type of URL, and went from there to construct a hack that allows you to use any website as a mailto handler.

It appears the Windows/IE has the following pipeline for handling URLs when they are clicked:

  1. The text of the URL is analyzed to infer the protocol (http, ftp, etc)
  2. The registry is searched to see if a handler exists
  3. The handler is invoked in the way specified by the “open” key in registry

To jump right in, lets set up our own mailto handler.

Start regedit and navigate to the following key: HKEY_CLASSES_ROOT\mailto\Shell\open\command

This key should have a value “(default)” that specifies a command to run when a mailto URL is clicked. If you have outlook installed, it should look something like this:

"C:\PROGRA~2\MICROS~1\Office14\OUTLOOK.EXE" -c IPM.Note /m "%1"

This is simply specifying the location of an executable and the arguments to pass to it. In this case %1 is referring to the entire URL that was clicked. So for example, if I were to click a link with the URL “mailto:bill@microsoft.com” the %1 would be replaced with the text “mailto:bill@microsoft.com”.

Our hack will be to write a batch file that will take this URL, remove the first seven characters (“mailto:”), and give the email address to our favorite webmail service in our favorite browser.

Supposing that C:\Users\Kris\handle-email.bat is the location of our custom script, changing the “(default)” value to the following would run the script when a mailto URL is clicked.

C:\Users\Kris\handle-email.bat %1

This just leaves making the actual script. Create a file called handle-email.bat and place it somewhere out of the way. Open it in your favorite text editor and paste the following:

set address=%1
set address=%address:~7%
start iexplore "http://sz0085.wc.mail.comcast.net/zimbra/mail?view=compose&to=%address%"

The first line assigns the arguments provided to the script to the variable called “address.” In our set up, this will be things like “mailto:ie@terriblebrowser.com”. The second line removes the first seven characters to get rid of the prefix “mailto:”. The last line starts IE so that it goes to the URL provided. In this example, it goes to Comcast’s compose message page and populates the “To:” field. Similar “magic” URLs exist for gmail and Windows Live.

Other webmail service link formats:


For the impatient:

Edit “(default)” in HKEY_CLASSES_ROOT\mailto\Shell\open\command to contain the following:

C:\path\to\your\script.bat %1

Make the script:

set address=%1
set address=%address:~7%
start iexplore "http://sz0085.wc.mail.comcast.net/zimbra/mail?view=compose&to=%address%"

j j j

Timing Parallel Algorithms in C++

It seems to be that it has become common knowledge that if you want to time something in c++, you use clock(). Typically you construct something that looks like:


using namespace std;

int main()
      clock_t startTime, endTime;
      startTime = clock();
      // Do some computationally expensive thing
      endTime = clock();
      cout << "It took " << (endTime - startTime) / (CLOCKS_PER_SEC / 1000) << "ms." << endl;

I’m sure most seasoned C++ developers have written something like this at one point or another. Unfortunately, this breaks down when timing a multithreaded algorithm. The clock() method is constructed in such a way that it always returns the amount of CPU time elapsed, not the real time elapsed. Perhaps this is common knowledge to some, but the reference documentation on sites like cplusplus.com simply state: “Returns the number of clock ticks elapsed since the program was launched.” This is incredibly ambiguous to me, because I could think of clock ticks as either the 2.6 billions ticks per second my CPU is receiving, or the number of ticks from the built in clock that have elapsed.

Regardless, that was tangential. To get reasonably accurate actual time that has elapsed, you need to employ gettimeofday(), which populates a struct with the current time in seconds since the epoch (plus a microsecond component).


using namespace std;

int main()
      struct timeval startTime, endTime;
      long totalTime;
      // The second parameter is the timezone, but it's ignored in the newer linux kernels
      gettimeofday(&startTime, NULL);
      // Do some computationally expensive thing
      gettimeofday(&endTime, NULL);

      // We need to combine the two components of timeval into one value.
      totalTime =  (endTime.tv_sec - startTime.tv_sec) * 1000000L;
      totalTime += (endTime.tv_usec - startTime.tv_usec);

      cout << "It took " << (totalTime / 1000L) << "ms." << endl;

And there you have it! I’d be interested to see if this method is any more accurate on a real time linux kernel.

j j j

Tracking Memory Leaks in C++

This may be common knowledge for a lot of you, but Valgrind is a dynamic code analysis tool that discovers memory leaks for you. It can tell you exactly where memory that leaked was allocated, from there you can use your intuition to decide where the memory should be freed.

The greatest thing is how easy to use it is! I just removed half a dozen memory leaks from my undergraduate thesis project in about 35 minutes with no prior experience. The only “trick” to using Valgrind is ensuring that you compile your project with debugging flags turned on ( “-g” for gcc and g++). That will embed line number information in your executable so that Valgrind can generate useful output.

The Valgrind Quick Start page is by far the best introduction to Valgrind.

j j j