Porting MATLAB’s Discrete Gaussian Smoothing to C++

A couple years ago I was placed in charge of porting a project written in MATLAB to C++. The resulting product, which we call LoCA Chop internally, takes a set of meshes and prepares them to be suitable for Localized Components Analysis, a method developed by some fellow students at UC Davis. LoCA Chop accepts a set of input shapes. The assumption is that all of the shapes are different samples of the same structure. In my case I was applying it to a set of hippocampi. Each input sample does not necessarily contain the entire hippocampus; portions of either end may have been omitted because they didn’t show up in the brain scan. LoCA Chop’s job is to find the largest subset of the shape that exists in all input meshes.

I found it to be a very interesting problem, and had a lot of fun porting it to C++ because it forced me to understand the entire solution. It worked by maximizing an objective function which depended on a handful of MATLAB functions. This is where the headaches began. The method worked by optimizing a function which had no derivative, so I needed my implementation of the objective function to be as numerically similar as possible. This meant doing a lot of work to replicate MATLAB functions in C++. One of the most basic parts I had to port was a function which smoothed a vector of numbers:

function [smoothedVector] = gaussianSmooth1D(vector)
    % Construct blurring window.
    windowWidth = int16(7);
    halfWidth = windowWidth / 2;
    gaussFilter = gausswin(double(windowWidth));
    gaussFilter = gaussFilter / sum(gaussFilter); % Normalize.

    % Do the blur.
    smoothedVector = conv(vector, gaussFilter);
    smoothedVector = smoothedVector(halfWidth:end-halfWidth);

The corresponding C++ function takes a vector of doubles, smooths it, and returns it. It also turns out that MATLAB rounds the result of integer division (instead of truncating it) when you use the resulting number to do subset selection, so my C++ implementation ended up looking like this:

std::vector< double > smoothVector( std::vector vect, int windowWidth )
    std::vector gaussFilter;
    std::vector result;
    std::vector smoothedVector;

    gaussFilter = makeGaussianWindow( windowWidth );
    gaussFilter = normalize( gaussFilter );

    // Note the use of rounding here instead of truncating integer divsion
    int halfWidth = floor((double)windowWidth / 2.0 + 0.5);

    smoothedVector = convolve( vect, gaussFilter );
    result = std::vector(
        smoothedVector.begin() + halfWidth-1,
        smoothedVector.end() - halfWidth

    return result;

I found the implementation of gausswin to be a little contradictory as I needed to use regular truncating integer division inside instead of the rounding division from earlier that is used to select a subset of the smoothed vector.

// Replicates gausswin(N, Alpha) in MATLAB.
// See: http://www.mathworks.com/help/toolbox/signal/ref/gausswin.html
std::vector< double > makeGaussianWindow( int size, double alpha )
    std::vector< double > window( size, 0.0 );
    int halfWidth = size/2; // integer division (truncating)

    for( int i = 0; i < size; ++i )
        // w(n) = e^( -1/2 * ( alpha * n/(n/2) )^2 )
        // for n such that -N/2 <= n <= N/2

        int n = i - halfWidth;

        window[i] = pow( E_CONST, -0.5 * (
            ( alpha * n / ( (double) halfWidth ) ) *
            ( alpha * n / ( (double) halfWidth ) )

    return window;

It was the same story with the truncating division in convolve. I can only conclude that there must be something weird about the subset selection in MATLAB.

 * Output has same dimension has f
 * @param f - Some data set
 * @param g - The kernel to convolve with
std::vector< double > convolve( std::vector f, const std::vector& g )
    int halfKernelWidth = g.size() / 2; // truncating division

    // Pad f in matlab style
    f.insert( f.begin(), halfKernelWidth, 0.0 );
    f.insert( f.end(), 2*halfKernelWidth, 0.0 );

    //std::vector< double > result( f.size()+2*halfKernelWidth, 0.0 );
    std::vector< double > result( f.size(), 0.0 );

    // Set result[g.size()-1] to result[f.size()-1]
    for( int i = g.size() - 1; i < f.size(); ++ i )
        for( int j = i, k = 0; k < g.size(); --j, ++k )
            result[i] += f[j] * g[k];

    // Set result[0] through result[g.size()-2]
    for( int i = 0; i < g.size() - 1; ++i )
        for( int j = i, k = 0; j >= 0; --j, ++k )
            result[i] += f[j] * g[k];

    // more matlab-ism
    result.erase( result.begin(), result.begin()+3 );

    return result;

The implementation of normalize is trivial, but reproduced here for completeness.

std::vector< double > normalize( std::vector vect )
    double sum;
    std::vector< double > result;

    sum = 0.0;
    for( unsigned int i = 0; i < vect.size(); ++i )
        sum += vect[i];

    assert( sum != 0 );

    for( unsigned int i = 0; i < vect.size(); ++i )
        result.push_back( vect[i] / sum );

    return result;

That concludes the implementation. Hopefully someone will find this as useful as I did. I spent a fair amount of time ensuring that these methods fell within a pretty strict numerical tolerance (1e-9) of the MATLAB implementation.

j j j

Mutual Information in R

In my endeavors as a graduate student in image processing I have found R to be an incredibly useful tool for testing hypotheses and rapid prototyping. Ultimately everything needs to be implemented in a real compiled language like C++ to make a useful tool, but it has saved me a lot time from implementing things that are dumb. I am going to share with you my experience of using R to do feature selection.

I am trying to predict the position on a surface using shape descriptors. I have a training set of about 50 point clouds representing the surface of the corpus callosum. For each point in the training set, I compute geometric moments. These moments boil down all the information in a spherical region around the point down to a single number. They are more sophisticated than simple descriptors like curvature and should capture the relationship between position on the surface and the shape of the corpus callosum.

The problem with using geometric moments is that are infinitely many of them to choose from and many of them are redundant or not very useful. Therefore I want a way to try a lot of them and quickly rule out the useless ones. I will be using mutual information (MI) for that. MI measures the dependence of two random variables on one another using entropy. An MI of 0 indicates no dependence (and therefore no relationship). Higher values indicate a stronger relationship.

For each point, I write its medial axis coordinates and the shape descriptors to a CSV file in the following format:

AxialDistance, Angle, d0, d1, d2, d3, d4, d5, d6, d7
0.00388039, 1.66916, 0.137261, 1, 2.4811e-07, 0.0446063, 0.0102502, 1, -1.07544e-06, 0.071294, 0.00606162
0.00466314, 1.03462, 0.288215, 1, -1.34452e-06, 0.0483304, 0.0120159, 1, -2.80076e-06, 0.072637, 0.00579168
~378,000 more rows like this

I’m trying to figure out which shape descriptors will be the best predictor for the medial axis coordinates. One way to eyeball this is simply by making a simple x-y plot where one axis has the descriptor, and the other has the variable you’re trying to predict. For example:

data <- read.csv("fs_lowres_2mm_o4_i4.csv")
plot(data$d5, data$AxialDistance)

gets a plot similar to the following:
This particular descriptor seems to have a relationship (although nonlinear) with axial distance. This seems like a pretty good descriptor in comparison to something like:


In my case, I am trying to pick a set of useful feature descriptors in a relatively automatic way, so looking at plots is far too time consuming. Instead I can just calculate the mutual information for each descriptor and only look at the ones significantly above 0.

#install.packages(entropy) if you don't have the entropy package
disc=discretize2d(data$AxialDistance, data$d0, numBins1=16, numBins2=16)

So here we see what we expect, the descriptor that yields a plot that looks nearly uniformly random has an MI of 0.043 while the plot with a very patterned relationship has an MI of 0.905!

I hope this helps!

j j j