Compiling OpenCV with TBB and the Intel compiler

When running on an Intel system, usually the best performance is achieved by using Intel’s C compilers. Since OpenCV is all about high performance, we’ll see how to get the best performance, which isn’t enabled out of the box.

Required reading: http://opencv.willowgarage.com/wiki/InstallGuide

The TBB library

First of all: the biggest performance gains don’t come from using Intel’s compiler but actually from using the TBB libraries. (Thread Bulding Blocks). If you don’t want to use (or can’t, because of licensing issues) Intel’s compiler, you can still use TBB and get a huge difference for a lot of OpenCV’s function.

For debian users, TBB is available in a package. To install:

sudo apt-get -y install libtbb-dev

And then add the WITH_TBB=YES option to cmake:

cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D BUILD_PYTHON_SUPPORT=ON -D BUILD_EXAMPLES=ON -DWITH_TBB=YES

To compile, you can use make -jN (with N being the number of CPU cores you have).

This should be enough to give you a tremendous performance boost. How much? On a single-core CPU, none. On a quad-core, a 4x gain in performance (roughly). Make sure your system is using this OpenCV install and not your system’s default, if any!

Intel’s compiler suite

Now, to squeeze even more of your CPU power, you can use Intel’s compiler, which should give even better performance. It also installs IPP (Intel Performance Primitives), which are library functions optimized for intel CPUs.

You can use Intel CC for free on Linux, for non-commercial use. You can download it from Intel’s website: http://software.intel.com/en-us/intel-compilers

Required reading: http://software.intel.com/en-us/articles/using-the-intel-compilers-for-linux-with-debian

I downloaded the one that got me the file “cpp_studio_xe_2013_update3.tgz”, which you have to install as the usual commercial software (gunzip,tar,cd,./install). After installing it, READ the screen, it will give you some commands you need to run to set up your building environment. In my case, for bash, it was:

source /opt/intel/bin/compilervars.sh intel64

This will set up the needed variables. Now, to compile OpenCV, you’ll run CMAKE with these options:

LINKFLAGS=-static-intel CFLAGS=-static-intel LDFLAGS=-static-intel CPPFLAGS=-static-intel CC=icc CXX=icpc cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D BUILD_PYTHON_SUPPORT=ON -D BUILD_EXAMPLES=ON -DWITH_TBB=YES -DWITH_IPP=YES ..

You will notice we added WITH_IPP=YES, this enables th Intel Performance Primitives. We also set up CC and CXX variables, to define icc and icpc as compilers, since otherwise it will use the system’s default (GCC). We also need to add the option -static-intel to compile the libraries statically, otherwise you’ll need Intel’s libraries installed and configured on the target system. Intel also recommends you build them this way. More about this here.

Final thoughts

While the gains of using Intel’s compiler may not be significant, I think it’s a good option to make sure you’re using the full potential for your processor (you paid for it anyway!).

Also, since in an OpenCV application, the biggest number-cruncher is the OpenCV library, you shouldn’t need to use Intel CC to build your application. You can compile it with GCC (and it will Just Work and save you a lot of headaches). Since the library is compiled with Intel CC, IPP, and TBB, you’ll already have significant performance gains. But if you think you’ll gain some exta performance by using this compiler for your program, nothing keeps you from trying.