Disclaimer: you know the drill. I write this with the best of intentions, but that’s it. There’s no warranties whatsoever.

Hola!

I have just recently installed a new iMac, and, of course, I installed some software from the very beginning, in case there was a newer version (I could have downloaded it from another Mac, but what-the-heck).

So, I searched for cd-hit, and found it here:

https://github.com/weizhongli/cdhit/releases

I downloaded version 4.6.4, which is the newest as of today. That’s a tar file with the source code. So here’s how you would normally proceed.

(I’m assuming that you have Xcode and the command tools (type “xcode-select --install” to get the command line tools installed), which is, as of yesterday, at version 7.0. I’m also assuming that you know how to use the terminal.).

% mkdir BUILD
% cd BUILD
% tar zxvf ../Downloads/cd-hit-v4.6.4-2015-0603.tar.gz | tail -5
cd-hit-v4.6.4-2015-0603/psi-cd-hit/psi-cd-hit-local.pl
cd-hit-v4.6.4-2015-0603/psi-cd-hit/psi-cd-hit.pl
cd-hit-v4.6.4-2015-0603/psi-cd-hit/qsub-template
cd-hit-v4.6.4-2015-0603/psi-cd-hit/README.psi-cd-hit
cd-hit-v4.6.4-2015-0603/README
% 

Remember that I use “tail -5” so that this thing fits into the blog, you don’t need that. It shows the last five lines of the output presented by the tar command.

% cd cd-hit-v4.6.4-2015-0603/
% ls
ChangeLog                     clstr_quality_eval.pl
README                        clstr_quality_eval_by_link.pl
cd-hit-2d-para.pl             clstr_reduce.pl
cd-hit-auxtools               clstr_renumber.pl
cd-hit-div.pl                 clstr_rep.pl
cd-hit-para.pl                clstr_reps_faa_rev.pl
cdhit-2d.c++                  clstr_rev.pl
cdhit-454.c++                 clstr_select.pl
cdhit-common.c++              clstr_select_rep.pl
cdhit-common.h                clstr_size_histogram.pl
cdhit-div.c++                 clstr_size_stat.pl
cdhit-est-2d.c++              clstr_sort_by.pl
cdhit-est.c++                 clstr_sort_prot_by.pl
cdhit-utility.c++             clstr_sql_tbl.pl
cdhit-utility.h               clstr_sql_tbl_sort.pl
cdhit.c++                     doc
clstr2tree.pl                 license.txt
clstr2txt.pl                  make_multi_seq.pl
clstr2xml.pl                  makefile
clstr_cut.pl                  plot_2d.pl
clstr_merge.pl                plot_len1.pl
clstr_merge_noorder.pl        psi-cd-hit
% 

This is where you think “oh shit, no configure script!” But there’s a makefile. Naturally, you shall try that, right?

% make
g++  -fopenmp -O2  cdhit-common.c++ -c
cdhit-common.c++:36:9: fatal error: 'omp.h' file not found
#include<omp.h>
^
1 error generated.
makefile:69: recipe for target 'cdhit-common.o' failed
make: *** [cdhit-common.o] Error 1
% 

Fatal error?! (Very dramatic, isn’t it?). OK then. I solved it in a couple seconds, but here I rather tell you that there’s two ways to solve this. The first one is to dispense of the omp.h “requirement.” You figure how to do that by checking the makefile and then find the place where it reads:

# with OpenMP
# in command line:
# make openmp=yes
ifeq ($(openmp),no)
  CCFLAGS = -DNO_OPENMP
else
  CCFLAGS = -fopenmp
endif

Do you see that? You have a choice to use OPENMP, or not to use it (Who knows if it’s good or bad to dispense of it?) Anyway, the simple solution is that you type your make command this way instead:

% make openmp="no"
g++  -DNO_OPENMP -O2  cdhit-common.c++ -c
g++  -DNO_OPENMP -O2  cdhit-utility.c++ -c
g++  -DNO_OPENMP -O2  cdhit.c++ -c
g++  -DNO_OPENMP -O2  cdhit.o cdhit-common.o cdhit-utility.o -o cd-hit
g++  -DNO_OPENMP -O2  cdhit-est.c++ -c
g++  -DNO_OPENMP -O2  cdhit-est.o cdhit-common.o cdhit-utility.o -o cd-hit-est
g++  -DNO_OPENMP -O2  cdhit-2d.c++ -c
g++  -DNO_OPENMP -O2  cdhit-2d.o cdhit-common.o cdhit-utility.o -o cd-hit-2d
g++  -DNO_OPENMP -O2  cdhit-est-2d.c++ -c
g++  -DNO_OPENMP -O2  cdhit-est-2d.o cdhit-common.o cdhit-utility.o -o cd-hit-est-2d
g++  -DNO_OPENMP -O2  cdhit-div.c++ -c
g++  -DNO_OPENMP -O2  cdhit-div.o cdhit-common.o cdhit-utility.o -o cd-hit-div
g++  -DNO_OPENMP -O2  cdhit-454.c++ -c
g++  -DNO_OPENMP -O2  cdhit-454.o cdhit-common.o cdhit-utility.o -o cd-hit-454
% 

Voilà!

You’re done. Mostly, but what if you wanted to compile with openmp (parallel programming sounds so much like a “must have”)?

In that case, forget what we did above. Don’t do it (or else, type “make clean” before proceeding any further).

This is actually what I did. I have fink installed in my macs. I always install a gnu gcc, in this case, I have gcc5, but this should work also with gcc49. Given that I have gcc5, I edited the makefile and replaced every instance of g++ for g++5 (you might have guessed by now that gnu’s gcc has omp.h). With diff you can see what I changed (the “-O3” is not necessary, but I always change the “-O2″s to “-O3″s):

% diff makefile makefile~
2,4c2,4
< CC = g++-5 -Wall -ggdb
< CC = g++-5 -pg
< CC = g++-5
 --- 
> CC = g++ -Wall -ggdb
> CC = g++ -pg
> CC = g++
24c24
< CCFLAGS += -O3
 --- 
> CCFLAGS += -O2

% make
g++-5  -fopenmp -O3  cdhit-common.c++ -c
g++-5  -fopenmp -O3  cdhit-utility.c++ -c
g++-5  -fopenmp -O3  cdhit.c++ -c
g++-5  -fopenmp -O3  cdhit.o cdhit-common.o cdhit-utility.o -o cd-hit
g++-5  -fopenmp -O3  cdhit-est.c++ -c
g++-5  -fopenmp -O3  cdhit-est.o cdhit-common.o cdhit-utility.o -o cd-hit-est
g++-5  -fopenmp -O3  cdhit-2d.c++ -c
g++-5  -fopenmp -O3  cdhit-2d.o cdhit-common.o cdhit-utility.o -o cd-hit-2d
g++-5  -fopenmp -O3  cdhit-est-2d.c++ -c
g++-5  -fopenmp -O3  cdhit-est-2d.o cdhit-common.o cdhit-utility.o -o cd-hit-est-2d
g++-5  -fopenmp -O3  cdhit-div.c++ -c
g++-5  -fopenmp -O3  cdhit-div.o cdhit-common.o cdhit-utility.o -o cd-hit-div
g++-5  -fopenmp -O3  cdhit-454.c++ -c
g++-5  -fopenmp -O3  cdhit-454.o cdhit-common.o cdhit-utility.o -o cd-hit-454
%

Of course, I should tell you that instead of editing the makefile you could have just typed:

% make CC=g++-5 openmp="yes" debug="no"
g++-5  -fopenmp -O2  cdhit-common.c++ -c
g++-5  -fopenmp -O2  cdhit-utility.c++ -c
g++-5  -fopenmp -O2  cdhit.c++ -c
g++-5  -fopenmp -O2  cdhit.o cdhit-common.o cdhit-utility.o -o cd-hit
g++-5  -fopenmp -O2  cdhit-est.c++ -c
g++-5  -fopenmp -O2  cdhit-est.o cdhit-common.o cdhit-utility.o -o cd-hit-est
g++-5  -fopenmp -O2  cdhit-2d.c++ -c
g++-5  -fopenmp -O2  cdhit-2d.o cdhit-common.o cdhit-utility.o -o cd-hit-2d
g++-5  -fopenmp -O2  cdhit-est-2d.c++ -c
g++-5  -fopenmp -O2  cdhit-est-2d.o cdhit-common.o cdhit-utility.o -o cd-hit-est-2d
g++-5  -fopenmp -O2  cdhit-div.c++ -c
g++-5  -fopenmp -O2  cdhit-div.o cdhit-common.o cdhit-utility.o -o cd-hit-div
g++-5  -fopenmp -O2  cdhit-454.c++ -c
g++-5  -fopenmp -O2  cdhit-454.o cdhit-common.o cdhit-utility.o -o cd-hit-454
% 

There you have it. Two solutions (or three, a question of focus) for the very same problem. Solution 1 is very simple and works all right. Solution 2 requires you to have fink (and use fink to install either gcc5 or gcc49). Your choice.

May you enjoy using cd-hit!

-Gabo

8 thoughts on “Compiling cd-hit under Mac OSX Yosemite

  1. I installed it using brew, it seemed straight forward and the main toold were running just fine. But then i found that it did not install the auxtools.

    1. Interesting. The guy who develops cd-hit works at the jcvi. I’d guess they would ensure compatibility. Anyway, I’ll give it a try and let you know.

    2. OK Dewan,

      Here’s what you’re going to do:
      1. You will compile using g++-6 (change the Makefile si that instead og g++ it has g++-6)
      2. Edit the cdhit-utility.h file line 95, change “push_back( item );” to “this->push_back( item );”
      3. Type make. It should compile (worked on Mac with MacOS Sierra).

      Let me know!

  2. v4.5.8 is not available at the git hub. It’s at v 4.6.7. 4.6.7 installed exactly as easily as I describe here, only I used gcc6 (installed using fink). Why are you installing the older version?

Leave a comment