8000 genomes assembled from metagenomes at RefSeq

A recent article reported the recovery of close to 8000 genomes from metagenomes. These can be found at NCBI's RefSeq genome database. Anyway, here's how I've been checking the number of genomes available. The pieces of information needed are: The project number for these genomes (found in the article): PRJNA348753. The address for NCBI's RefSeq … Continue reading 8000 genomes assembled from metagenomes at RefSeq

Discovering segmasker on a Friday

I read about the possibility of adding masking information to an NCBI blast database. This allows for running blast sequence comparisons using either soft or hard masking. Since there's already soft and hard masking of the queries, I hadn't bothered to try. Anyway, a few weeks ago, I started playing with it. I thus discovered … Continue reading Discovering segmasker on a Friday

Legacy blast compiled under OSX Yosemite

You might wonder why I would write about compiling NCBI's legacy blast. It's discontinued, right? Well, yes. But: Some people still want to use it Other programs use it (well, psi-cd-hit requires it, I don't know if anything else does) The precompiled one at NCBI is compiled at 32 bits (I want 64 damn it!) … Continue reading Legacy blast compiled under OSX Yosemite

Learning R: R can read compressed files!

It might be a tad hard to believe, but I am new ... well ... somewhat new, to R. I have been using R for a while, but it has been mostly by copying the proper commands from some place, understanding what they do, then putting them into a program of mine so that the … Continue reading Learning R: R can read compressed files!

NCBI’s blast 2.2.29+ compiled under mavericks

NOTE: I write these examples exactly the way they worked for me. No guarantees whatsoever, etc, etc. NCBI released a new version of blast: 2.2.29+ on January 6 (the newest versions of blast are always available at ftp://ftp.ncbi.nih.gov/blast/executables/LATEST/). Given my obsession with having the latest versions of whatever in my machines, and to compile them … Continue reading NCBI’s blast 2.2.29+ compiled under mavericks

Compiling NCBI’s blast 2.2.28+ in Mountain Lion

Note: left gutter and numbers in shell commands below to help differentiate the terminal commands and outputs from the text describing what I did. Recently, NCBI sent an e-mail about the release of blast version 2.2.28+. Since my post on compiling NCBI's blast 2.2.27+ there has been two updates to Apple's Xcode, claiming to have … Continue reading Compiling NCBI’s blast 2.2.28+ in Mountain Lion

Note on the perl programming example for newbies

Note: Not intended for professional programmers, but as example for non-programmers wishing to see some programming action. Also a disclaimer: No warranties whatsoever. In the little perl programming example for newbies I documented before, I did not make any tests for how quickly the program ran. Here I offer a modified example keeping tr/// to … Continue reading Note on the perl programming example for newbies

Little PERL programming example for biology newbies

Note: Not intended for professional programmers, but as example for non-programmers wishing to see some programming action. Also a disclaimer: No warranties whatsoever. During an exercise in my intro course in bioinformatics I asked my students to use the following command to count the number of As in E. coli K12's DNA sequence (yeah, the … Continue reading Little PERL programming example for biology newbies

And now BLAT 35 won’t compile in my Mountain Lion, or will it?

I just learned that the new version of the BLAST-like alignment tool, blat 35, was released last november. So, of course, I immediately downloaded and tried to compile it. Followed instructions verbatim, but it complained that it could not find "png.h". Since I knew I had the png libraries installed through fink, I checked and … Continue reading And now BLAT 35 won’t compile in my Mountain Lion, or will it?

If ssearch36 gets stuck while comparing sequences in your mac

Quick note. When I was comparing the protein sequences from C. glomerans PW2 to those in E. coli K12 MG1655, the program would get stuck with this sequence: >gi|328954742|ref|YP_004372075.1| MSKTIPELTLDNFDLVMQSKLPVLVDFWAPWCGPCRTLSPIVEQVAEEMSERITVAKCNVDENQDLAMKYGVMS IPTLVLFRDGAEVSRTVGAMPKPKLVAEIEKNL I could not figure out the problem, but found that compiling with gcc-4 installed by fink fixed the problem. I just had to change this … Continue reading If ssearch36 gets stuck while comparing sequences in your mac