On Wed, Mar 21, 2012 at 10:21 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
2012/3/21 Makis Ladoukakis <makis4ever@hotmail.com>:
Dear Galaxy users,
I have been trying to upload a blastable database in my local instance of galaxy. I have used the nr database and generated all the nhr, nin, nsq, and nal files. I have also edited the blastdb.loc file in the galaxy-dist/tool-data/ directory and it looks like this:
database [build data] path nr_01_Mar_2012 nr 15 Mar 2012 /home/user/Desktop/nr.00/nr
Nevertheless when i start galaxy the megablast tool can't recognise the database. Am I missing something?
The NR database comes split up into many parts, 00 to 06 currently, and you need to download them all. They are linked by the nr.pal file, which you should also have downloaded. The database is then used via the full name of the nr.pal file (but without the .pal extension).
If you are running Galaxy on a server, it is likely your systems administrator can/has setup a shared set of NCBI BLAST databases for all the system users (including Galaxy), to avoid unnecessary copies under /home
Note that queries about local Galaxy installations are normally handled via the galaxy-dev mailing list (although perhaps the project needs three lists now given local Galaxy installations are getting more common and not everyone wants to follow the Galaxy development itself).
Peter
Sorry, I missed something else which is vitally important: The NCBI NR database is a protein database, and should be listed in blastdb_p.loc (which is used by the BLASTP wrapper etc) while blastdb.loc is for nucleotide databases only (and used for the BLASTN/megablast wrapper etc). As you were asking about megablast, you probably want the NCBI NT BLAST database instead (although sometimes confusingly the NCBI can use the names ambiguously, for the file names this is very important). Peter