On Mon, Apr 29, 2013 at 3:27 AM, Mike Dyall-Smith mike.dyallsmith@gmail.com wrote:
Dear Peter Cock, thanks for your advice. Just to be clear, do I leave the files within their decompressed folders or do I put all the individual files into one folder? I assume the former, but want to be sure. Thanks again, Mike DS
Hi Mike,
Unless you're using a Graphical decompression tool which is trying to be too helpful, each tar-ball does *not* decompress into its own folder. The files should all be in the *same* folder.
I use this to verify the checksums,
$ md5sum --check nr.00.tar.gz.md5 nr.00.tar.gz: OK
Then I use this to decompress the tar-balls,
$ tar -zxvf nr.00.tar.gz etc
(Actually I don't do this personally any more - it has been setup to happen automatically when the NCBI update the databases.)
We keep all our NCBI databases in the same folder,
$ ls /data/blastdb/ncbi/nr.* /data/blastdb/ncbi/nr.00.phd /data/blastdb/ncbi/nr.00.phi /data/blastdb/ncbi/nr.00.phr /data/blastdb/ncbi/nr.00.pin /data/blastdb/ncbi/nr.00.pnd /data/blastdb/ncbi/nr.00.pni /data/blastdb/ncbi/nr.00.pog /data/blastdb/ncbi/nr.00.ppd /data/blastdb/ncbi/nr.00.ppi /data/blastdb/ncbi/nr.00.psd /data/blastdb/ncbi/nr.00.psi /data/blastdb/ncbi/nr.00.psq /data/blastdb/ncbi/nr.00.tar.gz /data/blastdb/ncbi/nr.00.tar.gz.md5 ... /data/blastdb/ncbi/nr.10.phd /data/blastdb/ncbi/nr.10.phi /data/blastdb/ncbi/nr.10.phr /data/blastdb/ncbi/nr.10.pin /data/blastdb/ncbi/nr.10.pnd /data/blastdb/ncbi/nr.10.pni /data/blastdb/ncbi/nr.10.pog /data/blastdb/ncbi/nr.10.ppd /data/blastdb/ncbi/nr.10.ppi /data/blastdb/ncbi/nr.10.psd /data/blastdb/ncbi/nr.10.psi /data/blastdb/ncbi/nr.10.psq /data/blastdb/ncbi/nr.10.tar.gz /data/blastdb/ncbi/nr.10.tar.gz.md5 /data/blastdb/ncbi/nr.pal
We can then refer to the NR database at the command line as /data/blastdb/ncbi/nr or as just nr if the BLAST database path is configured to check this folder.
In this folder we also have other NCBI database, like NT:
$ ls /data/blastdb/ncbi/nt.* /data/blastdb/ncbi/nt.00.nhd /data/blastdb/ncbi/nt.00.nhi /data/blastdb/ncbi/nt.00.nhr /data/blastdb/ncbi/nt.00.nin /data/blastdb/ncbi/nt.00.nnd /data/blastdb/ncbi/nt.00.nni /data/blastdb/ncbi/nt.00.nog /data/blastdb/ncbi/nt.00.nsd /data/blastdb/ncbi/nt.00.nsi /data/blastdb/ncbi/nt.00.nsq /data/blastdb/ncbi/nt.00.tar.gz /data/blastdb/ncbi/nt.00.tar.gz.md5 ... /data/blastdb/ncbi/nt.13.nhd /data/blastdb/ncbi/nt.13.nhi /data/blastdb/ncbi/nt.13.nhr /data/blastdb/ncbi/nt.13.nin /data/blastdb/ncbi/nt.13.nnd /data/blastdb/ncbi/nt.13.nni /data/blastdb/ncbi/nt.13.nog /data/blastdb/ncbi/nt.13.nsd /data/blastdb/ncbi/nt.13.nsi /data/blastdb/ncbi/nt.13.nsq /data/blastdb/ncbi/nt.13.tar.gz /data/blastdb/ncbi/nt.13.tar.gz.md5 /data/blastdb/ncbi/nt.nal
Note you don't need to keep the *.tar.gz and the *.md5 files once you've verified the checksum (using md5sum to detect any data corruption during download) and decompressed the tar-ball.
Peter
P.S. This galaxy-users list is meant for discussion of using the tools within Galaxy from an end user perspective. Although there is talk about creating a new Galaxy mailing list specifically for deployment questions like this, currently galaxy-devel is preferred for this kind of discussion.
galaxy-user@lists.galaxyproject.org