Hello, I am trying to get the reference genomes to appear in our NGS tools. In my bowtie.loc file for instance I have the following line: hg18 /share/shared/data/genomes/hg18/bowtie/Homo_sapiens_assembly18.fasta.1.eb wt [galaxy@ tool-data]$ cd /share/shared/data/genomes/hg18/bowtie/ [galaxy]$ ls Homo_sapiens_assembly18.fasta.1.ebwt Homo_sapiens_assembly18.fasta.4.ebwt Homo_sapiens_assembly18.fasta.2.ebwt Homo_sapiens_assembly18.fasta.rev.1.ebwt Homo_sapiens_assembly18.fasta.3.ebwt Homo_sapiens_assembly18.fasta.rev.2.ebwt Do the files provided in the .loc file have to be fasta files? Where can these fasta files be obtained? -Rob
Hi Robert, The fasta file that you created the indexes from should be located in the same directory hierarchy as the indexes themselves. For some tools (Bowtie is one them), a symbolic link to the fasta file in the directory with the indexes is also required. General instructions to set up indexes: http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup Instructions for setting up the builds.txt and other precursor steps: http://wiki.g2.bx.psu.edu/Admin/Data%20Integration For your example, the data could be organized like: /share/shared/data/genomes/hg18/ /share/shared/data/genomes/hg18/seq /share/shared/data/genomes/hg18/seq/Homo_sapiens_assembly18.fasta /share/shared/data/genomes/hg18/bowtie/ /share/shared/data/genomes/hg18/bowtie/<bowtie_index_files> /share/shared/data/genomes/hg18/bowtie/<symbolic_link_to_fasta> -- where <symbolic_link_to_fasta> is named exactly like the original fasta file, in your case, "Homo_sapiens_assembly18.fasta" -- and where all of the <bowtie_index_files> have the full original fasta file name with the index name appended (your example has this correct) Then, in the bowtie_indices.loc file, the line will be only one row for the fasta genome, not one row for each individual index file. So, one row, tab deliminated, 4 fields: <unique_build_id> <dbkey> <display_name> <file_base_path> Where each field could be, for example: Homo_sapiens_assembly18 Hs18 Human (Homo sapiens) /share/shared/data/genomes/hg18/bowtie/Homo_sapiens_assembly18 -- note that there is no ".fasta" in the <file_base_path> field -- put all of four of these fields in one single row in the actual file, I only put them on individual lines to make the contents of each field clear -- be sure to use only tabs, no extra spaces, to deliminate the fields -- use the actual file system path in the <file_base_path> (Avoid following symbolic links, as these have been problematic in the past for some users) The sample .loc files have this information plus more examples: http://bitbucket.org/galaxy/galaxy-central/src/a10bb73f5793/tool-data/bowtie... We just started up an rsync server to host the same genomes as those available on Galaxy Main. Or, you can obtain genomes from any source - making the data available in fasta format is the only requirement. Full wiki documentation for the rsync server linked in with the other NGS setup wikis & a broader announcement will be coming out later this week, but this prior post covers the basics: http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-July/010607.html Hopefully this helps you to get set up, Jen Galaxy team On 8/8/12 2:32 PM, Robert Chase wrote:
Hello,
I am trying to get the reference genomes to appear in our NGS tools. In my bowtie.loc file for instance I have the following line:
hg18 /share/shared/data/genomes/hg18/bowtie/Homo_sapiens_assembly18.fasta.1.eb wt
[galaxy@ tool-data]$ cd /share/shared/data/genomes/hg18/bowtie/ [galaxy]$ ls Homo_sapiens_assembly18.fasta.1.ebwt Homo_sapiens_assembly18.fasta.4.ebwt Homo_sapiens_assembly18.fasta.2.ebwt Homo_sapiens_assembly18.fasta.rev.1.ebwt Homo_sapiens_assembly18.fasta.3.ebwt Homo_sapiens_assembly18.fasta.rev.2.ebwt
Do the files provided in the .loc file have to be fasta files? Where can these fasta files be obtained?
-Rob
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Jennifer Jackson http://galaxyproject.org
participants (2)
-
Jennifer Jackson
-
Robert Chase