Re: [galaxy-dev] Bowtie .loc file with fasta.1.ebwt

9 Aug 2012

      Hi Robert,

The fasta file that you created the indexes from should be located in 
the same directory hierarchy as the indexes themselves. For some tools 
(Bowtie is one them), a symbolic link to the fasta file in the directory 
with the indexes is also required.

General instructions to set up indexes:
http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup

Instructions for setting up the builds.txt and other precursor steps:
http://wiki.g2.bx.psu.edu/Admin/Data%20Integration

For your example, the data could be organized like:

/share/shared/data/genomes/hg18/
/share/shared/data/genomes/hg18/seq
/share/shared/data/genomes/hg18/seq/Homo_sapiens_assembly18.fasta
/share/shared/data/genomes/hg18/bowtie/
/share/shared/data/genomes/hg18/bowtie/<bowtie_index_files>
/share/shared/data/genomes/hg18/bowtie/<symbolic_link_to_fasta>

  -- where <symbolic_link_to_fasta> is named exactly like the original 
fasta file, in your case, "Homo_sapiens_assembly18.fasta"
  -- and where all of the <bowtie_index_files> have the full original 
fasta file name with the index name appended (your example has this correct)

Then, in the bowtie_indices.loc file, the line will be only one row for 
the fasta genome, not one row for each individual index file.

So, one row, tab deliminated, 4 fields:
<unique_build_id>   <dbkey>   <display_name>   <file_base_path>

Where each field could be, for example:

Homo_sapiens_assembly18
Hs18
Human (Homo sapiens)
/share/shared/data/genomes/hg18/bowtie/Homo_sapiens_assembly18

  -- note that there is no ".fasta" in the <file_base_path> field
  -- put all of four of these fields in one single row in the actual 
file, I only put them on individual lines to make the contents of each 
field clear
  -- be sure to use only tabs, no extra spaces, to deliminate the fields
  -- use the actual file system path in the <file_base_path> (Avoid 
following symbolic links, as these have been problematic in the past for 
some users)

The sample .loc files have this information plus more examples:
http://bitbucket.org/galaxy/galaxy-central/src/a10bb73f5793/tool-data/bowtie...

We just started up an rsync server to host the same genomes as those 
available on Galaxy Main. Or, you can obtain genomes from any source - 
making the data available in fasta format is the only requirement. Full 
wiki documentation for the rsync server linked in with the other NGS 
setup wikis & a broader announcement will be coming out later this week, 
but this prior post covers the basics:
http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-July/010607.html

Hopefully this helps you to get set up,

Jen
Galaxy team

On 8/8/12 2:32 PM, Robert Chase wrote:
...
Hello,
I am trying to get the reference genomes to appear in our NGS tools. In
my bowtie.loc file for instance I have the following line:
hg18
/share/shared/data/genomes/hg18/bowtie/Homo_sapiens_assembly18.fasta.1.eb
wt
[galaxy@ tool-data]$ cd /share/shared/data/genomes/hg18/bowtie/
[galaxy]$ ls
Homo_sapiens_assembly18.fasta.1.ebwt  Homo_sapiens_assembly18.fasta.4.ebwt
Homo_sapiens_assembly18.fasta.2.ebwt
Homo_sapiens_assembly18.fasta.rev.1.ebwt
Homo_sapiens_assembly18.fasta.3.ebwt
Homo_sapiens_assembly18.fasta.rev.2.ebwt
Do the files provided in the .loc file have to be fasta files? Where can
these fasta files be obtained?
-Rob
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
-- 
Jennifer Jackson
http://galaxyproject.org