Thanks Jennifer,

Since I did have an error when I run tophat2 with, as reference, a fasta from my history, so I modified  line 105 of  tophat
wrapper (bowtie2-build instead of bowtie-build in command line).
Now "
Tophat for Illumina Find splice junctions using RNA-seq data " runs without error.

Thank you again for your help,
Sarah


Jennifer Jackson a écrit :
Hi Sarah,

On 4/11/13 8:02 AM, Sarah Maman wrote:
Thnaks Jennifer,

Excuse me, my previous mail contains an error : In fact, the reference genome from my history was a fasta format  (the name was GTF file but the format was fasta...).
So, when I run tophat with a reference genome from your history, here is the error message (my reference genome is a FASTA file) :

OK, now this looks like a tool/index mismatch problem. Most likely rooted in a binary path issue.
Error in tophat:

[2013-04-11 14:57:12] Beginning TopHat run (v2.0.5)
-----------------------------------------------
[2013-04-11 14:57:12] Checking for Bowtie
		  Bowtie version:	 2.0.0.7
[2013-04-11 14:57:12] Checking for Samtools
		Samtools version:	 0.1.19.0
[2013-04-11 14:57:13] Checking for Bowtie index files
Error: Could not find Bowtie 2 index files (/tmp/1078173.1.workq/tmpzxEFNK/dataset_6485.*.bt2)
    

Settings:
 blablabla  OK.....
Total time for backward call to driver() for mirror index: 00:00:57
TopHat v2.0.5
tophat -p 4  /tmp/1078173.1.workq/tmpzxEFNK/dataset_6485 /work/galaxy/database/files/006/dataset_6528.dat
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[bam_index_core] Invalid BAM header.[bam_index_build2] fail to index the BAM file.
Epilog : job finished at jeu. avril 11 14:57:18 CEST 2013

    
And here is my bowtie and tophat versions :

$ which bowtie
bowtie -v
0.12.8
This is good
   $ which tophat
tophat -v
2.0.5
This is most likely the problem. There is probably a symbolic link pointing from tophat -> tophat2. You will want to remove that. The tool wrappers will be looking for the correct binary/indexes for the version they are each dependent on. This means that if you are running Tophat2 for Illumina, you want both tophat2 and bowtie2 to be used, along with the bowtie2 indexes. This is detailed in the wikis I sent in the Tophat/Bowtie sections, for both dependencies and the index set up.

My guess at this point, without seeing your exact files, is that you need to add the index path to the bowtie2 loc file, and remove/adjust the symbolic link as I stated above, then restart, and test again to see if that fixes the problem.

But we have also, available on our cluster :
bowtie2 --version is 2.0.0-beta7
This is also good, if "which bowtie" == v0.12.8 and "which bowtie2" == v2.

If "bowtie" is pointing to v2 on your cluster nodes, then remove that symbolic link, so that this is instead pointing to the correct binary (v0.12.8). same for bowtie2, should point to the v2 binary. bowtie/tophat and bowtie2/tophat2 are not the same executable and use different indexes - this is most likely why you had to use the bowtie v0.12.8 loc to get tophat2 going to begin with.

Hope it works this time! Please keep replies on the list to help us with tracking,

Jen
Galaxy team

Could you please tell me how to point to the v2 binaries (how to change symbolic links) ?

Thanks in advance,
Sarah


Jennifer Jackson a écrit :
Hi Sarah,

It still sounds like there is a path problem - this is why the tools are looking in the wrong loc file. When bowtie2/tophat2 installs, it will create a symbolic link that names itself as just "bowtie" or "tophat", pointing to the v2 binaries.

When you run these, what do you get?

   $ which bowtie
 
   $ which tophat

My guess is that these are symbolic links pointing to the v2 binaries. You will want to remove these. This is noted in the NGS set-up wiki, but easy to miss.

For the custom reference genome portion below, there is a mix-up here. A custom reference genome is in fasta format, not GTF format. I think what you are doing is using a reference annotation file with the process. Both can be used with RNA-seq tools, but the reference genome is the one with the indexes. The link I sent about custom reference genomes explains how to use one of these, if you still what want to try.

I think it is worth reviewing the path and loc info, plus the binary commands above. Unless this helps you to solve the problem on your own now.

Thanks!

Jen
Galaxy team

On 4/11/13 6:16 AM, Sarah Maman wrote:
Thanks a lot Jennifer,

Restart, full paths were OK.

I don't know why but the 2nd version of Tophat (so the tophat tool available from Galaxy) search indexs in bowtie-index.loc file and not in bowtie2-index.loc
So, I've added my bowtie2 index paths in bowtie-index. loc file and tophat run...


But when I want to run tophat with a reference genome from your history, here is the error message (my reference genome is a GFT file) :
Error in tophat:

[2013-04-11 14:57:12] Beginning TopHat run (v2.0.5)
-----------------------------------------------
[2013-04-11 14:57:12] Checking for Bowtie
		  Bowtie version:	 2.0.0.7
[2013-04-11 14:57:12] Checking for Samtools
		Samtools version:	 0.1.19.0
[2013-04-11 14:57:13] Checking for Bowtie index files
Error: Could not find Bowtie 2 index files (/tmp/1078173.1.workq/tmpzxEFNK/dataset_6485.*.bt2)
    

Settings:
 blablabla  OK.....
Total time for backward call to driver() for mirror index: 00:00:57
TopHat v2.0.5
tophat -p 4  /tmp/1078173.1.workq/tmpzxEFNK/dataset_6485 /work/galaxy/database/files/006/dataset_6528.dat
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[bam_index_core] Invalid BAM header.[bam_index_build2] fail to index the BAM file.
Epilog : job finished at jeu. avril 11 14:57:18 CEST 2013
Thanks in advance,
Sarah


Jennifer Jackson a écrit :
Hi Sarah,

Let's try to sort this out. Your problem does not seem to be the same as in the question referenced, but we can see. First - just to double check - since setting up the genome, you have restarted the server? If not, do that first and check to see if that fixes the problem. Basically, you want to follow this checklist and restarting is the final step:
http://wiki.galaxyproject.org/Admin/NGS%20Local%20Setup

If the problem persists, then would you please send a few more details:

1 - full paths* on you system where you keep the .bt2 indexes, sam index, and .fa file. Maybe do an "ls -l" on these dirs so we can check the symbolic links are in place and named correctly.

* as a note, these should be "hard paths" and not symbolic (except for the .fa links), and must have permissions set to be accessible to the "galaxy user"

2 - lines from your bowtie2_indices.loc and sam_fa_indices.loc file for this genome. I may have you double check your builds.txt file later. if this doesn't sounds familiar, it could be the problem, the genome must be in there, too. - see this wiki: http://wiki.galaxyproject.org/Admin/Data%20Integration

3 - full error message you get when you try to run this using a genome in fasta format from your history. It really shouldn't be the same error - something is not right with the settings and a custom genome is not actually being used if that is the case. Give it another try and see what happens, then send that info. This is a bit of a side case, we should get your basic install correct, but knowing how to do this is a good thing and easy to learn.
http://wiki.galaxyproject.org/Support#Custom_reference_genome

It is OK to masked out anything like user names/groups you don't want to share. Please keep on the list in case we need other feedback.

Thanks!

Jen
Galaxy team

On 4/10/13 3:15 AM, Sarah Maman wrote:
Hello,


When I run tophat ("Tophat for Illumina Find splice junctions using RNA-seq data ), the job failed with truncated files. However, index files are available and I get exactly the same error message using built-in index or one of my history.


Tool execution generated the following error message:

Error in tophat:

[2013-04-10 09:17:07] Beginning TopHat run (v2.0.5)
-----------------------------------------------
[2013-04-10 09:17:07] Checking for Bowtie
          Bowtie version:     2.0.0.7
[2013-04-10 09:17:07] Checking for Samtools
        Samtools version:     0.1.19.0
[2013-04-10 09:17:07] Checking for Bowtie index files
Error: Could not find Bowtie 2 index files (/work/galaxy/Danio_rerio.Zv9.62.dna.chromosome.22.fa.*.bt2)


The tool produced the following additional output:

TopHat v2.0.5
tophat -p 4  /work/galaxy/Danio_rerio.Zv9.62.dna.chromosome.22.fa /work/galaxy/database/files/006/dataset_6528.dat
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[bam_index_core] Invalid BAM header.[bam_index_build2] fail to index the BAM file.
Epilog : job finished at mer. avril 10 09:17:12 CEST 2013


In this post (http://dev.list.galaxyproject.org/tophat-for-illumina-looking-in-wrong-directory-for-bowtie2-indexes-tt4658609.html#none), the solution isn't found.

Do you have any idea,
Sarah Maman
-- 
          --*--
Sarah Maman
INRA - LGC - SIGENAE
http://www.sigenae.org/
Chemin de Borde-Rouge - Auzeville - BP 52627
31326 Castanet-Tolosan cedex - FRANCE
Tel:   +33(0)5.61.28.57.08
Fax:   +33(0)5.61.28.57.53 
         --*--


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

-- 
Jennifer Hillman-Jackson
Galaxy Support and Training
http://galaxyproject.org


-- 
          --*--
Sarah Maman
INRA - LGC - SIGENAE
http://www.sigenae.org/
Chemin de Borde-Rouge - Auzeville - BP 52627
31326 Castanet-Tolosan cedex - FRANCE
Tel:   +33(0)5.61.28.57.08
Fax:   +33(0)5.61.28.57.53 
         --*--

-- 
Jennifer Hillman-Jackson
Galaxy Support and Training
http://galaxyproject.org


-- 
          --*--
Sarah Maman
INRA - LGC - SIGENAE
http://www.sigenae.org/
Chemin de Borde-Rouge - Auzeville - BP 52627
31326 Castanet-Tolosan cedex - FRANCE
Tel:   +33(0)5.61.28.57.08
Fax:   +33(0)5.61.28.57.53 
         --*--

-- 
Jennifer Hillman-Jackson
Galaxy Support and Training
http://galaxyproject.org


-- 
          --*--
Sarah Maman
INRA - LGC - SIGENAE
http://www.sigenae.org/
Chemin de Borde-Rouge - Auzeville - BP 52627
31326 Castanet-Tolosan cedex - FRANCE
Tel:   +33(0)5.61.28.57.08
Fax:   +33(0)5.61.28.57.53 
         --*--