Hi, thanks! Let me clarify myself on the multiple generation question. When I do command line, I do this: Bam,Bai -> Prog1 -> Bam,Bai(v2) -> Prog2 -> Bam,Bai(v3) What I understand, what galaxy is doing ("always overwrite" [1]): Bam,Bai -> Prog1 -> Bam,Bai(v2) -> Bam(v2) -> Bam,Bai(v2) -> Prog2 -> Bam,Bai(v3) -> Bam(v3) -> Bam,Bai(v3) Best, Alexander [1] https://github.com/galaxyproject/galaxy/blob/dev/lib/galaxy/datatypes/binary... 2015-06-19 1:26 GMT-05:00 Björn Grüning <bjoern.gruening@gmail.com>:
Hi Alexander,
Am 19.06.2015 um 01:44 schrieb Alexander Vowinkel:
Hi,
when I 'produce' a bam file in my output, I can download the bam file and a corresponding bai file.
I am now wondering where does the bai file come from?
From Galaxy! :) Galaxy has a concept of datatypes and metadata. In the case of BAM files, Galaxy is producing the index file to every position sorted BAM file.
When is it being generated, what code is used for that?
As soon as the BAM file is created.
https://github.com/galaxyproject/galaxy/blob/dev/lib/galaxy/datatypes/binary...
When I do multiple actions with my bam file - is the index generated every time? (overhead)
No why should it? The BAM files stays the same right? So if you are using the BAM multiple time the one index is used multiple times.
Where is the bai saved on disk (if it is)?
You can access it with this snipped: $your_bam_file.metadata.bam_index
I found that in the IUC gatk, a new bai is always generated when a gatk tool is started. This creates quite an overhead. Isn't there a better solution?
Do you have an idea? Without Galaxy you also create an index once and reuse it, this is was Galaxy is doing. This concept is also used for other datatypes and improves the usability dramatically.
If this was helpful, please consider to ask this question on Biostar again. I think this might be useful for others as well.
Cheers, Bjoern
@ref https://github.com/galaxyproject/tools-iuc/issues/194
Best, Alexander
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/