Hi,

thanks! Let me clarify myself on the multiple generation question.
 
When I do command line, I do this:
Bam,Bai -> Prog1 -> Bam,Bai(v2) -> Prog2 -> Bam,Bai(v3)

What I understand, what galaxy is doing ("always overwrite" [1]):

Bam,Bai -> Prog1 -> Bam,Bai(v2) -> Bam(v2) -> Bam,Bai(v2) ->
Prog2 -> Bam,Bai(v3) -> Bam(v3) -> Bam,Bai(v3)

Best,
Alexander

[1] https://github.com/galaxyproject/galaxy/blob/dev/lib/galaxy/datatypes/binary.py#L314


2015-06-19 1:26 GMT-05:00 Björn Grüning <bjoern.gruening@gmail.com>:
Hi Alexander,

Am 19.06.2015 um 01:44 schrieb Alexander Vowinkel:
> Hi,
>
> when I 'produce' a bam file in my output, I can download the bam file and a
> corresponding bai file.
>
> I am now wondering where does the bai file come from?

From Galaxy! :)
Galaxy has a concept of datatypes and metadata. In the case of BAM
files, Galaxy is producing the index file to every position sorted BAM file.

> When is it being generated, what code is used for that?

As soon as the BAM file is created.
https://github.com/galaxyproject/galaxy/blob/dev/lib/galaxy/datatypes/binary.py#L312

> When I do multiple actions with my bam file - is the index generated every
> time? (overhead)

No why should it? The BAM files stays the same right? So if you are
using the BAM multiple time the one index is used multiple times.

> Where is the bai saved on disk (if it is)?

You can access it with this snipped: $your_bam_file.metadata.bam_index

> I found that in the IUC gatk, a new bai is always generated when a gatk
> tool is started. This creates quite an overhead. Isn't there a better
> solution?

Do you have an idea?
Without Galaxy you also create an index once and reuse it, this is was
Galaxy is doing. This concept is also used for other datatypes and
improves the usability dramatically.


If this was helpful, please consider to ask this question on Biostar
again. I think this might be useful for others as well.

Cheers,
Bjoern

> @ref https://github.com/galaxyproject/tools-iuc/issues/194
>
> Best,
> Alexander
>
>
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>   https://lists.galaxyproject.org/
>
> To search Galaxy mailing lists use the unified search at:
>   http://galaxyproject.org/search/mailinglists/
>