The response speed is awesome!

On github Eric answered:

https://github.com/galaxyproject/tools-iuc/blob/0887009a23d176b21536c9fd8a18c4fecc417d4f/tools/bedtools/multiCov.xml see L12 of that file for where the bai file is stored.

That explains a lot. Very precise example. Thank you!

What I'm still wondering is:

When is it being generated, what code is used for that?

More specific:
When I use Picard SortSam in galaxy, the output is a bam file with an accompanying bai file.
But when I use Picard on command line, the output is just a bam file.

In the Picard SortSam galaxy tool, it is not being done. (If I didn't miss something)

What leads me to

When I do multiple actions with my bam file - is the index generated every time? (overhead)

GATK processes BAM files and at the same time the bai files.
It would be a waste to just use the bam files and re-generate the bai files, that already exist.
How could that be done?

Thanks,
Alexander


2015-06-18 18:44 GMT-05:00 Alexander Vowinkel <vowinkel.alexander@gmail.com>:
Hi,

when I 'produce' a bam file in my output, I can download the bam file and a corresponding bai file.

I am now wondering where does the bai file come from?
When is it being generated, what code is used for that?
When I do multiple actions with my bam file - is the index generated every time? (overhead)
Where is the bai saved on disk (if it is)?

I found that in the IUC gatk, a new bai is always generated when a gatk tool is started. This creates quite an overhead. Isn't there a better solution?

@ref https://github.com/galaxyproject/tools-iuc/issues/194

Best,
Alexander