Hello Galaxy Team,
 
I have been using Galaxy for SNP detection for with great success. Basically, I followed the screen-cast from Anton without any problems. The only change was to use the BWA instead of Bowtie. Until now, I have always assigned my raw read files to the hg19 format. Now I want to try the GATK pipeline to analyze my samples but I am running into a problem with the bam/bai files.
 
Here is what I did. I imported my Illumina paired end reads into Galaxy and assigned them to the hg_g1k_v37 format instead of the Hg19 format. From there, I again followed the exact same process: FastQ Groomer, Summary Statistics, Boxplots, Align with BWA, filter on SAM, SAM-to-Bam, generate bai file. I made sure that hg_g1k_37 was chosen for the format for all of these steps that required that information.
 
Everything seemed to run successfully as all of the boxed turned green. When I tried to view the bam file in IGV (as a QC step before the GATK pipeline), I received the following error: "Error reading bam file. This usually indicates a problem with the index (bai) file. ArrayIndexOutofBoundsException: 4682 (4682)."
 
I did the exact same analysis using the Hg19 format and my bam/bai files worked perfectly fine in the IGV viewer. Can anyone tell me what the problem is and how to fix it?
 
Thanks,
Mike Dufault