Hi Ross,
thanks for your answer. I found a dirty fix for merging pairs of bam files, had to change a couple of things in my local installation though.
- Add group reads to each BAM file separately using Picard's Add or Replace Groups (with ID=s1 and ID=s2 for each file)
- Create the "rg.txt" file containing something like this:
@RG ID:s1 SM:s1 LB:s1 PL:Illumina
@RG ID:s2 SM:s2 LB:s2 PL:Illumina
Modify sam_merge.py to call:
"samtools merge -rh path/to/rg.txt %s %s..."
It works. The problem is all (pairs of) files will end up with the same IDs and labels, unless the rg.txt file is changed every time.
Would it be very difficult to add to the Galaxy wrapper the option of creating rg.txt on the fly and adding the -h option to the samtools call?
I'm not familiar with creating wrappers for Galaxy, any suggestion as to where to start?
Thanks again,
Camille
Camille, thanks for reporting this - I think you have found a bug.
We definitely need to be able to preserve metadata when we merge bams.
Thanks for your suggestion of using mergeSamFiles - yes, I think it
might be a good fix for this problem - but it will take a little while
and won't reach the Main site for a few weeks once it's done. It is
possible to write your own wrapper locally if you need it fast.
Sorry for the inconvenience and thanks again.
> ___________________________________________________________
On Wed, Aug 3, 2011 at 6:15 PM, Camille Stephan
<camille.stephan@irbbarcelona.org> wrote:
> Hello guys,
> I'm trying to run a pipeline of the best practices for snp and indel
> discovery as described by the people at Broad and I'm running into troubles
> with the GATK tools in a local installation of Galaxy.
> The main problem I have is that merging bam files with the samtools merge
> tool doesn't keep read group for each sample, causing "Count Covariates" to
> crash. The pipeline works fine with a single bam file, but I need to realign
> at least two files at a time.
> Is there a way to set the read group of a merged bam inside Galaxy? Are
> there plans to include the "merge" tool from Picard in Galaxy? Is there an
> easy way for me to do this locally? (Although I would like to run this in
> the cloud later on when the workflow is ready).
>
> Thanks!
> Camille
>
> --
> ***
> Camille Stephan-Otto Attolini, PhD
> Senior Research Officer, Bioinformatics and Biostatistics unit
> IRB Barcelona
> Tel (+34) 93 402 0553
>
>
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org. Please keep all replies on the list by
> using "reply all" in your mail client. For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
> http://lists.bx.psu.edu/listinfo/galaxy-dev
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
> http://lists.bx.psu.edu/
>
--
Ross Lazarus MBBS MPH;
Associate Professor, Harvard Medical School;
Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;