On Fri, Apr 20, 2012 at 2:17 AM, Brad Chapman <chapmanb@50mail.com> wrote:
Lance and Peter; Peter, thanks for noticing the problem and duplicate tools. Lance, I'm happy to merge these so there are not two different versions out there.
I prefer your use for genomeCoverageBed over my custom hacks. That's a nice approach I totally missed.
I avoid the need for the sam indexes by creating the file directly from the information in the BAM header. I don't think there is any way around creating it since it's required by the UCSC tools as well, but everything you need is in the BAM header.
Indeed - I remember looking at that with you back in March 2011, including the special case of BAM files lacking an embedded SAM header (where the BAM header alone suffices).
There might be a sneaky way to do this with samtools -H and awk but I'm not nearly skilled enough to pull that out.
Using pysam works nicely, and therefore I stuck with Python ;)
Let me know what you think. I can also update my python wrapper script to use the genomeCoverageBed approach instead if you think that's easier.
I've made the update to Brad's script from the Tool Shed (attached), switching to using genomeCoverageBed and bedGraphToBigWig (based on the approach used in Lance's script), although in doing so I dropped the region support (which wasn't exposed to the Galaxy interface anyway). Since genomeCoverageBed doesn't support this directly, we could use samtools view for this I think - if you want this functionality. Sadly then I noticed that the Tool Shed version was out of date - lacking the new normalization option added here: https://github.com/chapmanb/bcbb/commits/master/nextgen/scripts/bam_to_wiggl... This was enough for my immediate needs today, but I'd happily try and merge this into the git version and update the XML file to match and add the new split option. We could list this as three contributing authors if you both like? Peter