Hi Jen,
thank you for your answer!
I have used the Add or Replace group tool and it worked pretty well, so that I could use the FreeBayes tool with no problem!
Now I have another question: I have been pre-processing my data with the NGS: GATK tools according to their Best Practices and I am ready for SNP calling. I have read the Unified Genotyper documentation and, since I am working with bacterial genome sequences, I would need to set the -sample-ploidy argument to 1 (default 2). I cannot find this option in the Galaxy version of this tool, not even in the advanced options. How can I do that?
Thank you very much! Debora
Message: 3 Date: Fri, 27 Sep 2013 14:02:50 -0700 From: Jennifer Jackson jen@bx.psu.edu To: garzetti garzetti@mvp.uni-muenchen.de Cc: galaxy-user@bx.psu.edu Subject: Re: [galaxy-user] SNP calling problems Message-ID: 5245F27A.7020200@bx.psu.edu Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
Hi Debora,
Sorry to hear that you are having problems. We can help get you going again! Please see below:
On 9/26/13 7:20 AM, garzetti wrote:
Dear all,
I have been looking for an answer to my problem in all the Galaxy Support resources but with no success. I am sorry if this topic has been already discussed!
So, I am analyzing MiSeq data on the main Galaxy. I have Fastq files from 4 paired-end samples. After having checked the quality with FastQC and groomed them, I have performed a BWA mapping, filtered the results and converted the SAM to BAM files (for each sample separately). I have then called SNPs with Freebayes and SAMtools, encountering problems in both cases.
- SAMtools: if I run the Generate pileup tool, then the Filter pileup
doesn't recognize any valid format in the files I have in my History and I cannot go on with the analysis. Why is that? What can I do?
Make sure that the output format is set as "pileup" and the tool will accept the input. Click on the pencil icon to make the datatype assignment change. http://wiki.galaxyproject.org/Support#Tool_doesn.27t_recognize_dataset
Note that Mpileup has an option to produce .bcf format, and that is not the same as pileup. If you have selected that type of output, then either re-run the tool with options that create pileup format, or convert bcf -> vcf and use one of the tools that work with vcf format to work with your data downstream from there.
- I have performed variant calling with Freebayes on single BAM files
and on one merged BAM files from all my four BWA mapping files. In all cases, the last column is "unknown", while it should be the name of my sample. This is not a big deal for the single vcf files, but from the merged BAM file, I cannot discriminate from which sample the SNPs were detected. I think there is a problem in the BAM files which are not properly indexed. Also Freebayes needs an RG tag. Is there a tool in Galaxy I can use to index BAM files, adding the RG tag?
The tool " NGS: Picard (beta) -> Add or Replace Groups" can be used to annotate SAM/BAM files. This tool can be a bit picky about formats, so just watch for that if you get an error.
/_Quick tip:_/ You can click on the bug icon on failed datasets to see the complete error message and it will often tell you exactly what is wrong so that you can correct it (this doesn't automatically submit a bug, which is good to know when you are in a hurry at night or on weekends or just want to troubleshoot yourself). You can use this on any error dataset to get more information if the dataset's "i" info button's stderr/stdout links or attributes "Info" field does not provide enough details. => This functions on servers that have bug reporting enabled (the public Main server does, and this is straightforward to configure on local/cloud instances, including your own, even if you use one for small local file manipulations or file backup/storage (very handy & key file backups are always a good idea, when doing analysis in general, anywhere). See the Admin wiki section for more.
Going forward, there is a short screencast about the Learning resources in Galaxy here in a Page. It will be uploaded to Vimeo sometime in the next 24 hrs, and will be likely updated to include the very latest as the infrastructure updates on Main settle out in the next weeks or so, but for now here is the link: Click on the "Learning Resources" graphic to launch the quick tour: https://main.g2.bx.psu.edu/u/galaxyproject/p/screencasts-usegalaxyorg
Galaxy team's Vimeo account: http://vimeo.com/channels/581769 We are uploading all of our vids, old & new, right now and over next few days. We really like and hope our user's do too and follow along. The public Main server will have direct links to this content, in the center home page, soon as part of the "New & Improved" Galaxy experience! I won't give an ETA, as this is in progress, but can hint that soon == expected very soon. (!)
Good luck and let us know if you need more help,
Jen Galaxy team
I hope someone can help me!
Thank you very much! Debora
Hi Debora,
This option was introduced in a version of GATK that is newer than the one on the public Main Galaxy server under the tool group "NGS: GATK Tools (beta)". The version of tools running here is "GATK (1.4)"
This is noted here for reference and in the tool output: http://wiki.galaxyproject.org/Admin/Tools/Tool%20Dependencies
The GATK forum has some discussion on the topic: http://gatkforums.broadinstitute.org/discussion/1214/can-i-use-gatk-on-non-d...
The newer version of GATK has not been wrapped for Galaxy. But if this was something you or someone else reading this post did, the Tool Shed would be a great place to share it: usegalaxy.org/toolshed
Hopefully this helps to clarify,
Jen Galaxy team
On 10/1/13 1:48 AM, garzetti wrote:
Hi Jen,
thank you for your answer!
I have used the Add or Replace group tool and it worked pretty well, so that I could use the FreeBayes tool with no problem!
Now I have another question: I have been pre-processing my data with the NGS: GATK tools according to their Best Practices and I am ready for SNP calling. I have read the Unified Genotyper documentation and, since I am working with bacterial genome sequences, I would need to set the -sample-ploidy argument to 1 (default 2). I cannot find this option in the Galaxy version of this tool, not even in the advanced options. How can I do that?
Thank you very much! Debora
galaxy-user@lists.galaxyproject.org