Why doesn't bowtie in galaxy accepting colorspace reads directly?
Hi All, I'm wondering why the Bowtie version in (even latest) Galaxy does NOT support .csfasta/.qual input files directly, though it is mentioned under "Map with Bowtie for SOLiD". This is the case of "BWA for SOLiD" as well. One would expect direct support on colorspace files. Do you have any plans of implementing this?I see this would be a great support to SOLiD users. Look forward to your comments Thanks, Raj ________________________________ This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD
Raj: It does support colorspace files, but you have to convert them to fastq first. That sounds bad, but you don't have to convert to basespace to store the data in fastq, so there is no loss of data - just one extra step. https://main.g2.bx.psu.edu/tool_runner?tool_id=fastq_combiner Only bowtie 1.x supports colorspace (and that's all that's available on the public galaxy anyway) https://main.g2.bx.psu.edu/tool_runner?tool_id=bowtie_color_wrapper Brad On Sep 15, 2012, at 6:22 AM, Praveen Raj Somarajan <Praveen.s@ocimumbio.com<mailto:Praveen.s@ocimumbio.com>> wrote: Hi All, I’m wondering why the Bowtie version in (even latest) Galaxy does NOT support .csfasta/.qual input files directly, though it is mentioned under “Map with Bowtie for SOLiD”. This is the case of “BWA for SOLiD” as well. One would expect direct support on colorspace files. Do you have any plans of implementing this?I see this would be a great support to SOLiD users. Look forward to your comments Thanks, Raj ________________________________ This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Brad Langhorst langhorst@neb.com<mailto:langhorst@neb.com> 978-380-7564
Thanks Brad for the reply, but the format conversion sounds bad when deal with multiple samples, especially Paired-End or Mate-Pair samples. It doubles the task. Hence, I'd be more interested to provide csfasta/qual files with -f and -Q1, -Q2 options, as given in Bowtie manual (shown below) "bowtie also handles input in the form of parallel .csfasta and _QV.qual files. Use -f<http://bowtie-bio.sourceforge.net/manual.shtml#bowtie-options-f> to specify the .csfasta files and -Q<http://bowtie-bio.sourceforge.net/manual.shtml#bowtie-options-Q> (for unpaired reads) or --Q1<http://bowtie-bio..sourceforge.net/manual.shtml#bowtie-options-Q1>/--Q2<http://bowtie-bio.sourceforge.net/manual.shtml#bowtie-options-Q2> (for paired-end reads) to specify the corresponding _QV.qual files. It is not necessary to first convert to FASTQ, though bowtie also handles FASTQ-formatted colorspace reads (with -q<http://bowtie-bio.sourceforge.net/manual.shtml#bowtie-options-q>, the default)" Why should the system spend time in converting the files when the tool itself provide the capability of accepting the original formats. Pl share your thoughts. Raj From: Langhorst, Brad [mailto:Langhorst@neb.com] Sent: Saturday, September 15, 2012 5:46 PM To: Praveen Raj Somarajan Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Why doesn't bowtie in galaxy accepting colorspacereads directly? Raj: It does support colorspace files, but you have to convert them to fastq first. That sounds bad, but you don't have to convert to basespace to store the data in fastq, so there is no loss of data - just one extra step. https://main.g2.bx.psu.edu/tool_runner?tool_id=fastq_combiner Only bowtie 1.x supports colorspace (and that's all that's available on the public galaxy anyway) https://main.g2.bx.psu.edu/tool_runner?tool_id=bowtie_color_wrapper Brad On Sep 15, 2012, at 6:22 AM, Praveen Raj Somarajan <Praveen.s@ocimumbio.com<mailto:Praveen.s@ocimumbio.com>> wrote: Hi All, I'm wondering why the Bowtie version in (even latest) Galaxy does NOT support .csfasta/.qual input files directly, though it is mentioned under "Map with Bowtie for SOLiD". This is the case of "BWA for SOLiD" as well. One would expect direct support on colorspace files. Do you have any plans of implementing this?I see this would be a great support to SOLiD users. Look forward to your comments Thanks, Raj ________________________________ This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Brad Langhorst langhorst@neb.com<mailto:langhorst@neb.com> 978-380-7564 ________________________________ This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD
On Mon, Sep 17, 2012 at 9:18 AM, Praveen Raj Somarajan <Praveen.s@ocimumbio.com> wrote:
Thanks Brad for the reply, but the format conversion sounds bad when deal with multiple samples, especially Paired-End or Mate-Pair samples. It doubles the task. Hence, I’d be more interested to provide csfasta/qual files with –f and –Q1, -Q2 options, as given in Bowtie manual (shown below)
“bowtie also handles input in the form of parallel .csfasta and _QV.qual files. Use -f to specify the .csfasta files and -Q (for unpaired reads) or --Q1/--Q2 (for paired-end reads) to specify the corresponding _QV.qual files. It is not necessary to first convert to FASTQ, though bowtie also handles FASTQ-formatted colorspace reads (with -q, the default)”
Why should the system spend time in converting the files when the tool itself provide the capability of accepting the original formats.
Pl share your thoughts.
Raj
Since Bowtie itself supports colorspace FASTA+QUAL, in theory the Galaxy wrapper could too. Galaxy does have file formats "csfasta" and "qualsolid" define, neither of which is currently used here - just "fastqcssanger" (FASTQ color-space, Sanger encoding): https://bitbucket.org/galaxy/galaxy-central/src/fe12d92febf9/tools/sr_mappin... However, the fact that this requires twice the number of input files would make this quite complex to implement - and also harder for the end user to use. Going to (colorspace) FASTQ as early as possible simplifies data management (you don't have to keep the two files in sync) and as a bonus saves you disk space (QUAL is very inefficient). If you are linking your Galaxy directly to your sequencing LIMS (as some people are for Illumina at least), doing conversion to FASTQ as part of that would make a nicer end user experience. Peter
Raj: It would not be a big job to add some parameters to the xml tool definition for bowtie… if you're running this on your own instance. If you're planning to run on the public galaxy, you'll probably get results more quickly by doing the conversion than by convincing the managers of that system to implement a change. If it's any consolation, it's not a terribly expensive conversion - we do it for all of our SOLiD data now. Brad On Sep 17, 2012, at 4:18 AM, Praveen Raj Somarajan <Praveen.s@ocimumbio.com<mailto:Praveen.s@ocimumbio.com>> wrote: Thanks Brad for the reply, but the format conversion sounds bad when deal with multiple samples, especially Paired-End or Mate-Pair samples. It doubles the task. Hence, I’d be more interested to provide csfasta/qual files with –f and –Q1, -Q2 options, as given in Bowtie manual (shown below) “bowtie also handles input in the form of parallel .csfasta and _QV.qual files. Use -f<http://bowtie-bio.sourceforge.net/manual.shtml#bowtie-options-f> to specify the .csfasta files and -Q<http://bowtie-bio.sourceforge.net/manual.shtml#bowtie-options-Q> (for unpaired reads) or --Q1<http://bowtie-bio.sourceforge.net/manual.shtml#bowtie-options-Q1>/--Q2<http://bowtie-bio.sourceforge.net/manual.shtml#bowtie-options-Q2> (for paired-end reads) to specify the corresponding _QV.qual files. It is not necessary to first convert to FASTQ, though bowtie also handles FASTQ-formatted colorspace reads (with -q<http://bowtie-bio.sourceforge.net/manual.shtml#bowtie-options-q>, the default)” Why should the system spend time in converting the files when the tool itself provide the capability of accepting the original formats. Pl share your thoughts. Raj From: Langhorst, Brad [mailto:Langhorst@neb.com<http://neb.com>] Sent: Saturday, September 15, 2012 5:46 PM To: Praveen Raj Somarajan Cc: galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu> Subject: Re: [galaxy-dev] Why doesn't bowtie in galaxy accepting colorspacereads directly? Raj: It does support colorspace files, but you have to convert them to fastq first. That sounds bad, but you don't have to convert to basespace to store the data in fastq, so there is no loss of data - just one extra step. https://main.g2.bx.psu.edu/tool_runner?tool_id=fastq_combiner Only bowtie 1.x supports colorspace (and that's all that's available on the public galaxy anyway) https://main.g2.bx.psu.edu/tool_runner?tool_id=bowtie_color_wrapper Brad On Sep 15, 2012, at 6:22 AM, Praveen Raj Somarajan <Praveen.s@ocimumbio.com<mailto:Praveen.s@ocimumbio.com>> wrote: Hi All, I’m wondering why the Bowtie version in (even latest) Galaxy does NOT support .csfasta/.qual input files directly, though it is mentioned under “Map with Bowtie for SOLiD”.. This is the case of “BWA for SOLiD” as well. One would expect direct support on colorspace files. Do you have any plans of implementing this?I see this would be a great support to SOLiD users. Look forward to your comments Thanks, Raj ________________________________ This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Brad Langhorst langhorst@neb.com<mailto:langhorst@neb.com> 978-380-7564 ________________________________ This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD -- Brad Langhorst langhorst@neb.com<mailto:langhorst@neb.com> 978-380-7564
Thanks Brad. I can probably look into the xml tool definition and change. Meantime, I'll go with the current solution of converting files. Raj From: Langhorst, Brad [mailto:Langhorst@neb.com] Sent: Monday, September 17, 2012 5:15 PM To: Praveen Raj Somarajan Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Why doesn't bowtie in galaxy acceptingcolorspacereads directly? Raj: It would not be a big job to add some parameters to the xml tool definition for bowtie... if you're running this on your own instance. If you're planning to run on the public galaxy, you'll probably get results more quickly by doing the conversion than by convincing the managers of that system to implement a change. If it's any consolation, it's not a terribly expensive conversion - we do it for all of our SOLiD data now. Brad On Sep 17, 2012, at 4:18 AM, Praveen Raj Somarajan <Praveen.s@ocimumbio.com<mailto:Praveen.s@ocimumbio.com>> wrote: Thanks Brad for the reply, but the format conversion sounds bad when deal with multiple samples, especially Paired-End or Mate-Pair samples. It doubles the task. Hence, I'd be more interested to provide csfasta/qual files with -f and -Q1, -Q2 options, as given in Bowtie manual (shown below) "bowtie also handles input in the form of parallel .csfasta and _QV.qual files. Use -f<http://bowtie-bio.sourceforge.net/manual.shtml#bowtie-options-f> to specify the .csfasta files and -Q<http://bowtie-bio.sourceforge.net/manual.shtml#bowtie-options-Q> (for unpaired reads) or --Q1<http://bowtie-bio..sourceforge.net/manual.shtml#bowtie-options-Q1>/--Q2<http://bowtie-bio.sourceforge.net/manual.shtml#bowtie-options-Q2> (for paired-end reads) to specify the corresponding _QV.qual files. It is not necessary to first convert to FASTQ, though bowtie also handles FASTQ-formatted colorspace reads (with -q<http://bowtie-bio.sourceforge.net/manual.shtml#bowtie-options-q>, the default)" Why should the system spend time in converting the files when the tool itself provide the capability of accepting the original formats. Pl share your thoughts. Raj From: Langhorst, Brad [mailto:Langhorst@neb.com<http://neb.com>] Sent: Saturday, September 15, 2012 5:46 PM To: Praveen Raj Somarajan Cc: galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu> Subject: Re: [galaxy-dev] Why doesn't bowtie in galaxy accepting colorspacereads directly? Raj: It does support colorspace files, but you have to convert them to fastq first. That sounds bad, but you don't have to convert to basespace to store the data in fastq, so there is no loss of data - just one extra step. https://main.g2.bx.psu.edu/tool_runner?tool_id=fastq_combiner Only bowtie 1.x supports colorspace (and that's all that's available on the public galaxy anyway) https://main.g2.bx.psu.edu/tool_runner?tool_id=bowtie_color_wrapper Brad On Sep 15, 2012, at 6:22 AM, Praveen Raj Somarajan <Praveen.s@ocimumbio.com<mailto:Praveen.s@ocimumbio.com>> wrote: Hi All, I'm wondering why the Bowtie version in (even latest) Galaxy does NOT support .csfasta/.qual input files directly, though it is mentioned under "Map with Bowtie for SOLiD".. This is the case of "BWA for SOLiD" as well. One would expect direct support on colorspace files. Do you have any plans of implementing this?I see this would be a great support to SOLiD users. Look forward to your comments Thanks, Raj ________________________________ This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Brad Langhorst langhorst@neb.com<mailto:langhorst@neb.com> 978-380-7564 ________________________________ This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD -- Brad Langhorst langhorst@neb.com<mailto:langhorst@neb.com> 978-380-7564 ________________________________ This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD
participants (3)
-
Langhorst, Brad
-
Peter Cock
-
Praveen Raj Somarajan