Speed up the galaxy
Hi all, How can I speed up the galaxy? Like how to use more cores and memeries.
On Dec 4, 2012, at 4:36 AM, 泽 蔡 wrote:
Hi all,
How can I speed up the galaxy? Like how to use more cores and memeries.
Hi, This is a pretty broad question. However, I would recommend that you start at: http://usegalaxy.org/production --nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi, I read the page of the document. But I don't think there is anything I need. Now, I installed a local instance of galaxy, and I just run it on a PC. I need to deal with large data, if I just use the default configure of galaxy, every task would take long time. For example I ran the FASTQ Groomer with a large file, the precedure is so slow, and galaxy would not use the potential of my machine. So I want to know how can I get galaxy ran faster. I already can upload files quickly and I just need to know how to ran tools quickly. ________________________________ 发件人: Nate Coraor <nate@bx.psu.edu> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn> 抄送: "galaxy-dev@bx.psu.edu" <galaxy-dev@bx.psu.edu> 发送日期: 2012年12月4日, 星期二, 下午 9:38 主题: Re: [galaxy-dev] Speed up the galaxy On Dec 4, 2012, at 4:36 AM, 泽 蔡 wrote:
Hi all, How can I speed up the galaxy? Like how to use more cores and memeries.
Hi, This is a pretty broad question. However, I would recommend that you start at: http://usegalaxy.org/production --nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi I presume the best way to optimise your current problem is to evaluate whether you really need to groom your data!? If its old data presumably yes, but if it is recent data in Illumina 1.8+ encoding(http://en.wikipedia.org/wiki/FASTQ_format) it is not necessary.... speedup 100% :-) Groom would take long at our servers as well but due to the new Illumina format we didn’t bother to optimise it further by parallelisation for instance... Alex Van: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] Namens ? ? Verzonden: dinsdag 4 december 2012 16:08 Aan: Nate Coraor CC: galaxy-dev@lists.bx.psu.edu Onderwerp: [galaxy-dev] 回复: Speed up the galaxy Hi, I read the page of the document. But I don't think there is anything I need. Now, I installed a local instance of galaxy, and I just run it on a PC. I need to deal with large data, if I just use the default configure of galaxy, every task would take long time. For example I ran the FASTQ Groomer with a large file, the precedure is so slow, and galaxy would not use the potential of my machine. So I want to know how can I get galaxy ran faster. I already can upload files quickly and I just need to know how to ran tools quickly. 发件人: Nate Coraor <nate@bx.psu.edu<mailto:nate@bx.psu.edu>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> 抄送: "galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>" <galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>> 发送日期: 2012年12月4日, 星期二, 下午 9:38 主题: Re: [galaxy-dev] Speed up the galaxy On Dec 4, 2012, at 4:36 AM, 泽 蔡 wrote:
Hi all,
How can I speed up the galaxy? Like how to use more cores and memeries.
Hi, This is a pretty broad question. However, I would recommend that you start at: http://usegalaxy.org/production --nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi Alex I look the page of wikipedia, but I have a little confused. We sequenced with solexa. I paste a read of my data, can you tell me whether I need run Groom? The read is like this:@HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 AGAAGTACATCGCGATGCCGTTNCCNNCGAAGGCGATAGNNNACAAGNCCAAATGNTTCTNCATCNNNCNCGAGNNGNCGAGGNCGCCGTGCGACCCTGC +HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 Ya^a`edddeddc\c`a`dc]\Ba^BBZ]ZZ`ZZZ]a]]BBB^[`\UB_V[V\`ZBSZX^BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB ________________________________ 发件人: "Bossers, Alex" <Alex.Bossers@wur.nl> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn> 抄送: "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu>; Nate Coraor <nate@bx.psu.edu> 发送日期: 2012年12月4日, 星期二, 下午 11:21 主题: RE: [galaxy-dev] 回复: Speed up the galaxy Hi I presume the best way to optimise your current problem is to evaluate whether you really need to groom your data!? If its old data presumably yes, but if it is recent data in Illumina 1.8+ encoding(http://en.wikipedia.org/wiki/FASTQ_format) it is not necessary.... speedup 100% :-) Groom would take long at our servers as well but due to the new Illumina format we didn’t bother to optimise it further by parallelisation for instance... Alex Van:galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] Namens ? ? Verzonden: dinsdag 4 december 2012 16:08 Aan: Nate Coraor CC: galaxy-dev@lists.bx.psu.edu Onderwerp: [galaxy-dev] 回复:Speed up the galaxy Hi, I read the page of the document. But I don't think there is anything I need. Now, I installed a local instance of galaxy, and I just run it on a PC. I need to deal with large data, if I just use the default configure of galaxy, every task would take long time. For example I ran the FASTQ Groomer with a large file, the precedure is so slow, and galaxy would not use the potential of my machine. So I want to know how can I get galaxy ran faster. I already can upload files quickly and I just need to know how to ran tools quickly. 发件人:Nate Coraor <nate@bx.psu.edu> 收件人:泽蔡<caizexi123@yahoo.com.cn> 抄送:"galaxy-dev@bx.psu.edu" <galaxy-dev@bx.psu.edu> 发送日期:2012年12月4日, 星期二, 下午9:38 主题:Re: [galaxy-dev] Speed up the galaxy On Dec 4, 2012, at 4:36 AM, 泽蔡wrote:
Hi all, How can I speed up the galaxy? Like how to use more cores and memeries.
Hi, This is a pretty broad question. However, I would recommend that you start at: http://usegalaxy.org/production --nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi, You can use fastQC to find out what is the quality encoding of your sequences http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ Alternatively, you can also use this script: http://www.uppmax.uu.se/userscript/check-fastq-quality-score-format David Date: Tue, 4 Dec 2012 23:41:27 +0800 From: caizexi123@yahoo.com.cn To: Alex.Bossers@wur.nl CC: galaxy-dev@lists.bx.psu.edu Subject: [galaxy-dev] 回复: 回复: Speed up the galaxy Hi Alex I look the page of wikipedia, but I have a little confused. We sequenced with solexa. I paste a read of my data, can you tell me whether I need run Groom?The read is like this:@HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 AGAAGTACATCGCGATGCCGTTNCCNNCGAAGGCGATAGNNNACAAGNCCAAATGNTTCTNCATCNNNCNCGAGNNGNCGAGGNCGCCGTGCGACCCTGC +HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 Ya^a`edddeddc\c`a`dc]\Ba^BBZ]ZZ`ZZZ]a]]BBB^[`\UB_V[V\`ZBSZX^BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 发件人: "Bossers, Alex" <Alex.Bossers@wur.nl> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn> 抄送: "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu>; Nate Coraor <nate@bx.psu.edu> 发送日期: 2012年12月4日, 星期二, 下午 11:21 主题: RE: [galaxy-dev] 回复: Speed up the galaxy Hi I presume the best way to optimise your current problem is to evaluate whether you really need to groom your data!? If its old data presumably yes, but if it is recent data in Illumina 1.8+ encoding(http://en.wikipedia.org/wiki/FASTQ_format) it is not necessary.... speedup 100% :-) Groom would take long at our servers as well but due to the new Illumina format we didn’t bother to optimise it further by parallelisation for instance... Alex Van: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] Namens ? ? Verzonden: dinsdag 4 december 2012 16:08 Aan: Nate Coraor CC: galaxy-dev@lists.bx.psu.edu Onderwerp: [galaxy-dev] 回复: Speed up the galaxy Hi, I read the page of the document. But I don't think there is anything I need. Now, I installed a local instance of galaxy, and I just run it on a PC. I need to deal with large data, if I just use the default configure of galaxy, every task would take long time. For example I ran the FASTQ Groomer with a large file, the precedure is so slow, and galaxy would not use the potential of my machine. So I want to know how can I get galaxy ran faster. I already can upload files quickly and I just need to know how to ran tools quickly. 发件人: Nate Coraor <nate@bx.psu.edu> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn> 抄送: "galaxy-dev@bx.psu.edu" <galaxy-dev@bx.psu.edu> 发送日期: 2012年12月4日, 星期二, 下午 9:38 主题: Re: [galaxy-dev] Speed up the galaxy On Dec 4, 2012, at 4:36 AM, 泽 蔡 wrote:
Hi all,
How can I speed up the galaxy? Like how to use more cores and memeries.
Hi, This is a pretty broad question. However, I would recommend that you start at: http://usegalaxy.org/production --nate
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Hi, Thanks a lot! ________________________________ 发件人: David Roquis <david_roquis@hotmail.com> 收件人: caizexi123@yahoo.com.cn; alex.bossers@wur.nl 抄送: galaxy-dev@lists.bx.psu.edu 发送日期: 2012年12月5日, 星期三, 上午 12:31 主题: RE: [galaxy-dev] 回复: 回复: Speed up the galaxy Hi, You can use fastQC to find out what is the quality encoding of your sequences http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ Alternatively, you can also use this script: http://www.uppmax.uu.se/userscript/check-fastq-quality-score-format David ________________________________ Date: Tue, 4 Dec 2012 23:41:27 +0800 From: caizexi123@yahoo.com.cn To: Alex.Bossers@wur.nl CC: galaxy-dev@lists.bx.psu.edu Subject: [galaxy-dev] 回复: 回复: Speed up the galaxy Hi Alex I look the page of wikipedia, but I have a little confused. We sequenced with solexa. I paste a read of my data, can you tell me whether I need run Groom? The read is like this:@HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 AGAAGTACATCGCGATGCCGTTNCCNNCGAAGGCGATAGNNNACAAGNCCAAATGNTTCTNCATCNNNCNCGAGNNGNCGAGGNCGCCGTGCGACCCTGC +HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 Ya^a`edddeddc\c`a`dc]\Ba^BBZ]ZZ`ZZZ]a]]BBB^[`\UB_V[V\`ZBSZX^BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB ________________________________ 发件人: "Bossers, Alex" <Alex.Bossers@wur.nl> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn> 抄送: "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu>; Nate Coraor <nate@bx.psu.edu> 发送日期: 2012年12月4日, 星期二, 下午 11:21 主题: RE: [galaxy-dev] 回复: Speed up the galaxy Hi I presume the best way to optimise your current problem is to evaluate whether you really need to groom your data!? If its old data presumably yes, but if it is recent data in Illumina 1.8+ encoding(http://en.wikipedia.org/wiki/FASTQ_format) it is not necessary.... speedup 100% :-) Groom would take long at our servers as well but due to the new Illumina format we didn’t bother to optimise it further by parallelisation for instance... Alex Van:galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] Namens ? ? Verzonden: dinsdag 4 december 2012 16:08 Aan: Nate Coraor CC: galaxy-dev@lists.bx.psu.edu Onderwerp: [galaxy-dev] 回复:Speed up the galaxy Hi, I read the page of the document. But I don't think there is anything I need. Now, I installed a local instance of galaxy, and I just run it on a PC. I need to deal with large data, if I just use the default configure of galaxy, every task would take long time. For example I ran the FASTQ Groomer with a large file, the precedure is so slow, and galaxy would not use the potential of my machine. So I want to know how can I get galaxy ran faster. I already can upload files quickly and I just need to know how to ran tools quickly. 发件人:Nate Coraor <nate@bx.psu.edu> 收件人:泽蔡<caizexi123@yahoo.com.cn> 抄送:"galaxy-dev@bx.psu.edu" <galaxy-dev@bx.psu.edu> 发送日期:2012年12月4日, 星期二, 下午9:38 主题:Re: [galaxy-dev] Speed up the galaxy On Dec 4, 2012, at 4:36 AM, 泽蔡wrote:
Hi all, How can I speed up the galaxy? Like how to use more cores and memeries.
Hi, This is a pretty broad question. However, I would recommend that you start at: http://usegalaxy.org/production --nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
It seems old Illumina encoding 1.5 So yes some tools requiring the fastqsanger would need grooming. If you are up to programming you can seriously speed this up by using a precalculated transfer or hash table. That way you do not have to do any calculation but just translate each quality line using generic regexp/grep/sed like tools or $seq =~ t/STARTSCORES/SANGERSCORES/g in perl. The wiki table could be an alternative option. You still have to set the upload type to fastsanger likely since it will probably sniff the header and see its fastq but not 1.8+ encoding. Good luck! Alex ________________________________ Van: 泽 蔡 [caizexi123@yahoo.com.cn] Verzonden: dinsdag 4 december 2012 16:41 To: Bossers, Alex Cc: galaxy-dev@lists.bx.psu.edu Onderwerp: 回复: [galaxy-dev] 回复: Speed up the galaxy Hi Alex I look the page of wikipedia, but I have a little confused. We sequenced with solexa. I paste a read of my data, can you tell me whether I need run Groom? The read is like this:@HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 AGAAGTACATCGCGATGCCGTTNCCNNCGAAGGCGATAGNNNACAAGNCCAAATGNTTCTNCATCNNNCNCGAGNNGNCGAGGNCGCCGTGCGACCCTGC +HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 Ya^a`edddeddc\c`a`dc]\Ba^BBZ]ZZ`ZZZ]a]]BBB^[`\UB_V[V\`ZBSZX^BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 发件人: "Bossers, Alex" <Alex.Bossers@wur.nl> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn> 抄送: "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu>; Nate Coraor <nate@bx.psu.edu> 发送日期: 2012年12月4日, 星期二, 下午 11:21 主题: RE: [galaxy-dev] 回复: Speed up the galaxy Hi I presume the best way to optimise your current problem is to evaluate whether you really need to groom your data!? If its old data presumably yes, but if it is recent data in Illumina 1.8+ encoding(http://en.wikipedia.org/wiki/FASTQ_format) it is not necessary.... speedup 100% :-) Groom would take long at our servers as well but due to the new Illumina format we didn’t bother to optimise it further by parallelisation for instance... Alex Van: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] Namens ? ? Verzonden: dinsdag 4 december 2012 16:08 Aan: Nate Coraor CC: galaxy-dev@lists.bx.psu.edu Onderwerp: [galaxy-dev] 回复: Speed up the galaxy Hi, I read the page of the document. But I don't think there is anything I need. Now, I installed a local instance of galaxy, and I just run it on a PC. I need to deal with large data, if I just use the default configure of galaxy, every task would take long time. For example I ran the FASTQ Groomer with a large file, the precedure is so slow, and galaxy would not use the potential of my machine. So I want to know how can I get galaxy ran faster. I already can upload files quickly and I just need to know how to ran tools quickly. 发件人: Nate Coraor <nate@bx.psu.edu<mailto:nate@bx.psu.edu>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> 抄送: "galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>" <galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>> 发送日期: 2012年12月4日, 星期二, 下午 9:38 主题: Re: [galaxy-dev] Speed up the galaxy On Dec 4, 2012, at 4:36 AM, 泽 蔡 wrote:
Hi all,
How can I speed up the galaxy? Like how to use more cores and memeries.
Hi, This is a pretty broad question. However, I would recommend that you start at: http://usegalaxy.org/production --nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi Alex Now is an another problem. I now deal with two fastq files, there are Illumina enconding 1.8 and pair-end, so I don't need to groom. But the fact is, I need to use the "filter by quality" and "Fastq interlacer" and without groom thses two tools can not regonize the files. Any idea to solve this problem? ________________________________ 发件人: "Bossers, Alex" <Alex.Bossers@wur.nl> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn> 抄送: "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu> 发送日期: 2012年12月5日, 星期三, 4:02 上午 主题: RE: 回复: [galaxy-dev] 回复: Speed up the galaxy It seems old Illumina encoding 1.5 So yes some tools requiring the fastqsanger would need grooming. If you are up to programming you can seriously speed this up by using a precalculated transfer or hash table. That way you do not have to do any calculation but just translate each quality line using generic regexp/grep/sed like tools or $seq =~ t/STARTSCORES/SANGERSCORES/g in perl. The wiki table could be an alternative option. You still have to set the upload type to fastsanger likely since it will probably sniff the header and see its fastq but not 1.8+ encoding. Good luck! Alex ________________________________ Van: 泽 蔡 [caizexi123@yahoo.com.cn] Verzonden: dinsdag 4 december 2012 16:41 To: Bossers, Alex Cc: galaxy-dev@lists.bx.psu.edu Onderwerp: 回复: [galaxy-dev] 回复: Speed up the galaxy Hi Alex I look the page of wikipedia, but I have a little confused. We sequenced with solexa. I paste a read of my data, can you tell me whether I need run Groom? The read is like this:@HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 AGAAGTACATCGCGATGCCGTTNCCNNCGAAGGCGATAGNNNACAAGNCCAAATGNTTCTNCATCNNNCNCGAGNNGNCGAGGNCGCCGTGCGACCCTGC +HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 Ya^a`edddeddc\c`a`dc]\Ba^BBZ]ZZ`ZZZ]a]]BBB^[`\UB_V[V\`ZBSZX^BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 发件人: "Bossers, Alex" <Alex.Bossers@wur.nl> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn> 抄送: "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu>; Nate Coraor <nate@bx.psu.edu> 发送日期: 2012年12月4日, 星期二, 下午 11:21 主题: RE: [galaxy-dev] 回复: Speed up the galaxy Hi I presume the best way to optimise your current problem is to evaluate whether you really need to groom your data!? If its old data presumably yes, but if it is recent data in Illumina 1.8+ encoding(http://en.wikipedia.org/wiki/FASTQ_format) it is not necessary.... speedup 100% :-) Groom would take long at our servers as well but due to the new Illumina format we didn’t bother to optimise it further by parallelisation for instance... Alex Van: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] Namens ? ? Verzonden: dinsdag 4 december 2012 16:08 Aan: Nate Coraor CC: galaxy-dev@lists.bx.psu.edu Onderwerp: [galaxy-dev] 回复: Speed up the galaxy Hi, I read the page of the document. But I don't think there is anything I need. Now, I installed a local instance of galaxy, and I just run it on a PC. I need to deal with large data, if I just use the default configure of galaxy, every task would take long time. For example I ran the FASTQ Groomer with a large file, the precedure is so slow, and galaxy would not use the potential of my machine. So I want to know how can I get galaxy ran faster. I already can upload files quickly and I just need to know how to ran tools quickly. 发件人: Nate Coraor <nate@bx.psu.edu<mailto:nate@bx.psu.edu>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> 抄送: "galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>" <galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>> 发送日期: 2012年12月4日, 星期二, 下午 9:38 主题: Re: [galaxy-dev] Speed up the galaxy On Dec 4, 2012, at 4:36 AM, 泽 蔡 wrote:
Hi all,
How can I speed up the galaxy? Like how to use more cores and memeries.
Hi, This is a pretty broad question. However, I would recommend that you start at: http://usegalaxy.org/production --nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
You can just change the format of the data from fastq to fastqsanger if you're sure about the error format (use the pencil, then datatype tab) note: fastqsanger !=fastqcsanger Brad On Dec 7, 2012, at 3:20 AM, 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> wrote: Hi Alex Now is an another problem. I now deal with two fastq files, there are Illumina enconding 1.8 and pair-end, so I don't need to groom. But the fact is, I need to use the "filter by quality" and "Fastq interlacer" and without groom thses two tools can not regonize the files. Any idea to solve this problem? 发件人: "Bossers, Alex" <Alex.Bossers@wur.nl<mailto:Alex.Bossers@wur.nl>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> 抄送: "galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>" <galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>> 发送日期: 2012年12月5日, 星期三, 4:02 上午 主题: RE: 回复: [galaxy-dev] 回复: Speed up the galaxy It seems old Illumina encoding 1.5 So yes some tools requiring the fastqsanger would need grooming. If you are up to programming you can seriously speed this up by using a precalculated transfer or hash table. That way you do not have to do any calculation but just translate each quality line using generic regexp/grep/sed like tools or $seq =~ t/STARTSCORES/SANGERSCORES/g in perl. The wiki table could be an alternative option. You still have to set the upload type to fastsanger likely since it will probably sniff the header and see its fastq but not 1.8+ encoding. Good luck! Alex ________________________________ Van: 泽 蔡 [caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>] Verzonden: dinsdag 4 december 2012 16:41 To: Bossers, Alex Cc: galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu> Onderwerp: 回复: [galaxy-dev] 回复: Speed up the galaxy Hi Alex I look the page of wikipedia, but I have a little confused. We sequenced with solexa. I paste a read of my data, can you tell me whether I need run Groom? The read is like this:@HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 AGAAGTACATCGCGATGCCGTTNCCNNCGAAGGCGATAGNNNACAAGNCCAAATGNTTCTNCATCNNNCNCGAGNNGNCGAGGNCGCCGTGCGACCCTGC +HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 Ya^a`edddeddc\c`a`dc]\Ba^BBZ]ZZ`ZZZ]a]]BBB^[`\UB_V[V\`ZBSZX^BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 发件人: "Bossers, Alex" <Alex.Bossers@wur.nl<mailto:Alex.Bossers@wur.nl>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> 抄送: "galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>" <galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>>; Nate Coraor <nate@bx.psu.edu<mailto:nate@bx.psu.edu>> 发送日期: 2012年12月4日, 星期二, 下午 11:21 主题: RE: [galaxy-dev] 回复: Speed up the galaxy Hi I presume the best way to optimise your current problem is to evaluate whether you really need to groom your data!? If its old data presumably yes, but if it is recent data in Illumina 1.8+ encoding(http://en.wikipedia.org/wiki/FASTQ_format) it is not necessary.... speedup 100% :-) Groom would take long at our servers as well but due to the new Illumina format we didn’t bother to optimise it further by parallelisation for instance... Alex Van: galaxy-dev-bounces@lists.bx.psu.edu<mailto:galaxy-dev-bounces@lists.bx.psu.edu> [mailto:galaxy-dev-bounces@lists.bx.psu.edu<mailto:galaxy-dev-bounces@lists.bx.psu.edu>] Namens ? ? Verzonden: dinsdag 4 december 2012 16:08 Aan: Nate Coraor CC: galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu> Onderwerp: [galaxy-dev] 回复: Speed up the galaxy Hi, I read the page of the document. But I don't think there is anything I need. Now, I installed a local instance of galaxy, and I just run it on a PC. I need to deal with large data, if I just use the default configure of galaxy, every task would take long time. For example I ran the FASTQ Groomer with a large file, the precedure is so slow, and galaxy would not use the potential of my machine. So I want to know how can I get galaxy ran faster. I already can upload files quickly and I just need to know how to ran tools quickly. 发件人: Nate Coraor <nate@bx.psu.edu<mailto:nate@bx.psu.edu><mailto:nate@bx.psu.edu<mailto:nate@bx.psu.edu>>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn><mailto:caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>>> 抄送: "galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu><mailto:galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>>" <galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu><mailto:galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>>> 发送日期: 2012年12月4日, 星期二, 下午 9:38 主题: Re: [galaxy-dev] Speed up the galaxy On Dec 4, 2012, at 4:36 AM, 泽 蔡 wrote:
Hi all,
How can I speed up the galaxy? Like how to use more cores and memeries.
Hi, This is a pretty broad question. However, I would recommend that you start at: http://usegalaxy.org/production --nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Brad Langhorst langhorst@neb.com<mailto:langhorst@neb.com>
Hi Brad, I'm sorry, I can not quite follow you. The "pencil" and the "datatype tab" are tools in galaxy? ________________________________ 发件人: "Langhorst, Brad" <Langhorst@neb.com> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn> 抄送: "galaxy-dev@lists.bx.psu.edu Dev" <galaxy-dev@lists.bx.psu.edu> 发送日期: 2012年12月7日, 星期五, 7:53 下午 主题: Re: [galaxy-dev] 回复: 回复: 回复: Speed up the galaxy You can just change the format of the data from fastq to fastqsanger if you're sure about the error format (use the pencil, then datatype tab) note: fastqsanger !=fastqcsanger Brad On Dec 7, 2012, at 3:20 AM, 泽 蔡 <caizexi123@yahoo.com.cn> wrote: Hi Alex
Now is an another problem. I now deal with two fastq files, there are Illumina enconding 1.8 and pair-end, so I don't need to groom. But the fact is, I need to use the "filter by quality" and "Fastq interlacer" and without groom thses two tools can not regonize the files. Any idea to solve this problem?
________________________________ 发件人: "Bossers, Alex" <Alex.Bossers@wur.nl> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn> 抄送: "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu> 发送日期: 2012年12月5日, 星期三, 4:02 上午 主题: RE: 回复: [galaxy-dev] 回复: Speed up the galaxy
It seems old Illumina encoding 1.5
So yes some tools requiring the fastqsanger would need grooming.
If you are up to programming you can seriously speed this up by using a precalculated transfer or hash table.
That way you do not have to do any calculation but just translate each quality line using generic regexp/grep/sed like tools or $seq =~ t/STARTSCORES/SANGERSCORES/g in perl.
The wiki table could be an alternative option. You still have to set the upload type to fastsanger likely since it will probably sniff the header and see its fastq but not 1.8+ encoding.
Good luck!
Alex
________________________________ Van: 泽 蔡 [caizexi123@yahoo.com.cn] Verzonden: dinsdag 4 december 2012 16:41 To: Bossers, Alex Cc: mailto:galaxy-dev@lists.bx.psu.edu Onderwerp: 回复: [galaxy-dev] 回复: Speed up the galaxy
Hi Alex
I look the page of wikipedia, but I have a little confused. We sequenced with solexa. I paste a read of my data, can you tell me whether I need run Groom? The read is like this:@HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 AGAAGTACATCGCGATGCCGTTNCCNNCGAAGGCGATAGNNNACAAGNCCAAATGNTTCTNCATCNNNCNCGAGNNGNCGAGGNCGCCGTGCGACCCTGC +HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 Ya^a`edddeddc\c`a`dc]\Ba^BBZ]ZZ`ZZZ]a]]BBB^[`\UB_V[V\`ZBSZX^BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
发件人: "Bossers, Alex" <Alex.Bossers@wur.nl> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn> 抄送: "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu>; Nate Coraor <nate@bx.psu.edu> 发送日期: 2012年12月4日, 星期二, 下午 11:21 主题: RE: [galaxy-dev] 回复: Speed up the galaxy
Hi
I presume the best way to optimise your current problem is to evaluate whether you really need to groom your data!? If its old data presumably yes, but if it is recent data in Illumina 1.8+ encoding(http://en.wikipedia.org/wiki/FASTQ_format) it is not necessary.... speedup 100% :-) Groom would take long at our servers as well but due to the new Illumina format we didn’t bother to optimise it further by parallelisation for instance...
Alex
Van: mailto:galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] Namens ? ? Verzonden: dinsdag 4 december 2012 16:08 Aan: Nate Coraor CC: mailto:galaxy-dev@lists.bx.psu.edu Onderwerp: [galaxy-dev] 回复: Speed up the galaxy
Hi,
I read the page of the document. But I don't think there is anything I need. Now, I installed a local instance of galaxy, and I just run it on a PC. I need to deal with large data, if I just use the default configure of galaxy, every task would take long time. For example I ran the FASTQ Groomer with a large file, the precedure is so slow, and galaxy would not use the potential of my machine. So I want to know how can I get galaxy ran faster. I already can upload files quickly and I just need to know how to ran tools quickly. 发件人: Nate Coraor <nate@bx.psu.edu<mailto:nate@bx.psu.edu>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> 抄送: "galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>" <galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>> 发送日期: 2012年12月4日, 星期二, 下午 9:38 主题: Re: [galaxy-dev] Speed up the galaxy
On Dec 4, 2012, at 4:36 AM, 泽 蔡 wrote:
Hi all,
How can I speed up the galaxy? Like how to use more cores and memeries.
Hi,
This is a pretty broad question. However, I would recommend that you start at:
http://usegalaxy.org/production
--nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________
Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Brad Langhorst langhorst@neb.com
[cid:06C774DC-E38E-480D-9C05-E0B49E9BC5BB@home.langhorst.com] see that pencil icon - that lets you change the settings for a given dataset. then you'll see a datatype tab Brad On Dec 7, 2012, at 7:23 AM, 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> wrote: Hi Brad, I'm sorry, I can not quite follow you. The "pencil" and the "datatype tab" are tools in galaxy? 发件人: "Langhorst, Brad" <Langhorst@neb.com<mailto:Langhorst@neb.com>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> 抄送: "galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu> Dev" <galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>> 发送日期: 2012年12月7日, 星期五, 7:53 下午 主题: Re: [galaxy-dev] 回复: 回复: 回复: Speed up the galaxy You can just change the format of the data from fastq to fastqsanger if you're sure about the error format (use the pencil, then datatype tab) note: fastqsanger !=fastqcsanger Brad On Dec 7, 2012, at 3:20 AM, 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> wrote: Hi Alex Now is an another problem. I now deal with two fastq files, there are Illumina enconding 1.8 and pair-end, so I don't need to groom. But the fact is, I need to use the "filter by quality" and "Fastq interlacer" and without groom thses two tools can not regonize the files. Any idea to solve this problem? 发件人: "Bossers, Alex" <Alex.Bossers@wur.nl<mailto:Alex.Bossers@wur.nl>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> 抄送: "galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>" <galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>> 发送日期: 2012年12月5日, 星期三, 4:02 上午 主题: RE: 回复: [galaxy-dev] 回复: Speed up the galaxy It seems old Illumina encoding 1.5 So yes some tools requiring the fastqsanger would need grooming. If you are up to programming you can seriously speed this up by using a precalculated transfer or hash table. That way you do not have to do any calculation but just translate each quality line using generic regexp/grep/sed like tools or $seq =~ t/STARTSCORES/SANGERSCORES/g in perl. The wiki table could be an alternative option. You still have to set the upload type to fastsanger likely since it will probably sniff the header and see its fastq but not 1.8+ encoding. Good luck! Alex ________________________________ Van: 泽 蔡 [caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>] Verzonden: dinsdag 4 december 2012 16:41 To: Bossers, Alex Cc: mailto:galaxy-dev@lists.bx.psu.edu Onderwerp: 回复: [galaxy-dev] 回复: Speed up the galaxy Hi Alex I look the page of wikipedia, but I have a little confused. We sequenced with solexa. I paste a read of my data, can you tell me whether I need run Groom? The read is like this:@HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 AGAAGTACATCGCGATGCCGTTNCCNNCGAAGGCGATAGNNNACAAGNCCAAATGNTTCTNCATCNNNCNCGAGNNGNCGAGGNCGCCGTGCGACCCTGC +HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 Ya^a`edddeddc\c`a`dc]\Ba^BBZ]ZZ`ZZZ]a]]BBB^[`\UB_V[V\`ZBSZX^BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 发件人: "Bossers, Alex" <Alex.Bossers@wur.nl<mailto:Alex.Bossers@wur.nl>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> 抄送: "galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>" <galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>>; Nate Coraor <nate@bx.psu.edu<mailto:nate@bx.psu.edu>> 发送日期: 2012年12月4日, 星期二, 下午 11:21 主题: RE: [galaxy-dev] 回复: Speed up the galaxy Hi I presume the best way to optimise your current problem is to evaluate whether you really need to groom your data!? If its old data presumably yes, but if it is recent data in Illumina 1.8+ encoding(http://en.wikipedia.org/wiki/FASTQ_format) it is not necessary.... speedup 100% :-) Groom would take long at our servers as well but due to the new Illumina format we didn’t bother to optimise it further by parallelisation for instance... Alex Van: mailto:galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu<mailto:galaxy-dev-bounces@lists.bx.psu.edu>] Namens ? ? Verzonden: dinsdag 4 december 2012 16:08 Aan: Nate Coraor CC: mailto:galaxy-dev@lists.bx.psu.edu Onderwerp: [galaxy-dev] 回复: Speed up the galaxy Hi, I read the page of the document. But I don't think there is anything I need. Now, I installed a local instance of galaxy, and I just run it on a PC. I need to deal with large data, if I just use the default configure of galaxy, every task would take long time. For example I ran the FASTQ Groomer with a large file, the precedure is so slow, and galaxy would not use the potential of my machine. So I want to know how can I get galaxy ran faster. I already can upload files quickly and I just need to know how to ran tools quickly. 发件人: Nate Coraor <nate@bx.psu.edu<mailto:nate@bx.psu.edu><mailto:nate@bx.psu.edu<mailto:nate@bx.psu.edu>>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn><mailto:caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>>> 抄送: "galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu><mailto:galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>>" <galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu><mailto:galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>>> 发送日期: 2012年12月4日, 星期二, 下午 9:38 主题: Re: [galaxy-dev] Speed up the galaxy On Dec 4, 2012, at 4:36 AM, 泽 蔡 wrote:
Hi all,
How can I speed up the galaxy? Like how to use more cores and memeries.
Hi, This is a pretty broad question. However, I would recommend that you start at: http://usegalaxy.org/production --nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Brad Langhorst langhorst@neb.com<mailto:langhorst@neb.com> -- Brad Langhorst langhorst@neb.com<mailto:langhorst@neb.com>
Hi Brad, Galaxy Support- Besides changing them individually, is there a way to modify the settings for a large number files from fastq to fastqsanger in batch? Either via the bioblend or manually via the UI? Thanks, Dave On Fri, Dec 7, 2012 at 3:53 AM, Langhorst, Brad <Langhorst@neb.com> wrote:
You can just change the format of the data from fastq to fastqsanger if you're sure about the error format (use the pencil, then datatype tab) note: fastqsanger !=fastqcsanger
Brad
On Dec 7, 2012, at 3:20 AM, 泽 蔡 <caizexi123@yahoo.com.cn> wrote:
Hi Alex
Now is an another problem. I now deal with two fastq files, there are Illumina enconding 1.8 and pair-end, so I don't need to groom. But the fact is, I need to use the "filter by quality" and "Fastq interlacer" and without groom thses two tools can not regonize the files. Any idea to solve this problem?
*发件人:* "Bossers, Alex" <Alex.Bossers@wur.nl> *收件人:* 泽 蔡 <caizexi123@yahoo.com.cn> *抄送:* "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu> *发送日期:* 2012年12月5日, 星期三, 4:02 上午 *主题:* RE: 回复: [galaxy-dev] 回复: Speed up the galaxy
It seems old Illumina encoding 1.5
So yes some tools requiring the fastqsanger would need grooming.
If you are up to programming you can seriously speed this up by using a precalculated transfer or hash table.
That way you do not have to do any calculation but just translate each quality line using generic regexp/grep/sed like tools or $seq =~ t/STARTSCORES/SANGERSCORES/g in perl.
The wiki table could be an alternative option. You still have to set the upload type to fastsanger likely since it will probably sniff the header and see its fastq but not 1.8+ encoding.
Good luck!
Alex
________________________________ Van: 泽 蔡 [caizexi123@yahoo.com.cn] Verzonden: dinsdag 4 december 2012 16:41 To: Bossers, Alex Cc: galaxy-dev@lists.bx.psu.edu Onderwerp: 回复: [galaxy-dev] 回复: Speed up the galaxy
Hi Alex
I look the page of wikipedia, but I have a little confused. We sequenced with solexa. I paste a read of my data, can you tell me whether I need run Groom? The read is like this:@HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1
AGAAGTACATCGCGATGCCGTTNCCNNCGAAGGCGATAGNNNACAAGNCCAAATGNTTCTNCATCNNNCNCGAGNNGNCGAGGNCGCCGTGCGACCCTGC +HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1
Ya^a`edddeddc\c`a`dc]\Ba^BBZ]ZZ`ZZZ]a]]BBB^[`\UB_V[V\`ZBSZX^BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
发件人: "Bossers, Alex" <Alex.Bossers@wur.nl> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn> 抄送: "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu>; Nate Coraor <nate@bx.psu.edu> 发送日期: 2012年12月4日, 星期二, 下午 11:21 主题: RE: [galaxy-dev] 回复: Speed up the galaxy
Hi
I presume the best way to optimise your current problem is to evaluate whether you really need to groom your data!? If its old data presumably yes, but if it is recent data in Illumina 1.8+ encoding( http://en.wikipedia.org/wiki/FASTQ_format) it is not necessary.... speedup 100% :-) Groom would take long at our servers as well but due to the new Illumina format we didn’t bother to optimise it further by parallelisation for instance...
Alex
Van: galaxy-dev-bounces@lists.bx.psu.edu [mailto: galaxy-dev-bounces@lists.bx.psu.edu] Namens ? ? Verzonden: dinsdag 4 december 2012 16:08 Aan: Nate Coraor CC: galaxy-dev@lists.bx.psu.edu Onderwerp: [galaxy-dev] 回复: Speed up the galaxy
Hi,
I read the page of the document. But I don't think there is anything I need. Now, I installed a local instance of galaxy, and I just run it on a PC. I need to deal with large data, if I just use the default configure of galaxy, every task would take long time. For example I ran the FASTQ Groomer with a large file, the precedure is so slow, and galaxy would not use the potential of my machine. So I want to know how can I get galaxy ran faster. I already can upload files quickly and I just need to know how to ran tools quickly. 发件人: Nate Coraor <nate@bx.psu.edu<mailto:nate@bx.psu.edu>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> 抄送: "galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>" < galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>> 发送日期: 2012年12月4日, 星期二, 下午 9:38 主题: Re: [galaxy-dev] Speed up the galaxy
On Dec 4, 2012, at 4:36 AM, 泽 蔡 wrote:
Hi all,
How can I speed up the galaxy? Like how to use more cores and memeries.
Hi,
This is a pretty broad question. However, I would recommend that you start at:
http://usegalaxy.org/production
--nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Brad Langhorst langhorst@neb.com
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi dave: The fastq groomer tool will convert your fastq files (of unknown base quality scale) to fastqsanger. Are you sure these files are not already sanger scaled? Modern illumina pipelines produce fastqsanger files. If you do know the scale, just import the files explictly as fastqsanger (not fastq). Galaxy team - maybe the sniffer should be smarter about guessing the file format… If you need to convert a large number you could set up a galaxy workflow with a single fastq groomer step. That would allow you to start the job on many fastq files at once. Brad On Apr 25, 2013, at 12:14 AM, Dave Lin <dave@verdematics.com<mailto:dave@verdematics.com>> wrote: Hi Brad, Galaxy Support- Besides changing them individually, is there a way to modify the settings for a large number files from fastq to fastqsanger in batch? Either via the bioblend or manually via the UI? Thanks, Dave On Fri, Dec 7, 2012 at 3:53 AM, Langhorst, Brad <Langhorst@neb.com<mailto:Langhorst@neb.com>> wrote: You can just change the format of the data from fastq to fastqsanger if you're sure about the error format (use the pencil, then datatype tab) note: fastqsanger !=fastqcsanger Brad On Dec 7, 2012, at 3:20 AM, 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> wrote: Hi Alex Now is an another problem. I now deal with two fastq files, there are Illumina enconding 1.8 and pair-end, so I don't need to groom. But the fact is, I need to use the "filter by quality" and "Fastq interlacer" and without groom thses two tools can not regonize the files. Any idea to solve this problem? 发件人: "Bossers, Alex" <Alex.Bossers@wur.nl<mailto:Alex.Bossers@wur.nl>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> 抄送: "galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>" <galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>> 发送日期: 2012年12月5日, 星期三, 4:02 上午 主题: RE: 回复: [galaxy-dev] 回复: Speed up the galaxy It seems old Illumina encoding 1.5 So yes some tools requiring the fastqsanger would need grooming. If you are up to programming you can seriously speed this up by using a precalculated transfer or hash table. That way you do not have to do any calculation but just translate each quality line using generic regexp/grep/sed like tools or $seq =~ t/STARTSCORES/SANGERSCORES/g in perl. The wiki table could be an alternative option. You still have to set the upload type to fastsanger likely since it will probably sniff the header and see its fastq but not 1.8+ encoding. Good luck! Alex ________________________________ Van: 泽 蔡 [caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>] Verzonden: dinsdag 4 december 2012 16:41 To: Bossers, Alex Cc: galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu> Onderwerp: 回复: [galaxy-dev] 回复: Speed up the galaxy Hi Alex I look the page of wikipedia, but I have a little confused. We sequenced with solexa. I paste a read of my data, can you tell me whether I need run Groom? The read is like this:@HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 AGAAGTACATCGCGATGCCGTTNCCNNCGAAGGCGATAGNNNACAAGNCCAAATGNTTCTNCATCNNNCNCGAGNNGNCGAGGNCGCCGTGCGACCCTGC +HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 Ya^a`edddeddc\c`a`dc]\Ba^BBZ]ZZ`ZZZ]a]]BBB^[`\UB_V[V\`ZBSZX^BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 发件人: "Bossers, Alex" <Alex.Bossers@wur.nl<mailto:Alex.Bossers@wur.nl>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> 抄送: "galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>" <galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>>; Nate Coraor <nate@bx.psu.edu<mailto:nate@bx.psu.edu>> 发送日期: 2012年12月4日, 星期二, 下午 11:21 主题: RE: [galaxy-dev] 回复: Speed up the galaxy Hi I presume the best way to optimise your current problem is to evaluate whether you really need to groom your data!? If its old data presumably yes, but if it is recent data in Illumina 1.8+ encoding(http://en.wikipedia.org/wiki/FASTQ_format) it is not necessary.... speedup 100% :-) Groom would take long at our servers as well but due to the new Illumina format we didn’t bother to optimise it further by parallelisation for instance... Alex Van: galaxy-dev-bounces@lists.bx.psu.edu<mailto:galaxy-dev-bounces@lists.bx.psu.edu> [mailto:galaxy-dev-bounces@lists.bx.psu.edu<mailto:galaxy-dev-bounces@lists.bx.psu.edu>] Namens ? ? Verzonden: dinsdag 4 december 2012 16:08 Aan: Nate Coraor CC: galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu> Onderwerp: [galaxy-dev] 回复: Speed up the galaxy Hi, I read the page of the document. But I don't think there is anything I need. Now, I installed a local instance of galaxy, and I just run it on a PC. I need to deal with large data, if I just use the default configure of galaxy, every task would take long time. For example I ran the FASTQ Groomer with a large file, the precedure is so slow, and galaxy would not use the potential of my machine. So I want to know how can I get galaxy ran faster. I already can upload files quickly and I just need to know how to ran tools quickly. 发件人: Nate Coraor <nate@bx.psu.edu<mailto:nate@bx.psu.edu><mailto:nate@bx.psu.edu<mailto:nate@bx.psu.edu>>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn><mailto:caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>>> 抄送: "galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu><mailto:galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>>" <galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu><mailto:galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>>> 发送日期: 2012年12月4日, 星期二, 下午 9:38 主题: Re: [galaxy-dev] Speed up the galaxy On Dec 4, 2012, at 4:36 AM, 泽 蔡 wrote:
Hi all,
How can I speed up the galaxy? Like how to use more cores and memeries.
Hi, This is a pretty broad question. However, I would recommend that you start at: http://usegalaxy.org/production --nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Brad Langhorst langhorst@neb.com<mailto:langhorst@neb.com> ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Brad Langhorst langhorst@neb.com<mailto:langhorst@neb.com>
Hi Brad- In the past, I had always been running the time-consuming fastqgroomer step. However, based on this thread, I now realize this step is sometimes unnecessary. Here is my question. I have a large number of fastq data files that are already in data libaries. They were imported in "fastq" format. I was trying to figure out if there is an easy way to change the metadata to "fastqsanger" in batch, without having to do it manually, file by file. One approach is to recreate another library -- and reimport them with the correct "fastqsanger" format selected. However, I was trying to see if there was an existing feature that allows me to select a bunch of files and then edit the metadata. Thanks, Dave On Thu, Apr 25, 2013 at 1:58 AM, Langhorst, Brad <Langhorst@neb.com> wrote:
Hi dave:
The fastq groomer tool will convert your fastq files (of unknown base quality scale) to fastqsanger. Are you sure these files are not already sanger scaled? Modern illumina pipelines produce fastqsanger files.
If you do know the scale, just import the files explictly as fastqsanger (not fastq).
Galaxy team - maybe the sniffer should be smarter about guessing the file format…
If you need to convert a large number you could set up a galaxy workflow with a single fastq groomer step. That would allow you to start the job on many fastq files at once.
Brad
On Apr 25, 2013, at 12:14 AM, Dave Lin <dave@verdematics.com> wrote:
Hi Brad, Galaxy Support-
Besides changing them individually, is there a way to modify the settings for a large number files from fastq to fastqsanger in batch? Either via the bioblend or manually via the UI?
Thanks, Dave
On Fri, Dec 7, 2012 at 3:53 AM, Langhorst, Brad <Langhorst@neb.com> wrote:
You can just change the format of the data from fastq to fastqsanger if you're sure about the error format (use the pencil, then datatype tab) note: fastqsanger !=fastqcsanger
Brad
On Dec 7, 2012, at 3:20 AM, 泽 蔡 <caizexi123@yahoo.com.cn> wrote:
Hi Alex
Now is an another problem. I now deal with two fastq files, there are Illumina enconding 1.8 and pair-end, so I don't need to groom. But the fact is, I need to use the "filter by quality" and "Fastq interlacer" and without groom thses two tools can not regonize the files. Any idea to solve this problem?
*发件人:* "Bossers, Alex" <Alex.Bossers@wur.nl> *收件人:* 泽 蔡 <caizexi123@yahoo.com.cn> *抄送:* "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu> *发送日期:* 2012年12月5日, 星期三, 4:02 上午 *主题:* RE: 回复: [galaxy-dev] 回复: Speed up the galaxy
It seems old Illumina encoding 1.5
So yes some tools requiring the fastqsanger would need grooming.
If you are up to programming you can seriously speed this up by using a precalculated transfer or hash table.
That way you do not have to do any calculation but just translate each quality line using generic regexp/grep/sed like tools or $seq =~ t/STARTSCORES/SANGERSCORES/g in perl.
The wiki table could be an alternative option. You still have to set the upload type to fastsanger likely since it will probably sniff the header and see its fastq but not 1.8+ encoding.
Good luck!
Alex
________________________________ Van: 泽 蔡 [caizexi123@yahoo.com.cn] Verzonden: dinsdag 4 december 2012 16:41 To: Bossers, Alex Cc: galaxy-dev@lists.bx.psu.edu Onderwerp: 回复: [galaxy-dev] 回复: Speed up the galaxy
Hi Alex
I look the page of wikipedia, but I have a little confused. We sequenced with solexa. I paste a read of my data, can you tell me whether I need run Groom? The read is like this:@HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1
AGAAGTACATCGCGATGCCGTTNCCNNCGAAGGCGATAGNNNACAAGNCCAAATGNTTCTNCATCNNNCNCGAGNNGNCGAGGNCGCCGTGCGACCCTGC +HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1
Ya^a`edddeddc\c`a`dc]\Ba^BBZ]ZZ`ZZZ]a]]BBB^[`\UB_V[V\`ZBSZX^BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
发件人: "Bossers, Alex" <Alex.Bossers@wur.nl> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn> 抄送: "galaxy-dev@lists.bx.psu.edu" <galaxy-dev@lists.bx.psu.edu>; Nate Coraor <nate@bx.psu.edu> 发送日期: 2012年12月4日, 星期二, 下午 11:21 主题: RE: [galaxy-dev] 回复: Speed up the galaxy
Hi
I presume the best way to optimise your current problem is to evaluate whether you really need to groom your data!? If its old data presumably yes, but if it is recent data in Illumina 1.8+ encoding( http://en.wikipedia.org/wiki/FASTQ_format) it is not necessary.... speedup 100% :-) Groom would take long at our servers as well but due to the new Illumina format we didn’t bother to optimise it further by parallelisation for instance...
Alex
Van: galaxy-dev-bounces@lists.bx.psu.edu [mailto: galaxy-dev-bounces@lists.bx.psu.edu] Namens ? ? Verzonden: dinsdag 4 december 2012 16:08 Aan: Nate Coraor CC: galaxy-dev@lists.bx.psu.edu Onderwerp: [galaxy-dev] 回复: Speed up the galaxy
Hi,
I read the page of the document. But I don't think there is anything I need. Now, I installed a local instance of galaxy, and I just run it on a PC. I need to deal with large data, if I just use the default configure of galaxy, every task would take long time. For example I ran the FASTQ Groomer with a large file, the precedure is so slow, and galaxy would not use the potential of my machine. So I want to know how can I get galaxy ran faster. I already can upload files quickly and I just need to know how to ran tools quickly. 发件人: Nate Coraor <nate@bx.psu.edu<mailto:nate@bx.psu.edu>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> 抄送: "galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>" < galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>> 发送日期: 2012年12月4日, 星期二, 下午 9:38 主题: Re: [galaxy-dev] Speed up the galaxy
On Dec 4, 2012, at 4:36 AM, 泽 蔡 wrote:
Hi all,
How can I speed up the galaxy? Like how to use more cores and memeries.
Hi,
This is a pretty broad question. However, I would recommend that you start at:
http://usegalaxy.org/production
--nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Brad Langhorst langhorst@neb.com
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
-- Brad Langhorst langhorst@neb.com
Hi Dave: Maybe some one knows a better way… I would probably muck around in the database directly if there were a lot of files. One could probably move them all into a history, groom them, then pull the data back into your libraries, deleting the old ones. Brad On Apr 25, 2013, at 4:58 AM, Brad Langhorst <langhorst@neb.com<mailto:langhorst@neb.com>> wrote: Hi dave: The fastq groomer tool will convert your fastq files (of unknown base quality scale) to fastqsanger. Are you sure these files are not already sanger scaled? Modern illumina pipelines produce fastqsanger files. If you do know the scale, just import the files explictly as fastqsanger (not fastq). Galaxy team - maybe the sniffer should be smarter about guessing the file format… If you need to convert a large number you could set up a galaxy workflow with a single fastq groomer step. That would allow you to start the job on many fastq files at once. Brad On Apr 25, 2013, at 12:14 AM, Dave Lin <dave@verdematics.com<mailto:dave@verdematics.com>> wrote: Hi Brad, Galaxy Support- Besides changing them individually, is there a way to modify the settings for a large number files from fastq to fastqsanger in batch? Either via the bioblend or manually via the UI? Thanks, Dave On Fri, Dec 7, 2012 at 3:53 AM, Langhorst, Brad <Langhorst@neb.com<mailto:Langhorst@neb.com>> wrote: You can just change the format of the data from fastq to fastqsanger if you're sure about the error format (use the pencil, then datatype tab) note: fastqsanger !=fastqcsanger Brad On Dec 7, 2012, at 3:20 AM, 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> wrote: Hi Alex Now is an another problem. I now deal with two fastq files, there are Illumina enconding 1.8 and pair-end, so I don't need to groom. But the fact is, I need to use the "filter by quality" and "Fastq interlacer" and without groom thses two tools can not regonize the files. Any idea to solve this problem? 发件人: "Bossers, Alex" <Alex.Bossers@wur.nl<mailto:Alex.Bossers@wur.nl>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> 抄送: "galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>" <galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>> 发送日期: 2012年12月5日, 星期三, 4:02 上午 主题: RE: 回复: [galaxy-dev] 回复: Speed up the galaxy It seems old Illumina encoding 1.5 So yes some tools requiring the fastqsanger would need grooming. If you are up to programming you can seriously speed this up by using a precalculated transfer or hash table. That way you do not have to do any calculation but just translate each quality line using generic regexp/grep/sed like tools or $seq =~ t/STARTSCORES/SANGERSCORES/g in perl. The wiki table could be an alternative option. You still have to set the upload type to fastsanger likely since it will probably sniff the header and see its fastq but not 1.8+ encoding. Good luck! Alex ________________________________ Van: 泽 蔡 [caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>] Verzonden: dinsdag 4 december 2012 16:41 To: Bossers, Alex Cc: galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu> Onderwerp: 回复: [galaxy-dev] 回复: Speed up the galaxy Hi Alex I look the page of wikipedia, but I have a little confused. We sequenced with solexa. I paste a read of my data, can you tell me whether I need run Groom? The read is like this:@HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 AGAAGTACATCGCGATGCCGTTNCCNNCGAAGGCGATAGNNNACAAGNCCAAATGNTTCTNCATCNNNCNCGAGNNGNCGAGGNCGCCGTGCGACCCTGC +HWUSI-EAS1734_0003_FC620JEAAXX:8:1:1174:9013#0/1 Ya^a`edddeddc\c`a`dc]\Ba^BBZ]ZZ`ZZZ]a]]BBB^[`\UB_V[V\`ZBSZX^BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB 发件人: "Bossers, Alex" <Alex.Bossers@wur.nl<mailto:Alex.Bossers@wur.nl>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>> 抄送: "galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>" <galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu>>; Nate Coraor <nate@bx.psu.edu<mailto:nate@bx.psu.edu>> 发送日期: 2012年12月4日, 星期二, 下午 11:21 主题: RE: [galaxy-dev] 回复: Speed up the galaxy Hi I presume the best way to optimise your current problem is to evaluate whether you really need to groom your data!? If its old data presumably yes, but if it is recent data in Illumina 1.8+ encoding(http://en.wikipedia.org/wiki/FASTQ_format) it is not necessary.... speedup 100% :-) Groom would take long at our servers as well but due to the new Illumina format we didn’t bother to optimise it further by parallelisation for instance... Alex Van: galaxy-dev-bounces@lists.bx.psu.edu<mailto:galaxy-dev-bounces@lists.bx.psu.edu> [mailto:galaxy-dev-bounces@lists.bx.psu.edu<mailto:galaxy-dev-bounces@lists.bx.psu.edu>] Namens ? ? Verzonden: dinsdag 4 december 2012 16:08 Aan: Nate Coraor CC: galaxy-dev@lists.bx.psu.edu<mailto:galaxy-dev@lists.bx.psu.edu> Onderwerp: [galaxy-dev] 回复: Speed up the galaxy Hi, I read the page of the document. But I don't think there is anything I need. Now, I installed a local instance of galaxy, and I just run it on a PC. I need to deal with large data, if I just use the default configure of galaxy, every task would take long time. For example I ran the FASTQ Groomer with a large file, the precedure is so slow, and galaxy would not use the potential of my machine. So I want to know how can I get galaxy ran faster. I already can upload files quickly and I just need to know how to ran tools quickly. 发件人: Nate Coraor <nate@bx.psu.edu<mailto:nate@bx.psu.edu><mailto:nate@bx.psu.edu<mailto:nate@bx.psu.edu>>> 收件人: 泽 蔡 <caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn><mailto:caizexi123@yahoo.com.cn<mailto:caizexi123@yahoo.com.cn>>> 抄送: "galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu><mailto:galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>>" <galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu><mailto:galaxy-dev@bx.psu.edu<mailto:galaxy-dev@bx.psu.edu>>> 发送日期: 2012年12月4日, 星期二, 下午 9:38 主题: Re: [galaxy-dev] Speed up the galaxy On Dec 4, 2012, at 4:36 AM, 泽 蔡 wrote:
Hi all,
How can I speed up the galaxy? Like how to use more cores and memeries.
Hi, This is a pretty broad question. However, I would recommend that you start at: http://usegalaxy.org/production --nate
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Brad Langhorst langhorst@neb.com<mailto:langhorst@neb.com> ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Brad Langhorst langhorst@neb.com<mailto:langhorst@neb.com> -- Brad Langhorst langhorst@neb.com<mailto:langhorst@neb.com>
participants (6)
-
Bossers, Alex
-
Dave Lin
-
David Roquis
-
Langhorst, Brad
-
Nate Coraor
-
泽 蔡