Hi, I am new to this and I hope someone can help. I have pooled sequencing data that I am trying to analyse using Galaxy. I've done quite a bit of online searching and it seems that FreeBayes should be able to do this, if I select "set population", click the "Assume that samples result from pooled sequencing" option and change the ploidy to nx2 (number of alleles, where n is the number of subjects and the organism is diploid). However, whenever I do this I get an error: usually just "Killed" I was originally setting my polidy rather high (190 as I have 95 subjects pooled), so I wondered if this was the problem, however, it fails if I do a ploidy of only 4 too. I've tried various things to try to see where I am going wrong: All with the same BAM file: Set population model options: Do not set --> works Set population model options: set, Assume that samples result from pooled sequencing: not ticked, Default ploidy for the analysis: 2 --> works Set population model options: set, Assume that samples result from pooled sequencing: ticked, Default ploidy for the analysis: 2 --> works Set population model options: set, Assume that samples result from pooled sequencing: ticked, Default ploidy for the analysis: 4 --> fails (killed) Set population model options: set, Assume that samples result from pooled sequencing: ticked, Default ploidy for the analysis: 10 --> fails (killed) Set population model options: set, Assume that samples result from pooled sequencing: ticked, Default ploidy for the analysis: 10 --> fails (killed) It seems that it is the ploidy part that I am doing wrong, as it works with pooled data but ploidy of 2. I'm sure I have to change the ploidy though, or else how does the program know how many subjects are in the pool? Also, everywhere that I've ready says you have to change the ploidy. I apologise if my question is naive. As I said, I am new to Galaxy and this is the first thing I am trying to do! Any help / suggestions would be appreciated, Thanks, Nic
Hi Nic, Yes, the program is running into a memory issue with this setting (confirmed by reviewing your bug report, thank you!). This is not an issue that is localized to Galaxy or even our server/cluster, but seems to be with the tool itself and it comes up on different systems under different cases when deviating from a ploidy setting of 1 or 2. So, sticking with ploidy = 2 is one option. You might try contacting the tool author at the Freebayes google group for more detailed advice, the link is: https://groups.google.com/forum/#!forum/freebayes Best, Jen Galaxy team On 4/18/13 8:34 AM, Nicola Smith wrote:
Hi,
I am new to this and I hope someone can help. I have pooled sequencing data that I am trying to analyse using Galaxy. I've done quite a bit of online searching and it seems that FreeBayes should be able to do this, if I select "set population", click the "Assume that samples result from pooled sequencing" option and change the ploidy to nx2 (number of alleles, where n is the number of subjects and the organism is diploid).
However, whenever I do this I get an error: usually just "Killed"
I was originally setting my polidy rather high (190 as I have 95 subjects pooled), so I wondered if this was the problem, however, it fails if I do a ploidy of only 4 too. I've tried various things to try to see where I am going wrong:
All with the same BAM file:
Set population model options: Do not set à works
Set population model options: set, Assume that samples result from pooled sequencing: not ticked, Default ploidy for the analysis: 2 à works
Set population model options: set, Assume that samples result from pooled sequencing: ticked, Default ploidy for the analysis: 2 à works
Set population model options: set, Assume that samples result from pooled sequencing: ticked, Default ploidy for the analysis: 4 à fails (killed)
Set population model options: set, Assume that samples result from pooled sequencing: ticked, Default ploidy for the analysis: 10 à fails (killed)
Set population model options: set, Assume that samples result from pooled sequencing: ticked, Default ploidy for the analysis: 10 à fails (killed)
It seems that it is the ploidy part that I am doing wrong, as it works with pooled data but ploidy of 2. I'm sure I have to change the ploidy though, or else how does the program know how many subjects are in the pool? Also, everywhere that I've ready says you have to change the ploidy.
I apologise if my question is naive. As I said, I am new to Galaxy and this is the first thing I am trying to do!
Any help / suggestions would be appreciated,
Thanks,
Nic
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
-- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org
Hi, I wanted to identify that I've resolved this issue. The problem is that the tool must consider the likelihoods of all possible un-phased genotypes with N alleles and M copies across P pools. This becomes quite a big number when N becomes large and M is large, as it might in low-complexity loci and deep pools. See: https://github.com/ekg/freebayes/commit/576bc703c246035342538a0feeecd13c28f3..., and also https://groups.google.com/forum/#!topic/freebayes/R6dReM4sPoQ for a discussion of how this can be dealt with. The --use-best-n-alleles option was previously targeted only for SNPs, which made it ineffective at dealing with the combinatoric expansion as most multiallelic loci contain indels or other kinds of non-SNP variation. In the most recent version this can be set low (e.g. 2 or 3 in your case) to prevent the memory blowup. The current version of freebayes is not currently in Galaxy--- but I am working on getting the most recent version of freebayes available there. Sorry for the troubles. I hope you'll still have a chance to analyze your data with the pooled functionality in freebayes. Erik On Wed, Apr 24, 2013 at 9:55 PM, Jennifer Jackson <jen@bx.psu.edu> wrote:
Hi Nic,
Yes, the program is running into a memory issue with this setting (confirmed by reviewing your bug report, thank you!).
This is not an issue that is localized to Galaxy or even our server/cluster, but seems to be with the tool itself and it comes up on different systems under different cases when deviating from a ploidy setting of 1 or 2. So, sticking with ploidy = 2 is one option.
You might try contacting the tool author at the Freebayes google group for more detailed advice, the link is: https://groups.google.com/forum/#!forum/freebayes
Best,
Jen Galaxy team
On 4/18/13 8:34 AM, Nicola Smith wrote:
Hi,****
** **
I am new to this and I hope someone can help. I have pooled sequencing data that I am trying to analyse using Galaxy. I’ve done quite a bit of online searching and it seems that FreeBayes should be able to do this, if I select “set population”, click the “Assume that samples result from pooled sequencing” option and change the ploidy to nx2 (number of alleles, where n is the number of subjects and the organism is diploid).** **
** **
However, whenever I do this I get an error: usually just “Killed”****
** **
I was originally setting my polidy rather high (190 as I have 95 subjects pooled), so I wondered if this was the problem, however, it fails if I do a ploidy of only 4 too. I’ve tried various things to try to see where I am going wrong:****
** **
All with the same BAM file:****
** **
Set population model options: Do not set à works****
** **
Set population model options: set, Assume that samples result from pooled sequencing: not ticked, Default ploidy for the analysis: 2 à works****
** **
Set population model options: set, Assume that samples result from pooled sequencing: ticked, Default ploidy for the analysis: 2 à works****
** **
Set population model options: set, Assume that samples result from pooled sequencing: ticked, Default ploidy for the analysis: 4 à fails (killed)*** *
! &nbs p; ****
Set population model options: set, Assume that samples result from pooled sequencing: ticked, Default ploidy for the analysis: 10 à fails (killed)** **
** **
Set population model options: set, Assume that samples result from pooled sequencing: ticked, Default ploidy for the analysis: 10 à fails (killed)** **
** **
It seems that it is the ploidy part that I am doing wrong, as it works with pooled data but ploidy of 2. I’m sure I have to change the ploidy though, or else how does the program know how many subjects are in the pool? Also, everywhere that I’ve ready says you have to change the ploidy. ****
** **
I apologise if my question is naive. As I said, I am new to Galaxy and this is the first thing I am trying to do!****
** **
Any help / suggestions would be appreciated,****
** **
Thanks,****
Nic****
** **
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
-- Jennifer Hillman-Jackson Galaxy Support and Traininghttp://galaxyproject.org
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at:
participants (3)
-
Erik Garrison
-
Jennifer Jackson
-
Nicola Smith