August 2013 - galaxy-user - lists.galaxyproject.org

Fwd: HCC Analysis Pipeline
by Enis Afgan 05 Aug '13

05 Aug '13

Hi Moritz, I'm forwarding your message to galaxy-user mailing list and Jennifer from Galaxy. Between the two, you should be able to get some help because unfortunately this is not my area of expertise so I'm afraid I won't be of much help. Good luck, Enis ---------- Forwarded message ---------- From: Moritz Juchler <juchler(a)stud.uni-heidelberg.de> Date: Fri, Aug 2, 2013 at 6:52 PM Subject: Re: HCC Analysis Pipeline To: enis.afgan(a)irb.hr Sorry Mr. Afgan, I messed up with the history. This is the current one: https://main.g2.bx.psu.edu/u/mj--/h/whole-exome-somatic-gene-mutation-extra… Or you just ignore the older message, I included the message above: Dear Sir Afgan, I am Moritz Juchler from University Heidelberg. I read through your slides for Bioblend and noticed that maybe you could help me with my problem. For my Bachelor thesis I have to setup GalaxyProject to find SNP's in genomes from hcc patients. I have a server on which I installed Galaxy locally, since we have a) very large files (>30GB per patient) and b) the data is protection sensitive. My goal is to do the first 5 steps of this pipeline: http://www.nature.com/ng/journal/v44/n6/extref/ng.2256-S1.pdf (page 2) from this paper: http://www.nature.com/ng/journal/v44/n6/pdf/ng.2256.pdf This is my current workflow: https://main.g2.bx.psu.edu/u/mj--/w/workflow-whole-exome-somatic-gene-mutat… and the matching *executed* history (with results) https://main.g2.bx.psu.edu/u/mj--/h/whole-exome-somatic-gene-mutation-extra… My question is: Am I going in the right direction? I really dont know if I'm doing the correct steps :( Are there any links or papers that explain which specific tool I have to use for the first five steps? Of course if you do not have the time to answer my question, please refer me to someone who can answer the question and maybe has the time for it. Best Moritz On 2 August 2013 18:50, Moritz Juchler <juchler(a)stud.uni-heidelberg.de>wrote: > Dear Sir Afgan, > > I am Moritz Juchler from University Heidelberg. I read through your > slides for Bioblend and noticed that maybe you could help me with my > problem. For my Bachelor thesis I have to setup GalaxyProject to find SNP's > in genomes from hcc patients. I have a server on which I installed Galaxy > locally, since we have a) very large files (>30GB per patient) and b) the > data is protection sensitive. > > My goal is to do the first 5 steps of this pipeline: > http://www.nature.com/ng/journal/v44/n6/extref/ng.2256-S1.pdf (page 2) > from this paper: > http://www.nature.com/ng/journal/v44/n6/pdf/ng.2256.pdf > > This is my current workflow: > > https://main.g2.bx.psu.edu/u/mj--/w/workflow-whole-exome-somatic-gene-mutat… > and the matching *executed* history (with results) > > https://main.g2.bx.psu.edu/u/mj--/h/clean-whole-exome-somatic-gene-mutation… > > My question is: Am I going in the right direction? I really dont know if > I'm doing the correct steps :( Are there any links or papers that explain > which specific tool I have to use for the first five steps? > Of course if you do not have the time to answer my question, please refer > me to someone who can answer the question and maybe has the time for it. > > Best > Moritz >

1 0

Egg build failed for numpy 1.6.0
by lahcen campbell 05 Aug '13

05 Aug '13

Hi, I am trying to install a local version of galaxy on my mac book (x86_64, OSx 10.8.4). I seem to be having a lot of trouble with fetching and installing specific egg dependencies. Initially pysam was giving me trouble, but once I ran scramble.py -e pysam, the problem was fixed. Now I have trouble with the numpy egg dependency. Does anyone have any idea why a egg would fail to build. I am assuming it is because I have a more recent numpy version (1.7.1) installed already. What should I do, short of uninstalling the newer version of numpy ? Any help on the matter would be greatly appreciated. Thank you. L -- =================================== >Dr. Lahcen Campbell < >Contact: lahcencampbell(a)gmail.com < >http://bioinf.may.ie/index.html < ===================================

2 1

Whole Exome Somatic Gene Mutation Extraction
by Moritz Juchler 01 Aug '13

01 Aug '13

Hey, this is my current workflow: https://main.g2.bx.psu.edu/u/mj--/w/workflow-whole-exome-somatic-gene-mutat… and the matching *executed* history (with results) https://main.g2.bx.psu.edu/u/mj--/h/workflow-constructed-from-history-https… My goal is to do 2-3 steps of this pipeline: http://www.nature.com/ng/journal/v44/n6/extref/ng.2256-S1.pdf (page 2) from this paper: http://www.nature.com/ng/journal/v44/n6/pdf/ng.2256.pdf I have several questions: step 6 in workflow / 22 in history: filter Sam: necessary? If yes, only one flag: "if unmapped: Do not set states?" or the 3 flags I chose? 10/45: realigner: known indels necessary? If yes which ones? (I got this from http://www.broadinstitute.org/gatk//events/2038/GATKwh0-BP-2-Realignment.pdf which says on slide 14 that bam input file is not needed, is that right?) 12/49: BIGGEST PROBLEM: As you can see, the Indel Realigner is empty :( Where is my fault? I did everything the best practices from gatk suggested but it failed anyways. Any hints, links to papers or answers to fullfil the above mentioned pipeline or to answer my questions are welcome :) Best Moritz

1 0

Whole Exome Somatic Gene Mutation Extraction
by Moritz Juchler 01 Aug '13

01 Aug '13

Hey, this is my current workflow: https://main.g2.bx.psu.edu/u/mj--/w/workflow-whole-exome-somatic-gene-mutat… and the matching *executed* history (with results) My goal is to do 2-3 steps of this pipeline: http://www.nature.com/ng/journal/v44/n6/extref/ng.2256-S1.pdf (page 2) from this paper: http://www.nature.com/ng/journal/v44/n6/pdf/ng.2256.pdf I have several questions: step 6 in workflow / 22 in history: filter Sam: necessary? If yes, only one flag: "if unmapped: Do not set states?" or the 3 flags I chose? 10/45: realigner: known indels necessary? If yes which ones? (I got this from http://www.broadinstitute.org/gatk//events/2038/GATKwh0-BP-2-Realignment.pdf which says on slide 14 that bam input file is not needed, is that right?) 12/49: BIGGEST PROBLEM: As you can see, the Indel Realigner is empty :( Where is my fault? I did everything the best practices from gatk suggested but it failed anyways. Any hints, links to papers or answers to fullfil the above mentioned pipeline or to answer my questions are welcome :) Best Moritz

1 0

Re: [galaxy-user] Cuffdiff-cummerbund with biological replicates problem
by Jeremy Goecks 01 Aug '13

01 Aug '13

Thanks for the information. I've added an option to the Cuffdiff tool so that the read group files can be output by Galaxy; this should make it possible to run Cuffdiff in Galaxy and then Cummerbund for replicate analysis. This change will make it out to our public server in a couple weeks. Thanks, J. On Jul 31, 2013, at 5:39 PM, Mike Shamblott wrote: > Thanks > > My tests confirm that cummerbund does require the run group files to do replicate analyses. Therefore Galaxy cannot be used to do cuffdiff on replicates when cummerbund is the next step in the workflow. > > Mike > > > > Sent from my iPhone > > On Jul 31, 2013, at 9:45 AM, Jeremy Goecks <jeremy.goecks(a)emory.edu> wrote: > >> In the past, others have had success using Cummerbund with Galaxy, and there's even a Cummerbund wrapper in the tool shed: >> >> http://toolshed.g2.bx.psu.edu/view/jjohnson/cummerbund >> >> That said, it appears that replicate information is largely contained in the read group tracking files, which are not currently included in Galaxy's Cuffdiff outputs. I don't know if these files are required by Cummerbund to do replicate analysis. This would be a good question for the Cummerbund developers, as well as what the p and q values mean when doing replicate analysis. >> >> If you find that Galaxy's lacking something for Cummerbund to function correctly, that would be very useful information to share with the list. >> >> Best, >> J. >> >> >> On Jul 26, 2013, at 8:50 PM, Mike Shamblott wrote: >> >>> I'm trying to run Cuffdiff on a set of 10 human samples with biological replication then download the results for further analyses in Cummerbund(v2.1.1). It seems like a standard workflow but I cannot get cummerbund to acknowledge replicates. I download and rename the 11 cuffdiff output files to the names expected by cummerbund. Cummerbund builds a CuffSet with no warnings and most analyses work as expected. The problem comes any time I try to see the results of replication. For example, in cummerbund, >replicates() returns an empty set and any type of plot returns an error when replicates=T is included as an argument. >>> >>> There is no evidence of replication data in any of the 11 cuffdiff output files. The data is presented with the group name only. From this, I conclude that the problem is with cuffdiff, since there is no replicate data for cummerbund to build into its db. I see that there are several read group files that are produced by cuffdiff but cannot be downloaded in Galaxy. Is this the problem, and if so, how can Galaxy be used to generate data with (essential) replication? Are the p and q significance values reported in the output files a result of replicate analysis? >>> >>> I have tried to ask this question in several different forums without success. The responses I've gotten suggest its a Galaxy issue rather than either cuffdiff or cummerbund. I'm hoping someone here can help answer my questions. >>> >>> Hopeful, >>> >>> Mike >>> >>> >>> ___________________________________________________________ >>> The Galaxy User list should be used for the discussion of >>> Galaxy analysis and other features on the public server >>> at usegalaxy.org. Please keep all replies on the list by >>> using "reply all" in your mail client. For discussion of >>> local Galaxy instances and the Galaxy source code, please >>> use the Galaxy Development list: >>> >>> http://lists.bx.psu.edu/listinfo/galaxy-dev >>> >>> To manage your subscriptions to this and other Galaxy lists, >>> please use the interface at: >>> >>> http://lists.bx.psu.edu/ >>> >>> To search Galaxy mailing lists use the unified search at: >>> >>> http://galaxyproject.org/search/mailinglists/ >>

1 0