April 2010 - galaxy-user - lists.galaxyproject.org

Re: [galaxy-user] [galaxy-bugs] cDNA mapping 454
by Jeremy Goecks 28 Apr '10

28 Apr '10

Hi Chris, I'm cc'ing galaxy-user because your questions are more about use than bugs. > Do you have any suggestiong for mapping cDNA reads using Lastz? > > I imagine the gaps caused by splicing cause problems for mapping cDNA reads using the commonly used setting? You're correct here. > Maybe setting up a "cDNA" mapping setting would be helpful, if such a thing is possible with LastZ. I looked through the LastZ documentation but didn't see anything that would support mapping cDNA to a reference genome. > > I suppose an alternative would be to map with bowtie, trimming all the reads to be equal length? You could try this, but splicing is likely to prevent good mapping. A better alternative is to use Tophat: http://tophat.cbcb.umd.edu/ Tophat is a splice junction mapper; Tophat, among other outputs, provides a list of mapped reads in SAM format. We have a very simple version of Tophat running on our test server that you try out: http://test.g2.bx.psu.edu/ We'll have a more complete version of Tophat up in the next day or two. A caveat: Tophat is designed for Illumina data, so you may not get optimal results when using 454 data. Best, J.

1 0

Re: [galaxy-user] galaxy-user Digest, Vol 46, Issue 13
by Ross 27 Apr '10

27 Apr '10

Jelle, thanks for pointing out this dead-end link - now fixed - for the record, bitbucket marks any link preceded by 'wiki:' to be an unsafe URL... For the most up-to-date information on the WGA/SNP tools http://rgenetics.org is a good place to look. > Message: 6 > Date: Mon, 26 Apr 2010 09:03:14 +0200 > From: Jelle Scholtalbers <j.scholtalbers(a)gmail.com> > To: galaxy <galaxy-user(a)bx.psu.edu> > Subject: [galaxy-user] rgenetics and atlas > Message-ID: > <u2wb66932c31004260003y3754116ev99cde23351ac6e6e(a)mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1 > > Hi, > > I was wondering where I could find all the rgenetics dependencies as > the link at the bottom of: > http://bitbucket.org/galaxy/galaxy-central/wiki/ToolDependencies > doesn't seem to bring me anywhere. > Furthermore it seems that I'm missing the module atlas for genetrack, > although I haven't seen anything about the need for this dependency. > Should this be added as a dependency or is this an egg that should be > automatically fetched? > > Cheers, > Jelle

1 0

how to create and use workflows
by Martin Senger 26 Apr '10

26 Apr '10

I am a new Galaxy user and I am trying to find out more about creation and usage of workflows. Once you have a workflow, its usage seems to be quite straighforward. But for the workflow creation, I am still looking for more documentation. I am especially interested in these topics: * How can I start a workflow with data from my computer if the upload step is not available for workflows? * What should I do for my new tools (added in my own Galaxy installation) to be able to participate in workflows. Anything specila to add in their XML description file perhaps? [Perhaps this question is more for the galaxy-dev mailing list?] * Is there perhaps more documentation on Galaxy workflows anywhere? Thanks for any help, Martin -- Martin Senger email: martin.senger@gmail.com,martin.senger@kaust.edu.sa skype: martinsenger

2 1

rgenetics and atlas
by Jelle Scholtalbers 26 Apr '10

26 Apr '10

Hi, I was wondering where I could find all the rgenetics dependencies as the link at the bottom of: http://bitbucket.org/galaxy/galaxy-central/wiki/ToolDependencies doesn't seem to bring me anywhere. Furthermore it seems that I'm missing the module atlas for genetrack, although I haven't seen anything about the need for this dependency. Should this be added as a dependency or is this an egg that should be automatically fetched? Cheers, Jelle

1 0

Galaxy server is down or Up ??
by Saurabh.V. Laddha 24 Apr '10

24 Apr '10

i am using mapping with Bowtie and its taking unexcepted time..i usally mapp data with bowtie in 30 mis or so but this time it is taking more than 18 hs.... so i wanted to know galaxy server is down or at my end problem ??? regards saurabh

2 1

Re: [galaxy-user] A question regarding Galaxy
by Anton Nekrutenko 21 Apr '10

21 Apr '10

Caiti: I am forwarding your e-mail to galaxy-user list. To answer your question - you will soon be able to upload tarred gzipped datasets that may contain multiple files. a. On Apr 16, 2010, at 10:29 AM, Caiti Smukowski wrote: > Hi Anton, > > I am a graduate student in Mohamed Noor's lab at Duke University - I > believe you met at a conference where the Galaxy tool was discussed. > Mohamed suggested I contact you with a quick question. I am > interested in using Galaxy to do some linear regressions, but I am > encountering a problem. I have 640 individual datasets I would like > to look at. It seems I would have to individually upload each one or > alternatively upload them all as one file. I have decided to upload > them as one file, and then I was hoping to find a tool where I could > separate the file by row into different files once in Galaxy (file > consists of 3 columns and thousands of rows, the first column has > names in it that I would like to sort). So ideally, I would like to > sort the file by colum one (name) into individual files. Is there a > way to do this? > > Thanks. > > Best, > > Caiti Smukowski Anton Nekrutenko http://nekrut.bx.psu.edu http://usegalaxy.org

1 0

Question about FASTQ Groomer
by Timothy Hughes 17 Apr '10

17 Apr '10

Hi Jianchao, > I am asking this question because I used to use Maq's > sol2sanger (I guess it is just similar to your "Solexa") to convert all > data generated by Illumina 1.5. The different fastq formats are broadly summarised by: S - Sanger Phred+33, 41 values (0, 40) I - Illumina 1.3 Phred+64, 41 values (0, 40) X - Solexa Solexa+64, 68 values (-5, 62) However, at least in my version of the MAQ software (some months old), sol2sanger conversion converts from X to S and NOT from I to X. So if you feed I to the MAQ converter you are going to get slightly incorrect Sanger qualities (because it is expecting the input qualities to have been calculated using the Solexa formula but they have in fact been calculated using Phred). If you search on seqanswers.com you will find a post that details how you need to modify the MAQ conversion script to make the conversion from I to S. Could this explain the discrepancies you observe? Tim. > > ------------------------------ > > Message: 4 > Date: Fri, 16 Apr 2010 11:20:09 -0400 > From: "Yao, Jianchao" <jyao(a)cshl.edu> > To: <galaxy-user(a)bx.psu.edu> > Subject: [galaxy-user] Question about FASTQ Groomer > Message-ID: > <2A031782CDB83F44A26147B7A11C2C95013002CF(a)mailbox11.cshl.edu> > Content-Type: text/plain; charset="us-ascii" > > To Whom It May Concern: > > > > I am a new user to Galaxy. In the function of "FASTQ Groomer", I noticed > there is an option for "Input FASTQ quality scores type". My question is > what different conversions you will do when I choose "Sloexa" or > "Illumina 1.3+". I am asking this question because I used to use Maq's > sol2sanger (I guess it is just similar to your "Solexa") to convert all > data generated by Illumina 1.5. It seems like, based on your options, I > should have chosen other conversion (e.g., your "Illumina 1.3+") to > convert data generated by Illumina 1.5 > > > > Also, it looks like "Sloexa" and "Illumina 1.3+" just differ in the > quality score calculation. But, when I use BWA and SAMtools to do > mapping and call SNPs, I notice the size of the bam or pileup files are > very different between those two different conversions. Also, it looks > like even the coverage for some of the bases are different when choosing > different conversions. > > > > Can you tell me how the conversion can affect the final result in terms > of coverage? > > > > All your help will be greatly appreciated! > > > > -Jianchao Yao > > > >

1 1

Question about FASTQ Groomer
by Yao, Jianchao 16 Apr '10

16 Apr '10

To Whom It May Concern: I am a new user to Galaxy. In the function of "FASTQ Groomer", I noticed there is an option for "Input FASTQ quality scores type". My question is what different conversions you will do when I choose "Sloexa" or "Illumina 1.3+". I am asking this question because I used to use Maq's sol2sanger (I guess it is just similar to your "Solexa") to convert all data generated by Illumina 1.5. It seems like, based on your options, I should have chosen other conversion (e.g., your "Illumina 1.3+") to convert data generated by Illumina 1.5 Also, it looks like "Sloexa" and "Illumina 1.3+" just differ in the quality score calculation. But, when I use BWA and SAMtools to do mapping and call SNPs, I notice the size of the bam or pileup files are very different between those two different conversions. Also, it looks like even the coverage for some of the bases are different when choosing different conversions. Can you tell me how the conversion can affect the final result in terms of coverage? All your help will be greatly appreciated! -Jianchao Yao

2 1

setting environment for sge scripts
by Andreas Kuntzagk 15 Apr '10

15 Apr '10

Hi, I followed the advice for multiple installed python versions to set ~galaxy/galaxy-python/python and set the environment for the galaxy user to have that in the PATH Problem is, when a job is started on GridEngine, it does not use the "-V" flag to inherit the environment. So it runs with a different python version and more important it does not set LD_LIBRARY_PATH and fails for some tools which need some special lib. (libRblas in this case) Where can I fix that? regards, Andreas

4 6

data privacy?
by Peter Andrews 14 Apr '10

14 Apr '10

I ran into a researcher the other day who said her 'boss does not want me to put our data on galaxy'. What can I tell her to reassure them of privacy and safety? Thanks! -- -------------- Peter Andrews Programmer Computational Genetics Lab Dartmouth Hitchcock Medical Center (603) 653-9963

2 1