September 2013 - galaxy-user - lists.galaxyproject.org

question
by Bondici, Ibi 25 Sep '13

25 Sep '13

On the FastUnifrac webside available at the begining of the year I was able to obtain p-values having “unifrac_env.txt”file, GreenGene Core as the reference tree and “Automatically generate category mapping file” option was previously available. Can I have access to the previous website, rather than the new galaxy? or how can i have the option of "automatically generate catergory file" ? Viorica (Ibi) Bondici PhD. Candidate Department of Food and Bioproduct Sciences College of Agriculture and Bioresources University of Saskatchewan

2 1

Stitch MAF blocks
by Jennifer Di Tommaso 25 Sep '13

25 Sep '13

Hi, I don't undestand how to use the tool "Stitch MAF blocks". I update a small bed file and now I need to run this tool, I don't understand the next step. I'm searching for Novel Linc in some RNA-seq data and I found the lncRscan tool (http://code.google.com/p/lncrscan/wiki/example) and now I have to use "Stitch MAF blocks". The point is: I update the bed file, I select Fetch Alignments -> Stitch MAF blocks, and then I can do nothing, I can only choose among "locally cached alignments" and "alignments in your history". But MAF type/MAF file remain empty, in any case. Can someone be so gentle to solve my problem? Do I need to download a MAF file? Is it normally on Galaxy server, but now it is down? How can I produce a MAF file selecting only some species (29 mammals)? Thank you really much. Jennifer

2 3

Re: [galaxy-user] Metagenomic filtering
by Scott Tighe 25 Sep '13

25 Sep '13

Jing et al Thank you for the offer to write some code to help advance the metagenomics arena. It is certainly needed. So the problem is well known with megablast and shotgun metagenomics and without proper understanding and correct software will yield very misleading and in many cases incorrect data. For those of us who wish NOT to move to a protein level of comparison for specific reasons, we are stuck. *The Problem:* If I megablast 50 million sequences from a HiSeq run, millions of rRNA sequences will have a 99% match to all microbes rRNA genbank deposits. Not surprizing since the rRNA is highly conserved. The difference between E.coli and Shigella is 1 to 2 bases for the full 1540 bp 16s. So 16s is not useful for Genus level, and certainly not Species *So what happens:* The returned matches will have many hits to whatever model organism is in Genbank. For example E coli has 13000 entries for rRNA and Sphearotilus has 3 entries for rRNA. If the blasted sequence matches both, the results will mislead the investigator to think they have 13000 hits to E coli, EVEN if the microbe is Sphearotilus. *The cure?:* If there was a way to filter/ remove all hits ? Let say, for example, that a result has a first match (say E. coli) at >99% a second match (say Pseudomanas) at >99% and a third , forth and fifth match >99 for three other organisms. This sequence _must_ be discarded because it is a conserve sequence. Basically conserved sequence is the enemy and invalidates the entire result. * **Another problem:* If you have a reference sample with 19 non-model microbes, and you run that by HiSeq Shotgun for metagenomics and then megablast, what do you think you get? If E coli is not in the reference sample, how many hits do you think you get? Yes, 10,000 of thousands. So without removing conserved sequences, your data is wrong and you are much better served by culturing and running a Biolog metabolic panel and comparing to the sequence result. So where do we start? I have some shotgun metagenomics data from the reference sample which included the 19 microbes. That was data from a MiSeq. Scott Scott Tighe Senior Core Laboratory Research Staff Advanced Genome Technologies Core University of Vermont Vermont Cancer Center 149 Beaumont ave Health Science Research Facility 303/305 Burlington Vermont 05405 802-656-2557 On 9/20/2013 9:17 PM, Jing Yu wrote: > Hi Scott, > > I can do some perl programming, such as local/remote blasting. Can you > specify your problem a little bit clearer, so that maybe I can write a > program to do just that? > > Regards, > Jing > > > > > Gerald > > 16s is basically useless for identification to genus. Since I started > sequencing 16s in 1992, I have come to realize that without sequencing > the full 1540 bases, it is generally misleading, and even than, it is > not accurate enough to nail genus on more than 1/2 the cases. > However, what is your feeling on ITS and gyrase, They seem to be far > more discriminating but those databases have been decommissioned > sometime ago. > > The desirable thing would be that Galaxy or NCBI add a "filter > conserved genes" [ ie any hit with a second choice greater than 3% > distance]. Something such as that. > > If you (or others) are aware of such a thing, I'd love the here about it. > > Sincerely > Scott

3 2

de novo RNA-Seq workflow
by Carlos Canchaya 24 Sep '13

24 Sep '13

Hi guys, I am looking for a de novo RNA-seq workflow that uses trinity. Any idea if there is one available? Bests, Carlos Carlos Canchaya ccanchaya(a)gmail.com

2 1

Re: [galaxy-user] History will not load?
by Jennifer Jackson 24 Sep '13

24 Sep '13

Hi Maria, Yes, this is a known issue and we have been working on resolving it today. A red banner is up on the public Main server with a description. That said, from my own testing when you first sent this in to a test I just did now, there appears to be some improvement already. When fully cleared the banner will update. Thank you for your patience, Jen Galaxy team On 9/24/13 6:13 AM, Maria Hoffman wrote: > Hello, > Unfortunately, it appears as though the history will not load again > this AM and the jobs I have set to run last night (cuffdiff) have not > completed either. > Sorry to keep bothering you guys. > > Maria > > > On Mon, Sep 23, 2013 at 2:36 PM, Jennifer Jackson <jen(a)bx.psu.edu > <mailto:jen@bx.psu.edu>> wrote: > > Hello Maria, > > The issue has been resolved. Our apologies for the inconvenience, > > Jen > Galaxy team > > > On 9/22/13 11:55 AM, Maria Hoffman wrote: >> Hello, >> I have been having trouble with Galaxy since yesterday. I have >> been trying to run cuffdiff which has been waiting to run since >> Friday afternoon and now my history will not even load (red error >> message shows up). >> Any incite would be much appreciated. >> Maria >> >> >> ___________________________________________________________ >> The Galaxy User list should be used for the discussion of >> Galaxy analysis and other features on the public server >> atusegalaxy.org <http://usegalaxy.org>. Please keep all replies on the list by >> using "reply all" in your mail client. For discussion of >> local Galaxy instances and the Galaxy source code, please >> use the Galaxy Development list: >> >> http://lists.bx.psu.edu/listinfo/galaxy-dev >> >> To manage your subscriptions to this and other Galaxy lists, >> please use the interface at: >> >> http://lists.bx.psu.edu/ >> >> To search Galaxy mailing lists use the unified search at: >> >> http://galaxyproject.org/search/mailinglists/ > > -- > Jennifer Hillman-Jackson > http://galaxyproject.org > > -- Jennifer Hillman-Jackson http://galaxyproject.org

1 0

Re: [galaxy-user] your video of using cloudman
by Anton Nekrutenko 24 Sep '13

24 Sep '13

Elwood: With large fastq files your best option will be either using public server (which is having difficulties today) or running a local instance of Galaxy. We are very much aware that the public site is not behaving well. It will be back on-line today and in the coming weeks we will be switching underlying infrastructure brining much needed relief to all Galaxy users at once. The best venue for posting such questions is out galaxy-user mailing list (I CC'ed it here). Thank you and sorry for the main site troubles. anton On Tue, Sep 24, 2013 at 11:29 AM, Elwood Linney <ellinney(a)gmail.com> wrote: > Hello, > Because of the problems with Galaxy online, I am looking into using Galaxy > on the cloud but even with all the information it sometimes is confusing > for individuals who wish to use existing Galaxy in a specific manner. > > In your video of transferring datasets to the cloud you used URLs for > relatively small files, but under normal circumstances with Galaxy online, > this does not work for large files. I generally work with fastq files of > about 16gb in size. Would I be able to transfer this size of file via URL > to the cloud or would I have to use some other means (for galaxy online I > use fetch)? > > Rather that bother you with this question, is there a specific address > that I should send this type of question to (like the user list--though it > seems like there is so much disfunctional with Galaxy online that it takes > a few weeks to get a repy from that)? > > Elwood Linney > Professor of Molecular Genetics and Microbiology > Duke University Medical Center > -- Anton Nekrutenko Associate Professor Dept. of Biochemistry and Molecular Biology www.galaxyproject.org (814) 826-3051

1 0

Secure file upload gateway for Galaxy
by Georgios Magklaras 24 Sep '13

24 Sep '13

Hi, Our team at the University of Oslo is building a Life Science Portal based on Galaxy. We operate several standalone instances and we have the necessary sysadmin experience, but we really need to implement a more secure file upload mechanism than FTP (we do not like to send cleartext password credentials in the open ), and we understand that Galaxy does not integrate an upload method other than FTP with reference to this screencast: http://screencast.g2.bx.psu.edu/quickie_17_ftp_upload/flow.html One possible solution for this is to setup an SFTP upload server with a huge scratch space, that runs the SFTP upload gateway on one end and an IP restricted FTP server on the other, so that users can then upload/index the SFTP uploaded data into their Galaxy session via the URL upload field. This two step process might be a bit cumbersome for some of our users and we are looking for ways to simplify it. Do you have best recipes for SFTP/Aspera upload gateway integration to Galaxy? We would welcome advise on that matter. GM Best regards, -- -- George Magklaras PhD RHCE no: 805008309135525 Head of IT/Senior Systems Engineer Biotechnology Center of Oslo and the Norwegian Center for Molecular Medicine/ Vitenskapelig Databehandling (VD) - Research Computing Services - USIT EMBnet TMPC Chair http://folk.uio.no/georgios http://www.uio.no/english/services/it/research/hpc/abel/ Tel: +47 22840535

3 2

non-coding RNA annotation
by Hoang, Thanh 23 Sep '13

23 Sep '13

Hi all, I am analyzing my small RNA sequencing data on mouse tissue. Does anyone know where to download annotation file for non-coding RNA? Thanks Thanh

2 1

Galaxy Main - Files not Found, Server Error
by Yardley, Nathan 23 Sep '13

23 Sep '13

Hello, I am getting a resource not found error when I try to view data on the Galaxy Main Server. For some of my FASTQ Groomed files I get: The resource could not be found. File Not Found (/galaxy/main_pool/pool7/files/006/764/dataset_6764692.dat). [[Note the number before .dat changes depending on the file requested for viewing]] For some of my Map with Bowtie for Illumina files I get: Internal Server Error Galaxy was unable to sucessfully complete your request An error occurred. This may be an intermittent problem due to load or other unpredictable factors, reloading the page may address the problem. The error has been logged to our team. If you want to contact us about this error, please reference the following GURU MEDITATION: #dc57d31266564204baf90d698880401a I also have two mapping processes that have not been processed since Friday afternoon (It is Monday Morning as of this post). Any help on this matter would be great Thank you very much for your help. Nathan

2 1

History will not load?
by Maria Hoffman 23 Sep '13

23 Sep '13

Hello, I have been having trouble with Galaxy since yesterday. I have been trying to run cuffdiff which has been waiting to run since Friday afternoon and now my history will not even load (red error message shows up). Any incite would be much appreciated. Maria

2 1