December 2013 - galaxy-user - lists.galaxyproject.org

Help with Summary Statistics
by D. A. Cowart 23 May '14

23 May '14

Hello, I am attempting to use Galaxy to calculate the mean sequence read length and identify the range of read lengths for my 454 data. The data has already been organized and sorted by species. The format of the data is as follows: >HD4AU5D01BHBCQCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTC >HD4AU5D01A093MCTCTGTCGCTCTGTCTCTCTTCTCTCTCTCTCTCTCT etc...for each species I have attempted to use the "Summary Statistics" button, however it appears to only be for numerical data and not sequence data. Is this tool/task available via Galaxy? Thank you, Dominique Cowart User name: dac330

6 5

multiplex data
by Ann Holtz-Morris 03 Jan '14

03 Jan '14

Hi, I've several questions regarding multiplex data and its analysis. First, I'm using a MiSeq run on 96-plex (dual index) data. While our current MiSeq data has had the machine the adapters ( including the indexes), it would be nice to be able to clip adapters using a file of adapters rather than one at a time for our older data, which didn't have the adapter clipped. Second, we have the MiSeq run a BWA alignment to the refererence genome for each (96) sample. The folder containing them all averages at about 5GB of data. I noticed that on a local version of Galaxy, a user with administrator privileges can load multiple files. But how can a regular user load them to the main galaxy? I've been reading the developer thread-is this something that will be released soon-ish? Third, Is there a way to run a workflow , several Samtools and Picard analysis, on multiple files so they all get processed? Thanks Ann -- Ann Holtz-Morris, M.S. SRA III de Jong Laboratory CHORI aholtzmorris(a)chori.org

2 2

using SnpEff and SNPSift in Galaxy
by shamsher jagat 02 Jan '14

02 Jan '14

I have a VCF file and I want to filter it for nonsynonymous/ deletion/ insertion seq variations. Once I filter this file and compare between tumor vs normal samples and then annotate such variations. I believe I can filter this file using SnpSift and then can annotate with SnpEff, When I try to use Snsift filter it just says arbitrary expression. Are there rules how to use expression for a particular filter with in galaxy. If any one has used SnpSift in galaxy may share their expertise. Thanks Kanwar

2 1

retaining or reclaiming bed ids
by Gold, Bert (NIH/NCI) [E] 02 Jan '14

02 Jan '14

Hi! Having provided a name (field 4) in a UCSC bed file ( http://www.genome.ucsc.edu/FAQ/FAQformat.html#format1 ) and sought a RefSeq name using the UCSC Table Browser ( http://www.genome.ucsc.edu/cgi-bin/hgTables ), I would now like to recover which line of the bed file delivered which line of the output file… However, I am told I need Galaxy to provide a workflow to do this. Can anyone explain how? eg, one line of my bedfile looks like: chr2 2723752 2723777 seqid6354405 0 - and one line of my intersected table browser output looks like: chr1 176432306 176811970 NM_020318 0 + 176525458 176811590 0 23 248,1835,1072,146,294,193,122,490,129,92,194,147,136,217,172,178,214,169,136,110,72,99,455, 0,92236,131353,207799,226966,228955,232567,235929,239436,243188,246812,248664,276455,276809,302495,306436,307796,326638,328176,330389,336890,377002,379209, Clearly the first line of my bed doesn't correspond to the first line of my intersection output, but as my bed is long, what reference can I use to unambiguously identify which line of output the first line of my intersection corresponds to? How do I do this in Galaxy? PS - I tried this workflow earlier today without success, aiming to achieve a similar objective: https://usegalaxy.org/u/james/w/workflow-from-ucsc-genes-and-symbols PPS- I also note similar issues were raised in this discussion, with Galaxy promoted as the solution, but with no real details about how to achieve the desired results: http://redmine.soe.ucsc.edu/forum/index.php?t=msg&goto=10615&S=0d1b303e6dfd… Bert Gold, Ph.D., FACMG Staff Scientist NCI-Frederick Frederick, MD 21702 VOICE: 301-846-5098 EMAIL: golda(a)mail.nih.gov

2 1

Problems when analyzing my data
by Yitian Xu 02 Jan '14

02 Jan '14

Hi Galaxy, I am a user from Cornell University. And you website is a great help to me and my research. But there are two problems with it I cannot figure out by myself, hoping you can help me. 1. When I uploading the data via FTP, there's option of mouse reference genome mmp10. When I get to Tophat2, there's only mmp9. Is there a problem that I use mmp10 at the beginning and use mmp9 at tophat2? Or maybe you will update the tophat2? 2. I have around 50G space missing. I have one and only one history (at least I can see) with 171.5G, but when I checked my preference I used 225.2G. I don't know where the missing 50G count for then I don't know how to make room for my ongoing analysis. My user name is douyadou. Can you help me check for a min? Thanks. Best, Yitian

2 1

Chrdecoy error in trackster
by doanea＠mskcc.org 02 Jan '14

02 Jan '14

Hello, I’ve been using Galaxy for RNA-seq analysis. Many thanks for this great resource! I have run into a problem that I hope someone might be able to help me with. When loading my .BAM files and/or .gtf files into trackster, I get an error stating: "Input error: Chromosome chrDecoy found in your input file but not in your genome file.” My Illumina fastq files were QCed and aligned using tophat to hg19, and I used cufflinks with hg19 as annotation guide. All my files have hg19 as the dbkey. My guess is that the hg19 I used as a reference differs form the built in model, but I am not sure how I might fix this. Fwiw, I am able to load all my data into IGV, using hg19 as the genome, and visualize everything. Any suggestions would be really appreciated! Thanks! Ashley ===================================================================== Please note that this e-mail and any files transmitted from Memorial Sloan-Kettering Cancer Center may be privileged, confidential, and protected from disclosure under applicable law. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any reading, dissemination, distribution, copying, or other use of this communication or any of its attachments is strictly prohibited. If you have received this communication in error, please notify the sender immediately by replying to this message and deleting this message, any attachments, and all copies and backups from your computer.

2 1

Problems with Galaxy 101 Tutorial
by Lance Parsons 02 Jan '14

02 Jan '14

I have a user who is running into some problems on the Galaxy 101 tutorial at: https://usegalaxy.org/u/aun1/p/galaxy101. Specifically, in Safari, it's prompting for a userid and password for each image (and the galaxy account doesn't work). I can confirm this behavior myself. On Chrome, there is no prompt, but no images show up. Any suggestions would be quite welcome. Hoping to give a good first impression... ;-) -- Lance Parsons - Scientific Programmer 134 Carl C. Icahn Laboratory Lewis-Sigler Institute for Integrative Genomics Princeton University

2 2

Transferring data to a different account
by Yona Kim 02 Jan '14

02 Jan '14

Hello all, Is there any way in galaxy I can transfer data to a different account? Thanks in advance Yona Kim

2 1

January 2014 Galaxy Update Newsletter
by Dave Clements 31 Dec '13

31 Dec '13

1 0

EdgeR returning empty output
by Anto Praveen Rajkumar Rajamani 19 Dec '13

19 Dec '13

Hi, I have used http://galaxy.nbic.nl/ server to run edgeR on my RNA-seq data to identify differentially expressed genes successfully in August 2013. However, while trying with another digital expression matrix at present, I get only empty output repeatedly. Has anyone used edgeR wrapper recently? What could be the problem? Best wishes, Anto

2 1